Intron AfriSpeech-200 Automatic Speech Recognition Challenge
Can you create an automatic speech recognition (ASR) model for African accents, for use by doctors?
Prize
$5 000 USD
Time
2 months to go
Participants
11 active · 193 enrolled
Advanced
Automatic Speech Recognition
Health
Media
Bug with Whisper Models Fixed
Data · 4 Mar 2023, 18:43 · 3

I think we finally found the fix for this error.

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1384, in forward
    return_dict=return_dict,
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1249, in forward
    return_dict=return_dict,
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1001, in forward
    hidden_states = inputs_embeds + positions
RuntimeError: The size of tensor a (449) must match the size of tensor b (448) at non-singleton dimension 1

Removing all samples in train/dev where the number of characters in the transcript is over 300 should take care of this problem.

Discussion 3 answers

thanks @intron,

When I used a truncation with a max_length = 80, the error changed to: "expected sequence of length 80 at dim 1 (got 90)" and I'm still looking after it. I think it will be better to remove the last truncation than to drop all samples>300.

5 Mar 2023, 07:49
Upvotes 0

Even after filtering the transcripts >300 the error remains "ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length."

NB: padding and truncation already activated.

When I find out the reason, I will let you know.

5 Mar 2023, 10:05
Upvotes 0

This is one of the batchs where I have the problem, the length of the input_features is 80 (correct), but the labels length is 85 (!=label_ids), so if you have the same problem, you need to adjust the preparation function.

(80, 3000)
85
{'audio_id': '0c73853b-d758-4ab1-bd3f-70b7029c4ae3/38f602fc33a9375738a4a6b5645047fc',
 'audio': {'path': '0c73853b-d758-4ab1-bd3f-70b7029c4ae3/38f602fc33a9375738a4a6b5645047fc.wav',
  'array': tensor([2.8217e-04, 4.3279e-04, 1.1999e-05,  ..., 2.4362e-03, 5.1772e-03,
          4.6573e-03], dtype=torch.float64),
  'sampling_rate': 16000},
 'transcript': 'Displaced fracture of proximal phalanx of left little finger, initial. TABLET, ORAL PERPHENAZINE AND AMITRIPTYLINE HYDROCHLORIDE, AMITRIPTYLINE HYDROCHLORIDE; PERPHENAZINE, 25MG;2MG. Pedestrian on other standing micro-mobility pedestrian conveyance inju',
 'input_features': array([[-0.50099504, -0.36960256, -0.2437787 , ..., -0.6693915 ,
         -0.6693915 , -0.6693915 ],
        [-0.35442793, -0.26131332, -0.05092454, ..., -0.6693915 ,
         -0.6693915 , -0.6693915 ],
        [-0.54630923, -0.40773106,  0.07135439, ..., -0.6693915 ,
         -0.6693915 , -0.6693915 ],
        ...,
        [-0.6693915 , -0.6693915 , -0.6693915 , ..., -0.6693915 ,
         -0.6693915 , -0.6693915 ],
        [-0.6693915 , -0.6693915 , -0.6693915 , ..., -0.6693915 ,
         -0.6693915 , -0.6693915 ],
        [-0.6693915 , -0.6693915 , -0.6693915 , ..., -0.6693915 ,
         -0.6693915 , -0.6693915 ]], dtype=float32),
 'labels': [50257, 50258, 50358, 50362, 7279, 21820, 31846, 286, 14793, 4402, 872, 25786, 87, 286, 1364, 1310, 7660, 11, 4238, 13, 43679, 51, 11, 6375, 1847, 19878, 11909, 1677, 22778, 8881, 5357, 3001, 2043, 32618, 9936, 24027, 43624, 7707, 46, 3398, 43, 1581, 14114, 11, 3001, 2043, 32618, 9936, 24027, 43624, 7707, 46, 3398, 43, 1581, 14114, 26, 19878, 11909, 1677, 22778, 8881, 11, 1679, 20474, 26, 17, 20474, 13, 13457, 395, 4484, 319, 584, 5055, 4580, 12, 39949, 879, 22382, 13878, 590, 2529, 84, 50256]}
5 Mar 2023, 13:52
Upvotes 0