I think we finally found the fix for this error.
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1384, in forward
return_dict=return_dict,
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1249, in forward
return_dict=return_dict,
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1001, in forward
hidden_states = inputs_embeds + positions
RuntimeError: The size of tensor a (449) must match the size of tensor b (448) at non-singleton dimension 1Removing all samples in train/dev where the number of characters in the transcript is over 300 should take care of this problem.
thanks @intron,
When I used a truncation with a max_length = 80, the error changed to: "expected sequence of length 80 at dim 1 (got 90)" and I'm still looking after it. I think it will be better to remove the last truncation than to drop all samples>300.
Even after filtering the transcripts >300 the error remains "ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length."
NB: padding and truncation already activated.
When I find out the reason, I will let you know.
This is one of the batchs where I have the problem, the length of the input_features is 80 (correct), but the labels length is 85 (!=label_ids), so if you have the same problem, you need to adjust the preparation function.
{'audio_id': '0c73853b-d758-4ab1-bd3f-70b7029c4ae3/38f602fc33a9375738a4a6b5645047fc','audio': {'path': '0c73853b-d758-4ab1-bd3f-70b7029c4ae3/38f602fc33a9375738a4a6b5645047fc.wav',