hidden_states

Intron AfriSpeech-200 Automatic Speech Recognition Challenge

$5 000 USD

Challenge completed over 2 years ago

Skills you will learn

Automatic Speech Recognition

430 joined

41 active

Info Data Chat Leaderboard

Start

Feb 17, 23

May 28, 23

Reveal

May 28, 23

Siwar_NASRI

hidden_states

Help · 27 Feb 2023, 00:27 · 4

I have tried to change all the parameters, but the same error always occurs at the same step.

Can I try other models as well or should we just use the whisper?

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1384, in forward
    return_dict=return_dict,
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1249, in forward
    return_dict=return_dict,
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/transformers/models/whisper/modeling_whisper.py", line 1001, in forward
    hidden_states = inputs_embeds + positions
RuntimeError: The size of tensor a (449) must match the size of tensor b (448) at non-singleton dimension 1

Discussion 4 answers

intron

Have you tried wav2vec2? There are a ton of pretrained models you can try out. Check this one out

https://huggingface.co/blog/fine-tune-wav2vec2-english

27 Feb 2023, 07:06

Upvotes 1

Muhamed_Tuo

Inveniam

@intron is right. You can find all the pretrained models here https://huggingface.co/models?pipeline_tag=automatic-speech-recognition

replied to intron27 Feb 2023, 08:10

Upvotes 0

Siwar_NASRI

that's why I asked, because I used a Wav2Vec2 model on an African mozilla_common-voice dataset before and it gives me significant results. I wanted to know if you prefer a specific architecture.

replied to intron27 Feb 2023, 08:16

Upvotes 0

intron

I think we finally found the fix for this error. If you remove all samples in train/dev where the number of characters in the transcript is over 300, that should take care of this problem

4 Mar 2023, 18:40

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status