if you haven't resolve the problem, I would suggest reducing the batch_size and the tokenizer max_length (any value below 80 should work for a batch_size of 16 or lower).
Here is something I like to do:
train['length'] = train['Yoruba'].apply(lambda x: len(tokenizer.encode(x)))
train.length.hist()
train.length.describe()
# Do the same for English, and choose your max_length accordingly.
It is also important to keep in mind that a too low max_length may have an impact on your BLEU score. But I guess it is a compromise you have to accept, depending on your training hardware.
Hi,
if you haven't resolve the problem, I would suggest reducing the batch_size and the tokenizer max_length (any value below 80 should work for a batch_size of 16 or lower).
Here is something I like to do:
It is also important to keep in mind that a too low max_length may have an impact on your BLEU score. But I guess it is a compromise you have to accept, depending on your training hardware.
thanks