Hello everyone,
I'm quite new to transformers. I started this competition to learn more about them. Thanks to hugging face's transformers library things are pretty simple to put in place.
I've tried BERT (base-uncased), Roberta (base and large), and Distilbert and so far the best score I got (with Roberta) is around 0.37...
The classifier I put on top of them is a simple linear layer with four 4 outputs. (multi-label classification)
I also noticed that these models easily overfit on the dataset (in a few epochs like 4 or 5).
What is your experience with transformers?
Do you have any tips/best practices to share, in particular on small datasets with short phrases?
Let's learn together :)
A possible idea could be leverage transformer to generate data and then do training
Have you tried this approach? I doubt the quality of the generated results given the small size of the dataset.
No I have not tried it. Also are you using simple transformer. You may try language modelling fine tuning before
did not try LM fine-tuning. I'll look into it. thank you
Are you using simpletransformers package
no transformers only. is simpletransformers better?
It provides a nice wrapper around transformers and which is good for a noob like me
Any tips on how I should decide my BatchSize according to this small dataset ?