Model - Single fine-tuned RoBERTa-base using Huggingface Transformers library with PyTorch.
NB: Not doing text preprocessing/cleaning performed better than doing so. All I had to do was remove duplicated texts.
Hyperparameters
5 cv folds run 5 times with different seeds used in sampling data making a total of 25 runs. This was done in order to reduce variability in predictions as the data was very small. Test data predictions were done between folds, and later averaged in total.
Link to code
Awesome. Thanks for sharing.
Hi, thanks for sharing this. Amazing work.
Great approach!
Great work, thanks