Hi everyone, a lot of folks are at the top because of the hard work they put into the just concluded hackathon. This means you can also be among the top ten if you put in some hard work, so I won't be sharing my notebook. Nevertheless, I'll give some tips on how you can beat my current score on the leaderboard.
0) do proper cleaning of the text data.
1) use a stratified split for training and validation, classes are quite imbalanced.
2) Read up on pseudo labelling and try applying it here.
3) Avoid overfitting, it's easy to get a high score.
4) ML approaches are good enough, but you may want to try deep learning approaches.
Expecting to see y'all above my score on the leaderboard.....😉
Thank you @Professor for sharing your approach.