Primary competition visual

Machine Learning Hackathon by AMLD Africa

Helping Morocco
$1 000 USD
Challenge completed ~4 years ago
Natural Language Processing
Classification
171 joined
78 active
Starti
Aug 27, 21
Closei
Sep 05, 21
Reveali
Sep 06, 21
User avatar
Mansoor
Text classification with Simple Transformers
23 Sep 2021, 10:17 · 2

Hi, here's a sample notebook for using the Simple Transformers library for text classification: https://github.com/FaatimahM/Tweet_classification_simpletransformers. This is by no means pro level, I'm relatively new to data science and programming in Python and I literally learnt about using this library when I enrolled in this hackathon. I hope that this will be useful to someone on their data science journey :)

Discussion 2 answers
User avatar
Lone_Wolf
University of ghana

hey @Mansoor.. thanks for sharing this

I'd like to know how did you handle all the unknowns in the dataset, and even violence phrases??

Also I noticed there was more than 1 language in the dataset, did you treat this as well or you fed them into the model?

23 Sep 2021, 10:54
Upvotes 0
User avatar
Mansoor

Hey @ZzyZx using these models I found that applying no pre-processing produced the best results, i.e. feeding in the raw data into the model. The pre-processing which I tried for cleaning the text was removing 'RT', '&amp', '', '' and '#'. I chose not to remove stop words because I read somewhere that they may contain important info needed by the language models.

Just a general question, can anyone please advise what metrics/methods should be looked at when trying to interpret what a model is actually doing