9 Jul 2020, 06:49

Meet the winners of the #ZindiWeekendz To Vaccinate or Not to Vaccinate: It’s not a Question Challenge

Zindi is excited to introduce the winners of the #ZindiWeekendz To Vaccinate or Not to Vaccinate: It’s not a Question Challenge. In just 60 hours, the virtual hackathon attracted 222 data scientists from across the continent and around the world, with 130 placing on the leaderboard.

The objective of this challenge was to develop a machine learning model to assess whether a Twitter post related to vaccinations was positive, neutral, or negative. This solution can help governments and other public health actors monitor public sentiment towards COVID-19 vaccinations and help improve public health policy, vaccine communication strategies, and vaccination programs across the world.

Although it may be many months before we see COVID-19 vaccines available on a global scale, it is important to monitor public sentiment towards vaccinations now and especially in the future when COVID-19 vaccines are offered to the public. The anti-vaccination sentiment could pose a serious threat to the global efforts to get COVID-19 under control in the long term.

The winners of this challenge are: devnikhilmishra from India in 1st place, Muhamed_Tuo from Côte d'Ivoire in 2nd place, and Rajat_Ranjan from India in 3rd place.

This hackathon will be re-opened as a knowledge competition.

A special thank you to the 1st and 2nd place winners for their insights.

Nikhil Kumar Mishra (1st place)

Zindi handle: devnikhilmishra

Where are you from? India

Tell us a bit about yourself?

A final year student, with more than 2 years of experience in learning and applying Data Science, I have always loved participating in Data Science competitions. Competitions give you an opportunity to try out new ideas, or discover them, and see if it works or not. I believe Data Science is far from saturation, and new ideas help uncover the potential and use cases of data-science.

Tell us about the approach you took.

Transformers are the state of the art for NLP tasks. And simple transformers is a library which provides a high level interface to transformers. I used simple-transformers to create models with Roberta architecture and finally did a weighted blending to get the final solution. Also I treated it as a regression problem with metric being as RMSE, whereas most other fellow competitors took it as a classification problem.

What were the things that made the difference for you that you think others can learn from?

Blending and ensembling of different solutions always help. And weighted blending of solutions based on the individual goodness of the solutions rather than taking simple average of all your solutions gives you a much better overall result. Competitions are a bit different from real world solutions, where the focus is only one robust model. Here many okayish models made with different approaches, can outperform a single strong model. So always try to create different kinds of models with different ideas.

Tuo Muhamed (2nd place)

Zindi handle: Muhamed_Tuo

Where are you from? Côte d'Ivoire

Tell us a bit about yourself?

I'm a third year student in Math and CS. I'm wildly excited about any ML topics, but I mostly favor NLP. I read every day about NLP and I hope to bring my little contribution to the NLP world.

Tell us about the approach you took.

Solution: My solution consists of an ensemble of RoBERTa-large, where each roberta learns from a fold of dataset. That configuration helped make the final prediction more robust and stable.

Approach: When it comes to nlp tasks like classification or regression, Roberta seems to always work. So my first attempt was to try a roberta-base as a baseline. And from there, I slowly improved every part of the training pipeline going from the activation function of the last dense layer, the loss function. When I reached that optimal point where I couldn't do anything more, I switched to Roberta-large as my main backbone model.

What I could have done: One could simply train different models in the same above configuration and make an ensemble out of them.

What were the things that made the difference for you that you think others can learn from?

I think that it is because of 2 factors:

  • First, I started really small. I tried a TF-IDF+LogReg and the roberta base. My training loop was minimal at the early stage
  • And lastly, I was reading a lot of resources while building the training pipeline for the transformers model. I can't count the number of times I went on the HuggingFace's github and docs pages.

What are the biggest areas of opportunity you see in AI in Africa over the next few years?

It surely is in applying ML algorithms to ease daily life difficulties like:

  • Time Series Apps to help a farmer remotely monitor his farm's constants, or help to forecast natural catastrophes
  • ML powered apps for Health Care for those who are miles away from the nearest health center

What are you looking forward to most about the Zindi community?

I'm looking to learn from others, help those I can too, and mostly contribute to make Africa great again.

What are your thoughts on our winners' feedback? Engage via the Discussion page or leave a comment on social media.