🙌 Meet the winners of UmojaHack #1: SAEON Marine Invertebrates Identification Challenge

Meet the winners of UmojaHack #1: SAEON Marine Invertebrates Identification Challenge

Meet the winners · 2 Apr 2020, 06:55 · 7 mins read ·

UmojaHack Africa brought more than 1000 data science students from across Africa to the Zindi platform on March 21 2020. Out of 342 data scientists from across the continent that signed up for the SAEON Marine Invertebrates Identification Challenge, 46 made it onto the leaderboard. Only the best of the best made it to the top.

The aim of this challenge was to develop an automated image classification solution for photographs of marine invertebrates taken by researchers in South Africa. This solution will substantially reduce manual image processing efforts and enable researchers to detect any changing patterns in marine invertebrates and the fragile ecosystems they live in much faster.

The winners of this challenge are: Rinnqd from Tunisia in 1st place, team HMF_Tunisia (Ashrafmahdhi, Firas_Zarai, HaythemTELLILI) from Tunisia in 2nd place and Team_DeepDivers_Unilag (IrekponorVictor, Yhemmy_DSN, stanleydukor) from Nigeria in 3rd place.

A special thank you to the 1st place and 3rd place winners for sharing some insights into how they succeeded in this challenge.

Firas Baba (1st place)

Zindi handle: Rinnqd

Where are you from? Tunisia

Tell us a bit about yourself?

I am a third year Telecommunication Engineer Student at Sup'Com and a Kaggle competition master with an interest in computer vision and natural language processing. I have worked on multiple model types, such as Auto-Encoder, LSTM, CNN, transformers models, decision tree models (LGBM, XGBOOST, CATBOOST).

Tell us about the approach you took.

The images were of good quality, even though there were some duplicated images. I hashed all the images to get rid of the duplicated images, and I kept the bigger images for the training. Since the number of images was not large, I decided to train many models and to use the power of ensembling in my final solution as I did not have time to experiment on the data. I trained 2 resnets and 2 efficientnets models. Unfortunately, I didn't have enough time to train densenet and se_resnext models. I think the computational power I had (RTX 2080 and P100) played a big part in my success. Below are the models I trained:

resnet50 with size 224x224, 256x256, 512x512
resnet152 with size 224x224, 350x350
effcientnet b0 with size 224x224, 512x512
efficientnet b4 with size 224x224

A simple average between these models was my final model. Note that I used TTA=5 predictions for the resnet models (I didn't have enough time to predict with TTA with efficientnet models).

My validation strategy was simple, simple KFold(n=5) was enough. It boosted the public score from 0.3+ (using simple split) to less than 0.2 (using 5-folds average). I used Adam optimizer for all the models (with lr=1e-4) and StepLR scheduler with 0.6 decay after each 7 epochs. This process needed between 10 and 20 epochs to converge (it depends on the model and the fold).

I also used different learning rates for different models and differential learning rates for fine-tuning efficientnet models (lr_min=1e-5, lr_max=3e-3). I think a single resnet50 model would have put me in second place.

What were the things that made the difference for you that you think others can learn from?

When you are dealing with a Deep Learning competition (which is the case for this computer vision competition), you won't have enough time to experiment with a lot of ideas. It is better to focus on training different independent models with different parameters (different architectures, learning rate, image size, optimizers, scheduler, transformations, with and without TTA). My message here is clear: do not rely on the strength of your single model in DL competitions hackathons (less than 48h hackathons). 2 weak models will outperform a good baseline. Focus on the quantity in this case and not the quality. Having a good single model is always a good idea but getting this good model will cost you a lot of experiments and hours (especially if you don't have enough computational power).

What are the biggest areas of opportunity you see in AI in Africa over the next few years?

I think Africa can be one of the AI hubs in the world. Africa has only 2% of the world's wealth. That means African opportunities are limited when they want to launch a big project such as: car manufacturing, chemical products, electronic devices manufacturing etc. This is not the case for the AI industry, as the cost of projects in this industry are lower when compared to the industries mentioned above. I can say that Africa has many young talents in the AI field and everyone can start developing their own ideas and models with a small laptop at home and some coding and math skills. The growth of the talents in our continent is fast and this is something we can discover when we look at some of Zindi's competition leaderboards and discussions. We are starting to get smart solutions and inspiring ideas from many African data scientists. Efforts are being made to train computer scientists from African nations, as AI can be used to solve many complex challenges. Unfortunately, AI was a missed opportunity for improving African lives because this industry is missing out on talent from African nations: they did not have access to the right education. Thanks to many African governments, AI startups and investors we can now see many opportunities that are offered freely to all the African citizens. I think that we will see soon an African revolution based on new tech and especially AI algorithms and systems.

What are you looking forward to most about the Zindi community?

I remember a year and a half ago, no one has heard about Zindi in Tunisia and many other countries. African data scientists did not have the culture to join data science competitions and to learn from a community. This is something that we can extract if we look at the number of African competitors in other platforms like Kaggle, DrivenData and Analytics Vidhya. which is quite low. However, after the launch of Zindi, we can see a huge difference. Many African data scientists are starting to spending time on competitive data science, the quality of the solutions has improved and Africans now are using SOTA algorithms. Not only that, but they even add some new innovative touches to these models. We start to have high-ranked competitors on other platforms from many countries such as Russia, Canada, India, Germany etc, which is good and helps us learn from the best of the best. As an African student, I can only express my happiness and my satisfaction with the effort that Zindi is doing towards our AI community.

Obumneme Stanley Dukor (3rd place)

Zindi handle: Team_DeepDivers_Unilag (IrekponorVictor, Yhemmy_DSN, stanleydukor)

Where are you from? Nigeria

Tell us a bit about yourself.

I am a computer engineering student with a huge interest in computer vision.

Tell us about the approach you took.

We chose to use the fastai library which is built on pytorch because of its ease of use for such a short hackathon period. We then wrote a list of pre-trained models from previous competitions, tests and practice, performed better. We loaded the dataset into fastai's dataloader, using a batch size of 16, resizing the images to 299, and applying many data augmentation techniques. Next, we set up the model, using callbacks, such as early stopping, and saving the best models per epoch. After training on some of these models (even though we didn't have enough time to train on all the pre-trained models we wrote down), we submitted to have an idea of its performance. Finally, we ensembled 4 of the models we trained which gave our best performance.

What were the things that made the difference for you that you think others can learn from?

Ensembling of models is key.

What are the biggest areas of opportunity you see for AI in Africa over the next few years?

I see the biggest opportunities for Africa and for African data scientists is in computer vision, block-chain and reinforcement learning.

What are you looking forward to most about the Zindi community?

I look forward to Zindi becoming well known across the world and the availability of African-based datasets.

This competition was hosted by SAEON.

What are your thoughts on our winners' feedback? Engage via the Discussion page or leave a comment on social media.

If you enjoyed this content upvote this article to show your support

Discussion 0 answers

Join the largest network for
data scientists and AI builders

About FAQs

Status