15 Oct 2019, 06:59

Meet the winners of the Xente Fraud Detection Challenge

Get insights from challenge winners! A special thank you to the winners for their generous feedback.

Zindi is excited to announce the winners of the Xente Fraud Detection Challenge. The objective of the competition was to create a machine learning model to accurately classify the fraudulent transactions from Xente's e-commerce platform. The challenge attracted 1098 data scientists from across the continent and around the world, of whom over 547 were on the leaderboard. We are happy to introduce the winners of the competition: Alexander Ageev of Russia, Dmitry Fedotov of Russia and Rose Wambui of Kenya!

Name: Alexander Ageev (1st prize)

Zindi handle: Veegaaa

Where are you from? Russia

Tell us a bit about yourself.

I am a graduate from Higher School of Economics. I have a wide range of experiences from research activities to software development. I'm currently working as a Senior Data Scientist at one of the most innovative banks in the world.

Tell us about the approach you took.

I spent time exploring the dataset by hand. I performed thorough EDA that allowed me to understand the patterns of data and construct new unusual features.

What were the things that made the difference for you that you think others can learn from?

It is important to dig into the data and understand everything about it.

Name: Dmitry Fedotov (2nd prize)

Zindi handle: Nikodim

Where are you from? Russia

Tell us a bit about yourself.

I am an Oracle PL/SQL Developer at Norilsk Nickel. My main hobby is machine learning.

Tell us about the approach you took.

I found that it would be better to take a few main features for modelling: CustomerID, day, hour, minute, weekday and datetime in ordinal format. For training I only took records with Value = 150000 and more, as it showed good results on the train-test split. Using different models such as XGBoost, CatBoost, LightGBM, KNN, RandomForestClassifier, DecisionTreeClassifier, and QuadraticDiscriminantAnalysis I found that 18 records in the test dataset have a float predicted value of FraudResult (average value near 0.5). By selecting and submitting different values of FraudResult in those 18 records, I found values of FraudResult in the test dataset that resulted in a perfect score on the public leaderboard.

What were the things that made the difference for you that you think others can learn from?

The most important difference was attention to data. For example, everybody could see that the fraud transactions in the train data were mostly with large values. It prompted me to focus on large value-transactions only.

What are the biggest areas of opportunity you see for AI in Africa over the next few years?

I think the biggest area of opportunity is in logistics, financial transactions and agriculture.

Name: Rose Wambui (3rd place)

Zindi handle: rwambu

Where are you from? Kenya

Tell us a bit about yourself.

I'm a Software Engineer and hold a bachelor's degree in Computer Security and Digital Forensics. I'm passionate about using AI to solve problems - particularly cyber threats.

Tell us about the approach you took.

Data exploration and analysis

This helps in understanding the type of dataset you are dealing with. You get to know the features, which are categorical and those which are not and a lot more.

Data preprocessing

Transform the categorical features - you can use one hot encoding but for this case I used pandas `get_dummies`. This is suitable for those with low cardinality.

Model training

I used both Random Forest Classifier and XGBoost Classifier with Grid Search CV for parameter tuning. Both worked well but Random Forest was better with a 0.02.

What were the things that made the difference for you that you think others can learn from?

Data preprocessing and feature engineering - knowing how to transform the features given and generating new ones.

What are the biggest areas of opportunity you see for AI in Africa over the next few years?

Africa has great potential and resources given the right leadership. AI can be used in different areas which include but are not limited to:
Agriculture- to maximise the produce, to detect pests and diseases, to predict weather patterns. Health - for faster detection of diseases.
Digital forensics - fraud detection, differentiate between fake and real evidence, steganography etc.

This competition was hosted by Xente (xente.co) and sponsored by Innovation Village (www.innovationvillage.co.ug) and insight2impact (i2ifacility.org).

What are your thoughts on our winners' feedback? Engage via the Discussions page or leave a comment on social media.