Xente Fraud Detection Challenge
$4,500 USD
Accurately classify the fraudulent transactions from Xente's e-commerce platform
20 May–22 September 2019 23:59
1027 data scientists enrolled, 508 on the leaderboard
Any tips and tricks to handle imbalance data?
published 28 Jun 2019, 15:09

I am new in data science I want to handle imbalance data how do i do that?


i tried smote but result is so bad.

Hi jayesh! There are quite a number of ways to deal with imbalanced datasets. Some of these are:

1. Use of SMOTE

2. Use of the NearMiss algorithm

3. Random Sampling (Undersampling and/or Oversampling)

4. Anomaly Detection

5. Use of certain ensemble learning models.

Since the results you obtained on using SMOTE gave you low scores, I will advise you to check your features very well...something may be fishy.

I tried SMOTE and didn't work !!!

On the anomaly detection w/ autoenconder front i am looking at this document : https://www.cs.ru.nl/bachelors-theses/2018/Tom_Sweers___4584325___Autoencoding_credit_card_fraude.pdf

I am trying just for kicks because i think this is a very signal poor dataset.