Primary competition visual

Absa Customer Income Prediction Challenge

Helping South Africa
$5 000 USD
Completed (~3 years ago)
Prediction
341 joined
54 active
Starti
Nov 29, 22
Closei
Feb 26, 23
Reveali
Feb 26, 23
About

The transaction data is over a period of 14 months, running from the start of July 2021 to the end of August 2022. There are 5144 customers and 46926 unique accounts, with some customers having more than one account. The number of transactions per customer range from 1 to just over 2 000 for the 14-month period.

The train set contains 3600 customers along with their declared net income and the test data contains the remaining 1544 customers with the declared income excluded.

Along with the train and test files is the transaction history for each customer which details the various transactions each customer engaged in over the entire period of recording. There are also data files containing demographic information on each customer as well as files describing some of the categorical variables related to the customers.

How to use Colab on Zindi

How to mount a drive on Colab

The objective of this challenge is to create a machine learning solution to determine a customer’s income based on their transaction history over 14 months. Your model can make use of all the data provided with the target being the declared net income from the train file.

Variable Definitions

  • employment_status.csv - contains descriptions of the various categories of customer employment status
  • income_group.csv - describes how customers are grouped based on their income
  • Test.csv - contains customer identifiers as well as record dates for customers in the test set
  • SampleSubmission.csv - a sample of the test file with zero predictions on the declared net income
  • Train.csv - contains customer identifiers as well as their declared net income(the target) used to train your model
  • customer.csv - contains demographic information on the customers that can be used as predictors in your model
  • transactions.csv - customer transaction history for the past 14 months
Files
Description
Files
Contains customer identifiers as well as record dates for customers in the test set
Contains customer identifiers as well as their declared net income(the target) used to train your model
Describes how customers are grouped based on their income
Contains demographic information on the customers that can be used as predictors in your model
Contains descriptions of the various categories of customer employment status
This shows the submission format for this competition, with the ā€˜CUSTOMER_IDENTIFIER’ column mirroring that of Test.csv and the ā€˜DECLARED_NET_INCOME’ column containing your predictions. The order of the rows does not matter, but the names of the ā€˜CUSTOMER_IDENTIFIER’ must be correct.
Customer transaction history for the past 14 months