Standard Bank Tech Impact Challenge: Xente credit scoring challenge
2000 Zindi Points
Predict the likelihood of credit default of ecommerce clients
387 data scientists enrolled, 96 on the leaderboard
Financial ServicesPredictionStructured
Uganda
30 August 2019—2 December 2019

Xente is an e-commerce and financial service app serving 30,000+ customers in Uganda.

This dataset includes a sample of approximately 2,665 unique e-commerce transactions that occurred between 21 September 2018 and 17 July 2019. During this period, 1,631 loans were issued to Xente clients.

The data have been split into a test and training set. This was done chronologically, so the buyers' history can be used to predict their default likelihood.

The training set contains 1,769 unique transactions and the test set contains 905 unique transactions. The number of observations in the train data sets exceeds the number of transactions, as a result of some transactions being paid in split payments/installments.

Variable definitions

  • CustomerId: Unique number identifying the customer on platform
  • TransactionStartTime: Transaction start time
  • Value: Value of transaction
  • Amount: Value of Transaction with charges
  • TransactionId: Unique transaction identifier on platform
  • BatchId: Identifier for bulk transactions being done on an account
  • SubscriptionId: You can have one account with multiple subscriptions
  • CurrencyCode: Country currency
  • CountryCode: Numerical geographical code of country
  • ProviderId: Source provider of Item bought
  • ProductId: Item name being bought
  • ProductCategory: Type of product
  • ChannelId: Identifies if customer used Xente Paylater on any other channel
  • TransactionStatus: Loan application status (1=accepted, 0 = rejected)
  • IssuedDateLoan: Date loan is issued
  • AmountLoan: Value of the loan issued
  • Currency: Ugandan shillings Denominations
  • LoanId: Loan transaction unique identifier
  • PaidOnDate: Date on which the loan was paid
  • IsFinalPayBack: Last payback installment
  • InvestorId: Loan issuer or network owner
  • DueDate: Date loan is due
  • LoanApplicationId: unique identifier for loan application
  • PayBackId: Loan payback number identifier
  • ThirdPartyId: Transaction id for a loan payback
  • IsThirdPartyConfirmed: loan order succeeded on platform
  • IsDefaulted: Exceeded agreed payback time (1 = default, 0 = non-default)

Files available for download

The files for download here are:

  • VariableDefinitions.csv: Definition of the features per transaction
  • Train.csv: Ecommerce transactions and associated loans from 21 September 2018 and 31 March 2019, including whether or not a customer defaulted on their loan. This is the dataset that you will use to train your model.
  • Test.csv: Ecommerce transactions and associated loans from 31 March 2019 and 17 July 2019, excluding whether or not a customer defaulted on their loan. Additional variables associated with the loan have also been removed from the test set. is the dataset that you will use to test your model on.
  • unlinked_masked_final.csv: e-commerce transactions not linked to any loans, but associated to customers with loan-linked e-commerce transactions.
  • sample_submission.csv: is an example of what your submission file should look like, including a list of unique transaction ids and the associated loan status. The order of the rows does not matter, but the names of the TransactionId must be correct.