SFC PAYGo Solar Credit Repayment Competition
$5 000 USD
Can you predict PAYGo solar customer payments?
744 data scientists enrolled, 175 on the leaderboard
6 June—29 August
Ends in 1 month

This competition focuses on PAYGo SHS contracts data.

When a customer applies for a loan, banks and other credit providers use statistical models to determine whether or not to grant the loan based on the likelihood of the loan being repaid. The factors involved in determining this likelihood are complex, and extensive statistical analysis and modelling are required to predict the outcome for each individual case. You must implement a similar model that predicts PAYGo SHS contract repayments or defaults based on the data provided.

In this competition, you must explore and cleanse a dataset consisting of over ~37000 PAYGo SHS contracts to determine the best way to predict the repayment profile. You must then build a machine learning model that returns the expected future payments for n months ahead (for this competition n=6).

You could empower your solution by predicting the contract repayment status label (a probability of being paid or not paid) as well. This could indicate whether the contract will be fully paid or defaulted.

Files available for download:

  • Train.csv - contains the target (next 6 months to predict ). This is the dataset that you will use to train your model.
  • Test.csv- resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
  • metadata.csv - contains some extra features; could be used to build a payment status label predictor.
  • SampleSubmission.csv - shows the submission format for this competition, with the ‘ID’ column mirroring that of Test.csv and the ‘Target’ column containing your predictions for each month. The order of the rows does not matter, but the names of the ID must be correct.
  • VariablesDefinition.txt - describes the variables in the metadata file
  • StarterNotebook.ipynb - this is the starter notebook for the hackathon which was a subset of this dataset. You can use this notebook as inspiration for this competition.