Primary competition visual

African Credit Scoring Challenge

Helping Africa
$5 000 USD
Completed (~1 year ago)
1959 joined
1022 active
Starti
Nov 29, 24
Closei
Jan 12, 25
Reveali
Jan 13, 25
User avatar
Peter-Murimi
Upwork
EDA
Data · 6 Jan 2025, 07:31 · 2

Anybody who has looked at the disparity between Total Amount and Total amount to repay? I find that in some loans the amount being repaid is actually less than loan borrowed or equal to loan borrowed which is not ideal !! Any Insights here ?

#loan_amount vs loan_amount_to_be_repaid consistency check

print((train['Total_Amount'] < train['Total_Amount_to_Repay']).all())

print((test['Total_Amount'] < test['Total_Amount_to_Repay']).all())

#Identify inconsistent rows

inconsistent_train = train[train['Total_Amount'] >= train['Total_Amount_to_Repay']]

inconsistent_test = test[test['Total_Amount'] >= test['Total_Amount_to_Repay']]

Discussion 2 answers

It is sometimes the case that a loan is syndicated (total loan amount jointly provided by 2+ providers). That is why we also have lender_portion_funded column as well to show the ratio. In that instance, the total to be repaid to that specific lender_id will be less than the total amount borrowed.

6 Jan 2025, 10:50
Upvotes 0

Also check tbl_loan_id, this identifies each loan using a unique ID, you'll be able to identify that some loans are being provided by 2 different lenders