Anybody who has looked at the disparity between Total Amount and Total amount to repay? I find that in some loans the amount being repaid is actually less than loan borrowed or equal to loan borrowed which is not ideal !! Any Insights here ?
#loan_amount vs loan_amount_to_be_repaid consistency check
print((train['Total_Amount'] < train['Total_Amount_to_Repay']).all())
print((test['Total_Amount'] < test['Total_Amount_to_Repay']).all())
#Identify inconsistent rows
inconsistent_train = train[train['Total_Amount'] >= train['Total_Amount_to_Repay']]
inconsistent_test = test[test['Total_Amount'] >= test['Total_Amount_to_Repay']]
It is sometimes the case that a loan is syndicated (total loan amount jointly provided by 2+ providers). That is why we also have lender_portion_funded column as well to show the ratio. In that instance, the total to be repaid to that specific lender_id will be less than the total amount borrowed.
Also check tbl_loan_id, this identifies each loan using a unique ID, you'll be able to identify that some loans are being provided by 2 different lenders