I honestly don't know why your cvs are that high and getting good scores
my cv correlates with the lb slightly my current CV is 6.24 vs 6.35 on the lb
I have lots of 6cvs and are ranging 6.35 to 6.6 lb score . I wonder what people in the top are using, although the 3 scores is most likely post processing. I doubt if it's a score from a model
I think there are many 0 values in the total column. This causes skewness to 0 and therefore a very high CV if predicting non zero values. However, having a high CV yet a good score on the LB can also be a sign of overfitting to the PL if the private board is mostly filled with 0 values. If you group with location, you are likely going to get a 6 score with a correlating CV. If you also aggregated with the mean or any other form of aggregation, you might also get a 6 and a correlating CV.
Hey Yisak, I am sure in no time you will improve on the leaderboard. Though might not share much about my approach to help improve coz its against the rules as long as competition is on, I would say. The extra data did not help me much and current cv looks really bad. .. one question to you:does your current 6.19 pb correspond to cv?
Okay, I read back the data page. Additional data refers to toilets, water & waste. Yes I am using them, but they are not the most important variables based on boosting importances. The original dataset is the most relevant.
How did you get your 6.xx scores?
what is your current CV LB score for your best lb score?
My CV is 7.4 and lb 6.19
I honestly don't know why your cvs are that high and getting good scores
my cv correlates with the lb slightly my current CV is 6.24 vs 6.35 on the lb
I have lots of 6cvs and are ranging 6.35 to 6.6 lb score . I wonder what people in the top are using, although the 3 scores is most likely post processing. I doubt if it's a score from a model
Yeh definitely
I think there are many 0 values in the total column. This causes skewness to 0 and therefore a very high CV if predicting non zero values. However, having a high CV yet a good score on the LB can also be a sign of overfitting to the PL if the private board is mostly filled with 0 values. If you group with location, you are likely going to get a 6 score with a correlating CV. If you also aggregated with the mean or any other form of aggregation, you might also get a 6 and a correlating CV.
With the last part, do you mean how the repeating ID's in the train set are handled?
Yes. That's what I meant.
How do you compute your CV ? It's very hard to compare if we have different strategies and dates !
That will depend on you. You can decide to take it like a time series challenge or you can decide to do a normal train test split or any other.
Okay thanks!
Hey Yisak, I am sure in no time you will improve on the leaderboard. Though might not share much about my approach to help improve coz its against the rules as long as competition is on, I would say. The extra data did not help me much and current cv looks really bad. .. one question to you:does your current 6.19 pb correspond to cv?
No I didn't find any good way to stable it yet
Hello @Ds_Queen. Please is there extra data ? And what are these data ? May you share plz.
I'm sure she's talking about the additional datasets.
@marching_learning. Does your current score make use of the additional data?
Okay, I read back the data page. Additional data refers to toilets, water & waste. Yes I am using them, but they are not the most important variables based on boosting importances. The original dataset is the most relevant.
Got it. Thanks
yea sure its toilets, water & waste i ment
Surely out of topic, but please can you share us your winning solution to Africa Credit Scoring Challenge ?
yea possibly please share @Yisakberhanu even if its a write up atleaset.yhanks
First let's code review done then I am gone write briefly my solution.
Seen final winners are declared for credit challenge.Congratulations on winning that and this challenge-very good performance.
Kindly don't forget to share your solution for the credit classification challenge as promised.
Today's thought of the day, I guess asking for help is easier than helping the community learn :)
sad.
a write up would have been nice if the code is too hard :)