welcome, and thanks for your questions I think this discussion acts as the writeup I was too lazy to write😅. Congratulations again for your recent win and I wish you more wins to come!
@Koleshjr Thanks for sharing this. Can I please get the code that generated these 2 datasets "train = pd.read_csv(new_path + "train_monthly_stats_patch_8.csv")
test = pd.read_csv(new_path + "test_monthly_stats_patch_8.csv")". I will appreciate it. I just want to learn what approach you took.
@Koleshjr Thanks for sharing. I have a few questions to ask😅.
1. How much boost did the robust scaler add to your score? I normally train on the raw values when it comes to gbdt.
2. On my side, some indices turned to worsen my results, only GNDVI, NDVI and CI did its best. Did I do something wrong?
3. What different things did you do to get a very high private score (almost touching 0.98, that's amazing bro🔥🔥)?
1. The improvement from robust scaler had no significant improvement , it was negligible
2. If you look at the feature importance plot , we only focused on median alone. So FE with median features led to significant improvement
3. Tricks to get high score: Patching plus median features
Oh alright makes sense,
Could you elaborate more on patching?
You don't use the whole image to create features , you create a crop of it of size NxN where N is Tunable
Ohhh okay! This is definitely new to me. So you focus more on the center of the image or anywhere the phenomenon is mostly located.
Thank you for taking time to answer my questions. I am really grateful and keep winning!
I have talked about it in previous competitions:
2nd Place Solution - Zindi
Thanks
Thank you @Koleshjr. Learning a lot from you!
welcome, and thanks for your questions I think this discussion acts as the writeup I was too lazy to write😅. Congratulations again for your recent win and I wish you more wins to come!
Thank you big man😊! More wins for all of us😅.
@Koleshjr Thanks for sharing this. Can I please get the code that generated these 2 datasets "train = pd.read_csv(new_path + "train_monthly_stats_patch_8.csv")
test = pd.read_csv(new_path + "test_monthly_stats_patch_8.csv")". I will appreciate it. I just want to learn what approach you took.