As I see, the final leaderboard has been revealed, but I see that there is a 1st team that has a very different score, so I'm sure they used type of damage?
That should have been disqualified. Can the organizers explain this?
I don't understand why he wasn't eliminated =D I think with that score, type of damage was definitely used. Otherwise I would like to know his brilliant solution.
Hi @kamsamita First of all congrats to all who participated in this competition my solution not utilize the type of damage at all
My solution follow the following steps
1- data exploration
2- excluding and filtering for some seasons and growth stages for example i exclude (SLR2020,....) which reduce data size and accordingly decrease train time which allow me to ensemble different models
3- postprocessing based on extent distribution for each season and growth stage.
Btw i drop the damage column after reading the train and test files directly.I extend my best wishes to you.
no i don't use it at all the tricks are reduce time to ensabmle models and the most important one is postprocessing based on distribution of extent for season and growth stage in addition to upper and lower capping
For example if you each each combination between season and growth stage and figure out what is the maximum and minimum values for extent ,in addition to the distribution of extent level ( you can figure it out by plotting histogram) you can postprocess your predictions by capping for example and I use this technique after I saw a good correlation between LB and CV so most probably the distribution of extent per season and growth stage in train is similar to test which is expressed by public LB.I hope it now clear what is the meaning of postprocess
I have try add growth stage and season column to model as metadata. Result is score not change. If season and growth stage are useful feature, they will increase score.
Me too when I use as meta data for nn it didn't post the score but I use it in postprocess it works without this trick my model on lb scored 7.7 btw and the pipeline is just classification stage 0 or non zeros then pass highly confident non zeros and low confident zeros to regression model bu dropping last classifier head and relplace it with linear layer and all of these steps after removing badly label images and any image with growth stage s
Maybe there's some magic they discovered... but yes more likely they should be removed surely.
Coincidentally.. I'm 1 spot out of a Gold medal currently D:
I don't understand why he wasn't eliminated =D I think with that score, type of damage was definitely used. Otherwise I would like to know his brilliant solution.
Congrats to you!
Hi @kamsamita First of all congrats to all who participated in this competition my solution not utilize the type of damage at all
My solution follow the following steps
1- data exploration
2- excluding and filtering for some seasons and growth stages for example i exclude (SLR2020,....) which reduce data size and accordingly decrease train time which allow me to ensemble different models
3- postprocessing based on extent distribution for each season and growth stage.
@Mohamed_abdelrazik Congratulations on your 1st place!
Because of your score I think you have modified other columns DR is 0?
no i don't use it at all the tricks are reduce time to ensabmle models and the most important one is postprocessing based on distribution of extent for season and growth stage in addition to upper and lower capping
Thank you very much.
I didn't believe that score until I saw your solution. Amazing!
Could you give me a reference to your code so I can prepare for the upcoming competition?
Congratulations @Mohamed_abdelrazik , Which models and training approach did you use and did you go with regression or classification?
How to postprocessing based on distribution of extent for season and growth stage. How to define exactly image to modify extent.
For example if you each each combination between season and growth stage and figure out what is the maximum and minimum values for extent ,in addition to the distribution of extent level ( you can figure it out by plotting histogram) you can postprocess your predictions by capping for example and I use this technique after I saw a good correlation between LB and CV so most probably the distribution of extent per season and growth stage in train is similar to test which is expressed by public LB.I hope it now clear what is the meaning of postprocess
I have try add growth stage and season column to model as metadata. Result is score not change. If season and growth stage are useful feature, they will increase score.
Me too when I use as meta data for nn it didn't post the score but I use it in postprocess it works without this trick my model on lb scored 7.7 btw and the pipeline is just classification stage 0 or non zeros then pass highly confident non zeros and low confident zeros to regression model bu dropping last classifier head and relplace it with linear layer and all of these steps after removing badly label images and any image with growth stage s
congrats for the 1st place
can you share your lb score before post processing
Sure before post processing public LB 7.7XX and private LB 8.23XX
great score before post processing
Wow, if this holds up that's amazing. how did you choose which images to remove? and why growth stage s?
No need to defend yourself. You ranked first in private leaderboard and that should be enough.
I think this is a cheating. If the top 1 team can prove it with their code otherwise they definitely cheated.
I'm sure Zindi already has vetted Mohamed_abelrazik's code. I would be interested in learning from the solution, however!