Hello everyone. I'm quite new here on Zindi, and I really enjoyed taking part to the crop detection challange. However, now that the competetion is closed, I have one doubt concerning the input data and their distribution. With my numerical models, I have seen that results on internal validation varied a lot using random sampling and stratified sampling. Indeed, this is consistent with the input dataset, where the classes are clearly not balanced. My doubt is the following: according to the partecipants or to Zindi organizers, are the classes in the validation dataset distributed as in the training dataset? What I have seen is that using random sampling for my internal validation I got results around 1.25-1.35, whereas using stratified sampling I always go below 1 no matter what model I use, random forest or deep learning.
Can someone help me understanding? Thank you and congratulation to you all!
My local validation score is very close to the leaderboard score. I used stratified sampling on the fields not on individaul pixels so as not to oversample fields with a lot of pixels.
Ok, I noticed the diffence both dealing with pixels and with fields...maybe it also depends on other settings of the model...
Hi KarelAmer, please could you clarify stratifying on pixel vs fields. I am a bit lost. My view is that Per field there are multiple rows showing readings on each location. Its clear if you group the location readings per field into one and then stratify by the target variable - "label". Or do you mean your were stratifying with Field_id?
Hi DrFad, I group the pixels of each field in one row in order to have only 3,286 training sample then apply stratified sampling one such data.
Thanks for the clarification KarimAmer. So what's the difference between pixel vs field stratification?
Since the metric of the competition is cross entropy, the results will be heavily affected by the class distribution. So if training distribution is different from test distribution, test results will be worse than validation results.
Applying that on our case, it appears that the class distribution in training and testing are very close and splitted on fields not on pixels. So applying pixel stratification will oversample some fields changing the class distribution in training which leads to bad test results.
Here is another example competition where training and test class distributions are different and how fixing that can improve the reuslts: https://www.kaggle.com/c/quora-question-pairs/discussion/31179
I think I understand what you are trying to say. Pixel stratification is stratifying labels before aggregation. Field stratification is stratifying labels after aggregation i. e . Aggregated to 3,286 training samples. Is this correct?
Yes
Thank you Karim, you clarified the problem I was getting. Now I see there is no more difference between my local results and the results on the leaderboard. Can I ask you what approach did you used to tackle the problem? I have started only recently to study this problem and I'm trying to undestand what are the most promising techniques to get good final results.
You are welcome @ESA-Philab. I used deep learning only with various augmentations. I will be happy to share my approach after the code review stage.
I would be very glad to look at your approach! Thank you for the answers you gave me, I really appreaciate
Happy to help
Karim, first of all congrats for your first position! I would like to ask you if you can share some details of your implementation to study your approach, thank you :)
Thanks a lot for your kind words.
I used a 3-layer conv net (shared across time steps) followed by 3-layer bi-directional GRU net. The input to the network is a crop around the field's center pixel. Extensive augmentations were applied including spatial augmentation, mix-up and time augmenation (randomly dropping one time sample). I will let you know when the implementation is uploaded on github.
Let me know if you have any further questions.