🛡️ Let's Talk About: Top teams trick?

CGIAR Eyes on the Ground Challenge

Helping Africa

$10 000 USD

Completed (over 2 years ago)

Skills you will learn

Prediction

872 joined

137 active

Info Data Chat Leaderboard

Start

Jul 21, 23

Nov 03, 23

Reveal

Nov 03, 23

Agastya

Top teams trick?

Platform · 4 Nov 2023, 12:32 · 17

what are the tricks used by top teams to get such great score .I have used lots of different methods but only reached 8.91 /9.99 (public/private) without any leaks.Did i miss something very obvious other than leaks which were not allowed to use.

Discussion 17 answers

cloudyy

Did you get that score without using the 'damage' column in training or testing?

4 Nov 2023, 13:01

Upvotes 0

Agastya

yes i got this score without using 'damage' column neither in training nor in testing.

replied to cloudyy4 Nov 2023, 13:04

Upvotes 2

cloudyy

Greate score! What method did you use? How long is your training?

replied to Agastya4 Nov 2023, 13:07

Upvotes 0

Agastya

i treated this probem as binary classification by dividing targets by 100 . training time is around ~3 hours for single model on p100 gpu and final score is ensemble of 2 models .

replied to cloudyy4 Nov 2023, 13:16

Upvotes 3

Muhamed_Tuo

Inveniam

Nice one. This is a very clever way of solving this. Thanks for sharing

replied to Agastya4 Nov 2023, 13:58

Upvotes 0

Bartek

Congratulations! I think you will get a top-5 score after all disqualifications.

What do you mean by dividing a target by 100? What will be the class for value 0.9, 0, or 1?

I briefly tried an approach of training a classifier for `extent` 0 vs > 0 and I'm not sure if your approach was the same or not.

replied to Agastya4 Nov 2023, 17:01

Upvotes 1

doItLikeThis99

I had a similar approach. It's motivated by a few Kaggle notebooks which recommended this - you set target as (extent / 100), and train with binarycrossentropy. so the "class value" is the extent

replied to Bartek4 Nov 2023, 18:20

Upvotes 0

Mugisha_

Great solution @Agastya and yes I believe you'll get a top 5 placement on the LB once zindi clarifies issues with the damage feature (their previous explanation does little to iron out issues as previously raised and also considering the goal of the competition).

I personally have two subs one completely not relying on the damage feature and the other using pre and post processing with the damage feature, so I expect a drop once they provide a clearer explanation on what they'll consider a leak taking into account the goal of the competition.

replied to Agastya4 Nov 2023, 19:28

Upvotes 0

ahmadmwali

I'm nowhere near the top as my score was around 13. However, just this morning I thought about using an ensemble of CNNs; one should classify an image as having a severity of 0 or not (binary classification), for severity values not equal to 0, you use the image as input to the other CNN which will be trained on non 0 values. I think that will be a good solution to such a problem with so many zeros.

Something I was able to achieve, though, is training the models really fast. The arcgitecture I used was similar to VGG with about 16 Conv layers and 4 dense layers. My notebook ran in average of 13 mins (loading, training and inference) on a 4GB GPU and 16 GB of RAM. I think it ran fast becasue I didn't use the inbuilt dataloader. I rather read the images to memory and made the training. I found that so much faster than using dataloader.

4 Nov 2023, 13:53

Upvotes 1

YangTea

I don't understand your solution very well. Can you explain more clearly? And importantly, did you use data leaks in your solution?

replied to ahmadmwali4 Nov 2023, 14:07

Upvotes 1

ahmadmwali

I did not use dataleak. I think this can be approached by an ensemble of 2 CNNs. You basically create a new column on your df to indicate whether an image has an "extent" value of 0 or not. You use this column as target to train the first CNN for binary classification which classifies an image as having an extent of 0 or 1. The output will be used as input to train the second model to predict the value of the non-zero images. Now, this is just something I though of, and not what I used for the competition. My own model did not perform very well, which was just a normal CNN.

replied to YangTea4 Nov 2023, 14:21

Upvotes 1

YangTea

Oh thank you! So your solution was just simple a CNN with regression output?

replied to ahmadmwali4 Nov 2023, 14:37

Upvotes 1

doItLikeThis99

I tried this too but couldn't get this to work. nice job!

replied to ahmadmwali4 Nov 2023, 15:23

Upvotes 0

LuckyLucky

@Ahmadmwali I see you said your score is about 13, but on the LB I see it is 9.27/9.98. I'm not sure what the score of 13 is that you're talking about? Thanks

replied to ahmadmwali5 Nov 2023, 11:03

Upvotes 1

ahmadmwali

Just checked the leaderboard. So I saw that they selected the first model I made, which had data leak. My later models were not trained with it, which gave me about 13 RMSE. I forgot to select the later submissions. I'm getting disqualified as well.

replied to LuckyLucky5 Nov 2023, 11:14

Upvotes 2

LuckyLucky

@Ahmadmwali Oh thank you! I have learned a lot from your new idea in CNN (classification + regression). I appriciated it.

replied to ahmadmwali5 Nov 2023, 11:17

Upvotes 0

doItLikeThis99

In my opinion the top ~17 teams didn't get realistic score. I expect there will be quite a few disqualifications, unless there really is some non-damage type trick to get that low.

what model did you use to train?

I did the same thing, I used an ensemble of 4-5 different cnns trained on extent / 100.

However, my best models took too long to train so I couldn't include them in the ensemble.

4 Nov 2023, 15:26

Upvotes 3

Join the largest network for
data scientists and AI builders

About FAQs

Status