🐼 Join the Buzz: A simple approach to 0.94+

Turtle Recall: Conservation Challenge

Helping Kenya

$10 000 USD

Completed (~4 years ago)

Skills you will learn

Classification

Computer Vision

756 joined

246 active

Info Data Chat Leaderboard

Start

Nov 19, 21

Apr 21, 22

Reveal

Apr 21, 22

Professor

Carnegie Mellon University Africa

A simple approach to 0.94+

Notebooks · 22 Apr 2022, 13:26 · edited less than a minute later · 15

Congrats to the winners, and thanks to Deep mind and @Zindi for hosting. I had a great time in this competition, I however noticed many folks were unable to cross over to 0.9+, so I made a simple notebook that takes you to 0.94+ on the private LB, with ideas for improvement using simple classification. feel free to access it here: https://github.com/osinkolu/Turtle-Recall-Conservation-Challenge

Discussion 15 answers

aninda_bitm

Great post. Thanks

22 Apr 2022, 14:08

Upvotes 0

Professor

Carnegie Mellon University Africa

Welcome aninda_bitm

replied to aninda_bitm22 Apr 2022, 15:15

Upvotes 0

21db

Thanks for the share @Professor ! Can I ask how long this model takes to train?

22 Apr 2022, 18:23

Upvotes 0

Professor

Carnegie Mellon University Africa

Hi @DanielBruintjies, it depends on your hardware infrastructure, the whole notebook took about 200 minutes to run on Colab pro with a Tesla P100.

replied to 21db22 Apr 2022, 20:13

Upvotes 0

21db

Ah okay, thanks, and congrats on your strong finish!

replied to Professor23 Apr 2022, 09:53

Upvotes 0

100i

Ghana Health Service

Congrats, and thanks for sharing ! I can see that dropping the low-count classes had a net postive effect.

22 Apr 2022, 20:12

Upvotes 0

Professor

Carnegie Mellon University Africa

Yes it did. There were too many classes in the extra data with not enough images to learn properly, it was best to leave them.

replied to 100i22 Apr 2022, 21:21

Upvotes 0

tahsin

Thanks for sharing. Did you train your model over all 2000+ classes?

22 Apr 2022, 21:19

Upvotes 0

Professor

Carnegie Mellon University Africa

Nah......the model for this sample notebook only saw about 405 unique classes since I cut out classes with less than 7 images.

replied to tahsin22 Apr 2022, 21:27

Upvotes 0

100i

Ghana Health Service

wait, what?! the data i used only had 100 unique classes ??

replied to Professor22 Apr 2022, 21:30

Upvotes 0

100i

Ghana Health Service

my bad , got what you mean. external data had +2000 classes. interesting that never crossed off my mind.

replied to 100i22 Apr 2022, 21:41

Upvotes 0

flamethrower

Thanks for sharing @Professor. Elegant approach.

I'm wondering what happens at Cell Output 51, Prediction 5 seems to have a None rather than a turtle ID.

Did you fix this in your best submission or there might be instances of this.

26 Apr 2022, 09:16

Upvotes 0

Professor

Carnegie Mellon University Africa

@flamethrower, congratulations once again. Yes, infact the best single submission had many Nan cells, the reason is because of the strategy I implemented, and my code makes it impossible to have the same turtle on a row.

I took care of the Nan cells during ensembling, using other submissions to fill it in asides taking the mode.

However, one thing I noticed is that filling the cells alone made almost No change in score. Infact, deleting the whole prediction (5) and probably (4) may not even change the score at all. This is because the model's best prediction is most likely in column 1, worst case scenario, column 2. The rest are most likely wrong.

replied to flamethrower26 Apr 2022, 10:57

Upvotes 0

flamethrower

Thank you bro. Yes from validation on my end, accuracy to ground truth for prediction 1 is around 0.9+, prediction 2 is between 0.2-0.5, prediction 3-5 are less than 0.1. However, I think you could have a different score on private if prediction 3-5 has correct turtle ID, maybe public will be same.

Additionally, did you explore using thresholds to detect entirely new turtles not from extra images/Train database. Also, since it seems you dropped Turtle ID less than 7, so model can't take them into account at Test time even though they are to be classified as New Turtle.

replied to Professor26 Apr 2022, 14:23 (edited 2 minutes later)

Upvotes 0

Professor

Carnegie Mellon University Africa

Yes, true @flamethrower. I did a kind of tradeoff by using images from 7 samples upward. for the initial question, yes, I had notebooks where I used thresholds. one, in particular, scored up to 0.92 in private, where I used diminishing thresholds as the model predicts from prediction 1 to 5 for each row. The amazing thing is that there was no extra data used in that notebook.

replied to flamethrower27 Apr 2022, 10:57

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status