🐼 Hot Topic: METRIC CALCULATION

Turtle Recall: Conservation Challenge

Helping Kenya

$10 000 USD

Completed (~4 years ago)

Skills you will learn

Classification

Computer Vision

756 joined

246 active

Info Data Chat Leaderboard

Start

Nov 19, 21

Apr 21, 22

Reveal

Apr 21, 22

FADHLOUN

METRIC CALCULATION

Connect · 6 Mar 2022, 05:40 · edited 4 minutes later · 8

Hello Zindians,

I joined this challenge early this week and I started creating a baseline where my local validation wasn't that bad compared to what I've got in the public leaderboard.

I started investigating about this because I have seen too many threads talking about wrong metric calculation in Zindi backend until I found this thread https://zindi.africa/competitions/turtle-recall-conservation-challenge/discussions/9474 where @picekl asked a question "would you mind sharing the script that handles the mAP calculation at the backend?" and @stigvp response was "Hi Lukas, please see the latest version of the tutorial, which now includes such an example.".

The problem is that the metric is wrong implemented in the starter notebook and here's a notebook that I have made explaining the reasons in details. https://colab.research.google.com/drive/1de_tHzS-rasM1sV0BmBWgjudYxFlKVxv?usp=sharing

The question now is "Is Zindi using the same function to calculate our scores?" According to @stigvp answer in @picekl's thread . "YES".

@ZINDI @amyflorida626

Could you confirm this also?

Cordially,

Discussion 8 answers

kiminya

Strathmore university

Hello Fadhloun,

Thanks for taking your time to debug this issue, and for the simplified MAP@5 formula.

In some of the unit tests for the starter notebook implementation, I noticed you're passing a list of actual labels to the apk function, e.g. actual = ["E"] * len(predicted)

If we only pass the label e.g. actual = "E", both functions seem to be identical.

Could you confirm?

6 Mar 2022, 06:33

Upvotes 0

FADHLOUN

Yes I passed same unittests and it worked perfectly. I wonder now if they are passing true labels as a list or not. Maybe that's the issue. Let's wait and see

replied to kiminya6 Mar 2022, 16:40

Upvotes 0

goldentom42

I am afraid the metric calculation on this competition actually works just fine.

I had a look at the most similar images found on the test set (for a model performing pretty well on train) and to be honest they are far from perfect and the scores on LB make sense (at least for me).

My feeling is that the images in train (+extra) and test differ in time, which makes sense, and models have a hard time generalizing in time.

This sounds like a pretty hard challenge to me.

6 Mar 2022, 07:54

Upvotes 0

kiryusha

https://zindi.africa/competitions/turtle-recall-conservation-challenge/discussions/10167

I'm afraid if there is a problem public board is not representative. Top score participants can have a lot of errors in their predictions. I compared top-1 of my models prediction with GT, turtles mostly are the same.

replied to goldentom426 Mar 2022, 12:31 (edited 1 minute later)

Upvotes 0

Chiebuka

This is the main reason why this task is difficult. It is hardly about the metric and more about the differences between the train and test sets.

I noticed this earlier and already asked those @Zindi and @Deepmind about the key differences between both sets . While they responded that there wasn't any major difference , I have reason to believe that isn't the case.

To validate this, I tested a good model for image location classification on a subset of the train set (val set) and on the test set. The model performed quite well on the val set but very poorly on the test set, just as the case has been in this competition.

While I am not sure what the difference actually is, mabe of time like you suggested, I believe the difficulty in this task due to some sort of disparity between the train and test sets and not neccesarily the metric.

replied to goldentom427 Mar 2022, 11:32

Upvotes 0

kiryusha

I don't agree with that since I checked a lot of my top-1 predictions very carefully. Same did @Fnoa. The predictions are quite good. But in general it doesn't matter. Organizers and "experts" from Zindi just ignoring all our discussions here. @amyflorida626

replied to Chiebuka7 Mar 2022, 12:08

Upvotes 0

tahsin

I agree with you. And there are only 6 days left for this competition but we haven't heard a logical response from @zindi team yet.

replied to kiryusha7 Mar 2022, 20:33

Upvotes 0

FADHLOUN

If there's a real issue, this challenge should be extended and I think zindi will agree with that as they are seeking best solutions for their customers and during all this period there were no real competition.

replied to tahsin7 Mar 2022, 22:33

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status