Primary competition visual

Lacuna Masakhane Parts of Speech Classification Challenge

Helping Africa
$7 000 USD
Completed (over 2 years ago)
Classification
Natural Language Processing
472 joined
101 active
Starti
Jun 08, 23
Closei
Sep 17, 23
Reveali
Sep 17, 23
Ethical thin-line clarification about "CV"
Data · 13 Sep 2023, 12:45 · 3

I am reading about "CV" vs "LB" in other posts.

I wasn't sure what cross validation we were talking about... because in the absense of direct data for the target tasks, there is clearly no such thing as "CV"...

The only CV that makes sense in this competition, would be in terms of using the LB scores to guide your training.

But at what point does using this informal evaluation technique "violate the spirit of the competition"?

Because everyone makes use of the information informally, to guide their training... eg. "Bambara brought my score down, next round I won't tune on Bambara".

But there's 5 days left, and I'm sitting at 10th place, and only the top 10 go forward.

As far as I can tell, there have been multiple discussions talking openly about formalising a CV play that makes sense in the competition, which I take to mean that using LB scores in your CV is probably fair game, because no one has said anything yet...

But if I algorithmically formalise a fine tuning CV evaluation based on my LB scores, I will be pushing anyone who doesn't do this out of the top 10, and out of the competition.

So are we allowed to use our LB scores in our CV evaluation? That is, can you use it to rigorously, or systematically evaluate and fine tune your model towards a higher score?

If yes, then ok I'll formalise the play this evening, and see you guys in the top 5 in a couple of days, and sorry to contestant #11 who didn't think to use their LB scores to game the system.

If not, then I think some of these top 5 models will need to be disqualified for making use of leaked LB score information in their evaluation.

It's my first data science competition, so I don't know where the ethical boundary line is, since informal use is inherent to the competition, yet systematising it seems unfair, and yet it's talked about freely, and is now practically a necessecity, if I want to stay in the top 10 in the competition.

Discussion 3 answers

I'm pretty sure there are some ways to get a robust CV in this comp, at least robust enough not to have to probe LB as your CV. You are allowed to use LB as an indicator of your improvements don't worry, it's not unfair, since, you are most likely going to shake (up/down ?) due to how small the data you are relying on is. Using LB as your score is allowed because it's never reliable, so don't feel bad about it and chose your final subs the way you want it.

13 Sep 2023, 13:33
Upvotes 4
User avatar
HungryLearner

In my own view, discussion on CV vs. LB scores are mainly to openly help guide ourselves collectively. As I believe challenges should not only be about who wins but also about learning collectively.

Providing your CV can help someone to ponder on why there own CV/LB correlation looks the way it does, why whatever was shared by others can be useful for you as well. It's basically a choice to provide yours or not and no rules is against any form of discusssion on the platform. What is frown against is out of team collaborations where people work together without being in a team on the leaderboard. In fact, you are free to openly share baseline codes to guide others, but it is not advisable at the last few days of the competition.

13 Sep 2023, 13:42
Upvotes 3

@level devil There are definitely ways to build a strong CV in this competition, robust enough that you don't have to rely solely on the leaderboard (LB) as your CV. It's perfectly fine to use the LB to gauge your progress; don't worry, it's not unfair. Given the small size of the dataset, you might see fluctuations in your ranking. Using the LB as an indicator for your score is acceptable because it’s inherently unreliable. So, don't feel bad about it—choose your final submissions based on what you feel is best.

24 Jul 2024, 03:48
Upvotes 0