Primary competition visual

Lacuna Masakhane Parts of Speech Classification Challenge

Helping Africa
$7 000 USD
Completed (over 2 years ago)
Classification
Natural Language Processing
472 joined
101 active
Starti
Jun 08, 23
Closei
Sep 17, 23
Reveali
Sep 17, 23
User avatar
jpandeinge
University of manchester
Public Score
Help · 22 Aug 2023, 12:28 · 10

My public score isn't improving, although my local score is improving, I checked that I am not overfitting my model, and I find this a bit weird because I can't figure out how my score isn't improving on the public score, although my accuracy is around 0.68 and can't go beyond 0.43. Any idea why this might be the case?

Discussion 10 answers
User avatar
HungryLearner

There are lots of factors that contribute to the correlation between local and public score.

For this particular challenge, you are not given training dataset for the expected testing languages. So, we can't be sure of how to relate the local score with the public ones.

The local score depends on what you're validating your model with. Be it the dev part of the language(s) you used in training. It may also be another set of languages selected for validation while training with some selected languages not included in your validation language set.

These approaches may have their merit and demerit for this kind of out-of-domain challenge but may not lead to a correlated public score as the language used for validation is not exactly the test languages.

It boils down to recalibrating your experimental setup in logically deciding on how you perform your local validation. You may be lucky enough to find a setup that can be correlated to expected final score (private).

Also note, public scores do not really imply a good performing model as this is computed on a segment of the given test set. Many a time, shakeup do occur when the private score get refilled after the challenge deadline, leading to a lot of repositioning of the participants on the leaderboard.

Happy coding !!!

22 Aug 2023, 13:08
Upvotes 3
User avatar
jpandeinge
University of manchester

that's what i am thinking, but I was just worried since the final evaluation might be based on the score that's reflected on the public one, although I have models with higher accuracies that didn't just surpass the public scores.

thanks for the clarity, cheers!

Hello, You need to set up proper CV.

Think about what you will be evaluated on, and see if your validation score reflects that. You need to trust CV, but only if it is setup properly, and this much of a difference is no where near a good sign. I don't know if you made your own baseline or used any of the ones anyone has shared so far, but if the later, I strongly advise you to rethink about it, as the techniques shared so far will not get you the shakeup intended.

22 Aug 2023, 14:28
Upvotes 2

Comment deleted since it makes no sense, sorry!

28 Aug 2023, 11:22
Upvotes 0

There are dups, but 80% ?

I don't know if that was an exageration to make a more striking point or if thats what you actually recorded from the comp data, but thats not what my teamate and I found. The techniques discussed in the paper are more focused towards same language in train and valid as you said but I would like to see how you can get 92% on this comp from their approach 🤔.

I also wonder how you realised that labels are noisy, I don't personnaly speak any of the labelled languages so I wasn't able to assess that but maybe you do. What I found tho from the labeled french data is that the sentences don't make much sense, idk if thats the case in other langs...

PS: Congrats on the contrails comp top 20 :)

Nonesense again!

Hey, yes indeed you are missing something but thats all I can say without my teamate getting angry at me sharing too much :)

As I said before, you shouldn't expect a shakeup to 0.70ish with the current LB you have.

Obviously you don't have to trust my word for it, but I'm sure my LB neighbours would tend to align with what I'm telling you.

I think i get it. Thank you so much.

I will delete my comments as they make no sense haha and will confuse others.

again thanks and good luck!

No problem at all, we are all in this to learn.

good luck too !

Seeing your new score, glad you understood what I meant :)