Primary competition visual

Lacuna Masakhane Parts of Speech Classification Challenge

Helping Africa
$7 000 USD
Completed (over 2 years ago)
Classification
Natural Language Processing
472 joined
101 active
Starti
Jun 08, 23
Closei
Sep 17, 23
Reveali
Sep 17, 23
Making use of Language column
Data · 11 Sep 2023, 16:12 · 1

Here's the format of the Test.csv

Id Word Language Pos

Id00qog2f11n_0 Ne luo

Id00qog2f11n_1 otim luo

Id00qog2f11n_2 penj luo

I don't plan on doing it myself, as it is probably a bit more technical than I have time for....

but I was wondering if contestants are allowed to change the code to make use of the language field?

I imagine that this would be a pretty useful bit of info. But there's this instruction:PLEASE DO NOT MAKE ANY CHANGES IN THIS SECTION

which sounds official, but only says 'Please'. Doesn't say you can't.

Discussion 1 answer

Hey, according to the competiton description:

"It is important that only one solution be built for both languages as this is a step in creating a solution that can be applied to many different languages, instead of having to create a model for each language."

My understanding is that the model or the model training ideas will then be applied to bigger POS datasets and be infered on more languages it has not seen.

If you intend on doing it by feeding it to a FCNN as input with the text embeddings thats easily doable within 6 days but for new languages there will be a problem as the languages won't have a known representation for the model (not talking about luo and tsn but languages the model will then be used on after the comp).

They also specifically ask us not to do it, I think we should respect that.

12 Sep 2023, 00:25
Upvotes 0