Here's the format of the Test.csv
Id Word Language Pos
Id00qog2f11n_0 Ne luo
Id00qog2f11n_1 otim luo
Id00qog2f11n_2 penj luo
I don't plan on doing it myself, as it is probably a bit more technical than I have time for....
but I was wondering if contestants are allowed to change the code to make use of the language field?
I imagine that this would be a pretty useful bit of info. But there's this instruction:PLEASE DO NOT MAKE ANY CHANGES IN THIS SECTION
which sounds official, but only says 'Please'. Doesn't say you can't.
Hey, according to the competiton description:
"It is important that only one solution be built for both languages as this is a step in creating a solution that can be applied to many different languages, instead of having to create a model for each language."
My understanding is that the model or the model training ideas will then be applied to bigger POS datasets and be infered on more languages it has not seen.
If you intend on doing it by feeding it to a FCNN as input with the text embeddings thats easily doable within 6 days but for new languages there will be a problem as the languages won't have a known representation for the model (not talking about luo and tsn but languages the model will then be used on after the comp).
They also specifically ask us not to do it, I think we should respect that.