📚 Join the Buzz: Wow 0.7

Lacuna Masakhane Parts of Speech Classification Challenge

Helping Africa

$7 000 USD

Completed (almost 3 years ago)

Skills you will learn

Classification

Natural Language Processing

470 joined

100 active

Info Data Chat Leaderboard

Start

Jun 08, 23

Sep 17, 23

Reveal

Sep 17, 23

kenyor

Wow 0.7

Connect · 7 Aug 2023, 05:36 · 4

I think at this point, I have to relax because the idea of 0.7 is just too big for my brain. LOL. How did you guys get your notrebooks that high? LOL

Discussion 4 answers

isaacOluwafemiOg

Kwame nkrumah university of science and technology

We'll probably have to wait till the end of the competition. I learnt winning solutions are published some time after the competition

Also, public score performance isn't always a true reflection of a Model's performance. Your public low-scoring model may outperform the 0.7 model so don't give up on refining your model and making more submissions.

7 Aug 2023, 07:17

Upvotes 0

Hi kenyor, I'll help everyone out, because I am just in this for fun:

My current score of .48 is just from messing around with the train_pos.ipybn notebook, trying different fine tunings with languages similar to the target languages.

Hint: They tell you in the paper how to get up to 0.7

7 Aug 2023, 10:07

Upvotes 1

jpandeinge

University of manchester

i have done it, but my current public score is lower, I am not overfitting, which is a bit weird because I can't figure out how my score isn't improving on the general score, although my accuracy is around 0.68 and can't go beyond 0.43

replied to db22 Aug 2023, 12:25

Upvotes 0

I've also now worked out adapters, but no combination i've used has got above 0.44

We don't have any target language POS data to work with, so the only real validation is the submission to the public leaderboard, as far as I can tell. I haven't worked out how to extract the prediction per adapter, to see how they contribute to my fusion layer. They don't mention a fusion layer in the paper.

But by turning them off and on, I can see what influence an adapter has on the POS results.

They explicitly allowed any unlabelled monolingual data in the answer to one of the questions, but the task adaptation seems limited to source languages. So whatever your local accuracy is, it probably doesn't mean anything.

The instructions in the paper are the following, but I don't think it's clear whether they add a fusion layer above task and language adapters, and train that:

(1)We train language adapters/SFTs using monolingual news corpora of our focus languages. We perform language adaptation on the news corpus to match the POS task domain, similar to (Alabiet al., 2022). We provide details of the monolingual corpus in Appendix E.

(2) We train a task adapter/SFT on the source language labelled data using source language adapter/SFT.

(3) We substitute the source language adapter/SFT with the target language/SFT to run prediction on the target language test set, while retaining the task adapter

replied to jpandeinge28 Aug 2023, 16:38

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status