Lacuna Masakhane Parts of Speech Classification Challenge 📚

Lacuna Masakhane Parts of Speech Classification Challenge

Helping Africa

$7 000 USD

Completed (almost 3 years ago)

Skills you will learn

Classification

Natural Language Processing

470 joined

100 active

Info Data Chat Leaderboard

Start

Jun 08, 23

Sep 17, 23

Reveal

Sep 17, 23

About

The training set of 19 languages is available at this repo: https://github.com/masakhane-io/masakhane-pos

Use this starter notebook to get started: https://github.com/masakhane-io/masakhane-pos/blob/main/train_pos.ipynb

The test set contains 17 parts of speech from Luo and 17 parts of speech from Setswana. Both these languages are unseen in the training set.

You can read more about the dataset and some idea that have worked in the past in this paper (https://arxiv.org/pdf/2305.13989.pdf). However, you are encouraged to come up with your own methods.

Files

Description

Files

Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.

These are the text files in CSV format for Luo and Setswana.

Join the largest network for
data scientists and AI builders

About FAQs

Status