Beat the Enigma Machine 📚

Beat the Enigma Machine by AI Hack

Helping Tunisia

TBD

Completed (over 3 years ago)

Skills you will learn

Natural Language Processing

126 joined

29 active

Info Data Leaderboard

Start

Aug 29, 22

Aug 30, 22

Reveal

Aug 31, 22

About

The data contains phrases from a movie. All special characters and spaces have been remove.

There are ~56 000 phrases in train and ~2 500 in test.

The training dataset has 3 columns:

Plain_text: Original text in the transcript with special characters and spaces replaced removed
encrypted_text: The original text with special characters and spaces replaced by X and encrypted using the enigma machine
encryption_key: The encrypted message key used to encrypt and decrypt the phrase

You will notice that for evaluation purposes (we use log loss distance) we are one hot encoding the evaluation test set, hence we are asking you to submit the raw probabilities out of the model, where, for a given ID, each row represents a position on the sequence and each column represents the probabilities of each label (token) .

Files

Description

Files

Contains the target. This is the dataset that you will use to train your model.

Resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.

Shows the submission format for this competition, with the ‘ID’ column mirroring that of Test.csv. The order of the rows does not matter, but the names of the ‘ID’ must be correct.

Join the largest network for
data scientists and AI builders

About FAQs

Status