Primary competition visual

Beat the Enigma Machine by AI Hack

Helping Tunisia
TBD
Completed (over 3 years ago)
Natural Language Processing
126 joined
29 active
Starti
Aug 29, 22
Closei
Aug 30, 22
Reveali
Aug 31, 22
About

The data contains phrases from a movie. All special characters and spaces have been remove.

There are ~56 000 phrases in train and ~2 500 in test.

The training dataset has 3 columns:

  • Plain_text: Original text in the transcript with special characters and spaces replaced removed
  • encrypted_text: The original text with special characters and spaces replaced by X and encrypted using the enigma machine
  • encryption_key: The encrypted message key used to encrypt and decrypt the phrase

You will notice that for evaluation purposes (we use log loss distance) we are one hot encoding the evaluation test set, hence we are asking you to submit the raw probabilities out of the model, where, for a given ID, each row represents a position on the sequence and each column represents the probabilities of each label (token) .

Files
Description
Files
Contains the target. This is the dataset that you will use to train your model.
Resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
Shows the submission format for this competition, with the ‘ID’ column mirroring that of Test.csv. The order of the rows does not matter, but the names of the ‘ID’ must be correct.