Primary competition visual

Sea Turtle Rescue: Error Detection Challenge

Helping Kenya
$1 950 USD
Challenge completed over 6 years ago
Classification
Anomaly Detection
321 joined
51 active
Starti
Nov 30, 18
Closei
Apr 28, 19
Reveali
Apr 29, 19
About

About Local Ocean’s sea turtle rescue program

Since 1998, Local Ocean Conservation (LOC) has been running a by catch net release program. By catch is when a non-target species, in this case endangered and critically endangered species of marine turtles, are captured in fishing gear accidently. They become tangled in nets and risk injury, drowning and even slaughter. What do we do about it? We work with around 350 local fishermen who, instead of slaughtering the turtles they catch or leaving them to die, contact us. We are then able to race to the rescue!

  1. We assess the turtles condition, checking for parasites and injuries.
  2. We collect data such as measurements, weight, species, gender etc.
  3. We attach a tag or record the number of an existing tag
  4. We maintain a database of all recorded data
  5. We monitor the condition of turtles that have been recaptured, sometimes these are even ex-patients from our Rehabilitation Centre!

If the turtle is fit and healthy we transport them to a safe release site where they are returned to the ocean. If the turtle is sick or injured, we transport them back to our Rehabilitation Centre for medical treatment and provide specialist care until such time as they are strong enough to be released.

We provide a small amount of remuneration to help participating fishermen to repair any damage to their gear and cover any other expenses incurred such as telephone and transport costs.

Each year the number of fishermen involved in this programme increases along with the number of turtles released, a reflection of the success of the linked education and community development programmes.

Since 1998 over 10,000 turtles have been released through this programme.

The problem YOU will help us solve!

Each time a fisher catches a turtle he/she delivers the turtle to LOC researchers who collect the data on the rescue in a handwritten logbook. (Though we hope to introduce an electronic form of data collection soon.) Back in the office, LOC staff and volunteers then enter the handwritten data in an Access database. At the end of the year, LOC staff review each rescue/field to compare the database entry with the hand-written log book to ensure the integrity of the data.

LOC would like to reduce the amount of time and effort needed for this review process. (And it is a lot!)

The objective of this challenge is to process the data in the turtle rescue database and assign a probability that any given field has been entered erroneously from the logbook into the database.

Note if the entry for a particular field in both the clean and dirty data is "NA", this should be assessed as no error.

Files
Description
Files
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
Train contains the target. This is the dataset that you will use to train your model.
This file describes the variables found in train and test.
Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.
Is all of the data from 1998 to 2010 (except for 2008 and 2009) as they were originally entered into the database from the handwritten logbooks