The data have been split into a test and training set. The training set, on which you will train your models, contains 353 images of x-rays of TB-positive lungs and 365 images of x-rays of healthy lungs. The test set contains 82 images.
You are asked to build a machine learning model to predict the likelihood that the lung in the x-ray is TB-positive. In your submission file, LABEL=1 means the x-ray is TB-positive. Please keep your values as probabilities.
Files available for download
-
Train.zip - you will use these images to train your model.
-
Test.zip - these are the images on which you will apply your model and test your model.
-
SampleSubmission.csv - is an example of what your submission file should look like. The order of the rows does not matter, but the names of the IDs must be correct.
-
test.csv - list of all test file names and their ID
-
train.csv - list of all train file names and their ID
We realize the image files may be too large for some people. To make the dataset easier to use, we offer you the option of working with reduced resolution (800 pixels each) versions of of the train and test images. This may reduce the accuracy of your solutions slightly. These can be dowloaded as:
-
Train_small.zip
-
Test_small.zip