The objective of this challenge is to classify network activity from various websites as either cryptojacking or not, based on features related to both network-based and host-based data.
There are ~4 000 recordings in test and ~9 000 readings in train.
The data for this challenge was obtained with the help of researchers from the Universidad del Norte whose work you can access here.
The data was originally collected by researchers from Instituto Politecnico Nacional whose work, including details on how the data was prepared, is available here. The data contains features related to network activity and a description of the variables is included in the Variable Definitions file.
Files
Description
Files
This is a starter notebook to help you make your first submission. If the file open weirdly you can ctrl-S and it will save to your download folder.
Train contains the target. This is the dataset that you will use to train your model.
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
Description of variables in the train and test data
Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the ID must be correct.