Sendy Logistics Challenge
$7,000 USD
Predict the estimated time of arrival (ETA) for motorbike deliveries in Nairobi
23 August–25 November 2019 23:59
1114 data scientists enrolled, 433 on the leaderboard
How is Order ID generated?
published 16 Nov 2019, 08:16

Is the Order ID autogenerated or it was assigned to the data just for the sake of this competition?

Can it be it was autogenerated? I agree it looks a bit strange. Average age on test set is higher than on training set, while mean Order ID is about the same. I experienced heavy information leakage, even whith a pipeline and CV. If test set comes after train set I should propably have used calendar folding. But too late now...

edited 10 minutes later

Yes it can be autogenerated in the sense of auto increment in the database. If that was the case, then you can extract some date features from it. However, I realized it might have been assigned just for the sake of the competition.

Also, if the order ID is anything to go by, then you can see that the training and test sets were randomly sampled. In fact, they were deliberately sampled to contain nearly the same number of outliers.