I recently downloaded the train.csv file for this competition, but the isAccident column described in the variable definitions is missing.
I also saw a discussion where it was said that 'standing vehicle' etc is an incident, the target just becomes a vector of 1s in that case.
My question: what is not an incident?
the train.csv data contains all the incidents so the target will become a vector of 1's. you are correct. one needs to create a df with of all possible combinations of dates and road_segments and then merge the train data on to this grid (see starter code). then you have a much much larger dataset of which only some of the observations (those that matched the original train) will have target = 1.
Ohhh. That makes sense. Thanks a lot!