A particular field (14952, and i guess some other fields would as well) has a mix of labels, the other label being 0 so on grouping fields, the label wasn't an Integer and it was a value between 4 and 5
I believe a line in the code that should have sorted that is: data = data[data.label != 0]
The team might want to fix this in the starter notebook.
Do you solve the dismatching dimension Problem :X = np.append(X, X_tile, axis=0) !
Yes, i found that the optimal number of tiles you can run before my colab crashes is 400.
So in this line: for tile_id in tile_ids_train, i ran it in batches of 400
1. for tile_id in tile_ids_train[:400]
2. for tile_id in tile_ids_train[400:800]
and so on, then i combined my dataframes after running all of it.
Also, for the dismatching problem, it depends on what you are trying do. Some tiles don't have data for all of the 41 time frames, so you might need an exception block to capture those tiles if you are trying to get all the time frames at once.
So you split your data that a smart move.
thanks for the help I will try it but if you use kaggle is faster than colab.