This was a great challenge and I’m grateful to have achieved 1st place on the private leaderboard, also my first 1st-place finish on a private LB, which is awesome! Huge thanks to my teammate @wuuthraad and to the competition organizers for setting up such a challenging problem.
Our solution used LightGBM with carefully constructed temporal features to predict congestion across all required future horizons. One of the biggest challenges was the weak correlation between local CV and the public leaderboard, which meant trusting cross-validation more heavily and tracking experiments and improvements thoroughly. Because of the strong class imbalance, we also relied on a two-stage balancing strategy, combining downsampling of fully free-flowing sequences with per-fold oversampling of minority -classes during training to optimize F1-macro without leaking information into validation along with an F1 eval function for early stopping.
The training data is constructed using strict, gap-free 15-minute windows, grouped by camera, and only samples with all future targets present are retained after building out lag features. This enforces temporal integrity and fully respects the embargo and real-time inference constraints. I also experimented with sequence-based LSTM models trained on full 15-minute multivariate inputs to predict all future minutes jointly, but these did not outperform the per-horizon LightGBM models in terms of generalization.
In addition, I explored video-derived features extracted using YOLO tracking (e.g. average, max, and standard deviation of vehicle counts across frames). These features consistently improved LSTM validation performance, but did not improve LightGBM results and did not generalize to the test set. In practice, the tree-based models appeared to generalize better by focusing on cleaner temporal signals, while higher-capacity sequence models were more sensitive to noise introduced by video-level features on testing.
Please find the model training notebook and repository for our final solution at the link below:
Notebook: https://github.com/daniel-bru/Barbados-Traffic-Analysis-Solution/blob/main/modelling_v2.ipynb
Repo: https://github.com/daniel-bru/Barbados-Traffic-Analysis-Solution/tree/main
Repo & notebook notes:End-to-end experiments can be run with `modelling_v2.ipynb`. Each experiment is saved under `lgbm_training_history/`, just update the notebook, and set a new name for the experiment in the notebook config.
Curious to hear others' approaches and whether extracting features from the videos helped improve your models
Thank you Daniel for sharing your solution. I really appreciate it.
I have one question though, doesn't the create future targets violate the back propagation rule?
def create_future_target_features(df, target_cols, future_steps=[3, 4, 5, 6, 7]): """Create future target features for prediction""" df_feat = df.copy() for col in target_cols: for step in future_steps: df_feat[f'{col}_t{step}'] = df_feat.groupby('view_label')[col].shift(-step) return df_feat
The future targets won't be there in production yes?
@Koleshjr, Thanks. That function is used to create the dataset for training, but when we actually start training to split X and y, all those targets are dropped from X. You will see that in the prepare_modeling_data() function👍
I have actually seen. My bad. You used it for filtering
Thanks again for sharing 🤝
Awesome, yes I used it to filter out what shouldn't be there and only point to the correct target for y
Can I ask which was the main features that led to the most improvement?
All the time based futures were crucial and worked well with the lag features. The most improvement came from lower learning rate, and oversampling approaches.
overall it was a challenging but really great competition, I would like to thank @21db and the @zindi team for a great competition
@21db was the key driver behind the overall approach. We experimented with a wide range of modeling techniques, including deep learning, linear regression, and various ensembling strategies, to identify the most robust solution. as mentioned, During submission, we observed a lack of correlation between our local cross-validation results and leaderboard performance. This was particularly evident when I was training a CatBoost model, although it significantly outperformed LightGBM in local CV, its public leaderboard performance was comparable to LightGBM, and its private leaderboard score dropped substantially relative to LightGBM. We also explored the inclusion of video-based features. however, these resulted in little to no improvement in overall performance. Based on this, @21db made the decision to focus primarily on LightGBM and invest effort into hyperparameter tuning. Which in the end paid off
Let's star this Repo guys⭐. Amazing work by the team @21b and @wuuthraad! Congratulations and we really appreciate you sharing your code!
Thanks @CodeJoe