Check out M&M's winning approach to the Amini Canopy or Crop Challenge, where the goal was to classify Sentinel-2 pixels to distinguish between forest cover and canopy crops.
Their approach focused on building a simple yet powerful machine learning pipeline using XGBoost, a gradient boosting framework known for its speed, scalability, and predictive accuracy.
The first step was to prepare the data for modeling. M&M converted timestamp fields into datetime objects, which allowed them to extract meaningful temporal features. They then grouped the data by unique ID and computed several statistical aggregations per feature - namely: mean, standard deviation, minimum, maximum and median.
These statistics were calculated for both the training and test sets, creating a uniform feature space.
To handle missing values, they filled NaNs in both the train and test datasets with the mean value from the training set, which helped maintain consistency and avoid data leakage during inference.
For model evaluation, M&M used Stratified K-Fold Cross-Validation with n_splits=5, shuffle=True, and a fixed random_state=42. This ensured that the distribution of the target classes was preserved across folds, helping prevent overfitting and providing a robust estimate of model performance.
At the core of M&M's solution was an XGBoost Classifier, chosen for its excellent performance on tabular datasets. They iteratively tuned hyperparameters based on validation performance, refining the model until it achieved a local F1 score of 0.99.
Beyond F1, they also used XGBoost’s built-in feature importance scores to gain insights into which variables were contributing most to the model’s predictions. This helped with both interpretability and further feature refinement.
Thanks to a strong pipeline, careful validation, and the power of XGBoost, M&M's model delivered highly accurate and generalisable predictions—ultimately earning a winning spot on the leaderboard.
📍 You can view and run the full solution on the Data Page here.
Moetaz Bel Hadj Youssef and Mouna Mnejja are both second year engineering students at Tunisia Polytechnic School. This marks Moetaz's 11th and Mouna's 8th challenge on Zindi. Passionate about applying data science to real-world problems, they’re using Zindi to build hands-on experience and grow their CVs—so by the time they enter the industry, they’ll already have a track record of solving impactful challenges.
I think there is a mistake with the links. It directs us to the amini soil prediction challenge instead of the notebook. And also the notebook can not be found. When you go to the data page:
I hope this will be corrected soon. I really want to learn from their wonderful solution.
@Zindi @Amy_Bray
Oh no! Can you look now please.
Everything is looking good now. Thank you @Amy_Bray.