Primary competition visual

Fossil Demand Forecasting Challenge

$5 000 USD
Completed (over 3 years ago)
Forecast
1009 joined
200 active
Starti
May 24, 22
Closei
Aug 28, 22
Reveali
Aug 28, 22
User avatar
Nelly43
Zindi
1st Place Solution
Notebooks · 20 Jan 2023, 11:53 · 8

This is an overview of the approach I took in my solution that was ranked first following the top 5 challenge and review by Fossil. I may have to get confirmation from Zindi and/or Fossil as to whether I can publicly share the code but I've tried to highlight some important components of the solution below. Many thanks to the organizers and Zindi for putting together this very interesting competition and thanks to all for the challenge as well as unique learning experience!

Overview

The solution is made up of several ensemble models comprising of LGBModel, XGBoost, and CatBoost models. A total of around 140 models are trained in just under 1 hour(not really sure here haha) with predictions/forecasts from each ensemble model combined and averaged to give a CV score of 133538.77022.

Approach

This solution uses both time series analysis and causal(regression) modeling to identify patterns in the sales data in order to forecast demand for the various products four months into the future. The solution treats the data as a multivariate time series with several time series resulting from the various variables associated with sales of the products.

The solution is designed as a stacked ensemble model with the time series analysis and projection carried out in the base model whose forecasts, along with the original features, are then reshaped and fed into the meta learner to be used as predictors for the target, which is the sellin for each product four months into the future.

Base Model

The base model learns the autoregressive as well as temporal patterns within the time series. As such, the base model uses past values to try and forecast demand at current or future time steps. The model is trained recursively to forecast future values one time step at a time and using forecasts from previous time step(s) as features for the current time step.

Meta Learner

The meta learner tries to establish direct relationships between future demand and factors influencing it, i.e sellout, inventory, etc. (Parthasarathy, 1994) suggests that critical factors related to demand need to be selected through analysis of past data and their effect quantified and expressed in the form of mathematical equations. These factors are then projected forward and the forecasted values are used as predictors in the causal model. This is the concept underlying the meta-leaner, where sellout forecasts at the final time step are used as predictors for the sellin variable at the corresponding time step.

Feature Engineering

This solution had very minimal feature engineering or fine tuning as I tried to focus more on precision and consitency, especially since the model was expected to hold for various periods throughout the year as well as account for product variation that may occur prior to retraining the model.

Features were extracted for each sku separately afterwhich the model learned patterns across all 3000+ products simultaneously. This was to ensure the model generalizes well even in the case of new products. Moreover walk-forward validation, along with the stacked ensemble models, helped keep performance of the model in check and identify any seasonal trends that could have skewed performance.

References

Brownlee, J. (2016, December 19). How To Backtest Machine Learning Models for Time Series Forecasting. Retrieved from Machine Learning Mastery: https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/

Chambers, J. C., Mullick, S. K., & Smith, D. D. (1971, July). How to Choose the Right Forecasting Technique. Retrieved from Harvard Business Review: https://hbr.org/1971/07/how-to-choose-the-right-forecasting-technique

Gujarati, D. N., & Porter, D. C. (2009). Time Series Econometrics: Forecasting. In D. N. Gujarati, & D. C. Porter, BASIC ECONOMETRICS (pp. 773-798). New York, NY: McGraw-Hill.

Hillier, F. S., & Lieberman, G. J. (2010). Markov Chains. In F. S. Hillier, & G. J. Lieberman, Introduction to Operations Research Ninth Edition (pp. 723-725). New York, NY: McGraw-Hill.

Hoseinzade, E., & Haratizadeh, S. (2019). CNNpred: CNN-based stock market prediction using a diverse set of variables. Journal of Expert Systems with Applications.

Lee, J. B. (2018, November 12). How to Develop Convolutional Neural Network Models for Time Series Forecasting. Retrieved from Machine Learning Mastery: https://machinelearningmastery.com/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting/

Parthasarathy, N. S. (1994). Demand forecasting for fertilizer marketing. Retrieved from FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS: https://www.fao.org/3/t4240e/T4240E00.htm#TOC

Sandmann, W., & Bober, O. (2010). STOCHASTIC MODELS FOR INTERMITTENT DEMANDS FORECASTING AND STOCK CONTROL. University of Bamberg, Germany.

Szabłowski, B. (2021, December 15). Sell Out Sell In Forecasting: Machine Learning for sales forecasting at Nestlé. Retrieved from Towards Data Science: https://towardsdatascience.com/sell-out-sell-in-forecasting-45637005d6ee

Discussion 8 answers
User avatar
flamethrower

Big Congratulations Nelly! You have a stunning framework. Well deserved.

20 Jan 2023, 12:27
Upvotes 0
User avatar
Nelly43
Zindi

Thanks so much, and congrats as well! You did excellent work throughout the competition, and definitely kept me on my toes haha.

User avatar
skaak
Ferra Solutions

Wow Nelly, this is amazing. Thanks for sharing!

You know, if you have some computing power and you can program, why not throw 140 models at the data! Nice animations btw - thanks for the effort.

20 Jan 2023, 14:31
Upvotes 0
User avatar
Nelly43
Zindi

Hi skaak! Glad you like it, and I hope it's helpful.

Hehe, they're pretty lightweight models actually, the entire solution runs on the free version of colab!

User avatar
skaak
Ferra Solutions

PS: perhaps one question - did you vary any features between all the models or did you just vary the seed?

20 Jan 2023, 14:41
Upvotes 0
User avatar
Nelly43
Zindi

I only varied features between the base model and meta learner and each variant of the base model and meta learner got the same set of features. Seed was also constant throughout the solution.

User avatar
Raheem_Nasirudeen
The polytechnic ibadan

This is superb, congratulations once again.

21 Jan 2023, 07:33
Upvotes 0
User avatar
Nelly43
Zindi

Thanks so much Raheem!