Hey everyone! 👋
I noticed that many of you were looking for more insights after the webinar, so I decided to share a starter notebook that improves performance. This one achieves a 7.17 RMSE, which is a step up from the baseline. 🚀
🔗 https://www.kaggle.com/code/dukekojokongo/ibm-skillsbuild-improved-starter-notebook
✅ Feature engineering tricks to improve model performance
✅ A better handling of missing values and zero values from months 1 to 6
✅ Optimized model parameters for better forecasting
I hope this helps! Feel free to test it, tweak it, and share your thoughts. Let’s push for even better scores together! 💪🔥
Credits to @MuhammadQasimShabbeer for the uploaded dataset
Don't forgot to upvote the notebook :)
Nic and neat notebook @CodeJoe , i appreciate your effort and help
Thank you for the kind words @Knowledge_Seeker101
Hi @CodeJoe! I'm still struggling with the best strategies on validation and feature engineering and your notebooks are helping a lot! I've tried to reproduce your example but the "complete_data" dataframe in your notebook has 32906 rows while when I try to reproduce it with the raw data from the "Data" page, it turns out to have 136409 rows, almost 4 times more rows. Have you done any pre-processing? E.G.: filtering some periods of dates or some combinations of consumer_device/user_id?
@silvaemqap Yes I basically dropped the zero values to get a better score with gradient boosting models. All of it is explained in the notebook.
Good job @CodeJoe - also trying to make peace between the massive gaps between cv scores vs p.leaderboard in my experiments.. Kind of competition that may give us another earthquake just before I recovered fully from the cassava root estimation one..
You can try a sliding window validation @silvaemqap and see how will turn out if not yet....
I understand your plight😂. we hope for the best.
Thank you very much @CodeJoe, you just made my day, I am happy. Your notebook is very neat and very comprehensible. You are a great data scientist, keep it up.
@RareGem Thank you for the kind words. I really appreciate it.