Hi guys, the notebook uses just the data in the train.csv and test.csv. It is a baseline approach and gives a score of 1.075 without hyperparameter tuning, kfold, or much feature engineering.
Please star my repo and follow me on github, so that I can keep on sharing notebooks.
Here is the link to the notebook.
https://github.com/mkm-world/laduma-analytics/blob/main/zindi_football_baseline_public.ipynb
whats the intuition behind the preprocess function? could you please explain?
Pre processing the data is the first stage. Checkout getting started notebook (in the data tab), it got good samples and explanations.