Beginner looking for group to join
Help · 23 Apr 2020, 09:05 · 6

Hi Everyone,

Hope you are keeping safe indoors. I am a newbie in the data science world, looking to start entering some competitions.

Recently I obtained a PGDip at UJ for Machine Learning and AI which was mostly theory and math behind the various algorithms. I am subscribed to Datacamp, use Simpilearn, and read a number of blogs. I do MOOC after MOOC.

I am now looking for a team I can join, and start doing some competitions under thier guidance, because it is the only way to REALLY learn.

Looking forward to being adopted.

Discussion 6 answers

Hey , good day to you

I just started my data analysis course, and I'm looking for a mentor, someone to guide me through.

And pls if you know of any one, do well to reply me.

Thanks

23 Apr 2020, 09:52
Upvotes 0
User avatar
WanjohiChristopher
Jomo kenyatta university of agriculture and technology

Hello Neiloe.....am currently in same situation ..i think we could form a team

23 Apr 2020, 11:28
Upvotes 0

Hi,

I'm in a similar situation. Self taught using 1 course and several books. I'm looking for a team I can contribute to.

Thanks

23 Apr 2020, 12:40
Upvotes 0

Hi Neiloe,

Firstly, good on you for putting yourself out there. It's the first step in a whole new world of learning. I wanted to give some advice (if I may) about how to best utilise this platform.

For a Data Scientist / ML enthusiat, Zindi (and most competitive data science platforms) is a very good environment to apply techniques that you have already learnt, or experiment with new ones. Although competing and getting yourself up on that leaderboard is the ultimate goal here, it's not the best way for a beginner to learn in my opinion.

How to proceed then? Well, I think the best way to start is to pick a type of problem you think you know something about - lets say a tabular (not time-series) regression problem. Thats probably the easiest to start with. Then, go and see what are the methods you've learnt so far that you think can be used to tackle that problem. Dont just go for the ones that are kicking ass on Kaggle (yep, talking about boosted trees). Rather, go for the methods you understand, because the way you will learn the most is by really trying to understand why things are happening the way they are when you make predictions.

Once you've chosen a type of problem you want to focus on, and have a set of techniques in mind you want to try out, pick a competition. You can start with a knowledge one, or a live one it doesnt matter, but try not to worry too much about getting the best score (that will come later). What I suggest you focus on is three things:

  1. Data exploration. @DrFad has shown us all in a couple of competitions that great features beats complex ensembles. And the only way to engineer great features is to understand the data, and to practice doing it. Remember, the point is not to make pretty plots. The point is to understand the relationships between the variables you have and the variables you want to model.
  2. Validation strategies. In real world projects this is super critical because you can realistically only put one (or maybe 2/3) models into production and to do that you need to measure the performance on the data you have using some validation stategy. Same thing on Zindi, you need to figure out what will likely give you a good performance in general, and not just on the public leaderboard.
  3. Model interpretation. Once you're at the point where you are making predictions, inspect the residuals. Think about what that model is trying to do to the data and why it could be making the errors it's making. That will bring you back to data exploration. Residuals could be a good way to find new features.

Now, teaming up is great and a fantastic way to learn, but you are also your own best teacher and many of the top performers have shared their notebooks in the discussions (check out the last few hackathons for some really good ones) which is almost as good as having been on their team.

If you want to team up on a knowledge competition, let me know and hopefully we can both learn something ;-)

23 Apr 2020, 14:08
Upvotes 0

Wow. Thank you so much for the insightful information. I was really starting to get frustrated with myself and going through a lot of theory which would most likely make better sense if I actually tried something. You are right about Kaggle :D , I went into shock and decided to try on a smaller platform.

I will surely take your advice, peruse a few notebooks and try get going. I really do appreciate the advice.