🚧 This Week on Zindi: Sharing Solutions And Methodol...

Uber Nairobi Ambulance Perambulation Challenge

Helping Kenya

$6 000 USD

Completed (~5 years ago)

Skills you will learn

Prediction

1091 joined

330 active

Info Data Chat Leaderboard

Start

Sep 17, 20

Jan 24, 21

Reveal

Jan 24, 21

Abdellam

Sharing Solutions And Methodology

Data · 25 Jan 2021, 17:47 · edited 21 minutes later · 7

Helloo everyone,

Congrats to the winners of this challenge ! Well Done !

Would be great if the best ranked (top 20 for instance or more if you want :D) could share some insights on the solution they have implemented (or may be a github link :D).

On our side, we have had a score of : 44.13 (rank 42 on the final leaderboard).

We used an optimization algorithm (scipy) to minimize the challenge loss function, while focusing on initialization (as we noticed that the results depended a lot on that).

We tried different initialization:

- gaussian based on the dataset distribution

- EM algorithm (Gaussian Mixtures)

Also, we did not manage to include properly the hour / day / month information into our clustering as it seems to lead to overfitting. (same for the other variables even though we did not spend much time on that actually).

Did you manage to leverage other variables ? Or is your solution only based on the "historical accidents data" ?

Discussion 7 answers

luffy

Hi @Adbellam

Thank you for sharing,

what is EM algorihm please ?

25 Jan 2021, 18:02

Upvotes 0

Abdellam

It is an optimization algorithm that is (among other use cases) used for a clustering method called "Gaussian Mixtures" (The scikit learn link : https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html)

replied to luffy25 Jan 2021, 18:10

Upvotes 0

personnon

Hi,

I think there are many top places because of randomness. For example I've jumped from 83 to 7 place with very simple solution which use fixed positions for all cars for any time)))

26 Jan 2021, 08:41

Upvotes 0