Primary competition visual

SUA Outsmarting Outbreaks Challenge

Helping Tanzania, United Republic of
$12 500 USD + AWS credits
Completed (~1 year ago)
Prediction
815 joined
395 active
Starti
Dec 06, 24
Closei
Jan 31, 25
Reveali
Feb 01, 25
User avatar
Knowledge_Seeker101
Freelance
Cholera
Data · 19 Jan 2025, 14:18 · 10

has anyone noticed the unusualness of the cholera instances

Discussion 10 answers

Yes, almost no rows in training data but a lot in test data. The same goes for some places and locations.

In other words, for some rows, we are supposed to make predictions for an unknown disease and an unknown place ! I'm puzzled about the train/test split.

19 Jan 2025, 17:45
Upvotes 1
User avatar
Knowledge_Seeker101
Freelance

Yea that's very confusing, how are we going to train the model for predicting cholera if we can't train the model with cholera instances

User avatar
Koleshjr
Multimedia university of kenya

I think the test set gives you hints to handle the cholera classs

User avatar
Knowledge_Seeker101
Freelance

Thanks for the insight

User avatar
51pegasi

You can get a hint with the test set. The tricky part is that the public leaderboard seems to not consider the cholera rows

20 Jan 2025, 15:39
Upvotes 0
User avatar
Knowledge_Seeker101
Freelance

You mean to say the public leaderboard is not evaluating the correctness of the cholera instances

User avatar
51pegasi

Yes

I won't worry too much about cholera as no model will be able to predict out-of-distribution values very well.

24 Jan 2025, 17:14
Upvotes 1
User avatar
Knowledge_Seeker101
Freelance

Yeah that's true, the dataset is heavily skewed

Thank you for highlighting this. I think it's either there was an outbreak of cholera or the reporting regulations changed and every facility was supposed to inlude cholera in their monthly reports.

28 Jan 2025, 07:58
Upvotes 0