🩺 This Week on Zindi: Discrepancies in Disease Occur...

SUA Outsmarting Outbreaks Challenge

Helping Tanzania, United Republic of

$12 500 USD + AWS credits

Completed (over 1 year ago)

Skills you will learn

Prediction

810 joined

390 active

Info Data Chat Leaderboard

Start

Dec 06, 24

Jan 31, 25

Reveal

Feb 01, 25

robson_dsp

Discrepancies in Disease Occurrence Records for the Same Hospital and Time Period

Help · 19 Jan 2025, 00:28 · 2

Hello!

I know is a bit late but I would like to understand something. Look at the code bellow.

mask = train['ID'] == 'ID_704a38c1-35ca-4e81-ab81-02fcf41d1f72_9_2019_Diarrhea'

train.loc[mask, :]

Based on my understanding, I interpret the resulting dataset as follows:

In September 2019, at a hospital with the ID "ID_704a38c1-35ca-4e81-ab81-02fcf41d1f72_9_2019_Diarrhea" located at the coordinates "Transformed_Latitude" and "Transformed_Longitude," there were 10.0 occurrences of diarrhea, and also 2 instances of 0 occurrences. This doesn't make sense to me! How can there be different counts of the same disease recorded at the same hospital during the same time period?

Below is a code snippet that shows this happens multiple times.