Southern Malawi experienced major flooding in 2015 and again in 2019 with cyclone Idai. Approximate dates of impact are 13 January 2015 and 14 March 2019, respectively.
We have broken up the map of southern Malawi into approximately 1 km sq rectangles. Each rectangle has a unique ID. Each rectangle has been assigned a "target" value which is the fraction (percentage) of that rectangle that was flooded in 2015.
For this competition, the training data is the flood extent in 2015 in southern Malawi, however, you are encouraged to source other flood data for other nearby regions and other historic floods to train your model. (Just be sure to propose any new datasets that are not listed here to Zindi at email@example.com for approval.)
The test data to measure the accuracy of your model is the flood extent in southern Malawi in 2019.
Each unique rectangle also has some additional features that we have already extracted for you. Although we encourage you to add more yourself, these features are included as a starting point. They are:
Train.csv has the target variable for 2015, along with the above features (including rainfall for both the 2015 and 2019 flood events). The submission file should have the predicted target for 2019 and cover the same locations as Train. The X, Y coordinates given represent a rectangle 0.01 degrees on each side, centered on that X-Y location.
The target is the percentage of the given rectangle that was flooded, with a value between 0 and 1.
In addition to the features we have provided in the train and test CSV, you are free to extract additional datasets and features from the sites listed below:
Think about features such as land cover, elevation and slope, soil properties etc. that will affect how water moves in the environment. You may also use data on weather and rainfall leading up to and during the flooding.
Note that you cannot use images to detect the actual flood extent in the test data. In other words, this is not a computer vision challenge for identifying actual flooding. Any solutions that use models to detect actual flood extent from actual flood images in southern Malawi in 2019 will be disqualified. However, you may use imagery from before the flood events (imagery must be from at least one month before the flooding) to extract features you think might be useful to your model.
Finally, you can also propose other publicly-available datasets or data sources to us. We will review and approve your proposals and add them to the official list of accepted datasets above.
To propose additional datasets, email firstname.lastname@example.org. New data sets will not be accepted after 8 May 2020.
Please note that you cannot use data during the 2019 cyclones or afterward. If you are using rainfall data you can use it for 18 weeks beginning 2 months before the 2019 cyclone.
Historic rainfall and temperature data
Malawi geospatial data
Other data sites
Please document all data sets used.