22 Apr 2020, 13:34

Zindi solutions: A useful open-source model of urban air quality for Africa

In just a few days, the Zindi community has built a model that can accurately predict air quality in cities and towns across Africa, filling in the blanks where there are no air quality sensors.

In searching for data about air quality in Africa, we came across an image that effortlessly illustrates the problem of monitoring air pollution in Africa. This map of internet-connected air quality sensors around the world (from the World Air Quality Index (WAQI) project) inadvertently highlights why we need new approaches for tracking air quality in Africa.

The WAQI project maps ground-based air pollution sensors across the world, but Africa’s coverage is not great. Source: WAQI Project.

This is backed up by recent reports from the University of Pretoria and UNICEF, which both conclude that we need better data to understand Africa’s air pollution situation before trying to improve it.

The case for better air quality data in Africa

Beyond general respiratory health issues, we know that air pollution has a direct effect on COVID-19 mortality rates. This recent study (published on the Medrxiv preprint server, which means it is not yet a peer-reviewed scientific publication) shows a direct statistical link between long-term air pollution and COIV-19 mortality.

Epidemiological, public health, and economic models all need air quality data for accurate outputs. The authors of the above study say that higher pollution means stricter social distancing and increased medical preparedness is required where pollution levels are higher.

While Zindi hackathons are designed to offer a challenge and learning opportunity to Africa data scientists, we’re also very interested in contributing practical, open-source AI solutions to help in the battle against COVID-19. So as soon as we got devnikhilmishra’s winning solution to the recent #ZindiWeekendz Urban Air Pollution Challenge (you can find it yourself on GitHub), we put it to the test. We wanted to see how it worked predicting air quality in places where we have no ground-based pollution sensors.

Zindi's community rises to the challenge

The challenge was to build a model that could take in satellite data for a location and predict the air quality on the ground, as measured by ground-based sensors looking at particulate matter (PPM2.5, a common measure of air pollution). This air quality challenge ran from 10 to 12 April, and attracted over 200 data scientists across Africa and around the world.

We plotted the winning model’s predictions against the ‘true’ values from a sensor in a location that was not provided in the training set. You can see that the prediction (in orange) closely matches the sensor data (in blue) for a large city (London), but also for a smaller town in South Africa (Worcester, with a population of less than 100 000). This means that the model works well for both large and small urban centres.

Our air quality predictions accurately match actual data gathered from on-the-ground sensors in London and Worcester, SA. Source: Zindi.

A better picture of air quality in Africa

Now that we’ve checked that model works as expected, we can put it to use! Since the model’s only inputs come from satellite data, we can apply this model to any location, even one that doesn’t have any ground-based sensors.

This lets us build up a picture of air quality for places where there was previously no data available. The map below shows our air quality model applied to major cities across Africa, predicting the air quality for a single day.

Predicted air quality for major cities in Africa for a single day (April 2, 2020). Source: Zindi.

We’ve borrowed the WAQI colour-coded air quality indicator system for the image above, but you’ll notice that this map of Africa has noticeably more information about air quality on it than on the WAQI map above. This is useful, usable information for policymakers, public health researchers and epidemiologists modelling the COVID outbreak and its public health impacts.

With access to historical satellite data, we can easily look back at long-term trends to better understand trends in health outcomes, or to compare the COVID-19 lockdown period to a period of normal activity.

We’re excited to say that this model is freely available for use under a CC BY-SA 4.0 license, so please get in touch (zindi@zindi.africa) if you’d like to put it to use.

We’d like to say a big thank you to all our Zindians who participated and helped make this project possible, particularly those who placed at the top of the leaderboard. We’re also grateful to Microsoft for sponsoring #ZindiWeekendz and making all this possible. If you’re interested in seeing some other solutions, check out the GitHub repositories below. And if you want to put your skills towards our COVID-19 challenges, join a #ZindiWeekendz hack this Friday!

devnikhilmishra (1st place) - GitHub Repo

CoviData (2nd place) - GitHub Repo

Klai (3rd place soln) - GitHub Repo