Finding unmapped schools from space with AI
Accurate data about schools is critical to provide quality education and promote lifelong learning, ensure equal access to opportunity, and ultimately to reduce poverty. These are core aspects of human dignity and human development, and accordingly, are core UN Sustainable Development Goals (SDG4, SDG10, and SDG1 respectively). However, in much of the world educational facility records are inaccurate, incomplete or non-existent.
UNICEF has launched a bold initiative to map the connectivity level of every school in the world. Under the initiative a big data platform, known as Magic Box, that combines private sector data with new computational techniques to gain critical insights into the needs of the most vulnerable populations. Development Seed and Maxar (previously DigitalGlobe) are supporting this effort by bringing state of the art machine learning to Maxar’s massive global archive of high-resolution satellite imagery. Our first part of this effort mapped schools in Liberia. The latest effort added 7000 previously unmapped schools across Colombia and eleven Eastern Caribbean nations.
Schools are like any other infrastructure. They have a clear primary purpose, but they play so many other roles in communities. They provide a place for public gatherings, public recreation facilities, shelters, and polling stations. One implication of this is that schools look different from region to region. The look can vary due to funding, available materials, and land availabilities, as well as local architectural style, history, and local purposes. Because of these differences, it’s hard to imagine a consistent set of rules that you can provide a computer to reliably detect schools from space.
Modern deep learning approaches provide a viable path to overcome this challenge. These novel methods are presented with a wide variety of images of schools and a wide variety of images of buildings that are not schools. From this, they are able to develop a more sophisticated understanding of what is and is not a school. The results from the automated approaches aren’t perfect yet. But combined with expert human validators, deep learning can quickly scan a huge amount of data and empower human validators to quickly add thousands of schools to the map. It significantly reduces time and cost for such a mapping task.
Schools in high-resolution imagery
Despite their varied structure, many schools have identifiable overhead signatures that might make it possible to detect them with modern deep learning techniques. Schools can be observed from space and have clear features, e.g. building size, shape, and facilities. Comparing to the surrounding residential buildings, schools are bigger in size, the shapes vary from U, O, H, or L shapes (Figure 2); schools usually have multiple facilities attached to them: parking lots, playgrounds, basketball courts, swimming pools, etc. Schools in an urban and richer neighborhood area can have a group of buildings that have a similar roof type and color, and sometimes they come with a clear school boundary. A school in a rural area may have bare ground as a playfield for kids, surrounded by green space and can be accessed by roads or trails.
Maxar supported this research by providing access to its 31cm resolution Vivid Basemap. This project was selected as a recipient of a GBDX Research Award and will continue to be supported by DigitalGlobe after the life of the award. In this phase of the project, we use DigitalGlobe Vivid imagery (DG Vivid). DG Vivid is a snapshot of the Earth, mosaiced sharpened and color-enhanced. This high-resolution imagery product has a 50cm spatial resolution and global coverage. At this sub-meter spatial resolution imagery, objects like buildings, cars, and trees can be clearly observed from space.
One challenge of working with high-resolution imagery is that it is A LOT of data. Scanning the entire country of Colombia to find schools required processing more than 500 GB of imagery data. In order to accomplish this project within a reasonable timeframe, we used a new approach and a new set of open source tools our team created.
Building a School Identifier with Deep Learning
In the project, we applied Convolution Neural Networks as a school classifier to narrow the search space. Briefly, this works by feeding individual satellite images to our CNN, to get the probability that the model thinks a school is present in that image. Images that are predicted to have a school are sent to our Data Team, comprised of expert mappers, to verify the school in the image and map it (Figure 3). The Data Team also prepared an extremely high-quality training data set that helped us to get accurate results. We built our school classifier by enhancing two pre-trained models, Xception and MobileNetV2, from ImageNet, with the validated and cleaned dataset of known schools in Colombia. For the detailed technical approach, you can read our online report.
We did initial testing on two promising ML frameworks: Xception and MobileNetV2. A binary school classifier was trained with existing and cleaned school dataset in Colombia. Our preliminary testing showed slightly better results from Xception, so we selected Xception for the remainder of the project. However, MobileNetV2 only used a quarter of time per training iteration on exactly the same training set, and with only 1% validation accuracy dropped. For those without access to rich cloud computing resources, you might consider using MobileNetV2 to train the model instead.
The next step was to apply this model to small patches to detect whether there was a school present. In total, we scanned 52 million of these small patches drawn from DigitalGlobe Vivid imagery over Colombia and the Eastern Caribbean. To deal with the huge scale of this model inference, we used Chip n’ Scale, an open source tool created by Development Seed for massive satellite AI. Using Chip n’ Scale, and with support from the team managing DigitalGlobe’s imagery API, we were able to process 1.6 million images per hour completing the entire inference in 32 hours.
Our model identified 73,000 tiles that it predicted might contain an unmapped school. Five expert mappers reviewed all 73,000 locations over eight working days. While we couldn’t confirm every school from the imagery alone, the team was able to confirm around 11,000 schools in Colombia and Caribbean nations, including around 7,000 schools that didn’t previously exist on the map. The team identified a further 62,000 locations to be referred for ground validation by field agents and local government.
Our study showed that current deep learning and inexpensive cloud computing can assist humans to detect schools at scale in a rapid, rigorous manner. This provides the first object-based detection model for schools. A complete and accurate school facility map can further reduce the digital divide in education and improve children’s access to information, digital goods, and opportunities, and make the best use of limited educational resources.
The complete school map, combined with connectivity data collected by UNICEF’s Project Connect initiative, will be used to reduce the digital divide in education and improve access to information, digital goods and opportunities for entire communities. In addition, understanding the location of schools can help governments and international organizations gain critical insights around the needs of vulnerable populations, and better prepare and respond to exogenous shocks such as disease outbreaks or natural disaster.
For more information, consult the full project report. The peer-reviewed paper on this subject has been accepted in the first workshop of Computer Vision for Global Challenges workshop on June 16th. If you are planning to be there, we would love to hear from you — feel free to reach out to me on Twitter or LinkedIn.
If you are interested in deriving insights from satellite imagery and AI-assisted point of interest mapping e.g. school, health centers, human settlements, then please be in touch.