Jakarta Indonesia, 4th January 2020. (AP/Dita Alangkara)

Flood Water Level Estimation from Social Media Using Machine Learning

Published in

EcoVisionETH

7 min readFeb 13, 2020

Floods are one of the most common catastrophic natural disasters, they have accounted for 47 % of all weather-related disasters between 1995–2015 and have affected over 2 billion people according to the 2015 report from the Centre for Research on the Epidemiology of Disasters (CRED) and UN Office for Disaster Risk Reduction (UNISDR) [1]. During such events, it is extremely important to build accurate flood maps for emergency plans and rescue operations. To build such maps it is essential to promptly gather information from the disaster area. We have developed a system to automatically provide real-time flood level predictions from social media imagery.

Our approach is to determine the flood water level by looking at objects of known sizes that appear in social media images and quantifying how much they are submerged.

The prediction of the flood water levels are obtained by passing the images through a neural network that estimates up to what level are objects in the image submerged, level 0 corresponds to not submerged and level 10 to an averaged-size person being fully submerged. (Black: True Level, Green: Predicted Level)

Social media provides free and real-time data

Traditionally flood mapping is based on either of these data sources:

Field collection
Stream gauges
Remote sensing

For field data collection, people visit the disaster areas and survey the high water marks after the flood event. However, real-time field data collection is often expensive, dangerous, and difficult to obtain. Stream gauges can provide real-time data but only for monitored locations. Nevertheless, such locations are dispersed and often cannot provide sufficient information to map the flooded area completely. Remotely sensed satellite imagery is extensively used for monitoring the extent of disaster impact. But due to the long revisit cycle of the satellites and frequent cloud coverage, the information collected is not real-time [2].

**Left** Sentinel-2 image of Kerala, India, taken before the flood 2018 **Right** Image taken after the floods (Source: Earth Observatory/ NASA)

In contrast, social media platforms are a viable alternative data source. In fact, people located in the affected areas often share messages and pictures describing the situation. This data can easily be collected and has the advantage of being available in real-time and for free.

By defining a new meaningful annotation strategy, we can train a deep convolutional neural network to automatically predict flood water levels from images collected on social media.

How can we best exploit this source of data?

Our method is based on a neural network that we train in a supervised way. As with most deep learning applications, we need annotated images, but of course, it is not trivial to estimate the water level in centimeters based on an image, even for a human. In order to make this annotation feasible, we rely on objects of known sizes that are commonly found in pictures. Hence, we determine how much of the object is submerged in water in terms of some coarsely defined levels. We consider flood levels from level 0, which means no water, to level 10 which represents a human body of average height completely submerged in water.

Shows the annotation strategy for object classes Person and Bicycle

The height of the different levels is inspired by drawing artists who use head height as the building block for the human figure. To map level classes to actual flood height we consider an average height human body and derive the average height in centimeters.

We chose the following common object classes: Person, Car, Bicycle, House and Bus. For the ground truth generation, we annotate every pixel of each image in the dataset. Each pixel of the image is assigned to a specific instance of a class. Then, of course, each instance of the five selected common object classes is assigned a level value from 0 to 10. In addition, we also annotate the flood water in the image as an additional class Flood.

Overall the network is trained on 1,268 annotated images collected from several social media platforms. As this is a relatively small dataset for deep learning standards, we use the MS COCO [5] dataset for pretraining.

An example of original and annotated image

Our neural network for flood level estimation

We use Mask R-CNN [3] as the base architecture to make our predictions. The backbone of the architecture works as the main feature extractor where we can use any standard convolutional neural network (CNN). The Region Proposal Network (RPN) is a neural network that scans over the image and gives scores based on whether there is an object or not in the scanned region. The regions, also known as anchors, that obtain a high score are then sent to the next stage for classification.

In our adaptation of the Mask R-CNN, the proposal classification generates overall four outputs:

Class: We classify the object in the regions proposed by the RPN. If the class turns out to be background we drop the proposal.
Bounding box (bbox) regression: The bounding box for each classified proposal is refined to obtain a more accurate position of the object.
Mask: The task of identifying the exact silhouette of an object is called instance segmentation. To achieve this, an additional binary mask is generated with a small CNN that is applied to each region of interest and works in parallel to the proposal classification [4].
Flood level: This is our extension of the Mask R-CNN, which predicts the flood level for each object in the image.

Shows the overall architecture and respective output from each stage

The total training loss is the sum of all individual losses and it is written mathematically as:

We use cross-entropy as the loss function for level prediction and use the loss functions proposed by [3] for class, bbox, and mask losses.

How does it perform?

The figures below show our model’s predictions for images of recent flood events. Note that the masks are randomly colored in these qualitative results to distinguish between different object instances. For each detected object instance, the true flood level (black box) and the predicted flood level (green box) is given.

Jakarta Indonesia, 1st January 2020. (Black: True Level, Green: Predicted Level)

Guerneville USA, February 27th 2019. (Black: True Level, Green: Predicted Level)

Venice Italy, November 2019. (Black: True Level, Green: Predicted Level)

When converting our results from categorical levels back to centimeters we obtain a root mean square error of 8.07 cm on the test dataset with a 5-fold cross-validation procedure. For all the details about the experiments please refer to our paper.

What next?

We have presented a model to predict the flood-water level from images gathered from social media platforms in a fully automatized way. This work is a significant step towards building real-time flood maps. However, there are still challenges to be addressed before using this in real emergency relief situations. For example, accurate geolocation of the images from the social media platforms are not publicly available, as they are often stripped for privacy reasons. Another issue that could be faced is that the images posted when such events occur could be images from previous floods that are simply reposted and could lead to erroneous information.

To learn more about our approach, you can refer to the full research paper:

Chaudhary, P., D’Aronco, S., Moy de Vitry, M., Leitão, J. P., and Wegner, J. D., Flood-Water Level Estimation From Social Media Images, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., IV-2/W5, 5–12, 2019, https://doi.org/10.5194/isprs-annals-IV-2-W5-5-2019.

References

[1] P. Wallemacq, R. Below, D. McClean, The human cost of weather-related disasters, 1995–2015, https://www.unisdr.org/we/inform/publications/46796.

[2] Zhenlong Li, Cuizhen Wang, Christopher T. Emrich, Diansheng Guo, A novel approach to leveraging social media for rapid flood mapping: a case study of the 2015 South Carolina floods, Cartography and Geographic Information Science, 45:2, 97–110, 2018, DOI:10.1080/15230406.2016.1271356.

[3] He, Kaiming, Georgia Gkioxari, Piotr Dollár, Ross B. Girshick, Mask R-CNN, IEEE International Conference on Computer Vision (ICCV), 2017: 2980–2988.

[4] Abdulla, W., Splash of Color: Instance Segmentation with Mask R-CNN and TensorFlow, 2018, https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46.

[5] Lin, Tsung-Yi, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár and C. Lawrence Zitnick. Microsoft COCO: Common Objects in Context. ECCV, 2014.