Automatic Flood Detection from Satellite Images Using Deep Learning

12 min readAug 7, 2022

This project was carried out under the supervision of our valuable advisor, Assist. Prof. Ali Can Karaca, within the scope of the Yıldız Technical University Computer Engineering undergraduate program graduation project. As a team, me (Ömer Buğrahan Çalışkan) and my project friend Uğur Altındal worked together and put effort in every step of the project. Me, Uğur Altındal and our advisor Ali Can Karaca have equal rights on the project. Our advisor provided all kinds of guidance and assistance about the method, course and process in the idea and realization phase of this project, and guided us to complete the project successfully.

Introduction

Identifying the areas affected by natural disasters and the damage they cause is of great importance, especially in times of disaster. Therefore, automatic remote sensing technologies are a critical and beneficial area for human life.

In our project, which I will introduce in this article, it is aimed to automatically detect the impact area of the flood event. In the project, pre-disaster and post-disaster images taken from the Sentinel-1 satellite were examined, and various processes were applied on these images in order to clarify the disaster area.

Dataset was created with many different images taken from over the world at various times.

Three different deep learning model trainings that are common methods used in image segmentation were carried out to detect flooding on the images. Approximately 87% success was achieved for UNET and LinkNet, and 79% for SegNet.

Introducing SAR Images, Sentinel-1 and SNAP Application

Synthetic Aperture Radar SAR (Synthetic Aperture Radar) is a radar system for high resolution earth imaging and moving target detection that can be used on manned and unmanned aerial platforms.

It can be used in all weather conditions day/night, and thus it can perform imaging and moving target detection even in rainy and cloudy weather.

The main purpose of the Sentinel-1 satellite is to monitor the land and ocean.
The Sentinel-1 satellite is a satellite that operates day and night in a 2-pole orbit by using radar imaging and provides image acquisition regardless of weather conditions.

You can obtain free SAR images from the Copernicus site (https://scihub.copernicus.eu/dhus/#/home) anywhere in the world, in any date range. You can also customize your search result with various filters such as Sentinel-1 and Sentinel-2, polarity, sensor mode etc.

SNAP (Sentinel Application Platform) (https://step.esa.int/main/download/snap-download/) is a tool for analyzing and editing remote sensing images developed by the European Space Agency. In order to improve the image and increase the detection success in the deep learning model, 4 processes that I will explain were applied on the images.

Apply Orbit File : Orbit state vectors found in SAR products are often not correct. Satellite orbits are determined shortly and are available weeks after product release. With this process, the current orbital state vectors of the data are updated and the velocity position information is provided.

Thermal Noise Removal : Thermal noise removal reduces noise effects throughout the Sentinel-1, especially by normalizing the backscatter signal and reducing discontinuities.

Calibration: Rate and time-related radiometric distortions may occur in SAR images. While the images are being taken, other non-system errors such as radiometric errors due to local atmosphere, errors due to topography are included in the images. With this process, it is aimed to get rid of these effects.

Speckle Filter : SAR images contain speckles called noise. The reason for these spots is the waves reflected from the main scatterers. Speckle filtering aims to improve the quality of the image by reducing the effects of these speckles.

When various editing processes are made in the SNAP application on the sample Sentinel-1 SAR image that you can get from the Copernicus site, you will get an image like the one below before and after the disaster.

Before Flood — Malawi 22.01.2015 After Flood

Preparing the Dataset

Satellite images obtained through ESA Copernicus are downloaded for processing in the SNAP program. Before and after the disaster, the data are applied to the Apply Orbit File, Thermal Noise Removal, Calibration, Range Doppler Terrain Correction operations, respectively. As a result, the resulting sigma bands are combined and the water parts and the land parts are separated from each other. With the help of RGB coloring, the data is made ready for labeling and the human error in the labeling process is reduced.

Malawi Flood on 22.01.2015 — Overflow Areas are Shown in Blue

After the flood areas are detected, these areas are manually labeled using the LabelMe application and output in json format.

With a convert application written in Python, an image file as below is obtained from json format.

The images obtained after the completion of these processes are called masks. Mask images, pre-disaster and post-disaster images are fragmented and filed as 512x512 pixels in order to be suitable for deep learning models.

Dataset Preprocessing for Deep Learning Model

After this step, 3 channel shape images representing the same place before and after the disaster with 512x512 pixel size were placed on top of each other and a single 6 channel shape image was obtained. The visual representation of this step can be seen below.

In this way, the dataset consisting of images is divided into 3 parts as train, validation and test. The separation rate of these fragments was determined as 80%, 10% and 10%, respectively. The dataset was randomly shuffled before splitting.

Image Segmentation

Unlike object recognition, image segmentation is the classification process of each pixel in the image. In this method, objects to be classified are not represented in a box as in object recognition, recognition is performed by specifying the boundaries of the object itself.

There are two types of segmentation techniques Semantic and Instance segmentation. In semantic segmentation, the same objects in the picture are classified by assigning the same label. For example, if semantics were used instead of instance in image above, the color of the dogs would be the same. In Instance segmentation, a unique tag is assigned to each instance of the determined object, as seen in image above.

In the project, 3 different semantic segmentation methods, namely UNET, LinkNet and SegNet, were used.

UNET

The UNET architecture shown in above was introduced in 2015 for segmentation on biomedical images. Normally, very large datasets and high-capacity hardware should be used for segmentation operations, but these requirements have been reduced with the UNET approach.

The UNET model basically consists of a contraction and expansion path. In the step, which is defined as the contraction path and forms the left side in the representative image, the image size decreases but the number of channels increases. This is because the convolution and maximum pooling operations are performed in this step. In the part that is defined as the contraction path and forms the right side in the representative image, the number of channels decreases, but the image size increases because of the upsampling and convolutions operations.

LinkNet

In the Linknet network structure, the image is fragmented at the encoder and decoder stages and the image is reconstructed before entering the final convolution layer. Since this network structure is designed to minimize the number of parameters, it allows real-time segmentation as well as working very fast.

LinkNet is a similar image segmentation approach derived from the UNET model. In the UNET model, concatenation is done at the decoder stage, while in LinkNet, the images are added directly.

SegNet

In the encoder, convolution and maximum pooling operations are performed. At this stage, 2 × 2 maximum pooling is done and maximum pooling indexes are stored.

Upsampling and convolutions are performed at the decoder stage. During upsampling, the maximum pooling indices in the corresponding encoder layer are recalled for upsampling. In the last step, the determined classifier is used to predict the class of each pixel.

In SegNet, all feature maps are transferred from encoder to decoder, while in UNET, pooling indices are used. This difference increases the model size and increases the memory requirement.

Dataset Design

In order to create a data set, flood disasters in Turkey and abroad were investigated. As a result of the research, it is aimed to create a data set on disaster images taken from 5 different regions.

Various image editing cleaning processes, which are indicated before, have been applied. Then, the images were divided into 512x512 pixels and made into suitable sizes for deep learning models. At this stage, a total of 1545 images were obtained.

System Work

All the development steps necessary for the realization of the project were originally written by us in Python. Especially in the training process of the models, multiple libraries such as Keras, Tensorflow, Pandas, Numpy and OpenCV were used and implemented in the Google Colab environment.

In the program, the 2 images given before and after the disaster will be processed on the desired models, and if there is, it will be ensured that the regions with disasters are displayed to the user.

The photos before and after the flood are selected with the help of the Select Image button, the type of model whose result is desired to be seen is selected, and then the manually created mask image of the region and the mask image obtained as the result of the model are displayed to the user by pressing the Get Result button.

Example 2 uses are shown above. As can be seen in the image on the right, flood areas may not have been labeled due to human error during labeling, but our model marked the area as flooding, as can be seen in the post-disaster image, thanks to its high learning rate.

Performance Metrics and Experimental Results

The term processed images that will be mentioned at this stage refers to images that have been applied various correction and filtering processes in the SNAP application described in the previous stages. Unprocessed images also mean raw data accordingly.

Experiment 1 : Comparison of Processed Images in Different Models Experiment 2 : Comparison of Unprocessed Images in Different Models Experiment 3 : Comparison of Processed Images that the Model not used in the Training in Different Models

Models assign values to each pixel of the picture during estimation. Pixels whose prediction value is below the threshold value are not marked. By using the threshold value, it is aimed to prevent inconsistent results.

The following tables show the success rates of processed images according to a certain threshold value. Many values were tried and it was aimed to find the best success rate for each model.

The semantic significance of these metrics is different from each other. For example, accuracy is not considered a logical criterion in image segmentation success on its own, Mean IoU can be considered as a more important factor in image segmentation success. In this direction, even the SegNet model, which can be called less successful, has a very high accuracy value.

Experiment 1 : Processed Images

Looking at the metric results of the threshold values given in tables it is seen that the most successful result for UNET and LinkNet is 0.3, for SegNet is 0.4.

Experiment 2: Unprocessed Images

The reason for the lower success rates in this experiment is because the model used raw images during training. As can be seen from the outputs, the processed images allow the model to make a more successful prediction. This shows how successful and necessary our filtering processes are.

Looking at the metric results of the threshold values given in tables it is seen that the most successful result for UNET and LinkNet is 0.5, for SegNet is 0.7.

Comparison of Processed Images in Different Models

Comparison of Unprocessed Images in Different Models

Comparison of Processed Images that the Model not used in the Training in Different Models

At this stage, the disaster areas that were not used in the training of the models were tested. These regions were processed, labeled and masked images were obtained to be tested in the 3rd experiment

Results

In the project, it is aimed to automatically detect flood disasters from satellite images. In the SNAP environment, the images obtained were separated as before and after the disaster, prepared by processing, and finally, they were made suitable for model training by making them 512x512.

Then, various image segmentation algorithms were researched and 3 different trainings were carried out, namely UNET, LinkNet and SegNet. Looking at the results, it was seen that the models, especially UNET and LinkNet, gave highly successful outputs.

In the experimental phase, first of all, the threshold values in the model predictions were tested and the values with the highest success were selected for each model. Then, the training success was tested by testing the raw images in the SNAP environment and the images that did not participate in the model training. Various metrics were used when testing model success. These metrics are based on the ratio of the values predicted by the models and the true-false values in the actual values.

In the final stage, a GUI was prepared. The image of the region before and after the disaster was requested from the user as input. Then, it is aimed for the user to select the desired model and to obtain a mask image in which disastrous images are marked as output.

Our next goal in the development phase of our project is to automate the manual labeling process to minimize human errors.

I would like to thank my teammate Uğur Altındal and our valuable advisor Ali Can Karaca for their contributions and efforts in the completion of the project.

References

A 2021 guide to Semantic Segmentation https://nanonets.com/blog/semantic-image-segmentation-2020/
U-net: Convolutional networks for biomedical image segmentation
Ronneberger, Olaf and Fischer, Philipp and Brox, Thomas
International Conference on Medical image computing and computer-assisted intervention
Segnet: A deep convolutional encoder-decoder architecture for image segmentation
Badrinarayanan, Vijay and Kendall, Alex and Cipolla, Roberto
IEEE transactions on pattern analysis and machine intelligence
Linknet: Exploiting encoder representations for efficient semantic segmentation
Chaurasia, Abhishek and Culurciello, Eugenio
2017 IEEE Visual Communications and Image Processing
Rapid flood mapping and evaluation with a supervised classifier and change detection in Shouguang using Sentinel-1 SAR and Sentinel-2 optical data
Huang, Minmin and Jin, Shuanggen
Multidisciplinary Digital Publishing Institute
ESA Sentinel Missions
https://sentinel.esa.int/web/sentinel/missions#:~:text=The\%20first\%20Sentinel\%2D1\%20satellite,are\%20among\%20the\%20monitoring\%20objectives
Sentinel-1 GRD preprocessing workflow
Filipponi, Federico
Multidisciplinary digital publishing institute proceedings
Breaking limits of remote sensing by deep learning from simulated data for flood and debris-flow mapping
Yokoya, Naoto and Yamanoi, Kazuki and He, Wei and Baier, Gerald and Adriano, Bruno and Miura, Hiroyuki and Oishi, Satoru
IEEE Transactions on Geoscience and Remote Sensing
Building instance change detection from large-scale aerial images using convolutional neural networks and simulated samples
Ji, Shunping and Shen, Yanyun and Lu, Meng and Zhang, Yongjun
Multidisciplinary Digital Publishing Institute
Monitoring the summer flooding in the Poyang Lake area of China in 2020 based on Sentinel-1 data and multiple convolutional neural networks
Dong, Zhen and Wang, Guojie and Amankwah, Solomon Obiri Yeboah and Wei, Xikun and Hu, Yifan and Feng, Aiqing
International Journal of Applied Earth Observation and Geoinformation
Early flood detection using SAR images and remote sensing techniques-case study Kut city in Iraq
Obeydi, AL and AAl-Hummadi, SK and AL-Saady, AH
IOP Conference Series: Materials Science and Engineering