Towards urban flood susceptibility mapping using machine and deep learning models (part 4): Convolutional neural networks

Omar Seleem
Hydroinformatics
Published in
4 min readFeb 3, 2023

In this article, we will prepare the data to map urban flood susceptibility mapping using convolutional neural networks (CNN). This article shows how to develop the models and use the trained model to map urban flood susceptibility. This series of articles summarize and explain (with python code) the paper “ Towards urban flood susceptibility mapping using data-driven models in Berlin”. The complete Jupyter Notebook and a sample data for flooded and non flooded locations and the used predictive features in the paper are available here

The first problem you face when you want to do a convolutional neural networks model is How to prepare your own dataset? all the online sources/courses are just using standard datasets so you do not know/learn how to prepare your own dataset.

In a previous article, I showed how to prepare datasets for the point-based model such as random forest and support vector machine. This article will show how to prepare data for convolutional neural networks using simple coding skills.

As mentioned before, we need flood inventory to map the urban flood susceptibility using the data-driven model. Flood inventories include flooded locations as points. Thus, we need to convert these points to images. We will continue as follow:

1- Convert the points to polygons (polygon dimensions represent the required image size). Then, we split the polygons' shapefile into multiple shapefiles (each shapefile represents one polygon (image)) and save the flooded polygons and not flooded polygons in separate folders.

2- Read the raster (predictive feature) that you need to clip

3- Clip the raster with the polygons’ shapefiles

The final product of these steps is that we have two folders (Flooded and Notflooded) that include images corresponding to each flooded / Not flooded location (point).

The used dataset and code are found here

1- Convert points to images(polygons) and split the polygons

We will first import the packages which we will use. Then, we will use the buffer function to make a circle around the point then we use the envelope function to convert the circle to a square (image) as shown in Fig 1

Fig 1. Converting the point to an image with size (Image size x Image size).
# import the packages
import geopandas as gpd
import glob
import os
from osgeo import ogr
from osgeo import gdal
import numpy as np
import matplotlib.pyplot as plt
# read the points shape file using geopandas and plot the points
points=gpd.read_file('Points.shp')
points.plot()
buffer_dist=115 #buffer distance = image size x spatial resolution /2

# Read in the shapefile
points = gpd.read_file("Points.shp")

# Create square buffers with a side length of buffer_dist units around the point features
points['geometry'] = points.buffer(buffer_dist)

points['geometry'] = points.geometry.envelope

# Save the new shapefile
points.to_file("squares.shp")

Now we have a polygon shapefile which includes all the flooded and non-flooded locations. Create a folder and call it divided. We then split the features in the polygon shapefile (save each feature in a separate shapefile)

# Split the flooded and nonflooded points 
#points_flooded=points[points['Label']==1]
#points_notflooded=points[points['Label']==0]

# Iterate over each feature in the shapefile
for index, feature in points.iterrows():
# Create a new GeoDataFrame with just the current feature
#print(index)

feature_gdf = points.iloc[[index]]
#print(feature_gdf)
#print(feature_gdf['Label'][index])



# Save the feature to a new shapefile
feature_gdf.to_file(r"divided\feature_{}.shp".format(index))

2- Read the raster (predictive features) which you need to clip

Here we open the raster using GDAL and read it as an array. We have several predictive features. Therefore, we can put all the predictive features in one raster (with several bands, each band represents one predictive feature ) or make a for loop to iterate over the predictive features so you repeat the following steps for each predictive feature.

Composite rasters can be made in Arcmap using the composite bands' function or in QGIS as shown here

# Read raster files with GDAL
# import
ds = gdal.Open("Composite_raster.tif") # open a raster with several bands, each band represent one predictive feature
gt= ds.GetGeoTransform() #get the transformation data
proj = ds.GetProjection() #get the projection

band = ds.GetRasterBand(1) #read the first band
array = band.ReadAsArray() #read the first band as an array

plt.figure() #plot the raster to check that you every thing is working well
plt.imshow(array)

3- Clip the raster with the polygons’ shapefiles

Now we have the polygons which represent our images and the predictive features as a raster. So we will iterate over the polygons and clip the predictive features raster with each polygon. So the output will be two folders (Flooded and Notflooded) of images (in tiff format).

# change the path to the folder where we saved the splitted polygons
shp_path=r"D:\divided"
os.chdir(shp_path)
shp_file = glob.glob('*.shp')
#index =0
for file in shp_file:
#print(str(file))
ds2 = ogr.Open(file, 1)
layer = ds2.GetLayer()
shp_ds=gpd.read_file(file)
#print(shp_ds['Label'][0])
#index+=1
# we will clip the raster with each polygon and save the flooded and notflooded locations in different folders
# we will check the label, if label =0 then this is not flooded location
# Label = 0 for non-flooded locations, Label = 1 for flooded locations
if shp_ds['Label'][0] == 0 : # clip and save not flooded locations
#Save the feature to a new shapefile
dsClip = gdal.Warp(r"D:\Predictive_features\NotFlooded\feature_"+str(file[:-4])+".tif", ds, cutlineDSName = file,
cropToCutline = True, dstNodata = np.nan)
else: # clip and save flooded locations
# Save the feature to a new shapefile

dsClip = gdal.Warp(r"D:\Predictive_features\Flooded\feature_"+str(file[:-4])+".tif", ds, cutlineDSName = file,
cropToCutline = True, dstNodata = np.nan)

Now we prepared the images to train a convolutional neural network. In the next article, we will read these images in python and train the network.

--

--

Omar Seleem
Hydroinformatics

Dr. -Ing | Hydrology | Data scientist | Machine learning