Creating a Very Simple U-Net Model with PyTorch for Semantic Segmentation of Satellite Images

Maurício Cordeiro
Analytics Vidhya
Published in
8 min readApr 5, 2020

Cloud segmentation: RGB image (left), ground truth (middle) and our model predictions (right).

Introduction

In my previous story (here), I showed how to create a multi-channel dataset for satellite images from scratch, without using the torchvision module.

Now, we will move on to create a simple deep learning model, for semantic segmentation of satellite images and check how it performs using the 38-Cloud: Cloud Segmentation in Satellite Image, from Kaggle.

To make things easier, this code is available in Kaggle notebook 38-Cloud-Simple_Unet, available here.

The Model

My different model architectures can be used for a pixel-level segmentation of images. The 2019 Guide to Semantic Segmentation is a good guide for many of them, showing the main differences in their concepts. However, when we check the official’s PyTorch model zoo (repository of pre-trained deep learning models), the only models available are:

Besides being very deep and complex models (requires a lot of memory and time to train), they are conceived and pre-trained for the identification of a completely different set of classes (examples: boat, bird, car, cat, etc.) than our main objective here, that is cloud in satellite images.

The model codes that I found on github for PyTorch where also complex to understand and to implement, so I decided to create a cut-down version of the U-Net mode, proposed for biomedical image segmentation in 2015 (original paper can be found here).

Figure 1: Example of a similar U-Net architecture. Source: https://www.nature.com/articles/s41598-019-53797-9/figures/1

Instead of the original U-Net architecture, our model is more similar with the one in Figure 1, from the “Convolutional Neural Networks enable efficient, accurate and fine-grained segmentation of plant species and communities from high-resolution UAV imagery” paper, where we have 3 contracting blocks and 3 upsampling (or expanding) blocks. Let’s get into the details.

Defining the Contracting Block

Maurício Cordeiro
Analytics Vidhya

Ph.D. Geospatial Data Scientist and water specialist at Brazilian National Water and Sanitation Agency. To get in touch: https://www.linkedin.com/in/cordmaur/