Creating a Very Simple U-Net Model with PyTorch for Semantic Segmentation of Satellite Images
--
Introduction
In my previous story (here), I showed how to create a multi-channel dataset for satellite images from scratch, without using the torchvision module.
Now, we will move on to create a simple deep learning model, for semantic segmentation of satellite images and check how it performs using the 38-Cloud: Cloud Segmentation in Satellite Image, from Kaggle.
To make things easier, this code is available in Kaggle notebook 38-Cloud-Simple_Unet, available here.
The Model
My different model architectures can be used for a pixel-level segmentation of images. The 2019 Guide to Semantic Segmentation is a good guide for many of them, showing the main differences in their concepts. However, when we check the official’s PyTorch model zoo (repository of pre-trained deep learning models), the only models available are:
Besides being very deep and complex models (requires a lot of memory and time to train), they are conceived and pre-trained for the identification of a completely different set of classes (examples: boat, bird, car, cat, etc.) than our main objective here, that is cloud in satellite images.
The model codes that I found on github for PyTorch where also complex to understand and to implement, so I decided to create a cut-down version of the U-Net mode, proposed for biomedical image segmentation in 2015 (original paper can be found here).
Instead of the original U-Net architecture, our model is more similar with the one in Figure 1, from the “Convolutional Neural Networks enable efficient, accurate and fine-grained segmentation of plant species and communities from high-resolution UAV imagery” paper, where we have 3 contracting blocks and 3 upsampling (or expanding) blocks. Let’s get into the details.