From a research paper to a deep learning model with Keras and python for image segmentation
Ok, you have discovered U-Net, and cloned a repository from GitHub and have a feel for what is going on. Nothing teaches more than doing, so this paper will step you through coding up a deep learning U-Net style model from a research paper to a working tool. It assumes you have a basic level of python and worked a little with Keras and deep learning models.
The paper we are working from is Road Detection and Centerline Extraction via deep recurrent convolutional neural network by Yang et al. which proposes a model to simultaneously extract road networks and centerlines from satellite images using a Recurrent CNN network.
What is Recurrent CNN.
Above you can see a conventional CNN block on the left and a Recurrent CNN on the right. Advantages of RCNN¹ are that it can preserve low level detail and because it reuses the convolutional layer in the block complete with weights, it has a similar if not lower number of parameters than CNN.
From Diagram to Code.
Keras contains prebuilt layers that make translating the diagram to code straight forward. All the layers we need are there, but we do need a few tricks. We need Conv2D, BatchNormalization, Activation, and Addition to build up a a RCNN model block. With reference to the RCNN diagram above note that the ReLU activation comes after batch normalisation, so rather than usethe activation in the Conv2D layer, we have 3 layers — Conv2D, BatchNormalization and Activation.
Using the Keras Functional API, tensor_output = layer()(tensor_input), the model block is implemented below. Note on line 5 we added an additional convolutional layer which transforms the input to the shape to the correct shape for reusing a recurrent layer and is applied on line 11. Also note that the model has 5 convolutional layers consistent with the paper instead of the 3 in the diagram above and returns a Keras model for instantiation.
Overall Model Structure
The overall model structure from the paper is typical U-Net, with Yang et al, proposing 3 derivatives, see the diagram below. We will code up version b — RCNN-Unet2 with 2 outputs.
In their paper, Yang presents the the model in a table, as most research papers do and this becomes our implementation blueprint. With an RCNN block defined, it becomes easy to implement the full model. For the encoder, the RCNN model block is followed by a pooling layer and for the decoder an RCNN model block preceded by an upsampling layer.
The model implementation - roadNet — is detailed below. So working through the block levels from 1 to 7, the model is built up with RCNN block with the pooling and Upsampling layers. Note how the input size for the RCNN block is set when defining it (line 8, 17 etc.) and concatenating layers(line 20, 25,30) connecting the features from the encoder to the decoder.
The model version we are implementing has 2 outputs which are concurrently optimized for extracting the road network and one for extracting the centerline of the road. Above you can see the split at conv7 which passes though 2 more layers to the road and centerline outputs.
The code for the multiple outputs and losses is shown below.
What does the model look like?
Below is an image of the final model structure generated with the code below.
# Generate Model Plot
utils.plot_model(model,'model.png')
Note the U-Net architecture with the feed forward connections from the encoder to the decoder. The model trains well for my work on LIDAR, but hasn’t been applied to road networks yet.
Conclusion
In this post you have seen how to use the standard Keras layers to code a deep learning model from a research paper. If more sophisticated layers or loss functions are needed, then Keras allows for custom layers and loss functions.
If you are interested in how the data generators are used for this model and the training code, please let me know. Hoping you have found this post useful.
¹ Road Detection and Centerline Extraction via Deep Recurrent Convolutional Neural Network U-Net, Yang et al ,IEEE Transactions on Geoscience and Remote Sensing ,· May 2019.