Mapping Weeds and Crops in Precision Agriculture with Convolutional Neural Networks
Use Semantic Segmentation to identify crops, weed and soil in agriculture aerial images
In this posting we show you how to automatically map weeds in a plantation from aerial images using Semantic Segmentation and TensorFlow. We use sugar cane plantation images from Brazilian farms, acquired with a fixed-wing agricultural drone, and ground-truths generated by a biologist, to train and compare several semantic segmentation models.
Precision agriculture is a relatively new application field characterized by the use of technology to increase productivity and quality of cultures, while making use of specific policies to preserve the environment. One promising application field is the mapping using fixed-wing drones: even a large plantation can be covered fast and the acquired images can be used to direct the focused deployment of fertilizers, weed killers and pesticides. This is not only economically interesting, it is also environmentally interesting: the precise use of fertilizers and pesticides means less potentially harmful chemical substances in the environment.
Infrared images have for a long time been a standard in the investigation of vegetation from aerial images and drones used to acquire multispectral images. This resulted in the need to employ expensive and heavy cameras, which reduced the drone’s flight time and range. This has changed.
Convolutional neural networks are much better in differentiating various types of vegetation because the description of texture and color they learn is integrated and customized. This will allow us to use normal visual light images and off-the-shelf RGB cameras. Result? Less expensive equipment, with lighter cameras and smaller drones.
Sugar Cane Plantations, Orthomosaics and Ground Truths
We use orthorectified images from two sugar cane fields in Brazil. The images were captured employing a fixed-wing UAV. We captured the data following a flight altitude of 125 to 200 meters, resulting in a resolution of approx. 5cm/pixel. The images were rectified with a drone mapping photogrammetry software.
In order to make the huge orthomosaic images manageable for the neural networks, we divided the orthomosaics automatically into non-intersecting image fields of size 512 x 512 pixels. We discarded fields containing only black pixels. The dataset provided a total of 228 images with actual content. We randomly divided these image fields into training set (n=161, 70%), validation set (n=44, 20%) and test set (n=23, 10%).
From the whole dataset, an expert biologist produced a human-made ground truth (GT). There are three classes: crop, soil and weed. We showed the first plantation on the figure at the beginning of this posting. The second one is shown below. For both orthomosaics we show one of our 512 x 512 fields and its GT.
The whole dataset is available for download at:
The first field contains only sugar cane. The second one contained weeds.
Semantic Segmentation of Weeds and Crops
A lot has been told about semantic segmentation here on Towards Data Science. We won’t repeat it here and will go directly to our experiments. We employed the Semantic Segmentation Suite by George Seif.
We selected four models of CNNs that have achieved very good results in different semantic segmentation applications, e.g. indoor ant outdoor scene parsing, road and city scenarios, biomedical imagery of competition datasets. The SegNet and UNet models are more traditional and were the first ones to overcome classical techniques of computational vision in the segmentation challenges such as PASCAL VOC. The other two, the PSPNet and FRRN, are used in this work given their innovative techniques of feature extraction, their great potential and great results in very difficult segmentation datasets.
Our code, with a Jupyter Notebook that makes a few adaptations to George Seif’s original code and trains and tests the four neural network models above (SegNet, UNet, PSPNet and FRRN) using the two sugar cane field datasets, is available here:
Let’s look at it! The code below supposes you you’ll be running everything on Google Colab and have downloaded the Semantic Segmentation Suite by George Seif or our Git (which contains a copy of all parts of the suite you’ll need). If you plan to run it on your computer, you’ll have to install TensorFlow and Jupyter.
Setting your Dataset
Then, the first thing you need to accomplish is to organize the structure of the folders of your data as explained in the “Usage”” part of the repository. Do not forget to edit the text file “class_dict.csv” specific for your information. Observe that our dataset was stored in a folder called Dataset_ArticleBackground. The code below reflects this. You will have to adapt the code to your environment. After that, you just need to upload the content to the Drive.
Mounting your data:
Next you need to define the place where all the scripts available in the repository and also your dataset is stored:
# Code to mount Google Drive
from google.colab import drive
Check the processor
To use the GPU available go to:
Edit >> Notebook settings >> choose the Runtime type and GPU as Hardware accelerator. The code below is for you to check the version of the GPU being used.
Train the model
Access the directory where you mounted your project and call the script to run the training of the model:
In our work is the train_balancing_metrics.py. It is also necessary to provide some parameters. We used the following:
- num_epochs = 200
- dataset = “The folder where our dataset is located”
- num_val_images = 44, the number of images in our validation set
- h_flip and v_flip = True, to use operations fo data augmentation
- model = “FRRN-B”, or any other model you choose
- batch_size = 3 (worked for us!)
- continue_training = False, to start training from the beginning
In our repository there is an explanation for all the parameters that can be used.
%cd /content/drive/My\ Drive/DeepLearning/Semantic-Segmentation-Suite-master/!python train_balancing_metrics.py --num_epochs=200 --dataset="Dataset_ArticleBackground" --num_val_images=44 --h_flip=True --v_flip=True --model="DeepLabV3" --batch_size=3 --continue_training=False
Test the model
Here is the code to test you model over your test set. Call the test script (test.py) and pass the parameters. The checkpoint_path is the path where the weights for that trained model are located.
%cd /content/drive/My\ Drive/DeepLearning/Semantic-Segmentation-Suite-master/!python test.py --dataset="Dataset_ArticleBackground" --model="FRRN-B" --checkpoint_path='checkpoints/latest_model_FRRN-B_Dataset_ArticleBackground.ckpt'
Make a Prediction
This code is used when you want to make a prediction for new single images.
Call the predict.py with the correct parameters.
%cd /content/drive/My\ Drive/DeepLearning/Semantic-Segmentation-Suite-master/!python predict.py --dataset="Dataset_ArticleBackground" --model="FRRN-B" --checkpoint_path='checkpoints/latest_model_FRRN-B_Dataset_ArticleBackground.ckpt' --crop_height=512 --crop_width=512 --image="Dataset_ArticleBackground/test/115.png"
We obtained the best results with SegNet. The table below shows what we acquired:
The picture at the beginning of this posting shows a comparison of some results.
This work was the result of a collaborative effort of a team of engaged researchers besides me:
- Alexandre Monteiro <email@example.com>
- Paulo Cesar Pereira Junior <firstname.lastname@example.org>
- Antonio Carlos Sobieranski <email@example.com>
- Rafael da Luz Ribeiro <firstname.lastname@example.org>