AI-assisted mapping of vegetation in power line corridors

Vladimir Ignatiev
Geoalert platform
Published in
7 min readOct 29, 2020

We are ready to announce the joint project “Monitoring power lines” — the result of collaboration of R&D team AeronetLab and GeoAlert LLC. GeoAlert and Skoltech launched this to monitor and vectorize vegetation in power lines protected zones. It is a cloud service that powers on-the-fly analytics of VHR satellite imagery and provides access to the processing history through the GeoAlert platform’s API.

At the moment, we have successfully completed two commercial projects, received a patent for our technology, and are now launching the product at its scaling stage.

Getting a detailed map of the forested areas within power line corridors throughout the Leningrad region (with a total length of about 35,000 km) using our platform turned out to be a full-scale experiment similar to the “Urban Mapping” project, in which all buildings for the territory of the Russian Federation were automatically mapped via a neural network. Now we detected and vectorised about one million undesirable patches of trees and shrubs that interfere with the power line corridors. The combined area covered with this vegetation comprised about 30% of the total area of the protected zones, which had been obtained from OpenStreetMap (OSM).

Distribution of forested areas and number of vector polygons by height categories extracted with our service for power lines in Leningrad region

These simple statistics shows that 50% of all forested areas are in the height category “from 4 to 10 meters”. One can notice that the regions with vegetation higher than 10 meters are more compact and well-delineated unlike those with vegetation lower than 4 meters since those appear as numerous small, vaguely outlined patches.

Processing for power lines in the Leningrad region (left) and fragment of the results of automatic segmentation (right). Security zones are derived from OSM
Fragment of the results of automatic segmentation coloured in Quantum GIS

Knowing the exact location of vegetation-covered areas allows one to perform a differentiated assessment of the amount of effort required to clear and maintain the power line corridors free of arborious vegetation, both on the regional scale and the local scale.

How small issue of a huge problem has became an independent product

About a year and a half ago, our team at Skoltech laboratory started a study on the application of satellite imagery to forest inventory. We researched how deep-learning neural networks perform on a combination of LiDAR data and multispectral satellite imagery with submeter spatial resolution. For that, we used a subscription to the Maxar’s GBDX service, which provides an API for processing archives of high-resolution multispectral images from sensors like WorldView 2–3). One of the key parameters for forest inventory is the height of the forest stand. The problem still seems unresolved in the context of remote and scalable methods, which could also provide sufficient accuracy and level of automation in modelling. Yes, height can be easily measured by having a 3D point cloud obtained through LiDAR scanning, for example using the LasTools software. This method allows one to get centimetre accuracy in height, but it turns out to be quite expensive and still resource-intensive. Our team likes to solve complex problems based on AI and satellite images — the cheapest global source of remote sensing data, so the idea of replacing expensive LiDAR data with a neural network that can “generate” heights from a VHR image, for example from the WorldView-3 sensor, seemed very promising to us.

The training process was organized as follows (see the figure below). As reference data for heights, we used the digital Canopy Height Model (CHM). The input data of the neural network is RGB (or RGB+NIR) images with a resolution of 0.3–0.8 m/px obtained during the active vegetation season. CHM is a geo-referenced raster with an absolute height value assigned to each pixel and calculated from a 3D point cloud by interpolation. WE had to tinker a bit with the shift between the CHM raster and satellite imagery caused by the differences in shooting conditions and in the spatial resolution of the data. After trying manual and automatic ways to combine these data, we came to the conclusion that it is enough to choose a spatial resolution at which the shift ceases to affect the accuracy of the prediction.

Design of training NN models with satellite imagery and CHM as ground truth

The first version of the technology was presented on the annual contest “Еnergy breakthrough” 2019 supported by Skolkovo Foundation and PJSC Rosseti . This version of algorithm allows to detect forested areas and to predict vegetation heights inside those sufficiently well: pixel-wise F1-score=0.84 for segmentation and MAE = 3.3 meters for height estimation.

Comparison of the overall forested areas of protected zones in Republic of Tatarstan calculated by NN models and extracted from OSM

Although we did not finish among the top three, we nevertheless received various feedback questions from the high-ranking jury about why we decided to use images from foreign satellites, to requests for testing our technology from IDGC’s technical specialists, and also worked out an understanding of the potential market volume (see the figure below). This motivated us to develop the technology and package it into a self-contained product.

Segments of Russian market of vegetation monitoring in protected zones

Version 1.0

Modern image analysis techniques make it possible to solve the problem of automatic mapping vegetation using VHR satellite images. To fully automate the process, you need to be able to quickly get up-to-date images on the user’s request. Our service has managed to combine all the necessary components of this process and make it as user-friendly as possible.

To make our technology fit the demands of the potential customers, two improvements should be introduced. We should:

  1. increase the prediction accuracy of heights and contours of forested areas in the power line corridors;
  2. automate the delineation of the protected zones of power line according to the regulatory documents stated in PJSC Rosseti.

Improvement of segmentation masks

To address the first goal, we have fine-tuned our models on a highly accurate dataset (see figure below) that covers an area of ~50 sq. km and features 14 547 labelled objects.

Examples of markup prepared to train CNN for precise segmentation of vegetation

From height classification to regression

At the beginning, we formulated the height estimation problem as classifying pixels as one of specified ranges of heights. During the pilot stages, we learned that it is best to reformulate the problem to be one of regression applied to the neural network output, thus having the opportunity to change the resulting height ranges on demand.

Buffer design

The task of constructing a buffer of a protected zone relates to the classical GIS development more than to data analysis. However, it still requires experience with vector data and understanding of end-user scenarios. The result is a convenient tool for the operator to work through the plugin of the commonly used open source GIS — QGIS. The plugin sends requests to the processing platform via the GeoAlert Platform’s API, stores the processing history, and allows the user to upload the processing results into a new project. Several examples of constructing a buffer zone for power lines in the Leningrad region from our pilot project are shown in the figure below.

Example of constructing a buffer of a protected zone for spans 156–157–158 “ VL-129 PS-57 Kuznechnoe-PS-34 Lahdenpohya”: the borders of spans and towers are shown in black, the lines of wire projections are shown in red and the central line connecting neighboring towers is shown in green

Results and future work

Testing the performance of our algorithms during pilot projects consisted of visually comparing the model forecast with the CHM map and calculating the following set of metrics:

  1. Localization accuracy of forested areas:

2. Height estimation accuracy:

Just name a number of benchmarks that we reached:

  • speed of automatic processing of 110 kV power line equals to 2 km of corridors per minute on 1 GPU;
  • manyfold acceleration of the working process — we have not finished collecting the feedback yet, but the preliminary estimate is 12–16 times;
  • the average score on localization of vegetated areas in protected zone equals to E_loc =0.8;
  • the average score on height classification of vegetated areas in the protected zone equals to E_cl =0.75, MAE = 2.7.

At this stage, we offer users a new service deployed on the GeoAlert platform, that connect data sources and scale up a processing, and access to the most advanced satellite image systems.

We appreciate your comments and proposals. Stay tuned about the project — sign up for our newsletter:

--

--