Area Monitoring — Homogeneity Marker

Jernej Puc

Follow

Published in

Sentinel Hub Blog

6 min readOct 13, 2020

--

Detecting uniformity of parcels through crop signatures

This post is one of the series of blogs related to our work in Area Monitoring. We have decided to openly share our knowledge on this subject as we believe that discussion and comparison of approaches are required among all the groups involved in it. We would welcome any kind of feedback, ideas and lessons learned. For those willing to do it publicly, we are happy to host them at this place.

The content:

High-Level Concept
Data Handling
Outlier detection
Identifying built-up areas
Similarity Score
Bare Soil Marker
Mowing Marker
Pixel-level Mowing Marker
Crop Type Marker
Homogeneity Marker (this post)
Parcel Boundary Detection
Land Cover Classification (still to come)
Minimum Agriculture Activity (still to come)
Combining the Markers into Decisions
The Challenge of Small Parcels
The value of super resolution — real world use case
Traffic Light System
Expert Judgement Application
Agricultural Activity Throughout the Year

Since most of our processing is based on average top-of-the-atmosphere reflectance values per feature of interest (FOI) throughout the year, it is convenient to consider each FOI as having one crop type that extends to the whole polygon. This is true for the majority of FOIs, and if an FOI consists of multiple polygons of different types, it can be trivially broken down into separate homogeneous parts as well.

However, this assumption does not always hold — some supposedly uniform FOI could actually be covered by crops of several types, corrupting the consequent results.

The homogeneity marker aims to identify these inhomogeneous FOIs.

Intuition

In an optimistic scenario, two types of crops should have different year-long sequences of reflectances (spectral signatures). While averaged values over pixels within an FOI are not enough to practically decouple the original signals, adding the minimum and maximum alone could already prove sufficiently indicative.

An idealised scenario, where the minimum and maximum could imprint the two original signals, while the mean would lead to a loss of information.

Comparing the difference between the maximum and the minimum of different FOIs would require some sort of normalisation, e.g. dividing the difference by the mean. This gets us close to another metric: the standard deviation (std). Additionally, it may be more sensible to average it over longer time-frames instead of looking at values at every observation date.

An experiment, where the distributions of B04_std for homogeneous and inhomogeneous samples are easily distinguishable. It is expected that the latter distribution is shifted towards higher values, as it corresponds to greater differences.

Results

The classifier model, based on standard deviations of reflectances that were processed as described above, was used to process the entirety of our target dataset.

Distribution of classifier outputs for the target dataset.

As suspected, some FOIs are confidently predicted to be inhomogeneous:

Inhomogeneous examples, identified by the model, as seen in different parts of the season.

FOIs with the lowest degree of confidence towards either class can be ambiguous due to various reasons. Most of such examples that we looked had very few pixels — the statistical measures are bound to be noisy and inaccurate, hence why the model was not exposed to them during training. However, as shown in the example below, low confidence can also occur due to localised changes, e.g. partial farming activity that does not happen over the entire FOI.

An example that cannot be confidently assigned a class by the homogeneity model, as seen in different parts of the season.

Data

The training set was composed of homogeneous (positive) and inhomogeneous (negative) examples. The latter were sourced from FOIs, which were known to consist of multiple polygons of different crop types, while the former were sourced from FOIs, for which homogeneity was assumed (note that these FOIs were also the intended target for the marker to review).

Because the quality of statistical metrics (including standard deviation) is affected by the number of samples over which the operation is performed, we impose a threshold for reliability. After the discretisation of polygonal bounds, we only consider FOIs with geometries containing at least 8 full (Sentinel-2 ) pixels. Inhomogeneous FOIs were filtered by an additional criterion. By not allowing the area of the most represented crop within an FOI to exceed 62.5% of the total area of that FOI, the ambiguity between positive and negative examples should be decreased.

Taking these rules into account we were left with about 10,000 negative examples. Since the pool of eligible positive examples was much larger, the training set was balanced by sampling from it the same number of FOIs (without replacement).

Model

During feature exploration, which included the experiment from the beginning, it was observed that:

distributions vary across bands and parts of the season, and
distributions corresponding to bands B01, B09, and B10 do not differ much between homogeneous and inhomogeneous samples, while the differences for the rest are generally prominent.

This would suggest that the other 10 bands are suitable feature sources for the homogeneity model. Ultimately, the features were obtained by cleaning up and averaging the standard deviation per FOI and the 10 relevant bands over 3 distinct parts of the season, which amounted to 30 input features per sample.

Validation

After training, there are many ways of asserting that the model performs as intended. For the current version, we trained a LightGBM model, based on decision trees, which means that we can start by taking a look at the feature importances.

In this case, feature importances represent the number of times each feature was used to split a decision tree during training. Splitting is done by finding the feature and value threshold that best divide the remaining samples into distinct classes, which implies that the most frequently chosen features are the most relevant for a given instance of classification.

Note that bands B04 and B08 find themselves among the most important features. Intuitively, this makes sense, as they are both constituents of the normalised difference vegetation index (NDVI), widely used for similar purposes. Emphasis on the mid and late season was also expected, as suggested by the distributions in the initial feature exploration.

Next, we run the classifier on the validation set, i.e. a chunk of the preprocessed data that was not used for training the model.

The validation set is comprised of random samples that the classifier had not seen during training and serves as a proxy for estimating the general capabilities of the classifier.

The histograms above show that, for the most part, the model seems to confidently separate between the two classes. This corresponds to reasonable performance scores on the validation set:

As for where the (potential) errors come from, we can look into some likely sources of bias.

For starters, the classifier’s performance could depend on the number of pixels within FOI geometries — higher numbers would lead to better statistical measures, but would also mean that the FOI covers a larger area, where natural differences (e.g. in the soil) could occur. In the end, no notable relationship between the pixel count and the classifier’s pseudo-probability output was found: in both cases, the distributions behave as their total 1-dimensional distributions.

Coloured scatterplots, functioning as 2-dimensional histograms and depicting (the absence of) the relationship between classifier outputs and pixel count per FOI.

More interesting observations can be derived by looking at the relationship with how the inhomogeneous FOIs are split:

Classifier pseudo-probability versus the computed homogeneity, i.e. the ratio between the area covered by the most represented crop and the total area per FOI. The non-symmetry around homogeneity 0.5 is caused by the nonlinear max operation: if an FOI consists of two fields, the larger one will contribute to the score, resulting in values strictly above 0.5.

The model reflects the fact that 62.5% was chosen as the cut-off point for the training data. FOIs with higher computed homogeneity are spread between the two extremes, with the pseudo-probability appearing to move towards higher values as computed homogeneity increases.

Finally, we can see that FOIs with more crops are more easily recognised as inhomogeneous:

These results are in line with our intentions. Now that we can identify inhomogeneous outliers, the next step is to split them into homogeneous parts using a method for field delineation, such as the one we describe in another of our blog posts.

Check the Area Monitoring documentation for more information.

Our research in this field is kindly supported, in grants and knowhow, by our cooperation in Horizon 2020 (Perceptive Sentinel, NIVA, Dione) and ESA projects (Sen4CAP).