Area Monitoring — Bare Soil Marker

Sinergise
Sentinel Hub Blog
Published in
8 min readSep 1, 2020

Detecting ploughing, harvest and similar events

This post is one of the series of blogs related to our work in Area Monitoring. We have decided to openly share our knowledge on this subject as we believe that discussion and comparison of approaches are required among all the groups involved in it. We would welcome any kind of feedback, ideas and lessons learned. For those willing to do it publicly, we are happy to host them at this place.

The content:

The bare soil marker identifies all observations in which the feature of interest (FOI) is bare — with exposed bare soil as a result of ploughing or covered with non-photosynthetic vegetation as a consequence of harvest or vegetation drying up on the field. The event itself, being ploughing or harvest, cannot be detected in satellite imagery, but its consequences are observable. The bare soil marker thus identifies ploughing events by detecting exposed bare soil.

The marker is based on a decision tree with the following indices as input features:

  • Normalized Difference Vegetation Index
  • Bare-soil index
  • Normalized Difference Vegetation Index with Red Edge 3
  • and chlorophyll index with Red Edge

Training dataset curation

One of the main challenges in the EO domain — CAP and its use cases are no exception — is the lack of ground truth data. To our knowledge, there are no publicly available reference data that could be used to build and test event-based markers like bare-soil (or ploughing) marker. In this context, the domain knowledge of experts who are familiar with farming practices, phenological properties of different crop types, was used.

These domain experts provided expected sowing, crop presence and harvest time for each of the crop groups found in the Slovenian GSAA dataset. Crops are usually sown in bare soil and it takes a while before sown plant emerges from the ground. During the time around sowing, there’s a high probability that the ground is bare. We confirmed this to be true for a large number of crop types by looking at NDVI and BSI profiles averaged over many FOIs. Figures below show averaged NDVI and BSI profiles for selected crop types with superimposed sowing, crop presence, and harvest time intervals. As can be seen from the profiles, the NDVI values are low (around 0.2) and the BSI values are high (around 0.0) during the sowing period as expected. The NDVI and BSI profiles of meadows, which are never ploughed and never expected to be bare, are also shown for comparison.

Maize (left), summer barley (right)
Potato (left), meadow (right)

A training set for the bare-soil marker by selecting single bare-soil and not-bare-soil observations can be curated in the following way:

Bare-soil observations

  • Randomly sample 10% of FOIs claiming to grow any of 28 selected crop types belonging to different crop groups, such as winter cereals, summer cereals, corn, soybean, potato, buckwheat, etc. We confirmed that the sowing time interval for these crop types overlaps very well with low NDVI and high BSI values.
  • A single observation is then randomly selected from each selected FOI within a crop-type-specific sowing time interval.

Not-bare-soil observations

  • Randomly sample 10% of FOIs claiming to grow any of 34 selected crop types. The selected crop types are 28 crop types from above with the addition of crop types for which soil is never expected to be bare, such as meadows, grass and clover mixtures, orchards, etc.
  • Randomly sample a single observation from each selected FOI within a crop-type-specific crop presence time interval.

All sampled observations are besides required to have BSI > -0.1.

It is important to note that the training set curated in this way by design includes “noisy” samples, i.e. not all sampled bare-soil observations are observations of bare soil.

Test set curation

To estimate reliably the performance of the bare-soil marker model, with the help of domain experts 2000 FOIs all Sentinel-2 observations were labelled as bare-soil, low vegetation (crop emerging from the ground or presence of residual vegetation after harvest), or not bare soil. In total around 100000 Sentinel-2 observations were reviewed and labelled. We build a dedicated dashboard displaying true- or false-color images of all Sentinel-2 observations per FOI. Experts had to select the ones where bare soil or low vegetation was visible. A snapshot of labelling dashboard is shown in the Figure below.

Bare soil and low vegetation are best visible in false colour, where bare soil appears as brownish or greyish. When the crop starts to emerge (low vegetation) more pinkish and reddish tones appear. Fully-developed vegetation appears as red. Below are a few observations marked by experts as bare-soil and low vegetation.

Bare soil
Low vegetation

The test set consists of 10507 observations labeled as bare soil and 15494 observation labeled as low vegetation. The figure below shows the distribution of NDVI and BSI for manually labeled bare-soil (BS), low vegetation (LV), and neither of these two classes (0).

Model

A decision tree classifier was trained to separate bare-soil from non-bare-soil observations. A decision tree method was selected because it is believed it’s less susceptive to noisy labels and also less prone to overfitting. Once the model is trained, it is applied to classify all observations of all FOIs in the dataset as bare-soil or not-bare-soil.

Results

The performance of the bare-soil marker is evaluated per observation level and per FOI level using different evaluation metrics. FOI-level metrics relate more to CAP metrics where decisions are made per FOI.

Observation-level performance

The fraction of correctly identified bare-soil (true positive) depends on a crop type (group) and ranges between 80% (grass, clover mixtures) and 97% (winter cereals). The fraction of falsely identified bare-soil (false positive) depends as well on a crop type (group) and ranges between 0.5% (grass, clover mixtures) and 10% (corn, winter cereals).

FOI-level performance

Intersection over union

We evaluate intersection over union (IOU), defined as

where we count in the numerator (denominator) number of same (all) bare-soil observations made by the marker and labelers per FOI, respectively. The iou metric is common for evaluating the performance of semantic segmentation algorithms in the computer vision domain.

As an example calculation for FOI = 90536:

  • Ground truth bare soil observations (in day-of-year): {1, 11, 46, 51, 56, 71, 81, 91, 106, 216, 221, 231, 256}
  • Bare-soil marker observations (in day-of-years): {51, 56, 71, 81, 91, 111, 206, 216, 221, 231, 256}

Intersection over union averaged over all FOIs in the test set is found to be 0.57.

FOI groupings and counts

The final evaluation is performed by grouping FOIs into three groups according to their count of the number of labeled bare soil and low vegetation observations:

  • not-ploughed: the number of labelled bare soil and low vegetation observations is 0,
  • ploughed: the number of labelled bare soil observations is larger than 0,
  • farmed: the number of labelled bare soil observations is 0 but the number of labelled low vegetation observations is larger than 0.

For each of the above groups, the fraction of FOIs is calculated without any bare-soil observation made by the marker. Averaging over all crop groups the following results are obtained.

If FOI has at least one observation labeled as bare soil, then the bare-soil marker will identify at least one bare-soil observation in above 99% of FOIs. If FOI has zero observations labeled as low vegetation or bare soil, then the bare-soil marker will identify at least one bare-soil observation in 35% of FOIs. Note that this is based on a very small number of FOIs — 37.

It is interesting to take a closer look also at the results per specific crop group. The table below shows the results for crop groups containing meadows, grasses, clover, and mixtures of these.

The test set intentionally consists of FOIs that have been ploughed. According to the results, the bare-soil marker correctly identifies 99% of such FOIs. Mowing events are in general much more frequent for these crop groups. Mowing event typically results in low vegetation, which is in around 22% of cases identified as bare soil by the marker. This number could be reduced by removing bare soil marker events if they coincide with an identified mowing event, as in the Figure below showing identified bare-soil and marker events.

Bare soil events for a specific FOI.
Mowing events for a specific FOI.

Corn and winter cereals are another two very frequent groups. Their corresponding results are shown in the Tables below.

Bare soil marker identifies with all FOIs that have been ploughed or had low vegetation, which is in these cases due to farming activities: sowing leading to crop emergence, or harvest leading to residual vegetation on the field mixed with exposed soil.

An overview of the bare-soil classifier over Balaton (Hungary), taken on 2020–09–14. The violet overlay marks the detected bare soil. Click on the image to open it on EO Browser.

If you are curious to see how the bare soil marker performs in your area, try it in EO Browser. Let us know how it performs!

Check the Area Monitoring documentation for more information.

Our research in this field is kindly supported, in grants and knowhow, by our cooperation in Horizon 2020 (Perceptive Sentinel, NIVA, Dione) and ESA projects (Sen4CAP).

--

--