Area Monitoring — Crop Type Marker

Sentinel Hub Blog
Published in
5 min readSep 1, 2020


Separating nuances of crop growth stages throughout the season

This post is one of the series of blogs related to our work in Area Monitoring. We have decided to openly share our knowledge on this subject as we believe that discussion and comparison of approaches are required among all the groups involved in it. We would welcome any kind of feedback, ideas and lessons learned. For those willing to do it publicly, we are happy to host them at this place.

The content:

Crop type marker assigns each feature of interest (FOI) to a crop type group by using a trained machine learning (ML) model. Farmer’s declarations typically consist of hundreds of different crop types, which are grouped based on their properties, such as crops’ phenology, farming practice, etc. Grouping of crop types to crop groups can and should also consider business goals. The results of crop type marker trained to classify FOIs in Slovenia into eighteen different crop groups are presented here.

Reference data

Slovenian Geospatial Aid Application (GSAA) dataset from years 2017, 2018, and 2019 as a source of grand truth were used. These datasets typically contain around 800000 FOIs, where each of them has the main crop being cultivated during the growing season declared. Slovenian datasets consist of almost 200 different crop types, which were grouped into meadows, fallow land, peas, hop, grass, winter rape, maize, winter cereals, ready legumes and/or grass mixture, pumpkins, vegetables, buckwheat, potatoes, vineyards, soybean, orchards and other.


The crop type model is a Long Short Term Memory (LSTM) recurrent neural network. The benefits of LSTM in the EO domain have been studied to large extent ( for example in A Satellite Time Series Dataset for Crop Type Identification and Self-Attention for Raw Optical Satellite Time Series Classification) and were shown to produce state-of-the-art results. The LSTM model can take raw unprocessed EO time-series as an input and requires no cloud filtering. Temporal resampling to a fixed time grid is also not required. The results, presented below, show that LSTM crop type models can generalize over years — model trained in past years can be used to transfer knowledge to the target year. This leads to better performance compared with models trained with target year data only. Also, less training data from the target year is required to obtain close-to-optimal results.

Training and test set are constructed by dividing Slovenia into a grid with a cell size of around 10 km x10 km as illustrated in the Figure below. Cells were randomly split into training and verification (test) cells. All FOIs (not shown) from blue (red) cells are part of the training (test) set. The training (test) sets consist of around 300000 (100000) FOIs per year.

Input features are time-series of all 13 Sentinel-2 bands without any cloud filtering or temporal resampling applied. To obtain fixed-length time-series that are required for training deep learning methods with batches, it was decided to sub-sample randomly each time series to a fixed length of 45 observations for the deep learning models while maintaining the sequential topology. In cases when FOI has less than 45 observations in total, all observations are taken and the time-series is padded with a constant value.

Results and discussion

We obtain the best result for 2019 by fine-tuning a pre-trained model on the 2017 and 2018 data. The per-class precision, recall, f1-score, and support are given in the Table below as evaluated on an independent test set FOIs from geographically independent regions. The overall accuracy is found to be 89.7%.

The figure below shows the confusion matrix of the best performing model.

One of the biggest benefits of neural nets is the ability to fine-tune them. In practice, this means training a model on dataset A, hoping that it learns domain-specific representation, and then fine-tuning and applying the model on dataset B. In the fine-tuning process even the learning rates of different layers can be adjusted and in turn controlled for how much low- or high-level parameters can change.

The figure below illustrates the benefit of training a model on different years (2017 and 2018) and then fine-tuning it on data from the target year (2019). The fine-tuning has been carried out with different amounts of data from 2019 to understand how the performance depends on training dataset size. The performance of a model trained from scratch on 2019 data only is also shown for comparison.

The figure shows that fine-tuning leads to better performing models for all training set sizes. The performance of a model trained from scratch on the entire training data can be surpassed by a pre-trained model, which is fine-tuned with only a few percent of FOIs from 2019. Fine-tuning of the pre-trained model has the largest impact on less frequent classes as shown in figures below.

Recently, another model has been trained to classify FOIs in twenty different crop groups. This grouping is more in line with the goals of common agricultural policy’s controls but also takes crop phenological properties into account. The biggest differences with respect to the above grouping are: permanent crops (orchards, hops, vineyards) are removed, and all crop types forming the other group are assigned to their proper group. The model trained from scratch on the 2019 dataset alone using this grouping achieves an overall accuracy of 93.9%. The confusion matrix of this model is shown in the Figure below. These results show how important it is to make appropriate crop type grouping.

Check the Area Monitoring documentation for more information.

Our research in this field is kindly supported, in grants and knowhow, by our cooperation in Horizon 2020 (Perceptive Sentinel, NIVA, Dione) and ESA projects (Sen4CAP).