Dstl Satellite Imagery Feature Detection

ANUDEEP JOSHI
CodeX
Published in
7 min readJun 19, 2022
Kaggle

Table of Contents:

1. Business problem

2. Overview of Data

3. Performance Metrics

4. EDA

5. Existing Approach

6. Data Pre-Processing

7. Modelling

8. Error Analysis

9. Deployment

10. Future Work

11. Reference

1. Business Problem

Analyzing satellite/aerial images is been playing a major role in various fields like Disaster management, Defence, Monitoring effects of global warming, Urban Planning, etc. All these things can be automated by integrating this field with Deep Learning/Computer Vision.

Here, Object recognition could be a primary task in analyzing satellite images. In the current scenario, this could be done more accurately due to the advancement in both hardware (CPUs and GPUs) and Deep learning techniques

This blog contains work on the above similar problem statement which is actually from Kaggle, Where 1 Km x 1 Km satellite images are provided, in various bands and our goal is to detect and classify the types of objects found in the region

2. Overview of Data

2.1. train_wkt.csv

the WKT format of all the training labels

• ImageId — ID of the image
• ClassType — the type of objects (1–10)
• MultipolygonWKT — the labeled area, which is multipolygon geometry represented in WKT format WKT -> Link

2.2. three_band.zip

the complete dataset of 3-band satellite images.

2.3. sixteen_band.zip

the complete dataset of 16-band satellite images.

Bands

2.4. grid_sizes.csv

the sizes of grids for all the images

• ImageId — ID of the image
• Xmax — maximum X coordinate for the image
• Ymin — minimum Y coordinate for the image

2.5. train_geojson.zip

the geojson format of all the training labels (essentially these are the same information as train_wkt.csv)

2.6. Class Label

  1. Buildings — large building, residential, non-residential, fuel storage facility, fortified building
  2. Misc. Manmade structures
  3. Road
  4. Track — poor/dirt/cart track, footpath/trail
  5. Trees — woodland, hedgerows, groups of trees, standalone trees
  6. Crops — contour ploughing/cropland, grain (wheat) crops, row (potatoes, turnips) crops
  7. Waterway
  8. Standing water
  9. Vehicle Large — large vehicle (e.g. lorry, truck, bus), logistics vehicle
  10. Vehicle Small — small vehicle (car, van), motorbike

3. Performance Metrics

Used Jaccard Index as Performance Metrics

wiki
wiki

4. Exploratory Data Analysis

4.1. Frequency of Class Label

Bar Plot
heatmap of number of polygons in classes vs Images

Observations:

  1. There are 25 unique Images
  2. All Images have an object tree
  3. The waterway is present only in a few Images
  4. Almost all images have trees and tracks

4.2. Multipolygon

Below shows a few comparisons between polygons and the original Image

multipolygon

4.3. ClassWise Multipolygon

Below shows the image with class-wise multipolygon

Classwise multipolygon

4.4. Areas of object

Area of the object in Images

Observations:

  1. Crops cover the largest area
  2. waterways and vehicles have the lowest area coverage

5. Existing Approach

  1. Existing approach 1

Overview of Dataset

Number of bands of various types of Images

All images are resized to 3 Band RBG image size and then concatenated. The resultant contains 20 channels with 3348 x 3392. As this array is too large to process, it is converted to patches with sizes 112 x 112 x 20. Patching is carried on both Images and masks and with this various DNN architectures are trained. Architectures like Multispectral U-net, Inverted pyramid model, PSPNET, etc.

2. Existing approach 2

In this approach, only 8 Bands i.e. M band is used for the training of models. the model used is U_Net.

6. Data Pre-Processing

Here the main goal is to ready the dataset which further will be used to train the model.

Various functions are created which will get pixel values to a given range and also extract masks from MultipolygonWKT values provided in DataFrame (train_wkt_v4.csv).

From the above functions, masks of various images are extracted and stored in a folder as .tif file

patches of all the input images and masks are created and stored as .npy file

Above final all_images.npy and all_masks.npy is used for training the model.

7. Modelling

U_Net model is trained on the above-generated dataset.

U_Net model is a very much known model used mainly for segmentation tasks and primarily used in the medical domain. Architecture is called U Net mainly due to its symmetric shape and many skip connections.

UNet Architecture

The plot of epoch vs loss and epoch vs Jaccard_coef is shown below.

epoch vs Jaccard_coef and epoch vs loss

one thing which can be observed is that the last layer contains sigmoid activation and ‘binary_crossentropy’ is used as a loss.

Here in each pixel and each channel i.e. class label, we will get a probability score.

The below code snippet shows the extraction of the best threshold for each class label.

Jaccard score on the test dataset is 0.67.

Prediction of the above model

The input image is patched and model prediction on this contains patches of predicted masks. Patches of these predicted masks are combined and compared with the original masks

Original maks vs predicted mask

8. Error Analysis

Image ID for a low Jaccard score (i.e. below 0.2) is extracted and its EDA is shown below

number of objects per Image for a low Jaccard score

Image ID for an average Jaccard score (i.e. between 0.2 and 0.6) is extracted and its EDA is shown below

number of objects per Image for an average Jaccard score

Observations:

  1. The Low Jaccard score is mainly due to very low area coverage by many objects(Like small and large vehicles). Due to this, there will be many misclassifications which will drag the Jaccard score of that particular class to 0. This will again impact the overall Jaccard score of complete Image as we average out the Jaccard score of all classes

9. Deployment

The complete final pipeline is deployed using streamlit

Youtube link for above pipeline execution ->

10. Future work

  1. Dataset can be trained with SegNet model to improve Jaccard score
  2. A separate model can be used to train on small objects like vehicles
  3. Complete 20 Bands can be used for training and prediction

11. References

  1. https://www.kaggle.com/code/anomsulardi/dstl-semantic-segmentation
  2. https://www.kaggle.com/code/visoft/export-pixel-wise-mask/script
  3. https://www.kaggle.com/code/drn01z3/end-to-end-baseline-with-u-net-keras/script
  4. https://www.kaggle.com/code/ksishawon/segnet-dstl
  5. https://www.appliedaicourse.com/course/11/Applied-Machine-learning-course

you can find my complete code here — GitHub Repo

you can connect with me on Linkedin

--

--