Severstal Steel defect detection

Kovi Bhargav
8 min readSep 4, 2021

--

Steel defect detection using deep learning technology

Image source: https://www.istockphoto.com/photos/steel-manufacturing

Table of contents

  1. Introduction
  2. Business Problem
  3. Mapping to Deep Learning problem
  4. Understanding the Data
  5. Exploratory Data Analysis
  6. Functions to convert images to masks and masks to images
  7. Visualizing Images and masks of various defect types
  8. Understanding the Unet Architecture
  9. Model Explanation
  10. Results and Deployments
  11. Conclusions and Future Work
  12. Profile
  13. References

1. Introduction

Steel is very much prone to get defects during the manufacturing or shipping process, and it is very difficult for large manufacturing companies to detect these defects with help of manpower. Hence, there is scope to train a machine learning or deep learning model to detect these defects.

Severstal is a Russian company mainly operating in the steel and mining industry, headquartered in Cherepovets. Severstal conducted a Kaggle competition by providing the data of defective steel images.

This story is about my work as a response to the above Kaggle competition.

2. Business Problem

Identifying the defects present in the steel is a tedious repetition task for humans. There will be different types of defects ( scratches, broken parts, welding sediments, etc., ) and sometimes it is very difficult to classify a defect into one of the defect types. Delivering the defective steel to the customer will lead to customer dissatisfaction.

3. Mapping to Deep Learning problem

Given an image with some defect pixels, the task is to classify the defect pixels into their correct type and also to identify the non-defect pixels as well.

Hence, the above problem can be posed as an image segmentation task as the classification is at the pixel level.

The loss is considered as Dice loss and the metric is IOU score.

4. Understanding the Data :

The given data has a folder with training images and the CSV file with Image-ID, Class-ID, and encoded pixels. Hence, the CSV file can have multiple rows with the same image ID each row with a different defect type.

However, the above CSV file has been pivoted such that there is only one row in the data frame for an image ID with the below columns

ImageId : unique image id

Defect_1 : Defect 1 encoded pixel if present.

Defect_2: Defect 2 encoded pixels if present.

Defect_3: Defect 3 encoded pixels if present.

Defect_4: Defect 4 encoded pixels if present.

hasDefect : flag to indicate if at least one of the defect type pixels are present.

hasDefect_1: flag to indicate if defect type 1 pixels are present.

hasDefect_2 : flag to indicate if defect type 2pixels are present.

hasDefect_3 : flag to indicate if defect type 3 pixels are present.

hasDefect_4 : flag to indicate if defect type 4 pixels are present.

5. Exploratory Data analysis

Distribution of steel images among defect and non-defect images

There are 10054 Total images with 5333 with at least 1 defect image and 4721 no defect images.

53.04% are defective steel images and 46.96% are non defect images denoting the data is almost balanced in the distribution of steel among defect and non defect images.

Distribution of steel images among defect type images

Observation: There are 4127 type 3 defects and just 198 type 2 defects. Hence, the data is highly imbalanced.

Distribution of the number of defects in each image

Observation: There are very few images with multi defects.

Images with no defects

Observation: The above images are given under no defect steel images. However, we are able to still observe some kind of marks on the image. Hence, this is the challenge to the model that, though there are other types of defects, the model has to detect only the particular 4 defect types pixels and for other defects, it should show as no defect pixels.

6. Functions to convert images to masks and masks to images

rle2_1frame: Given list of run-length encoded information of the 4 defects as a list. The function will return the matrix of shape as input shape mentioning the pixels with 0 ( no defect ), 1 , 2 3, 4 denoting the particular defect type respectively.

rle_to_RGBmask : Given a list of lists of run-length encoded information of the 4 type defects and classes to the color dictionary, the function will return RGB image.

classes_tocolour = dict({0: [0, 0, 0], 1: [255, 105, 180], 2: [180,255,105], 3:[105, 180,255], 4: [ 255, 255,105]})

RGBmask_to_width_height_classes : Given RGB image and colour map, the function returns the NumPy array of shape width X height X classes. In this case study the number of classes = 5.

colourmap = [[0, 0, 0], [255, 105, 180], [ 180,255,105],[ 105, 180,255], [255, 255,105]]

width_height_classes_toRGB : Given Numpy array of shape ( width, height , classes) the function will return RGB image.

classes_tocolour = dict({0: [0, 0, 0], 1: [255, 105, 180], 2: [180,255,105], 3:[105, 180,255], 4: [ 255, 255,105]})

one_frame_rgb : Given single frame with values of 0 to 5 denoting the defect type. The function will return an RGB image.

7. Visualizing Images and masks of various defect types

No defect images and masks : No defect pixels are denoted with black colour.

Defect 1 Images and masks: Defect1 pixels are denoted with hot pink color.

Defect 2 Images and masks: Defect2 pixels are denoted with hot green color.

Defect 3 Images and masks: Defect3 pixels are denoted with hot blue color.

Defect 4 Images and masks: Defect4 pixels are denoted with yellow color.

Multi Defect images and masks:

Images and masks with 1,2,3 defect pixel:

Images and masks with 3,4 defect pixel:

8. Understanding Unet Architecture

Unet Architecture

Image source: https://medium.com/coinmonks/learn-how-to-train-u-net-on-your-dataset-8e3f89fbd623

Unet is an encoder-decoder architecture type model.

Encoding stage:

In each encoding step, the image size will be half and the number of channels will get double.

Decoding stage:

the decoder will get an input matrix from the previous decoder stage and also a residual connection from the same level block of the encoder stage. And these 2 matrices need to be concatenated on the last axis. If the shapes of these 2 matrices are not the same, then the larger matrix has to be trimmed accordingly to make it suitable to concatenate with the other matrix.

The number of channels will get half in each decoding step.

Final output:

the final output will have a width and height similar to the input image. However, the number of channels will be equal to the number of classes. And in each channel, the defect pixels corresponding to the particular defect will be indicated with 1 and other pixels will have a value of 0.

For more information, read the original architecture

https://arxiv.org/abs/1505.04597v1

9. Model Explanation

EfficientnetB7 is used as an encoder with an input shape of 256 X 1600 X None. And the decoder has been developed by using a custom decoder block, which will double the shape with help of Conv2dtranspose and also uses the residual connection from the same encoder stage.

10. Results and Deployments

For the test data, the IOU score obtained is 0.95

Confusion Matrix

confusion, precision, and recall matrix for the images with no defects
confusion, precision, and recall matrix for the images with defect type 1
confusion, precision, and recall matrix for the images with defect type 2
confusion, precision, and recall matrix for the images with defect type 3
confusion, precision, and recall matrix for the images with defect type 4

Visualization of images, original masks, and predicted masks for the four defect types.

Images original masks and predicted masks for no defect images

Observation: the image has no defect and hence the predicted image is completely black

Images original masks and predicted masks for defect 1 images

Observation: We can observe that the model can predict most of the defect 1 spots.

Images original masks and predicted masks for defect 2 images

Observation: The model is very much poor in detecting the type 2 defects. One reason can be data related to type 2 defects is very less.

Images original masks and predicted masks for defect 3 images

Observation: model is very good at detecting the 3rd type of defect.

Images original masks and predicted masks for defect 4 images

Observation: Model performance on defect 4 images is not bad

Deployment: The model is deployed in the local system and streamlit was used for making an application to demonstrate my work.

Model deployment in streamlit application

11. Conclusions and Future Work

Conclusion: Above plots show that, the model prediction performance order is as below defect3 > defect 1 > defect4 > defect2

and this order is the same as the distribution of defects in the given data.

Future Work:

  1. More data with type 2 defects can be added to give enough scope for the model to train on defect 2
  2. Individual models can be trained to detect the pixels for each defect type respectively.

12. Profile

linked in profile

www.linkedin.com/in/bhargav-kovi-01863411b

GitHub repository

13. References

--

--