Detecting and sorting nuclear waste using deep convolutional neural networks

Published in

Analytics Vidhya

9 min readDec 18, 2020

This article is a walk-through of a pipeline that combines deep learning and classical computer vision techniques in order to pick out radioactive springs from a pool of waste material using a robot.

Detected springs in a pool of waste material

Introduction

Nuclear decommissioning, due to its hazardous nature, is rarely performed by humans and thus requires robotic automation. According to a review study, most robotic systems used for nuclear decommissioning applications employ virtually no autonomy or even programmed motion. This is owed partially to the absence of cognition, which can be increased via the introduction of additional sensors, whether visual or otherwise, and partially to the high degree of customization which reduces flexibility.

A popular and straightforward means of improving cognition is the inclusion of optical sensing coupled with an object detection algorithm. On that front, research efforts largely concentrate on classical computer vision algorithms, such as shape recognition methods for nuclear waste sorting using Hu moments.

Our work showcases a radioactive Magnox spring detection and instance segmentation pipeline, integrated within the Minotaur-R robotic system, which outputs the following parameters:

Spring positions
Spring occlusion level by debris
Spring orientation, to facilitate grasping by the robot

The final version of the system can pick and place springs with a high level of accuracy, which significantly pushes the boundary for using robots for nuclear waste sorting. It includes a number of other features apart from spring detection, which will not be detailed in this article, as we will just focus on the vision pipeline.

Have a look at the final system:

Tools used for spring detection

The backbone of our algorithm is the Mask R-CNN model, which is typically used for instance segmentation. The Mask R-CNN is designed for pixel-to-pixel alignment between network inputs and outputs, returning a high-quality segmentation mask for each instance. This feature gives the ability to detect springs with high accuracy (even the partially occluded ones) among the scrap background.

Training dataset

The model was trained using Mask R-CNN’s suite of utility functions.The training process took place several times using various parametrizations in order to achieve the fastest convergence to the loss function minimum. The best result achieved with the use of ResNet50 as the Backbone network, which is a smaller and faster version of ResNet101. The image size was also a criterion that we had to determine; the recommended size of the input image is 2048 x 2048 pixels, as anything larger led to unacceptable processing times.

The heterogeneity of the data was increased to improve the generalization capability and to reduce model bias. The dataset that was used for training purposes consisted of 32 images containing 222 manually labeled springs in total, from various altitudes, orientations, and in different lighting conditions. These images were annotated carefully in order to avoid dataset contamination.

The dataset was split into training (80%), validation (10%), and testing (10%) subsets, and fed into two different backbone networks, ResNet50 and ResNet101, in order to evaluate their respective performances separately.

The model successfully detected 220 springs (true positives), and failed to detect 2 springs (false negatives). In addition, the model did not detect any scrap as springs (zero false positives).

Post-processing

The nature of the problem requires the precise knowledge of 1) the spring’s orientation, so that the robot’s end effector knows exactly how to approach and grasp the spring, and 2) the level of occlusion by debris. The latter is important because the system needs to prioritize visible (and therefore graspable) springs over their non-visible, or semi-visible, counterparts. Let’s dive into these, one at a time.

Calculating the spring orientation

The post-processing stage takes as input the predicted spring polygons and extracts spring (x,y) positions, orientation φ, and occlusion level occlusion_level. To obtain the position of the ith spring, we compute the mask centroid as follows:

mask_centroid = (np.mean(mask[:,0]), np.mean(mask[:,1]))

Next, the spring orientation φ is computed by obtaining the (x,y) coordinates and running them through a principal component analysis (PCA) algorithm. PCA is a dimensionality reduction method that can be used to obtain the direction along which a dataset (in our case, a x-y scatterplot) varies the most. It can be utilised to calculate the orientation of the principal component of the spring shape, which is its length. It is noteworthy that this method will fail when the detected spring is partially occluded, as it will not retain its original length-to-width ratio (approx. 3:1 in our case). The following code generates a scatterplot similar to the following image,

and uses PCA to obtain the the eigenvector matrix of the covariance matrix of the centered data, and the correspondent diagonal eigenvalues matrix. From the eigenvector matrix we can obtain the angle of rotation needed to map the scatter data onto the x-axis, like so:

The following code can be used to generate an x-y dataset, and compute the transformation needed to map the scatter data on to the x-axis, thus obtaining the angle φ with respect to the x-axis.

Let’s go through the functionality step by step. Function generate_sample_data will generate a line f(x) = β₀ x + β₁ , where β₀ is the slope (equivalent to tan(φ) ), and β₁ is the offset, where in our case β₁~N(0,1), meaning it is normally distributed with a mean of 0 and a standard deviation of 1. The function returns an array of the x-y coordinates of the points we generated.

Function load_image can be used to load a monochrome image we would like to convert to an array of x-y coordinates, so that we can feed it into our core algorithm function, compute_data_rotation_angle. Function load_image is making certain assumptions about the dynamic range of the image, to facilitate processing. These assumptions are: 1) the dynamic range is ranges from 0 to 255, or from 0 to 1. In the former case, anything above 127.5 (or 127 if integer division is used) is assigned a 1, and everything else 0. This allows us to extract a set of x,y coordinates which we will feed to compute_data_rotation_angle.

Function compute_data_rotation_angle is where the juice lies. This function takes in the x,y coordinates of the pixels generated in load_image and outputs the angle (in degrees) of the spring.

2. Calculating the level of occlusion in debris

The robot needs to know which spring to prioritize in order to maximize the chance of grasping it. The question we need to ask ourselves here is, what is the most important criterion that determines grasping success? The answer is, occlusion level. How occluded is the spring by other objects? Will the end effector of the robot be able to approach it and isolate it, without grasping other objects in the process? How to solve this?

We already know that the spring has a particular shape — that is, it is rectangular with a length/width ratio of approximately 3:1. Also, that shape is not very likely to change unless we melt the spring. Under typical circumstances the spring will not deform, and therefore that is a piece of information we can use to further assess how occluded the spring is. If our DL model outputs a mask which has the shape of a birthday balloon, then we can say two things: 1) either the model has failed miserably, or 2) we have detected part of a spring. Well, (2) cannot be easily discounted (yet). However, we can play safe against (1) and discard all predictions with a detection probability below a certain threshold, say 0.85. What we are left with, are almost certainly springs. Now, what we need to do is rank the springs in terms of their similarity to a spring template. The more similar a detected object is to what we have agreed a spring looks like, the more visible it must be.

One way to do that is by using image matching. First, we have to topologically analyze the image in order to produce a contour (or, a higher level summary of the image), then feed the result into a shape matching algorithm.

In our implementation, we decided to use the OpenCV library, which offers a suite of algorithms for contour detection and dissimilarity score computation. One of those contour detection algorithms is Suzuki’s border detection algorithm. This is implemented in the findContours() function. A step-by-step description of this algorithm can be found here. After we obtain a contour, we feed it to the matchShapes() function, which uses the Hu moments method to compute similarity of two contours. The Hu moments can be used to produce a summary of image properties, such as centroid and area among others. Our final dissimilarity score serves as a metric of how visible / occluded a spring is.

Defining a “graspability” metric

Now that we have determined the angle of our spring, and the occlusion level, we need to define a metric that the system will rank the springs against, to determine which one to pick first. We have two metrics in our disposal:

Probability of detection
Occlusion level

We will somehow need to combine those in order to make sure both are summarized effectively in a single metric. To combine them we can use a variant of the harmonic mean, which is expressed by

where p_i is the probability of detection produced by the DL model, and d_i is the dissimilarity metric produced by the matching algorithm. The figure below visualizes how this metric behaves. Notice that as we move down the rows of images, the springs get more occluded, which leads to an increasing occlusion metric. The smaller the number, the more a spring is visible. A higher number means that the spring is heavily occluded and thus cannot be easily grasped.

Overall Pipeline

After combining the two previous stages we are able to know a) the exact position (x, y) of each spring, b) and the angle that each spring has in respect to the z-axis, c) the level of spring occlusion. The following flowchart shows an overview of the process.

Final words

Detection of particular types of objects in a nuclear decommissioning setting is a challenging task, due to the number of potentially different classes of objects that need to be detected and isolated, and also because the levels of radiation may damage equipment and thus pose more constraints on the design of the overall system. However, having a working pipeline for detecting and grasping one class of object that needs to be stored away safely, is nevertheless a promising start.

Acknowledgement

The parent robotics project Minotaur-R was funded by ESMERA, through European Union‘s Horizon 2020 research and innovation programme under grant agreement No 780265, and developed by the robotics team at iKnowHow SA.

Team:

Michalis Logothetis — Robotic control and implementation

Nikos Valmantonis — Mechanical design

Orfeas Kypris —Vision pipeline design and implementation

Lefteris Gryparis — Vision pipeline training and tuning

Angeliki Pilalitou — Proposal development and project management

Makis Pachos — Project management