Good or Bad Pill?

Computer Vision for the Pharmaceutical Industry

Published in

Intel Analytics Software

5 min readOct 17, 2022

This tutorial shows how to use the Visual Quality Inspection AI Reference Kit to build computer vision solutions. The reference kit provides links to various datasets that illustrate the concept of visually inspecting damaged products in the manufacturing process. You could play around with any manufacturing defects, but we will focus on pill quality in this tutorial. In this dataset, consumer over-the-counter medical supplements are classified into good or bad categories. We will be using this dataset to transfer learn a pretrained VGG-16 model to create an automated pill quality control tool.

VGG-16 is a convolutional neural network that is 16 layers deep. It was one of the best-performing architectures in the ILSVRC challenge 2014. It was the runner-up in the classification task with a top-5 classification error of 7.32% (only behind GoogleNet with a classification error of 6.66%). It was also the winner of the localization task with a 25.32% localization error.

However, it is very slow to train from scratch. The size of VGG-16 trained ImageNet weights is 528 MB. So, it takes quite a lot of disk space and bandwidth, which makes it inefficient. 138 million parameters lead to exploding gradients problem. Therefore, we leverage the Intel Extension for PyTorch (IPEX) running on an AWS EC2 m6i.4xlarge instance (3rd Generation 2.9 GHz Intel Xeon Platinum 8375C processors) to transfer learn the pretrained VGGNet classification architecture on the pill dataset.

IPEX contains optimizations to boost PyTorch performance on Intel hardware (Figure 1). It includes a Python API that allows users to take advantage of these optimizations by just modifying 2–3 lines of code. Most of the optimizations in IPEX will eventually be included in stock PyTorch releases.

Figure 1. Intel Extension for PyTorch speedup in prediction time compared to stock PyTorch. This comparison was done with v1.8.0 but the latest version at the time of publication is v1.12.0.

3rd Generation Xeon processors natively support low-precision BFloat16 with Intel Advanced Vector Extensions (AVX-512), and future generations will support mixed-precision with Intel Advanced Matrix Extensions (AMX). With AMX, you’ll be able to train with half-precision while maintaining the network accuracy achieved with single-precision.

IPEX transparently supports the fusion of frequently used operator patterns, like Conv2D+ReLU, Linear+ReLU, etc. to optimize performance even further with TorchScript. It also optimizes operators and implements several customized operators. A few ATen operators are replaced by their optimized counterparts in IPEX via the ATen registration mechanism. Moreover, some customized operators are implemented for several popular topologies. For instance, ROIAlign and NMS are defined in Mask R-CNN. IPEX also optimizes these customized operators to improve the performance of these topologies.

Exploratory Data Analysis

Quality control in the pharmaceutical industry is a very critical component. Let’s look at the different aspects of acceptable (Figure 2a) and defective (Figure 2b) pills. Some of the defects to look for are color, contamination, cracks, faulty imprints, the wrong pill type, and scratches.

Figure 2a. Examples of good/acceptable pills

Figure 2b. Examples of bad/defective pills

A difference analysis of average acceptable and defective pills indicates a defect in the upper right of the pills and what appears to be discoloration on the right side of the pill (Figure 3). This could indicate an issue with the manufacturing process that damages the right side of some pills.

Figure 3. Result of taking the difference between the average acceptable and average defective pills

Our Custom VGG-16 Model Definition

We will define a custom multiclass classification model with a VGG-16 feature extractor, pre-trained on ImageNet, and a custom classification head. Parameters for the first convolutional blocks are frozen to allow for transfer learning. This function returns class scores when in training mode, class probabilities, and a normalized feature map when in evaluation mode.

If this is your first time seeing a PyTorch model training schema, you’ll notice some boilerplate code needs to be written. PyTorch models typically require the following components:

Data Preparation: Extend the PyTorch Dataset class for data loading and customization. This is useful for extracting images from folders, their respective labels, and other necessary metadata.
Data Loading: PyTorch also provides the DataLoader class to help navigate your Dataset class during the training and evaluation of your model. This function is a generator responsible for serving data to your model during training and inference.
Model Definition: Defining a PyTorch model involves defining a class that extends the Module class. The constructor (__init__) is responsible for determining the layers of the model, and the forward() function defines how to forward propagate the input data through the defined layers of the model.
Training Function: You will be required to define a loss function and an optimization algorithm for the training component.

The code snippet below represents this example’s model definition (Figure 4). The constructor defines the VGG-16 model we are extending and the custom layers we add to the existing architecture. The “_freeze_params” method freezes (prevents alteration to layer’s weights for transfer learning) the shallow layers of the VGG-16 model by setting requires_grad to False. The forward() method applies weights and moves data through the model.

Figure 4. Custom VGG-16 model class code snippet

Transfer Learning with Intel Extension for PyTorch

As previously discussed, IPEX is used to transfer learn the pretrained VGGNet classification model on the pill dataset. Line 11 in the code snippet below shows the call to the IPEX optimize method that applies various optimizations to our model (Figure 5). See the official documentation to learn more about applying IPEX to models.

Figure 5. Model training code snippet

Let’s evaluate the architecture of our trained model and see the total trained parameters (Figure 6). When models are optimized using IPEX, the prefix “_IPEX” is appended to appropriate layers, this is a good way to verify that IPEX has been implemented.

from torchsummary import summary
summary(trained_model, (3, 224, 224))

Figure 6. Architecture of the IPEX model

Predictions on the hold-out test data show that our model determines that 2/3 of the pictures are “bad” pills (Figure 7). Bounding boxes and heatmaps highlight the most prominent defects found on the pill. As expected, we see problems with the “FF” imprint, chipping, and discoloration.

Figure 7. Predictions on three test images from the pill dataset. Red bounding boxes surround defects. A heatmap is overlaid and shows concentrations of critical defects.

Conclusions

With just a few additions to our code, we optimized a custom VGG-16 binary classifier on Intel hardware. Our ability to transfer learn efficiently allows us to take a pretrained Torch VGG-16 model, transfer learn (tune) with images from our pill dataset, and turn it into an efficient classification tool for pharmaceutical quality control.