Introduction to Anomaly Detection in Images
I guess, the first thought that crossed your mind when you read the title is — Anomaly detection. What’s in it? It is not that complex to write an article about.
And maybe you are right 🙌. But maybe not? 😖
I, somewhat agree that anomaly detection lately has been one of the most common but complex problems to solve when it comes to Data Science. And obviously an abundance of incredible resources on the internet for help. However, it becomes more complex and exciting when you have to detect anomalies in images because there is a humongous amount of details/information that is hidden in an image.
Any guesses on what’s the most essential part of an image?
Yes, it’s the pixels of an image 🌆.
Any image that you work on, is a beautiful art of a collection of pixels and the pixel values which represent a specific color or intensity value. In the data science context, images can also be represented as feature vectors, where each pixel, ranging from 0 to 255, is treated as a feature and its value is used as a feature value.
When we talk about features, they can broadly be categorized into Global Features and Local Features. Now, let’s have a glance at the below image to understand this better.
Note: Global features can generalize an entire object with a single vector. Local features, on the other hand, are computed at multiple points in the image.
Histogram of Oriented Gradients
Now, before diving into detail, let’s first understand what is an anomaly.
An anomaly is something that deviates from normal. In images, it means an image that is distorted, say its local features deviate from that of a normal image or its global features are different.
Now, here to detect an anomalous image, I have used the HOG feature descriptor i.e. histogram of oriented gradients. It is a technique in computer vision, used as a feature descriptor, that counts the occurrences of gradient orientation in a localized portion of an image. So, one thing that we know for sure is that it captures the local features of an image.
Note: It is different from edge detection as in addition, it also provides the direction of the edges in an image
Let’s dive straight into the intuition behind it and understand its working.
One thing that is important to note is that the HOG of an image is calculated in patches and not on the complete image at once. Now, suppose we have pixels of the above patch of the image and its size is let’s say 8 x 8, and the same is given below.
Note: The above pixel matrix is not for a real pixel matrix but a random one.
Now, the HOG has 4 main components. Let’s understand and calculate them one by one to get a good understanding of the math behind HOG.
X-Gradient
The gradients are calculated for each pixel in the image. For now, let’s calculate the gradient for the highlighted pixel. To get the x-gradient for the highlighted pixel, we need to subtract the pixel on the left of the highlighted pixel from the pixel that is to the right.
So, the X-Gradient will be equal to 89–78=11 (Gx)
Y-Gradient
Just as we calculated the X-Gradient previously, we need to calculate the Y-Gradient for the highlighted pixel. To calculate this, we need to subtract the pixel below the highlighted pixel from that is above.
So, the Y-Gradient will be equal to 68–56=8(Gy)
Magnitude and Orientation of Gradients.
Now, to calculate the magnitude and orientation of the gradients, we will use the Pythagoras Theorem.
Here, the hypotenuse is the Magnitude and the angle θ is the orientation of the gradients. The calculation for the magnitude is done below.
Total Gradient Magnitude = √[(Gx)2+(Gy)2]
Total Gradient Magnitude = √[(11)2+(8)2] = 13.6
Now, let us calculate the orientation of the gradients. Let’s move to flashback for a second. We are very much familiar with the below formula. (Only if trigonometry was your favorite).
tan(Φ) = Gy / Gx
After some simple math, we get the below equation.
Φ = atan(Gy / Gx)
We will use the above formula to calculate the gradients. Now, given the values of Gx and Gy which are 11 and 8 respectively, the value of the orientation comes out to be 36.
This computation happens for all the pixels in the matrix. Also, for the corner pixels, the same calculation is done after applying padding.
Now, once we are done with these loooonnngggg calculations, let’s jump to create a histogram matrix using the orientations and gradients and then we are good to go to use it as an input to our anomaly detection model.
HOG — Histogram Of Oriented Gradients
Now, you might feel that HOG i.e. histogram of oriented gradients nowhere mentions the magnitude of the gradients which we calculated earlier. So, why did we even calculate the magnitude in the first place?
That is a very good and too legit question. But, the answer lies in how we finally arrive at the histogram. So, let’s jump straight to that.
Note: The gradient direction in HOG is represented as an angle that ranges from 0 to 180 degrees. The reason for using 180 degrees as the maximum value for the orientation is that it allows for the representation of both positive and negative gradient directions.
Now, we create a bin of 20 degrees each (9 bins in total for 180 degrees) and create a histogram by assigning each orientation of a value to a bin. We have 13.6 as the magnitude value and 36 as the orientation value.
Looking at the above image, you might have understood the use of magnitude of the gradients. The magnitude is simply used to add the contribution of the orientation on either side of the bins. This is how we finally arrive at the histogram of oriented gradients.
But…. Wait Wait Wait!!! We only calculated this for a patch of image right? How do we calculate it for the complete image? Also, what will be the shape of the histogram matrix? Cool, let’s answer that.
Let’s take 2 main assumptions.
- Our original image is of size 256 x 256 (Original paper recommends 64 x 128)
- The patch is of size 8 x 8 (as mentioned earlier)
One step that we missed is the normalization of the gradients to eliminate the lighting variation. For this, we will take multiple patches of dimension 8 x 8. Now, considering we initially had an 8 x 8 patch for which we derived the histogram. So, if a patch of 8 ✖️ 8 pixels gives us a histogram of shape 1 x 9, then that of 16 x 16 pixels would give us a histogram of shape 1 x 36.
Now, to answer the question as to what will be the shape of the HOG features, let’s say we move the patch of 16 x 16 across the image with a stride of 1. So, this would require 31 such patches to move across the length of the image and 31 such patches to move across the width. Since each path generates the histogram of size 1 x 36, so the total number of features will be 31 x 31 x 36 i.e. 34596.
Note: You can reduce the number of features by either resizing the image or by increasing the patch size.
Anomaly Detection
Now, once we get the HOG features from an image, we are ready to put them in the model to get the anomaly image. For this, I use the human faces data as the normal images and some random non-human images as the negative sample images.
Also, for anomaly detection, I use the one-class classification SVM to get the anomalous images. In short, it predicts the output as 1 for a normal image and -1 for an anomaly image. Let’s first create a list of HOG features for the training images i.e. all the human faces.
I have also used harlick textural features for better anomaly detection.
hog_desc = []
resize_shape = (224,224)
def extract_features(image):
"""
Extract Harlick textural features of the image
"""
# calculate haralick texture features for 4 types of adjacency
textures = mt.features.haralick(image)
# take the mean of it and return it
ht_mean = textures.mean(axis=0)
return ht_mean
for imgs in os.listdir('./human'):
"""
Extract the HOG and Harlick features of an image
"""
try:
if imgs.split('.')[-1] == 'jpg':
img = imread('./human/'+imgs)
resized_img = resize(img, resize_shape)
#pixel of 8*8 and cells_per_block as (2,2) gives a patch of 16(16
fd, hog_image = hog(resized_img, orientations=9, pixels_per_cell=(8, 8),
cells_per_block=(2, 2), visualize=True, multichannel=True)
#harlick
feat = extract_features(img)
#concat all
all_feat = np.concatenate((fd,feat))
hog_desc.append(all_feat)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 8), sharex=True, sharey=True)
ax1.imshow(resized_img, cmap=plt.cm.gray)
ax1.set_title('Input image')
ax2.imshow(hog_image, cmap=plt.cm.gray)
ax2.set_title('Histogram of Oriented Gradients')
plt.show()
except Exception as e:
print(e)
The above code block generates the HOG features and Harlick textural features of an image and stores them as an array in a list after concatenation.
Once you have the list of oriented gradients for all the training images, use the below code snippet to train your model.
#Training one-class SVM model
from sklearn.svm import OneClassSVM
clf = OneClassSVM(kernel="poly", gamma=0.1, nu=0.1, max_iter=500, verbose=True)
clf.fit(hog_desc)
Once your model is trained, you can input any of the images in the below function to get the output as anomaly or non-anomaly.
def predict_anomaly(image_path, resize_shape):
"""
Predicts if an image is anomaly or not
"""
#read the image
img = imread(image_path)
resized_img = resize(img, resize_shape)
fd,_ = hog(resized_img, orientations=9, pixels_per_cell=(8, 8),
cells_per_block=(2, 2), visualize=True, multichannel=True)
#harlick features
feat = extract_features(img)
#concat all
fdd = np.concatenate((fd, feat))
#prediction
if clf.predict(fdd.reshape(1,-1)) == -1:
result = "This is an anomaly Image"
else:
result = "This is not an anomaly Image"
plt.imshow(resized_img)
plt.title(result)
plt.show()
return result
Let me display some results on normal images from the above experiment.
Now, we have seen that it decently predicts non-anomaly images. What if we give an anomaly image to test the prediction? Let’s display those results as well.
As we can see, the HOG feature descriptor is working decently when it comes to detecting anomalies. However, this has a high scope of improvement and that can be done by concatenating some global features to the feature array.
I hope that you enjoyed the complete experimentation of anomaly detection on images. Will update you if I am there with any better or improvised version of this model.
Till then,
Happy reading !!! ;)