Anomaly Detection in Images — AUTOENCODERS

Subham Sarkar
Jun 13 · 5 min read

Introduction :

Autoencoding” is a data compression algorithm where the compression and decompression functions are:

  1. data-specific: which means that they will only be able to compress data similar to what they have been trained on.
  2. lossy : which means that the decompressed outputs will be degraded compared to the original inputs
  3. learned automatically from examples rather than engineered by a human : which means that it is easy to train specialised instances of the algorithm that will perform well on a specific type of input. It doesn’t require any new engineering, just appropriate training data.

It is an unsupervised learning technique in which we leverage neural networks for the task of representation learning.

  • Autoencoders consists of an Encoder network and a Decoder network. The encoder encodes the high dimension input into a lower-dimensional latent representation also referred to as the bottleneck layer. The decoder takes this lower-dimensional latent representation and reconstructs the original input.
  • As visualised above, we can take an unlabelled dataset and frame it as a supervised learning problem tasked with outputting , a reconstruction of the original input x. This network can be trained by minimising the reconstruction error, L(x,x̂), which measures the differences between our original input and the consequent reconstruction.


The interesting practical applications of Autoencoders are:

  1. Data denoising ,
  2. Dimensionality reduction for data visualisation: With appropriate dimensionality and sparsity constraints, Autoencoders can learn data projections that are more interesting than PCA or other basic techniques. Because neural networks are capable of learning nonlinear relationships, this can be thought of as a more powerful (nonlinear) generalisation of PCA. Whereas PCA attempts to discover a lower dimensional hyperplane which describes the original data, Autoencoders are capable of learning nonlinear manifolds.
  3. Image recognition , Anomaly Detection and Semantic segmentation.
  4. Recommendation Engines.

Structural Similarity Index (SSIM) Loss function :

  • SSIM is used as a metric to measure the similarity between two given images.
  • The Structural Similarity Index (SSIM) metric extracts 3 key features from an image: Luminance , Contrast and Structure.
Source :
  • Structural Similarity Index between 2 given images which is a value between -1 and +1. A value of +1 indicates that the 2 given images are very similar or the same while a value of -1 indicates the 2 given images are very different.
  • For similar images, the SSIM loss function will be smaller and for anomalous images, the SSIM loss function will be larger.

Problem Statement:

  • Given “n” images of Fashion-MNIST data, which contains few “m” images from MNIST handwritten data. We need to filter out the Anomalies without using any Transfer Learning techniques.

Let’s see how we approach this problem statement :

  • Importing required libraries
  • Loading the Fashion MNIST train and test dataset, normalising it and reshaping it .


  • Building the ENCODER part : The encoder encodes the high dimension input into a lower-dimensional latent representation also referred to as the bottleneck layer.
  • Building the DECODER part : The decoder will decompress the latent representation to recreate the input data.
  • The output layer uses a sigmoid activation function as it flattens the output to be in the range[0,1] .
  • Defining Structural Similarity Index(SSIM) Loss Function : For similar images, the SSIM loss function will be smaller and for anomalous images, the SSIM loss function will be larger.
  • Defining the Autoencoder: Optimiser : Adam , Loss : SSIM Loss
  • Let’s view the architecture of our Autoencoder neural network
  • Setting up TENSORBOARD as callback , for logging loss metric and TRAINING: Trained for 10 epochs , with batch size of 128.
  • Let’s view the train and test losses in TENSORBOARD : Run the experiment for more epochs to get better results.
  • Reconstruct the Fashion MNIST images for the test data and visualise : Pass the test dataset to the Autoencoder and predict the reconstructed data. Visualise the original and the reconstructed images.

Now our Autoencoder has been trained to reconstruct images from Fashion MNIST data.

  • Now, let’s introduce MNIST Handwritten image data which our Autoencoder model would consider as an Anomaly using SSIM loss.
  • Loading MNIST Handwritten Train and Test Data, normalising it and reshaping it .
  • Now, predict Fashion-MNIST data and MNIST Handwritten data using our AUTOENCODER which was trained on Fashion-MNIST dataset , and check there SSIM Loss.


  • You can see from the above image that the SSIM loss is minimal for the reconstruction of the trained dataset(Fashion MNIST dataset) however the SSIM loss is higher for the dataset the Autoencoder was not trained on(i.e. MNIST Handwritten).


  • Autoencoders works good for identifying anomalies. Because Autoencoders learn how to compress the data based on attributes (ie. correlations between the input feature vector) discovered from data during training, these models are typically only capable of reconstructing data similar to the class of observations of which the model observed during training.

Thanks for reading this blog. If you liked it please clap, follow and share.

Where can you find my code ?

Github :


Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…