Autoencoder for Anomaly Detection

Rick Qiu
Autoencoder for Anomaly Detection
3 min readMay 4, 2020

An autoencoder (AE) is a type of artificial neural network (ANN) used to learn data pattern. A typical AE has two parts: an encoder and a decoder. The encoder encodes/compresses an input into latent space variables, and the decoder decodes/reconstructs the input from the latent variables. A hidden layer named “bottleneck” connects the encoder and the decoder. The latent variables, which represent encoded data in the latent space, are the outputs of the activation functions in the “bottleneck” layer (see Figure 1).

Figure 1: AutoEncoder Diagram — courtesy from compthree blog

What is the use for an AE to predict its own input?

The AE learns an approximation to the identity function. The predicted output is similar to the input. By placing constraints on the network, i.e., limiting the number of hidden units, we can discover interesting structure about the data (Ng, 2011).

In an unsupervised learning setting, we don’t know the target labels of a dataset, but we do know there are a small number of outliers/anomalies in the dataset. A well-trained AE learns regularities. Hence, the AE will predict low reconstruction errors for normal examples and high errors for anomalous examples (Agmon, 2020).

Modelling

It is time to write code for the autoencoder model. The following code block is a five layers ANN. It takes 8 features data points as input and outputs data points with the same number of features in the input.

Table 1 shows a summary of the AE model.

Table 1: Summary of the AE model

Model Training

Training the model is an unsupervised learning process. We set epochs=200 , batch_size = 128 and shuffle=True (shuffle the data at the beginning of each epoch). We select Adam optimizer to minimize the mean_squared_error.

Prediction

After the model has been trained on the training dataset with 8 features, we can firstly select a threshold based on the reconstruction errors, i.e., the mean squared error (MSE) and the known anomaly ratio 10/50000.

MSE threshold: 0.0911897969373238

This calculated reconstruction MSE threshold can be used as a cut-off for predicting anomalies. Data points with MSE above the cut-off are anomalous.

['X', 'Y', 'D', 'C', '2', 'D', 'C', 'A']
['T', 'X', 'S', 'X', '1', 'A', 'B', 'C']
['X', 'T', 'S', 'X', '1', 'A', 'B', 'C']

Finally, we have got the predicted anomalies.

An input data point is encoded into two-dimensional data in latent space. So, it is easy for us to visualize latent outputs in a chart (see Figure 2).

Figure 2: Scatter plot of the hidden variables on test dataset

The full code is in my GitHub repository[link].

References

Ng, A., 2011. CS294A Lecture notes: Sparse autoencoder. [online] Web.stanford.edu. Available at: <https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf> [Accessed 4 May 2020].

Agmon, A., 2020. A Keras-Based Autoencoder For Anomaly Detection In Sequences. [online] Medium. Available at: <https://towardsdatascience.com/a-keras-based-autoencoder-for-anomaly-detection-in-sequences-75337eaed0e5> [Accessed 4 May 2020].

--

--