Building the simplest Auto-Encoder in Keras

Published in

Analytics Vidhya

3 min readMay 11, 2020

A good machine learning model requires a good feature representation. Once a good representation is extracted, the machine learning algorithm performs very well. Sometimes, this task known as feature engineering is very challenging especially when we are dealing with high dimensional data.

Feature extraction techniques have made feature engineering easier to some extent, by extracting the most explainable features form the data. Auto-Encoders can be thought of as a feature extraction technique, that can be used for non-linear dimensionality reduction.

Auto-Encoders are self-supervised learning techniques where the target is generated from the data itself. Auto Encoders map input data to an internal latent representation, which is used for producing the output that should be same as the input data.

Auto-Encoder composed of three components — Encoder, Bottle Neck (or latent representation), and Decoder. The encoder maps the input into the code, decoder maps the code to the original input, and the bottleneck that encompasses the lower-dimensional features of the data. Now, let us jump directly to build the simplest possible auto-encoder using Keras.

The pictorial representation of the network that we are going to create is shown below. All three components are represented in the figure.

Importing the libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from keras.layers import Dense
from keras.models import Sequential
from sklearn.metrics import mean_squared_error

Creating the dataset

We have created five linear sequences and added some noise in them. First, third and fifth sequences have highly positive correlation with one another while second and fourth have highly negative correlation with first, third and fifth.

seq_1 = np.linspace(0,100,100) + np.random.uniform(-1.5,1.5,100)
seq_2 = np.linspace(100,0,100) + np.random.uniform(-1.5,1.5,100)
seq_3 = np.linspace(0,100,100) + np.random.uniform(-1.5,1.5,100)
seq_4 = np.linspace(100,0,100) + np.random.uniform(-1.5,1.5,100)
seq_5 = np.linspace(0,100,100) + np.random.uniform(-1.5,1.5,100)

Plotting the data

fig,a =  plt.subplots(3,2)
a[0][0].plot(seq_1[0:20],'o',color="red")
a[0][0].set_title('Sequence 1')
a[0][1].plot(seq_2[0:20],'o',color="green")
a[0][1].set_title('Sequence 2')
a[1][0].plot(seq_3[0:20],'o',color="blue")
a[1][0].set_title('Sequence 3')
a[1][1].plot(seq_4[0:20],'o',color="yellow")
a[1][1].set_title('Sequence 4')
a[2][0].plot(seq_5[0:20],'o',color="pink")
a[2][0].set_title('Sequence 5')
plt.show()

Five linear sequences, that we will be using in our auto-encoder

Creating the data-frame

data = pd.DataFrame({'col_1':seq_1,'col_2':seq_2,'col_3':seq_3,'col_4':seq_4,'col_5':seq_5})
numpy_array = data.values

Building the network

Since we are recreating the input data at the output layer, the best choice for our loss function would be mean squared error.

L = || x-x’ ||

Where x is the original input and x’ is the reconstructed input.

model = Sequential()
model.add(Dense(5, activation='linear')) 
model.add(Dense(2, activation="linear"))
model.add(Dense(5, activation="linear"))model.compile(optimizer='adam',loss='mse')
history = model.fit(numpy_array,numpy_array,epochs=1000,verbose=0)

Evaluating the model

mse = model.evaluate(numpy_array, numpy_array, verbose=0)

The mean square error on the above data comes out to be 0.58. Now lets evaluate some random test data and see how the network performs.

test = np.array([[10, 1000, 10, 1000, 10]])
predicted_results = model.predict(test)
print(predicted_results)array([[9.514883,1000.5625,10.023716,999.5494,10.274501]],dtype=float32)

From the above example, we observed how a small auto-encoder trained on five highly correlated sequences can reconstruct the input. A representation of the input and corresponding output recreated by the network is shown below.

Reconstructed input from the random data

There are various variants of auto-encoders such as — undercomplete auto-encoder, denoising auto-encoder, sparse auto-encoder and the adversarial auto-encoder. All these architectures are aimed to produce meaningful results from the original input data.

There are various applications of Auto-Encoders. Some of the application includes Anomaly detection, Image Compression, Information retrieval, etc.

Building the simplest Auto-Encoder in Keras

Written by Nikhil Anand