Complete Guide to build an AutoEncoder in Pytorch and Keras

Sai Durga Mahesh
Analytics Vidhya
Published in
4 min readJul 6, 2020

This article is continuation of my previous article which is complete guide to build CNN using pytorch and keras.

Taking input from standard datasets or custom datasets is already mentioned in complete guide to CNN using pytorch and keras. So we can start with necessary introduction to AutoEncoders and then implement one.

AutoEncoders

Auto Encoder is a neural network that learns encoding data with minimal loss of information.

Autoencoder

There are many variants of above network. Some of them are:

Sparse AutoEncoder

This auto-encoder reduces overfitting by regularizing activation function hidden nodes.

Denoising AutoEncoder

This auto-encoder is trained by adding noise to input. This will remove noise from input at evaluation.

Variation AutoEncoder

This is kind of deep generative neural network. Major challenge with Auto Encoders is they always try to minimise reconstruction error and never bother about underlying latent representation.

A good latent representation should always be meaningful so that it can be used in generative neural networks like GAN. Meaningful refers to arrangement. Grouping data points from same class closer and data points form different class little farther.

https://blog.keras.io/building-autoencoders-in-keras.html

This kind of latent representation can be achieved by changing structure of neural network as follows:

VAE

Unlike remaining auto encoders, We are generating a latent distribution with mean and standard deviation instead of single latent vector. We will then sample from latent distribution to reconstruct the input.

The two important things about variation auto encoder are:

While sampling we need to handle randomness of node using re-parametrization trick as randomness of node may stop backpropogation.

N( μ,𝛔) ≈μ+𝛔*N(0,1)

This re-parametrization trick will not change distribution. But it will adjust the parameters to allow backpropogation.

Variation Auto Encoder regularizes cost function using following equation.

Regularized Cost Function= Loss+KL(N(μ,𝛔),N(0,1))

This forces the latent distribution to follow standard normal distribution that extends its usage in deep generative models .

You can read more about VAE in this article and more about various types of auto-encoders here. We will implement VAE in this article.

Implementation

Any auto-encoder comprises of two networks encoder and decoder. As previously said, VAE also uses regularized cost function.

Encoder

Encoder takes input and returns mean and standard deviation of a latent distribution.

#Pytorchclass VAE(nn.Module):
def __init__(self, x, h1, h2, z):
super(VAE, self).__init__()


self.fc1 = nn.Linear(x, h)
self.fc2 = nn.Linear(h1, h2)
self.fc_mean = nn.Linear(h2, z)
self.fc_sd = nn.Linear(h2, z)



def encoder(self, x):
h1 = F.relu(self.fc1(x))
h2 = F.relu(self.fc2(h1))
return self.fc_mean(h2), self.fc_sd(h2) # mu, log_var
#Kerasx = Input(batch_shape=(batch_size, original_dim))
h = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim)(h)
z_log_sigma = Dense(latent_dim)(h)

Sampling

From mean and standard deviation obtained from encoder, we will generate input to decoder by sampling. Above mentioned re-parametrization trick comes into picture here.

#Pytorchdef sampling(self, mu, log_var):
std = torch.exp(0.5*log_var)
eps = torch.randn_like(std)
return eps.mul(std).add_(mu)

#Keras
def sampling(args):
z_mean, z_log_sigma = args
epsilon = K.random_normal(shape=(batch_size, latent_dim),
mean=0., std=epsilon_std)
return z_mean + K.exp(z_log_sigma) * epsilon

Decoder

Decoder takes output of sampling function and tries to reconstruct the original input.

#Pytorchclass VAE(nn.Module):
def __init__(self, x, h1, h2, z):
super(VAE, self).__init__()
self.fc1 = nn.Linear(x, h1)
self.fc2 = nn.Linear(h1, h2)
self.fc_mean = nn.Linear(h2, z)
self.fc_sd = nn.Linear(h2, z)
# decoder
self.fc4 = nn.Linear(z, h2)
self.fc5 = nn.Linear(h2, h1)
self.fc6 = nn.Linear(h1, x)

def decoder(self, z):
h1 = F.relu(self.fc4(z))
h2 = F.relu(self.fc5(h1))
return F.sigmoid(self.fc6(h2))
#Kerasdecoder_h = Dense(intermediate_dim, activation='relu')
decoder_mean = Dense(original_dim, activation='sigmoid')
h_decoded = decoder_h(z)
x_decoded_mean = decoder_mean(h_decoded)

Loss Function

As previously mentioned, VAE uses regularized loss function,

KL divergence of distribution with mean μi and standard deviation 𝛔i with standard normal distribution ( KL(N(μi,𝜎I),N(0,1)) ) is

#Pytorchdef loss_function(reconstructed_x, x, mu, log_var):
loss = F.binary_cross_entropy(reconstructed_x, x.view(-1, 784),
reduction='sum')
regularized_term = -0.5 * torch.sum(1 + log_var - mu.pow(2) -
log_var.exp())

return loss + regularized_term
#Kerasdef vae_loss(x, x_decoded_mean):
xent_loss = objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.mean(1 + z_log_sigma - K.square(z_mean) -
K.exp(z_log_sigma), axis=-1)
return xent_loss + kl_loss

Flow of data

Data starts from encoder, sampling and then decoder .

#Pytorchdef forward(self, x):
mu, log_var = self.encoder(x.view(-1, 784))
z = self.sampling(mu, log_var)
return self.decoder(z), mu, log_var

In keras, there is no need of forward function. Data will flow in the order you modelled your network.

Compiling Network with loss function.

#Pytorchvae = VAE(x_dim=784, h_dim1= 512, h_dim2=256, z_dim=2)
latent, mu, log_var = vae(data)
loss = loss_function(latent, data, mu, log_var)

loss.backward()

optimizer.step()
#Kerasvae = Model(x, x_decoded_mean)vae.compile(optimizer='rmsprop', loss=vae_loss)

Also we will pack the implementation of GAN in pytorch and keras in next article.

Thanks for reading:))

References

--

--

Sai Durga Mahesh
Analytics Vidhya

Using Data Science to provide better solutions to real word problems