Autoencoders Explained

om pramod
6 min readJun 16, 2024

--

Part 5: Denoising Autoencoders

In traditional autoencoders, the goal is to learn a compressed representation that preserves the essential information in the data. However, if the input data is noisy or corrupted, the autoencoder may learn to encode and decode the noise as well, resulting in a poor compressed representation. Denoising autoencoders address this problem by learning to denoise the input data during training.

One approach to encourage an autoencoder to learn a useful representation of data is to keep the code layer small. This forces the model to learn an intelligent representation of the data rather than simply copying the input to the output. Another approach is to add random noise to the input data and train the autoencoder to recover the original, noise-free data. This way the autoencoder can’t simply copy the input to its output because the input also contains random noise. This technique is called a denoising autoencoder and helps ensure that the model is subtracting the noise and learning the underlying meaningful features of the data.

A denoising autoencoder (DAE) is a type of autoencoder that is trained to remove noise from data. To achieve this, the DAE adds random noise to the input data during training. The amount and type of noise added can vary depending on the problem being solved. For example, in image denoising tasks, random noise can be added to the pixels of an image to create a noisy version of the image. The noise can be added in various ways, such as Gaussian noise, salt-and-pepper noise, or random pixel dropout. Similarly, in speech denoising tasks, random noise can be added to the audio signal to create a noisy version of the speech. The noise can be added in various ways, such as white noise or background noise.

For example, if we choose to use Gaussian noise, we can generate random values from a normal distribution with mean 0 and standard deviation sigma and add them to the pixel values of each image in the dataset. The amount of noise added can be controlled by adjusting the standard deviation of the normal distribution. Suppose we have an input image X of size 28x28, which is a grayscale image with pixel values ranging from 0 to 255. We can add Gaussian noise to this image by generating a noise matrix N of the same size, where each element of N is a random value drawn from a normal distribution with mean 0 and standard deviation sigma.

The noisy image X’ can be obtained by adding the noise matrix N to the original image X, i.e., X’ = X + N. The amount of noise added can be controlled by adjusting the value of sigma. For example, if sigma=0.1, then the noise added will be small and the resulting image will still be relatively clear. On the other hand, if sigma=1.0, then the noise added will be much larger and the resulting image will be much noisier.

Let’s say we have a 1D signal or vector x of length n:

x = [2, 3, 4, 5, 6]

We can add random Gaussian noise to this signal by generating a vector of noise samples with the same length and adding it to the original signal. Let’s generate noise with mean 0 and standard deviation 0.5:

noise = [0.1, -0.4, 0.2, 0.3, -0.2]

We can then add this noise to the signal element-wise to get the noisy version of the signal:

x_noisy = x + noise = [2.1, 2.6, 4.2, 5.3, 5.8]

here’s a Python example of how to add Gaussian noise to an image-

import cv2
import numpy as np

# Read input image
img = cv2.imread('input_image.png')

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Add Gaussian noise with mean 0 and standard deviation 10
noise = np.random.normal(0, 10, gray.shape)

# Add noise to image
noisy_img = gray + noise

# Save output image
cv2.imwrite('noisy_image.png', noisy_img)

Note — This noise has the same shape as the grayscale image, which is achieved by passing in the gray.shape argument.

After adding the noise, we use these corrupted images as input to the denoising autoencoder and the network is then trained to reconstruct the original, clean data from the noisy input by minimizing the difference between the reconstructed output and the clean input.

Before and After the Noise Removal of an Image of a Playful Dog. Reference

Here’s an example of image denoising using python –

Comparing PCA and Autoencoders for Dimensionality Reduction: Linear vs. Nonlinear Approaches

PCA is a linear technique that works by finding the directions of maximum variance in the data and projecting the data onto a lower-dimensional subspace defined by these directions. The goal is to retain as much variance in the data as possible while reducing the number of dimensions. PCA is limited to linear transformations and can only capture linear relationships in the data. Autoencoders, on the other hand, are a type of neural network that can be used for nonlinear dimensionality reduction. Autoencoders are nonlinear neural networks that try to learn an encoding of the input data that captures its essential features. The encoder part of the network maps the input to a lower-dimensional space, and the decoder part maps it back to the original space. The goal of training an autoencoder is to minimize the difference between the input and output data, which encourages the encoder to capture the most important features of the input data in the lower-dimensional latent space. Here’s python implementation to demonstrate the difference:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from sklearn.decomposition import PCA
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

# Load the MNIST dataset
(x_train, _), (x_test, _) = mnist.load_data()

# Reshape the data to a 1D vector and normalize the values to be between 0 and 1
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Perform PCA on the training data
pca = PCA(n_components=32)
pca.fit(x_train)
x_train_pca = pca.transform(x_train)
x_test_pca = pca.transform(x_test)

# Define the autoencoder architecture
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
encoded = Dense(32, activation='relu')(encoded)

decoded = Dense(64, activation='relu')(encoded)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)

# Create the autoencoder model
autoencoder = Model(input_img, decoded)

# Compile the model
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Train the autoencoder
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, shuffle=True, validation_data=(x_test, x_test))

# Use the autoencoder to reconstruct the test data
x_test_autoencoder = autoencoder.predict(x_test)

# Plot the first 5 original and reconstructed images for PCA
plt.figure(figsize=(12, 6))
for i in range(5):
# Original image
ax = plt.subplot(2, 5, i + 1)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

# Reconstructed image using PCA
ax = plt.subplot(2, 5, i + 6)
plt.imshow(pca.inverse_transform(x_test_pca)[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.suptitle("Original Images (top row) vs. PCA Reconstruction (bottom row)")
plt.show()

# Plot the first 5 reconstructed images for autoencoder
plt.figure(figsize=(12, 6))
for i in range(5):
ax = plt.subplot(2, 5, i + 1)
plt.imshow(x_test_autoencoder[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.suptitle("Autoencoder Reconstruction")
plt.show()

Closing note — Congratulations on completing this journey through the world of autoencoders! Continue experimenting and applying what you’ve learned to your projects. The possibilities are endless!

--

--