DLOA (Part-13)-LeNet CNN and Implementation

Dewansh Singh
7 min readMay 8, 2023

--

Hey readers, hope you all are doing well, safe, and sound. I hope you have already read the previous blog. In the previous blog, we briefly discussed Convolutional Neural Networks. If you didn’t read that you can go through this link. In this blog, we’ll be discussing LeNet CNN, working, and Implementation.

Introduction

LeNet-5 is one of the earliest and most influential Convolutional Neural Network (CNN) architectures. It was designed by Yann LeCun in the 1990s and was originally used for handwritten digit recognition. In this blog, we will explain the architecture of LeNet-5 in detail.

LeNet was one of the first successful CNNs and is often considered the “Hello World” of deep learning. It is one of the earliest and most widely-used CNN architectures and has been successfully applied to tasks such as handwritten digit recognition. The LeNet architecture consists of multiple convolutional and pooling layers, followed by a fully-connected layer. The model has five convolution layers followed by two fully connected layers. LeNet was the beginning of CNNs in deep learning for computer vision problems.

However, LeNet could not train well due to the vanishing gradients problem. To solve this issue, a shortcut connection layer known as max-pooling is used between convolutional layers to reduce the spatial size of images which helps prevent overfitting and allows CNNs to train more effectively.

LeNet CNN Architecture

The LeNet CNN is a simple yet powerful model that has been used for various tasks such as handwritten digit recognition, traffic sign recognition, and face detection. Although LeNet was developed more than 20 years ago, its architecture is still relevant today and continues to be used.

Working of LeNet CNN

The LeNet-5 architecture consists of seven layers, including three convolutional layers, two subsampling (pooling) layers, and two fully connected layers. The input to the network is a grayscale image of size 32x32 pixels, and the output is a probability distribution over ten classes, corresponding to the digits 0–9.

The working of the LeNet-5 architecture can be summarized as follows:

  1. Convolutional layers: The first layer in the network is a convolutional layer with six filters of size 5x5. The stride is set to one, so the output size of this layer is 28x28x6. The weights in the filters are learned during training using backpropagation. The second convolutional layer in LeNet-5 also has six filters of size 5x5, but this time the input is the output from the first subsampling layer. The stride is again set to one, so the output size of this layer is 10x10x16. The third and final convolutional layer in LeNet-5 has 120 filters of size 5x5. The input to this layer is the output from the second subsampling layer, which has been flattened into a vector of length 400. The output size of this layer is 1x1x120.
  2. Subsampling (pooling) layers: After the first convolutional layer, LeNet-5 includes a subsampling layer, also known as a pooling layer. The purpose of the pooling layer is to downsample the feature maps and reduce the spatial resolution of the input. In LeNet-5, the subsampling layer is a max-pooling layer with a filter size of 2x2 and a stride of two. This results in an output size of 14x14x6. The second subsampling layer in LeNet-5 is also a max-pooling layer with a filter size of 2x2 and a stride of two. This results in an output size of 5x5x16.
  3. Fully connected layers: After the third convolutional layer, LeNet-5 includes two fully connected layers. The first fully connected layer has 84 units and is followed by a ReLU activation function. The second fully connected layer has 10 units, corresponding to the number of classes in the output, and is followed by a softmax activation function.
  4. Training: During training, LeNet-5 is trained using backpropagation and the cross-entropy loss function. The weights in the filters and fully connected layers are updated using gradient descent, and the learning rate is typically set to a small value (e.g., 0.01).

In summary, LeNet-5 uses a series of convolutional and subsampling layers to learn hierarchical features from images. The convolutional layers learn low-level features such as edges and corners, while the subsampling layers downsample the feature maps and reduce the spatial resolution. The fully connected layers then use these learned features to make predictions about the class of the input image. Through the process of training, the weights in the filters and fully connected layers are adjusted to minimize the cross-entropy loss and improve the accuracy of the predictions.

Implementation

Let’s say we have an input image of a handwritten digit “5” that is grayscale and has a size of 28x28 pixels, like this:

Handwritten Digit 5

We begin by importing the required libraries from Keras. The mnist module provides access to the MNIST dataset of handwritten digits, while Sequential, Conv2D, MaxPooling2D, Flatten, and Dense are Keras classes used to define the LeNet-5 CNN architecture. The to_categorical function is used to convert the integer class labels into one-hot encoded vectors.

from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

Next, we load the MNIST dataset using the load_data function and split it into training and testing sets.

(X_train, y_train), (X_test, y_test) = mnist.load_data()

We preprocess the data by reshaping the input images to have a channel dimension of 1 (since they are grayscale), scaling the pixel values to be between 0 and 1, and one-hot encoding the class labels.

X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

We define the LeNet-5 CNN architecture using the Sequential model in Keras. We begin by instantiating a new Sequential model object and adding layers to it using the add method.

The first layer is a convolutional layer with 6 filters of size 5x5 and a hyperbolic tangent activation function. The filters argument specifies the number of filters to use, and the kernel_size argument specifies the size of each filter. The activation argument specifies the activation function to use after the convolutional operation. The input_shape argument specifies the shape of the input data, which in this case is a 28x28 grayscale image with a channel dimension of 1.

model.add(Conv2D(filters=6, kernel_size=(5, 5), activation='tanh', input_shape=(28, 28, 1)))

Next, we add a max-pooling layer with a filter size of 2x2 using the MaxPooling2D class. This layer reduces the spatial dimensions of the output from the previous layer by a factor of 2 in both the width and height directions.

model.add(MaxPooling2D(pool_size=(2, 2)))

The second convolutional layer has 16 filters of size 5x5 and a hyperbolic tangent activation function, followed by another max-pooling layer.

model.add(Conv2D(filters=16, kernel_size=(5, 5), activation='tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))

The output from the second max-pooling layer is flattened and passed through two fully connected layers with 120 and 84 units and hyperbolic tangent activation functions, respectively.

Finally, the output layer has 10 units and a softmax activation function to produce a probability distribution over the 10 possible classes.

model.add(Flatten())
model.add(Dense(units=120, activation='tanh'))
model.add(Dense(units=84, activation='tanh'))
model.add(Dense(units=10, activation='softmax'))

We compile the model using categorical cross-entropy loss and the Adam optimizer and train it on the training data for 10 epochs with a batch size of 128.

We also evaluate the model on the test data and print the test accuracy.

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=128, validation_data=(X_test, y_test))

# Evaluate the model on the test data
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
print('Test accuracy: %.2f%%' % (accuracy * 100.0))

This code should produce a LeNet-5 CNN model that can achieve high accuracy on the MNIST dataset.

For your reference here is the whole code:

from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

# Load the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess the data
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# Define the LeNet-5 CNN architecture
model = Sequential()
model.add(Conv2D(filters=6, kernel_size=(5, 5), activation='tanh', input_shape=(28, 28, 1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(filters=16, kernel_size=(5, 5), activation='tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(units=120, activation='tanh'))
model.add(Dense(units=84, activation='tanh'))
model.add(Dense(units=10, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=128, validation_data=(X_test, y_test))

# Evaluate the model on the test data
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
print('Test accuracy: %.2f%%' % (accuracy * 100.0))

That’s it for now….I hope you liked my blog and got to know about LeNet CNN, it’s working, and the example I had taken while implementing the code.

In the next blog, I will be discussing AlexNet CNN and its Implementation.

Till then Stay tuned for the next blog…

***Next Blog***

--

--

Dewansh Singh

Software Engineer Intern @BirdVision | ML | Azure | Data Science | AWS | Ex-Data Science Intern @Celebal Technologies