Complete Guide to build CNN in Pytorch and Keras

Sai Durga Mahesh
Analytics Vidhya
Published in
5 min readJun 4, 2020

Convolutional Neural Network has gained lot of attention in recent years. It is giving better results while working with images.

Pytorch and Keras are two important open sourced machine learning libraries used in computer vision applications.

Pytorch is known for it’s define by run nature and emerged as favourite for researchers. On the other hand, Keras is very popular for prototyping.

We will build a convolution network step by step.

Convolutional Neural Network

CNN is hot pick for image classification and recognition.

The three important layers in CNN are Convolution layer, Pooling layer and Fully Connected Layer. Very commonly used activation function is ReLU.

Some important terminology we should be aware of inside each layer is :

Convolution Layer

This is first layer after taking input to extract features.

Image matrix is of three dimension (width, height,depth).

Kernel or filter matrix is used in feature extraction.

If (w , h, d) is input dimension and (a, b, d) is kernel dimension of n kernels then output of convolution layer is (w-a+1 , h-b+1 , n).

Stride is number of pixels we shift over input matrix.

Padding is the change we make to image to fit it on filter. It involves either padding with zeros or dropping a part of image.

Pooling Layer

Pooling layer is to reduce number of parameters. Three types of pooling commonly used are :

Max Pooling : Takes maximum from a feature map.

Average Pooling : Takes average of values in a feature map.

Sum Pooling : Takes sum of values inside a feature map.

Fully Connected Layer

Output from pooling layer or convolution layer(when pooling layer isn’t required) is flattened to feed it to fully connected layer.

CNN

Implementation Of CNN

Importing libraries

Keras

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D

pytorch

import torchvision.datasets as datasetsimport torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

Loading Input

Input can either be loaded from standard datasets available in torchvision and keras or from user specified directory.

Input from standard datasets in Keras and pytorch :

#kerasfrom keras.datasets import mnist(X_train, y_train), (X_test, y_test) = mnist.load_data()#pytorchimport torchvision.datasets as datasetsmnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)mnist_testset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

Input from user specified directory in Keras and pytorch

#kerasfrom keras.preprocessing.image import ImageDataGeneratortrain_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)


training_set = train_datagen.flow_from_directory("./classify/dataset/training_set",
target_size = (64, 64),
batch_size = 5
)
test_set = test_datagen.flow_from_directory("./classify/dataset/test_set",
target_size = (64, 64),
batch_size = 5
)
#pytorchfrom torchvision import datasets, transforms
from torch.utils import data
dataset = datasets.ImageFolder(root='./classify/dataset/training_set/,
transform = transforms.ToTensor())
loader = data.DataLoader(dataset, batch_size = 8, shuffle = True)

Adding Convolution Layers

Before adding convolution layer, we will see the most common layout of network in keras and pytorch.

In keras, we will start with “model = Sequential()” and add all the layers to model.

In pytorch, we will start by defining class and initialize it with all layers and then add forward function to define flow of data.

class NeuralNet(nn.Module):
def __init__(self):

def forward(self , x):

Adding convolution layer in keras :

model = Sequential()model.add(Conv2D(32, (5, 5), input_shape=(28, 28, 1), activation=’relu’))

32 is no. of filters and kernel size is 5*5. ReLU is activation layer.

In pytorch :

class NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
self.conv1 = nn.Conv2d(3,32,3,1)
self.conv2 = nn.Conv2d(32,64,3,1)

In conv1, 3 is number of input channels and 32 is number of filters or number of output channels. 3 is kernel size and 1 is stride.

Adding pooling layer :

we will add Max pooling layer with kernel size 2*2 .

In keras

model.add(MaxPooling2D(pool_size=(2,2))

In pytorch :

x=torch.nn.functional.max_pool2d(x,2)

Adding Fully Connected layer

As we already know about Fully Connected layer,

we will add one in keras and pytorch.

#kerasmodel.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
#pytorchclass NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
self.conv1 = nn.Conv2d(3,32,3,1)
self.conv2 = nn.Conv2d(32,64,3,1)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)

Now, we have added all layers perfectly. But we need to define flow of data from Input layer to output layer(i.e., what layer should come after what)

Flow of Data in Network

This section is purely for pytorch as we need to add forward to NeuralNet class.

In Keras, The order we add each layer will describe flow and argument we pass on to each layer define it.

 model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(28, 28, 1),
activation='relu'))
model.add(MaxPooling2D())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

In pytorch we will add forward function to describe order of added layers in __init__ :

class NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
self.conv1 = nn.Conv2d(3,32,3,1)
self.conv2 = nn.Conv2d(32,64,3,1)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self,x):
x=self.conv1(x)
x=F.relu(x)
x=self.conv2(x)
x=F.relu(x)
x=F.max_pool2d(x,2)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output

Fit the Model to Input Data

In keras we will compile the model with selected loss function and fit the model to data. Epochs,optimizer and Batch Size are passed as parametres.

Epochs are number of times we iterate model through entire data.

Batch Size is amount of data or number of images to be fed for change in weights. Batch Size is used to reduce memory complications.

Different types of optimizer algorithms are available. You can read about them here. Adam is preferred by many in general.

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200)

In pytorch ,

model = NeuralNet()
optimizer = optim.Adam(model.parameters())
for (i,l) in trainloader:
optimizer.zero_grad()
output = model(i)
loss = F.nll_loss(output, l)
loss.backward()
optimizer.step()

nll_loss is negative log likelihood loss. Combination of F.nll_loss() and F.log_softmax() is same as categorical cross entropy function.

optimizer.zero_grad() clears gradients of previous data.

loss.backward() calculates gradients and updates weights with optimizer.step().

Evaluating Model

Evaluating model in keras

score = model.evaluate(X_test, target_test, verbose=0)print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')

For custom data in keras, you can go with following functions:

model.fit_generator(training_set,
epochs = 2,
validation_data = test_set,
verbose = 1)
score = model.evaluate_generator(test_set)

In pytorch:

model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in testloader:
output = model(data)
test_loss += F.nll_loss(output, target,
reduction='sum').item()
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(testloader.dataset)print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(testloader.dataset),
100. * correct / len(testloader.dataset)))

model.eval() is to tell model that we are in evaluation process. This is because behaviour of certain layers varies in training and testing.

torch.no_grad() will turn off gradient calculation so that memory will be conserved.

Final Thoughts

I feel I am having more control over flow of data using pytorch. For the same reason it became favourite for researchers in less time.

However we will see. implementation of GAN and Auto-encoder in later articles.

References

https://keras.io/examples/vision/mnist_convnet/

--

--

Sai Durga Mahesh
Analytics Vidhya

Using Data Science to provide better solutions to real word problems