Image Classification using CNN in PyTorch

Manish Kumar
Analytics Vidhya
Published in
14 min readJul 16, 2020

In this article, we will discuss Multiclass image classification using CNN in PyTorch, here we will use Inception v3 deep learning architecture.

inception architecture

Take away from this article

  • You will learn deeper convolution neural network for image classification
  • Why PyTorch is more flexible Compare than other Framework
  • Use of Inception v3 with cifar10 dataset.

What is a Convolution neural network?

In deep learning, a convolutional neural network is a class of deep neural networks, most commonly applied to analyzing visual imagery. They are also known as shift invariant or space invariant artificial neural networks, based on their shared-weights architecture and translation invariance characteristics.

Dataset used:-cifar10 dataset

About Dataset used:-

CIFAR-10 is an established computer-vision dataset used for object recognition. It is a subset of the 80 million tiny images dataset and consists of 60,000 32x32 color images containing one of 10 object classes, with 6000 images per class. It was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.

Kaggle is hosting a CIFAR-10 leaderboard for the machine learning community to use for fun and practice. You can see how your approach compares to the latest research methods on Rodrigo Benenson’s classification results page.

Git hub link for Code implementation:-https://github.com/vatsmanish/Inception-v3-with-pytorch

The paper of inception v3:-

There’s a simple but powerful way of creating better deep learning models. You can just make a bigger model, either in terms of deepness, i.e., number of layers, or the number of neurons in each layer. But as you can imagine, this can often create complications:

  • Bigger the model, more prone it is to overfitting. This is particularly noticeable when the training data is small
  • Increasing the number of parameters means you need to increase your existing computational resources

A solution for this, as the paper suggests, is to move on to sparsely connected network architectures which will replace fully connected network architectures, especially inside convolutional layers. This idea can be conceptualized in the below images:

Densely connected architecture

Sparsely connected architecture

This paper proposes a new idea of creating deep architectures. This approach lets you maintain the “computational budget”, while increasing the depth and width of the network. Sounds too good to be true! This is how the conceptualized idea looks:

Let us look at the proposed architecture in a bit more detail.

Proposed Architectural Details

The paper proposes a new type of architecture — GoogLeNet or Inception v1. It is basically a convolutional neural network (CNN) which is 27 layers deep. Below is the model summary:

Notice in the above image that there is a layer called inception layer. This is actually the main idea behind the paper’s approach. The inception layer is the core concept of a sparsely connected architecture.

Idea of an Inception module

Let me explain in a bit more detail what an inception layer is all about. Taking an excerpt from the paper:

“(Inception Layer) is a combination of all those layers (namely, 1×1 Convolutional layer, 3×3 Convolutional layer, 5×5 Convolutional layer) with their output filter banks concatenated into a single output vector forming the input of the next stage.”

Along with the above-mentioned layers, there are two major add-ons in the original inception layer:

  • 1×1 Convolutional layer before applying another layer, which is mainly used for dimensionality reduction
  • A parallel Max Pooling layer, which provides another option to the inception layer

Inception Layer

To understand the importance of the inception layer’s structure, the author calls on the Hebbian principle from human learning. This says that “neurons that fire together, wire together”. The author suggests that when creating a subsequent layer in a deep learning model, one should pay attention to the learnings of the previous layer.

Suppose, for example, a layer in our deep learning model has learned to focus on individual parts of a face. The next layer of the network would probably focus on the overall face in the image to identify the different objects present there. Now to actually do this, the layer should have the appropriate filter sizes to detect different objects.

This is where the inception layer comes to the fore. It allows the internal layers to pick and choose which filter size will be relevant to learn the required information. So even if the size of the face in the image is different (as seen in the images below), the layer works accordingly to recognize the face. For the first image, it would probably take a higher filter size, while it’ll take a lower one for the second image.

The overall architecture, with all the specifications, looks like this:

Implementation of Inception v3 on cifar10 dataset using Pytorch step by step code Explanation

I have used google colab(gpu) for training the Model and google colab(cpu) for testing.

1 — Import useful libraries and mount the google drive.

from collections import namedtuple

import torch
import torch.nn as nn
import torch.nn.functional as F

__all__ = ['Inception3', 'inception_v3']

_InceptionOutputs = namedtuple('InceptionOutputs', ['logits', 'aux_logits'])
Mount Drive
from google.colab import drive
drive.mount('/content/drive')
Change the path
import os
if not os.path.exists('/content/drive/My Drive/Inception_CIFAR10/'):
os.makedirs('/content/drive/My Drive/Inception_CIFAR10/')
os.chdir('/content/drive/My Drive/Inception_CIFAR10/')

2 — Inception v3 from scratch

def inception_v3(pretrained=False, **kwargs):
if pretrained:
if 'transform_input' not in kwargs:
kwargs['transform_input'] = True
if 'aux_logits' in kwargs:
original_aux_logits = kwargs['aux_logits']
kwargs['aux_logits'] = True
else:
original_aux_logits = True
model = Inception3(**kwargs)
if not original_aux_logits:
model.aux_logits = False
return model

return Inception3(**kwargs)

3 — We will make the model from scratch so return the model to arguments so return the keyword to our model.**kwargs allows you to pass keyworded variable length of arguments to a function. You should use **kwargs if you want to handle named arguments in a function. Here is an example to get you going with it:

class Inception3(nn.Module):

def __init__(self, num_classes=10, aux_logits=True, transform_input=True):
super(Inception3, self).__init__()
self.aux_logits = aux_logits
self.transform_input = transform_input
self.Conv2d_4a_3x3 = BasicConv2d(3, 32, kernel_size=3,padding=1)
self.Mixed_5b = InceptionA(32, pool_features=8)
self.Mixed_5c = InceptionA(64, pool_features=72)
self.Mixed_6a = InceptionB(128)
self.Mixed_6b = InceptionC(256, channels_7x7=64)
if aux_logits:
self.AuxLogits = InceptionAux(512, num_classes)
self.Mixed_7a = InceptionD(512)
self.fc = nn.Linear(768, num_classes)

for m in self.modules():
if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
import scipy.stats as stats
stddev = m.stddev if hasattr(m, 'stddev') else 0.1
X = stats.truncnorm(-2, 2, scale=stddev)
values = torch.as_tensor(X.rvs(m.weight.numel()), dtype=m.weight.dtype)
values = values.view(m.weight.size())
with torch.no_grad():
m.weight.copy_(values)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)

def forward(self, x):
global aux
print(x.shape)
x = self.Conv2d_4a_3x3(x)
x = self.Mixed_5b(x)
x = self.Mixed_5c(x)
x = self.Mixed_6a(x)
x = self.Mixed_6b(x)
if self.training and self.aux_logits:
aux = self.AuxLogits(x)
x = self.Mixed_7a(x)
x = F.adaptive_avg_pool2d(x, (1, 1))
x = F.dropout(x, training=self.training)
x = torch.flatten(x, 1)
x = self.fc(x)
print(x.shape)
if self.training and self.aux_logits:
return _InceptionOutputs(x, aux)
return x

4 — Here we have defined a class and pass the number of classes that we have 10.The aux_logits will only be returned in train() mode, so make sure to activate it before the next epoch and transform_input says that change the Shape of images.

class InceptionA(nn.Module):

def __init__(self, in_channels, pool_features):
super(InceptionA, self).__init__()
self.branch1x1 = BasicConv2d(in_channels, 8, kernel_size=1)

self.branch5x5_1 = BasicConv2d(in_channels, 8, kernel_size=1)
self.branch5x5_2 = BasicConv2d(8, 16, kernel_size=5, padding=2)

self.branch3x3dbl_1 = BasicConv2d(in_channels, 8, kernel_size=1)
self.branch3x3dbl_2 = BasicConv2d(8, 16, kernel_size=3, padding=1)
self.branch3x3dbl_3 = BasicConv2d(16, 32, kernel_size=3, padding=1)

self.branch_pool = BasicConv2d(in_channels, pool_features, kernel_size=1)

def forward(self, x):
branch1x1 = self.branch1x1(x)

branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)

branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)

branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)

outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1)
class InceptionB(nn.Module):

def __init__(self, in_channels):
super(InceptionB, self).__init__()
self.branch3x3 = BasicConv2d(in_channels, 32, kernel_size=3, stride=2)

self.branch3x3dbl_1 = BasicConv2d(in_channels, 32, kernel_size=1)
self.branch3x3dbl_2 = BasicConv2d(32, 64, kernel_size=3, padding=1)
self.branch3x3dbl_3 = BasicConv2d(64, 96, kernel_size=3, stride=2)

def forward(self, x):
branch3x3 = self.branch3x3(x)

branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)

branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)

outputs = [branch3x3, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1)
class InceptionC(nn.Module):

def __init__(self, in_channels, channels_7x7):
super(InceptionC, self).__init__()
self.branch1x1 = BasicConv2d(in_channels, 128, kernel_size=1)

c7 = channels_7x7
self.branch7x7_1 = BasicConv2d(in_channels, c7, kernel_size=1)
self.branch7x7_2 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))
self.branch7x7_3 = BasicConv2d(c7, 128, kernel_size=(7, 1), padding=(3, 0))

self.branch7x7dbl_1 = BasicConv2d(in_channels, c7, kernel_size=1)
self.branch7x7dbl_2 = BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0))
self.branch7x7dbl_3 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))
self.branch7x7dbl_4 = BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0))
self.branch7x7dbl_5 = BasicConv2d(c7, 128, kernel_size=(1, 7), padding=(0, 3))

self.branch_pool = BasicConv2d(in_channels, 128, kernel_size=1)

def forward(self, x):
branch1x1 = self.branch1x1(x)

branch7x7 = self.branch7x7_1(x)
branch7x7 = self.branch7x7_2(branch7x7)
branch7x7 = self.branch7x7_3(branch7x7)

branch7x7dbl = self.branch7x7dbl_1(x)
branch7x7dbl = self.branch7x7dbl_2(branch7x7dbl)
branch7x7dbl = self.branch7x7dbl_3(branch7x7dbl)
branch7x7dbl = self.branch7x7dbl_4(branch7x7dbl)
branch7x7dbl = self.branch7x7dbl_5(branch7x7dbl)

branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)

outputs = [branch1x1, branch7x7, branch7x7dbl, branch_pool]
return torch.cat(outputs, 1)
class InceptionD(nn.Module):

def __init__(self, in_channels):
super(InceptionD, self).__init__()
self.branch3x3_1 = BasicConv2d(in_channels, 32, kernel_size=1)
self.branch3x3_2 = BasicConv2d(32,64, kernel_size=3, stride=2)

self.branch7x7x3_1 = BasicConv2d(in_channels, 32, kernel_size=1)
self.branch7x7x3_2 = BasicConv2d(32,64, kernel_size=(1, 7), padding=(0, 3))
self.branch7x7x3_3 = BasicConv2d(64, 128, kernel_size=(7, 1), padding=(3, 0))
self.branch7x7x3_4 = BasicConv2d(128,192, kernel_size=3, stride=2)

def forward(self, x):
branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)

branch7x7x3 = self.branch7x7x3_1(x)
branch7x7x3 = self.branch7x7x3_2(branch7x7x3)
branch7x7x3 = self.branch7x7x3_3(branch7x7x3)
branch7x7x3 = self.branch7x7x3_4(branch7x7x3)

branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)
outputs = [branch3x3, branch7x7x3, branch_pool]
return torch.cat(outputs, 1)
class BasicConv2d(nn.Module):

def __init__(self, in_channels, out_channels, **kwargs):
super(BasicConv2d, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)
self.bn = nn.BatchNorm2d(out_channels, eps=0.001)

def forward(self, x):
x = self.conv(x)
x = self.bn(x)
return F.relu(x, inplace=True)

5 — Train the model

# incremental training comments out that line of code.

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

WORK_DIR = './data'
NUM_EPOCHS = 100
BATCH_SIZE = 32
#LEARNING_RATE = 0.01

MODEL_PATH = './model'
MODEL_NAME = 'Inception_v3.pth'

# Create model
if not os.path.exists(MODEL_PATH):
os.makedirs(MODEL_PATH)

#AUGMENTATIONS
transform = transforms.Compose([
transforms.RandomCrop(32, padding=4),
#torchvision.transforms.ColorJitter(brightness=0.2, contrast=0.4, saturation=0.5, hue=0.1),
transforms.RandomHorizontalFlip(),
torchvision.transforms.RandomVerticalFlip(),
# torchvision.transforms.RandomAffine(degrees=0, translate=(0.2,0.2), scale=None,shear=50, resample=False, fillcolor=0),
torchvision.transforms.RandomRotation((20), resample=False,expand=False, center=None),
transforms.ToTensor(),
transforms.Normalize([0.4913997551666284, 0.48215855929893703, 0.4465309133731618], [0.24703225141799082, 0.24348516474564, 0.26158783926049628])
])

# Load data
dataset = torchvision.datasets.CIFAR10(root=WORK_DIR,
download=True,
train=True,
transform=transform)

dataset_loader = torch.utils.data.DataLoader(dataset=dataset,
batch_size=BATCH_SIZE,
shuffle=True)

6 — Now calculate number of parameters in the model

# Total parameters
model = inception_v3().to(device)
pytorch_total_params = sum(p.numel() for p in model.parameters())
pytorch_total_params
2534260def main():
print(f"Train numbers:{len(dataset)}")
LEARNING_RATE = 0.001
MOMENTUM=0.9
# first train run this line
#model = inception_v3().to(device)
print(model)
#model_save_name = 'Inception_v3e1.pth'
model.load_state_dict(torch.load(MODEL_NAME))
# Load model
#if device == 'cuda':

#model = torch.load(MODEL_PATH + MODEL_NAME).to(device)
#else:
#model = torch.load(MODEL_PATH + MODEL_NAME, map_location='cpu')
# cast
cast = torch.nn.CrossEntropyLoss().to(device)
# Optimization
optimizer = torch.optim.SGD(
model.parameters(),
lr=LEARNING_RATE,
momentum=MOMENTUM)
step = 1
loss_values=[]
for epoch in range(1, NUM_EPOCHS + 1):
print(loss_values)
model.train()
running_loss = 0.0

# cal one epoch time
start = time.time()
correct = 0
total = 0
for images, labels in dataset_loader:
images = images.to(device)
print(images.shape)
labels = labels.to(device)

outputs, aux_outputs = model(images)
loss1 = cast(outputs, labels)
loss2 = cast(aux_outputs, labels)
loss = loss1 + 0.4*loss2
running_loss =+ loss.item() * images.size(0)

optimizer.zero_grad()
loss.backward()
optimizer.step()
print("epoch: ", epoch)
print(f"Step [{step * BATCH_SIZE}/{NUM_EPOCHS * len(dataset)}], "
f"Loss: {loss.item():.8f}.")
print("Running Loss=",running_loss)
step += 1
# equal prediction and acc
_, predicted = torch.max(outputs.data, 1)
# val_loader total
total += labels.size(0)
# add correct
correct += (predicted == labels).sum().item()

print(f"Acc: {correct / total:.4f}.")
# cal train one epoch time
end = time.time()
loss_values.append(running_loss / len(dataset_loader))

print(f"Epoch [{epoch}/{NUM_EPOCHS}], "
f"time: {end - start} sec!")

# Save the model checkpoint
if epoch%20==0:
# LEARNING_RATE=LEARNING_RATE/10
# torch.save(model, MODEL_PATH + '/' + MODEL_NAME)

model_save_name = 'Inception_v3_CIFAR10_32BATCH_lr0.001_crop_bflip_rot'+str(epoch)+'.pth' #WE keep changing this and saving states ,can be found in excel sheet attached
torch.save(model.state_dict(), model_save_name)
print("epoch completed and model copy completed")

torch.save(model,MODEL_NAME)
print(f"Model save to {MODEL_PATH + '/' + MODEL_NAME}.")
if __name__ == '__main__':
main()

7 — Training accuracy

Streaming output truncated to the last 5000 lines.
torch.Size([64, 3, 32, 32])
torch.Size([64, 3, 32, 32])
torch.Size([64, 10])
epoch: 96
Step [959296/1000000], Loss: 0.01983919.
Running Loss= 1.2697083950042725
Acc: 0.9987.
torch.Size([64, 3, 32, 32])
torch.Size([64, 3, 32, 32])
torch.Size([64, 10])
epoch: 96
Step [959360/1000000], Loss: 0.01000556.
Running Loss= 0.6403558254241943
Acc: 0.9988.
torch.Size([64, 3, 32, 32])
torch.Size([64, 3, 32, 32])
torch.Size([64, 10])
epoch: 96
Step [959424/1000000], Loss: 0.01426543.
Running Loss= 0.9129873514175415
Acc: 0.9988.
torch.Size([64, 3, 32, 32])
torch.Size([64, 3, 32, 32])
torch.Size([64, 10])
epoch: 96
Step [959488/1000000], Loss: 0.01021658.
Running Loss= 0.6538611650466919
Acc: 0.9988.
torch.Size([64, 3, 32, 32])
torch.Size([64, 3, 32, 32])
torch.Size([64, 10])
epoch: 96
Step [959552/1000000], Loss: 0.01176894.
Running Loss= 0.7532118558883667
Acc: 0.9988.....................................

Our model has a large calculation of per step(epochs) to the parameter. If you want to see per steps loss then you can go with my git hub repository.

8 — Test the model

evice = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

WORK_DIR = './data'
BATCH_SIZE = 32

#MODEL_PATH = './model'
#MODEL_NAME = 'Inception_v3.pth'
#MODEL_NAME = "Inception_v3_CIFAR10_32SIZE_512BATCH_102_lr_low_para10.pth"
MODEL_NAME="Inception_v3_CIFAR10_32BATCH_lr0.001_crop_bflip_rot100.pth"


transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize([0.4913997551666284, 0.48215855929893703, 0.4465309133731618], [0.24703225141799082, 0.24348516474564, 0.26158783926049628])
])

9 — Load the validation dataset

dataset = torchvision.datasets.CIFAR10(root=WORK_DIR,
download=True,
train=False,
transform=transform)

dataset_loader = torch.utils.data.DataLoader(dataset=dataset,
batch_size=BATCH_SIZE,
shuffle=True)

CODE TO DO VALIDATION AND GET VALIDATION ACCURACY

def main():
print(f"Val numbers:{len(dataset)}")
#model = inception_v3().to(device)
print(model)
# Load model
if device == 'cuda':
#model = torch.load(MODEL_PATH+"/"+MODEL_NAME).to(device)
model.load_state_dict(torch.load(MODEL_NAME))
else:
#model = torch.load(MODEL_PATH+"/"+MODEL_NAME, map_location='cpu')
model.load_state_dict(torch.load(MODEL_NAME))
model.eval()

correct = 0.
total = 0
for images, labels in dataset_loader:
# to GPU
images = images.to(device)
labels = labels.to(device)
# print prediction
outputs = model(images)
# equal prediction and acc
_, predicted = torch.max(outputs.data, 1)
# val_loader total
total += labels.size(0)
# add correct
correct += (predicted == labels).sum().item()

print(f"Acc: {correct / total:.4f}.")
if __name__ == '__main__':

main()

10 — Model Summary

Val numbers:10000
Inception3(
(Conv2d_4a_3x3): BasicConv2d(
(conv): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(Mixed_5b): InceptionA(
(branch1x1): BasicConv2d(
(conv): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_1): BasicConv2d(
(conv): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_2): BasicConv2d(
(conv): Conv2d(8, 16, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(bn): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_5c): InceptionA(
(branch1x1): BasicConv2d(
(conv): Conv2d(64, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_1): BasicConv2d(
(conv): Conv2d(64, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_2): BasicConv2d(
(conv): Conv2d(8, 16, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(bn): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(64, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(8, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(64, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(72, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6a): InceptionB(
(branch3x3): BasicConv2d(
(conv): Conv2d(128, 32, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(128, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(64, 96, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6b): InceptionC(
(branch1x1): BasicConv2d(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_1): BasicConv2d(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_2): BasicConv2d(
(conv): Conv2d(64, 64, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_3): BasicConv2d(
(conv): Conv2d(64, 128, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_1): BasicConv2d(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_2): BasicConv2d(
(conv): Conv2d(64, 64, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_3): BasicConv2d(
(conv): Conv2d(64, 64, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_4): BasicConv2d(
(conv): Conv2d(64, 64, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_5): BasicConv2d(
(conv): Conv2d(64, 128, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(AuxLogits): InceptionAux(
(conv0): BasicConv2d(
(conv): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(conv1): BasicConv2d(
(conv): Conv2d(128, 512, kernel_size=(5, 5), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(fc): Linear(in_features=512, out_features=10, bias=True)
)
(Mixed_7a): InceptionD(
(branch3x3_1): BasicConv2d(
(conv): Conv2d(512, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_2): BasicConv2d(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_1): BasicConv2d(
(conv): Conv2d(512, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_2): BasicConv2d(
(conv): Conv2d(32, 64, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_3): BasicConv2d(
(conv): Conv2d(64, 128, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_4): BasicConv2d(
(conv): Conv2d(128, 192, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(fc): Linear(in_features=768, out_features=10, bias=True)
)
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([32, 3, 32, 32])
torch.Size([32, 10])
torch.Size([16, 3, 32, 32])
torch.Size([16, 10])
Model accuracy on 10000 test images: 99.95%

Here we got 99.95% accuracy on testing, some of the work is also performed like data augmentation that is not explained here so go through with my github repo where you can also fork the model and deploy it

Model deployment part:-

https://github.com/vatsmanish/Deploy_Inception_v3

This article is becoming longer so I will describe in the next part about deployment of the model.

References:-https://arxiv.org/pdf/1512.00567.pdf

My other article on medium:-https://medium.com/@manish.kumar_61520/introduction-to-vector-autoregression-6ec386db387e

--

--