Introduction to Computer Vision with PyTorch (2/6)

The V Notebook
10 min readSep 1, 2023

Previous << Introduction to Computer Vision with PyTorch (1/6)

In this unit, we start with the simplest possible approach for image classification — a fully-connected neural network, which is also called a perceptron. We will recap the way neural networks are defined in PyTorch, and how the training algorithm works.

Firstly, we use pytorchcv helper to load all data.

!wget https://raw.githubusercontent.com/MicrosoftDocs/pytorchfundamentals/main/computer-vision-pytorch/pytorchcv.py
import torch
import torch.nn as nn
import torchvision
import matplotlib.pyplot as plt
from torchinfo import summary

from pytorchcv import load_mnist, plot_results
load_mnist()

Fully Connected Dense Neural Networks

A basic neural network in PyTorch consists of a number of layers. The simplest network would include just one fully connected layer, which is called Linear layer, with 784 inputs (one input for each pixel of the input image) and 10 outputs (one output for each class).

As we discussed above, the dimension of our digit images is 1 × 28 × 28, i.e., each image contains 28 × 28 = 784 different pixels. Because linear layer expects its input as one-dimensional vector, we need to insert another layer into the network, called Flatten, to change input tensor shape from
1 × 28 × 28 to 784. After Flatten, there is a main linear layer (called Dense in PyTorch) that converts…

--

--

The V Notebook

I'm👩‍💻who have passion for tech, heart for data. My mission? Turning numbers into chapters, algorithms into stories. Let's ride the data science wave! 💻🌊✨