My lightening experience with Pytorch

Published in

Google Developer Student Clubs TIET

6 min readJul 4, 2022

Imagine a fine evening, it's drizzling outside, and you are sitting beside a window, exploring Python( the language one!) while sipping hot coffee.
You are going through all the articles, blogs, and videos that make you curious but suddenly lightning strikes and you come across an excellent blog about a library called Pytorch.

Pytorch is a deep learning library for training neural networks using optimised tensors. Plus, it’s open source!

The year 2016 saw the release of Pytorch by Facebook. Python programmers should find it simpler to use than other deep learning frameworks because its Python interface is more polished and is more narrowly focused. However, PyTorch also comes with a C++ frontend. In addition to that, PyTorch will be simpler to use if you want to tailor your code for certain machine learning issues.

Given the fact that PyTorch and its rival TensorFlow both employ tensors,but their main distinction is that PyTorch uses dynamic computation graphs, whilst TensorFlow uses static computation graphs which gives Pytorch an extra edge while building complex architectures.
Initially, Visdom was a visualisation package that PyTorch used, however now it has subsequently added complete support for TensorBoard.

The more you scroll the more curious you grow, and finally decide to install pytorch.

Installation

To install Pytorch locally, head on to the official website, specify the configurations and run the given commands in your terminal.

Note: Pytorch supports both GPU and CPU, The GPU enabled variant includes support for advanced features where as CPU enabled one omits the functions that would require a GPU.

For Conda,

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

For Pip,

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

Your struggles with Pip and Conda lead you to try something simpler, and Google Colab is the answer!

Come on, let’s hop on to Google Colab (if you too are intimidated by local Python environments.)

Importing necessary modules

import torch
import math
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms

Tensor?

In layman’s terms, Tensor is nothing but a multi-dimensional matrix consisting of homogeneous data type values.

Simple Mathematical Operations on a tensor,

p=torch.empty(2,2,3)
print(p)
x=torch.rand(2,2)
y=torch.rand(2,2)
print(x)
print(y)
z=x+y
z=torch.add(x,y)
print(z)
#y.add_(x)
#print(y)

q=torch.mul(x,y)
print(q)

Using Random,

x=torch.rand(4,4)
print(x)
y=x.view(-1,8)
print(y)

Tensor to Numpy Array

a=torch.ones(5)
print(a)
b=a.numpy()
print(type(b)) #tensor to numpy array
a.add_(1)
print(a)
print(b)

Numpy Array to Tensor

a=np.ones(5)
print(a)
b=torch.from_numpy(a)
print(b)
a==1
print(a)
print(b)

Gentle Warning: Now, as you move closer to the complexities, you decide to grab another cup of coffee to stop you from falling asleep.

Autograd Module

To calculate the gradients using autograd module, a recorder records the actions that have been taken and then replays them in reverse. This technique is particularly effective for creating neural networks since it can reduce the amount of time spent on one epoch by computing parameter differentiation during the forward pass.

weights = torch.ones(4,requires_grad=True)

for epoch in range(2):
  model_output=(weights*3).sum()

  model_output.backward()
  print(weights.grad)
  weights.grad.zero_()

Optim Module

A PyTorch package called torch.optim contains different optimization techniques. The majority of regularly used optimizers are now supported, and the interface is sufficiently straightforward to allow for future easy integration of more complicated ones.

SGD_optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.7)

nn Module

Raw autograd can be a little too low-level for creating sophisticated neural networks, but PyTorch makes it simple to define computational graphs and take gradients. The nn module can be useful in this situation. By simply specifying the layers of the network, the nn module offers layers and tools to quickly design neural networks.

Forward Propagation

In forward propagation, the Neural Network tries to predict the right output as accurately as possible. To arrive at this estimation, it passes the supplied data through all of its functions.

prediction = model(data)

Back Propagation

Backprop involves adjusting the NN’s parameters in accordance with the guess’s mistake. This is achieved by travelling backwards from the output, taking the derivatives of the error with respect to the function parameters (gradients), and using gradient descent to optimise the parameters.

loss = (prediction — labels).sum()
loss.backward()

Gradient Descent

In the simplest terms, Gradient Descent can be defined as an optimization algorithm that first calculates the derivative/gradient of the loss function then updates the weights and then tries to reduce the loss by finding the minima of the loss function.

Gradient Descent was difficult for me until I stumbled onto a helpful PyTorch lesson (link in ref). So I am providing a few steps to make it easier for you.

Prediction: Pytorch Model
Gradient Computation: Autograd
Loss Computation: Pytorch Loss
Parameters Update: Pytorch Optimizer

X=torch.tensor([[1],[2],[3],[4]],dtype=torch.float32)
Y=torch.tensor([[2],[4],[6],[8]],dtype=torch.float32)

X_test = torch.tensor([5],dtype=torch.float32)
n_samples, n_features = X.shape
print(n_samples,n_features)

input_size = n_features
output_size = n_features
#model=nn.Linear(input_size,output_size)

class LinearRegression(nn.Module):
  def __init__(self,input_dim, output_dim):
    super(LinearRegression,self).__init__()
    #define layers
    self.lin= nn.Linear(input_dim,output_dim)

  def forward(self,x):
    return self.lin(x)

model = LinearRegression(input_size,output_size)
print(f'Prediction before training: f(5)={model(X_test).item():.3f}')

#Training
learning_rate = 0.01
n_iters = 100

loss = nn.MSELoss()
optimizer=torch.optim.SGD(model.parameters(),lr=learning_rate)
for epoch in range(n_iters):
  #prediction= forward pass
  y_pred=model(X)

  #loss
  l=loss(Y,y_pred)

  #gradients=backward pass
  #dw=gradient(X,Y,y_pred)
  l.backward() #dl/dw

  #update weights
  optimizer.step()
  #zero gradients
  optimizer.zero_grad()

  if epoch % 10==0:
    [w,b]=model.parameters()
    print(f'epoch{epoch+1}: w = {w[0][0].item():.3f},loss = {l:.8f}')
print(f'Prediction after training: f(5)={model(X_test).item():.3f}')

Linear Regression

Ah! Here it finally comes, Linear Regression is ruling the machine learning with just y=mx+c. It is an attractive model with simple representation but has become a proven way to scientifically and reliably predict the future.

Now let’s prepare our data first,

X_numpy, y_numpy = datasets.make_regression(n_samples=100,n_features=1,noise=20,random_state=1)
#cast to float tensor
X=torch.from_numpy(X_numpy.astype(np.float32))
y=torch.from_numpy(y_numpy.astype(np.float32))
y=y.view(y.shape[0],1)

Linear Model (y=mx+c),

input_size = n_features
output_size = 1
model = nn.Linear(input_size, output_size)

Calculating Loss and Optimiser,

learning_rate = 0.01

criterion = nn.MSELoss() # Mean Square Error
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)  # SGD= Schostic Gradient Descent

Training the loop,

num_epochs = 100
for epoch in range(num_epochs):
    # Forward pass and loss
    y_predicted = model(X)
    loss = criterion(y_predicted, y)
    
    # Backward pass and update
    loss.backward()
    optimizer.step()

    # zero grad before new step
    optimizer.zero_grad()

    if (epoch+1) % 10 == 0:
        print(f'epoch: {epoch+1}, loss = {loss.item():.4f}')

Plotting……….

predicted = model(X).detach().numpy()

plt.plot(X_numpy, y_numpy, 'ro')
plt.plot(X_numpy, predicted, 'b')
plt.show()

Activation Functions

The Activation function decides whether a neuron should be activated or not by calculating the weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.

Initialising a tensor,

d = torch.tensor([-1.0, 2.0, 3.0, 4.0])

SoftMax,

output = torch.softmax(d, dim=0)
print(output)
sm = nn.Softmax(dim=0)
h = sm(d)
print(h)

Sigmoid,

output = torch.sigmoid(d)
print(output)
s = nn.Sigmoid()
h = s(d)
print(h)

Tanh

output = torch.tanh(d)
print(output)
t = nn.Tanh()
h = t(d)
print(h)

ReLu,

output = torch.relu(d)
print(output)
relu = nn.ReLU()
h = relu(d)
print(h)

Leaky ReLu,

output = F.leaky_relu(d)
print(output)
lrelu = nn.LeakyReLU()
h = lrelu(d)
print(h)

You must be a very tough guy if you ended up reading till here, now jump onto my repository to explore more code!!

References

For me, the excellent blog mentioned in the story was, Why Pytorch is the deep learning framework of the future.

The video which saved me while learning Gradient Descent was Deep Learning with Pytorch.

Additional ref: Wikipedia and Official Website of Pytorch.