Feedforward neural networks are also known as Multi-layered Network of Neurons (MLN). These network of models are called feedforward because the information only travels forward in the neural network, through the input nodes then through the hidden layers (single or many layers) and finally through the output nodes.
Traditional models such as McCulloch Pitts, Perceptron and Sigmoid neuron models capacity is limited to linear functions. To handle the complex non-linear decision boundary between input and the output we are using the Multi-layered Network of Neurons.
Citation Note: The content and the structure of this article is based on the deep learning lectures from One-Fourth Labs — Padhai.
In this post, we will discuss how to build a feed-forward neural network using Pytorch. We will do this incrementally using Pytorch module. The way we do that it is, first we will generate non-linearly separable data with two classes. Then we will build our simple feedforward neural network using PyTorch tensor functionality. After that, we will use abstraction features available in Pytorch module such as Functional, Sequential, Linear and Optim to make our neural network concise, flexible and efficient. Finally, we will move our network to CUDA and see how fast it performs.
Note: This tutorial assumes you already have PyTorch installed in your local machine or know how to use Pytorch in Google Collab with CUDA support, and are familiar with the basics of tensor operations. If you are not familiar with these concepts kindly refer to my previous post linked below.
Getting Started With Pytorch In Google Collab With Free GPU
Learn Pytorch in Colab With CUDA Support
Rest of the article is structured as follows:
- Import libraries
- Generate non-linearly separable data
- Feedforward network using tensors and auto-grad
- Train our feedforward network
- NN.Linear and Optim
- Moving the Network to GPU
If you want to skip the theory part and get into the code right away,
All the code files related to the deep learning course from PadhAI - Niranjankumar-c/DeepLearning-PadhAI
Before we start building our network, first we need to import the required libraries. We are importing the
numpy to evaluate the matrix multiplication and dot product between two vectors,
matplotlib to visualize the data and from the
sklearn package, we are importing functions to generate data and evaluate the network performance. Importing
torch for all things related to Pytorch.
#required libraries import numpy as np import math import #required libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_squared_error, log_loss
from tqdm import tqdm_notebook
from IPython.display import HTML
from sklearn.preprocessing import OneHotEncoder
from sklearn.datasets import make_blobs
Generate non-linearly separable data
In this section, we will see how to randomly generate non-linearly separable data using
#generate data using make_blobs function from sklearn.
#centers = 4 indicates different types of classes
data, labels = make_blobs(n_samples=1000, centers=4, n_features=2, random_state=0)
#visualize the data
plt.scatter(data[:,0], data[:,1], c=labels, cmap=my_cmap)
#splitting the data into train and test
X_train, X_val, Y_train, Y_val = train_test_split(data, labels, stratify=labels, random_state=0)
print(X_train.shape, X_val.shape, labels.shape)
To generate data randomly we will use
make_blobs to generate blobs of points with a Gaussian distribution. I have generated 1000 data points in 2D space with four blobs
centers=4 as a multi-class classification prediction problem. Each data point has two inputs and 0, 1, 2 or 3 class labels.
Once we have our data ready, I have used the
train_test_split function to split the data for and
validation in the ratio of 75:25.
Feedforward network using tensors and auto-grad
In this section, we will see how to build and train a simple neural network using Pytorch tensors and auto-grad. The network has six neurons in total — two in the first hidden layer and four in the output layer. For each of these neurons, pre-activation is represented by ‘ a’ and post-activation is represented by ‘ h ‘. In the network, we have a total of 18 parameters — 12 weight parameters and 6 bias terms.
We will use
map function for the efficient conversion of numpy array to Pytorch
#converting the numpy array to torch tensors
X_train, Y_train, X_val, Y_val = map(torch.tensor, (X_train, Y_train, X_val, Y_val))print(X_train.shape, Y_train.shape)
After converting the data to tensors, we need to write a function that helps us to compute the forward pass for the network.
#function for computing forward pass in the network
A1 = torch.matmul(x, weights1) + bias1 # (N, 2) x (2, 2)->(N, 2)
H1 = A1.sigmoid() # (N, 2)
A2 = torch.matmul(H1, weights2) + bias2 #(N, 2) x (2, 4)->(N, 4)
H2 = A2.exp()/A2.exp().sum(-1).unsqueeze(-1) #(N, 4) #softmax
We will define a function
model which characterizes the forward pass. For each neuron present in the network, forward pass involves two steps:
- Pre-activation represented by ‘a’: It is a weighted sum of inputs plus the bias.
- Activation represented by ‘h’: Activation function is Sigmoid function.
Since we have multi-class output from the network, we are using Softmax activation instead of Sigmoid activation at the output layer (second layer) by using Pytorch chaining mechanism. The activation output of the final layer is the same as the predicted value of our network. The function will return this value outside. So that we can use this value to calculate the loss of the neuron.
#function to calculate loss of a function.
#y_hat -> predicted & y -> actual
def loss_fn(y_hat, y):
return -(y_hat[range(y.shape), y].log()).mean()
#function to calculate accuracy of model
def accuracy(y_hat, y):
pred = torch.argmax(y_hat, dim=1)
return (pred == y).float().mean()
Next, we have our loss function. In this case, instead of the mean square error, we are using the cross-entropy loss function. By using the cross-entropy loss we can find the difference between the predicted probability distribution and actual probability distribution to compute the loss of the network.
Train our feed-forward network
We will now train our data on the feed-forward network which we created. First, we will initialize all the weights present in the network using Xavier initialization. Xavier Initialization initializes the weights in your network by drawing them from a distribution with zero mean and a specific variance (by multiplying with 1/sqrt(n)),
Since we have only two input features, we are dividing the weights by 2 and then call the
model function on the training data with 10000 epochs and learning rate set to 0.2
#set the seed
#initialize the weights and biases using Xavier Initialization
weights1 = torch.randn(2, 2) / math.sqrt(2)
bias1 = torch.zeros(2, requires_grad=True)
weights2 = torch.randn(2, 4) / math.sqrt(2)
bias2 = torch.zeros(4, requires_grad=True)
#set the parameters for training the model
learning_rate = 0.2
epochs = 10000
X_train = X_train.float()
Y_train = Y_train.long()
loss_arr = 
acc_arr = 
#training the network
for epoch in range(epochs):
y_hat = model(X_train) #compute the predicted distribution
loss = loss_fn(y_hat, Y_train) #compute the loss of the network
loss.backward() #backpropagate the gradients
with torch.no_grad(): #update the weights and biases
weights1 -= weights1.grad * learning_rate
bias1 -= bias1.grad * learning_rate
weights2 -= weights2.grad * learning_rate
bias2 -= bias2.grad * learning_rate
Continue reading this article at source: marktechpost (no Paywall),
Building a Feedforward Neural Network using Pytorch NN Module
Feedforward neural networks are also known as Multi-layered Network of Neurons (MLN). These network of models are…
All the blogs that I publish either at medium or any third party websites like Marktechpost will not be kept behind a Paywall. If you like my content, please consider supporting what I do. You can find all of my blogs here.
Pay NiranjanKumar using PayPal.Me
Go to paypal.me/niranjankumarc and type in the amount. Since it's PayPal, it's easy and secure. Don't have a PayPal…
The entire code discussed in the article is present in this GitHub repository. Feel free to fork it or download it. In my next article, we will discuss how to use matplotlib and seaborn to create awesome visualizations for Exploratory Data Analysis. It’s going to be a beginneer friendly post. So make sure you follow me on medium to get notified as soon as it drops.
Until then Peace :)
Niranjan Kumar is Retail Risk Analyst Intern at HSBC Analytics division. He is passionate about Deep learning and Artificial Intelligence. He was one of the top writers at Medium in Artificial Intelligence for 2.5 Months. You can find all of Niranjan’s blog here. You can connect with Niranjan on LinkedIn, Twitter and GitHub to stay up to date with his latest blog posts.
I am looking for opportunities either full-time or freelance projects, in the field of Machine Learning and Deep Learning. If there are any relevant opportunities, feel free to drop me a message on LinkedIn or you can reach me through email as well. I would love to discuss.
Originally published at https://www.marktechpost.com on June 30, 2019.