Building a Feedforward Neural Network using Pytorch NN Module

Niranjan Kumar
Jun 30 · 7 min read

Feedforward neural networks are also known as Multi-layered Network of Neurons (MLN). These network of models are called feedforward because the information only travels forward in the neural network, through the input nodes then through the hidden layers (single or many layers) and finally through the output nodes.

Traditional models such as McCulloch Pitts, Perceptron and Sigmoid neuron models capacity is limited to linear functions. To handle the complex non-linear decision boundary between input and the output we are using the Multi-layered Network of Neurons.

Citation Note: The content and the structure of this article is based on the deep learning lectures from One-Fourth Labs — Padhai.

Outline

In this post, we will discuss how to build a feed-forward neural network using Pytorch. We will do this incrementally using Pytorch module. The way we do that it is, first we will generate non-linearly separable data with two classes. Then we will build our simple feedforward neural network using PyTorch tensor functionality. After that, we will use abstraction features available in Pytorch module such as Functional, Sequential, Linear and Optim to make our neural network concise, flexible and efficient. Finally, we will move our network to CUDA and see how fast it performs.

Note: This tutorial assumes you already have PyTorch installed in your local machine or know how to use Pytorch in Google Collab with CUDA support, and are familiar with the basics of tensor operations. If you are not familiar with these concepts kindly refer to my previous post linked below.

Rest of the article is structured as follows:

  • Import libraries
  • Generate non-linearly separable data
  • Feedforward network using tensors and auto-grad
  • Train our feedforward network
  • NN.Functional
  • NN.Parameter
  • NN.Linear and Optim
  • NN.Sequential
  • Moving the Network to GPU

If you want to skip the theory part and get into the code right away,

Import libraries

Before we start building our network, first we need to import the required libraries. We are importing the numpy to evaluate the matrix multiplication and dot product between two vectors, matplotlib to visualize the data and from the sklearn package, we are importing functions to generate data and evaluate the network performance. Importing torch for all things related to Pytorch.

Generate non-linearly separable data

In this section, we will see how to randomly generate non-linearly separable data using sklearn.

To generate data randomly we will use make_blobs to generate blobs of points with a Gaussian distribution. I have generated 1000 data points in 2D space with four blobs centers=4 as a multi-class classification prediction problem. Each data point has two inputs and 0, 1, 2 or 3 class labels.

Once we have our data ready, I have used the train_test_split function to split the data for and validation in the ratio of 75:25.

Feedforward network using tensors and auto-grad

In this section, we will see how to build and train a simple neural network using Pytorch tensors and auto-grad. The network has six neurons in total — two in the first hidden layer and four in the output layer. For each of these neurons, pre-activation is represented by ‘ a’ and post-activation is represented by ‘ h ‘. In the network, we have a total of 18 parameters — 12 weight parameters and 6 bias terms.

We will use map function for the efficient conversion of numpy array to Pytorch tensors.

After converting the data to tensors, we need to write a function that helps us to compute the forward pass for the network.

We will define a function model which characterizes the forward pass. For each neuron present in the network, forward pass involves two steps:

  1. Pre-activation represented by ‘a’: It is a weighted sum of inputs plus the bias.
  2. Activation represented by ‘h’: Activation function is Sigmoid function.

Since we have multi-class output from the network, we are using Softmax activation instead of Sigmoid activation at the output layer (second layer) by using Pytorch chaining mechanism. The activation output of the final layer is the same as the predicted value of our network. The function will return this value outside. So that we can use this value to calculate the loss of the neuron.

Next, we have our loss function. In this case, instead of the mean square error, we are using the cross-entropy loss function. By using the cross-entropy loss we can find the difference between the predicted probability distribution and actual probability distribution to compute the loss of the network.

Train our feed-forward network

We will now train our data on the feed-forward network which we created. First, we will initialize all the weights present in the network using Xavier initialization. Xavier Initialization initializes the weights in your network by drawing them from a distribution with zero mean and a specific variance (by multiplying with 1/sqrt(n)),

Since we have only two input features, we are dividing the weights by 2 and then call the model function on the training data with 10000 epochs and learning rate set to 0.2

Continue reading this article at source: marktechpost (no Paywall),

All the blogs that I publish either at medium or any third party websites like Marktechpost will not be kept behind a Paywall. If you like my content, please consider supporting what I do. You can find all of my blogs here.

The entire code discussed in the article is present in this GitHub repository. Feel free to fork it or download it. In my next article, we will discuss how to use matplotlib and seaborn to create awesome visualizations for Exploratory Data Analysis. It’s going to be a beginneer friendly post. So make sure you follow me on medium to get notified as soon as it drops.

Until then Peace :)

NK.

Author Bio

Niranjan Kumar is Retail Risk Analyst Intern at HSBC Analytics division. He is passionate about Deep learning and Artificial Intelligence. He was one of the top writers at Medium in Artificial Intelligence for 2.5 Months. You can find all of Niranjan’s blog here. You can connect with Niranjan on LinkedIn, Twitter and GitHub to stay up to date with his latest blog posts.

I am looking for opportunities either full-time or freelance projects, in the field of Machine Learning and Deep Learning. If there are any relevant opportunities, feel free to drop me a message on LinkedIn or you can reach me through email as well. I would love to discuss.


Originally published at https://www.marktechpost.com on June 30, 2019.

Niranjan Kumar

Written by

Retail Risk Analyst at HSBC Analytics. ML and DL Enthusiast. Freelancer. Writer at hackernoon.com & towardsdatascience|| connect & fork @ Niranjankumar-c

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade