Step-By-Step Building A Neural Network From Scratch

A full guide on how to implement a neural network from scratch with python using only NumPy.

Published in

Analytics Vidhya

9 min readAug 21, 2020

I learnt this tutorial in the Udacity Deep Learning Foundation Nanodegree and thought to share my experience with you in this sentiment classification project and give a full guide on how to implement it.

You can get the source code & dataset from this link on github.

https://github.com/udacity/deep-learning/tree/master/sentiment-network

We are going to build a neural network model from scratch without using Tensorflow or py-torch or any machine learning platform. We will build everything from the beginning using python so we can see the behind scenes of how the neural network works.

In this project we’ll implement a multiple layer perceptron model to classify a given text to a POSITIVE Or NEGATIVE.

Components of A Neural Network:

Firstly let’s Identify the components of the neural network model:

Layer 1 → Input Layer
Layer 2 → Hidden Layer
Layer 3 → Output Layer
Activation Function (We will choose a sigmoid function)
Learning Rate
Dataset (Reviews text)

Functions to be Implemented:

Pre Processing Data.
Label To Binary (Convert Output Labels Positive as ‘1’ and Negative as ‘0’).
Sigmoid Function.
Sigmoid Derivative Function.
Train.
Test.
Run.

Steps we’ll go through:

Read Dataset
Import Numpy library and Counter function
Sentiment Classification Class
Create Pre Processing Data Function
Initialize Network Function
Label To Binary
Sigmoid function and its derivative
Implement the training function (Forward Pass)
Backward Pass
Run & Test
Let’s run this network

Let’s get started.

Step 1: Read Dataset

Using Open Function to read a text file, Then use map to map the whole file to convert it to a list of reviews, Then repeat for the label text file.

File = open('reviews.txt','r') # What we know!
reviews = list(map(lambda x:x[:-1],File.readlines()))
g.close()File2 = open('labels.txt','r') # What we WANT to know!
labels = list(map(lambda x:x[:-1].upper(),File2.readlines()))
g.close()

This is how the first review will look like in the list.

Step 2: Import Numpy library and Counter function

We’ll use a function called counter in our project we’ll get to this later, but first let’s import it.

import numpy as np
from collections import Counter

Step 3: Sentiment Classification Class

Create a class that holds the whole network with the functions we’ll use, Let’s name it ‘Sentiment Network’.

Let’s Initialize it with this 4 parameters (reviews data, labels, hidden nodes, learning rate).

Next call pre process data funtion which we’ll implement next step, and call the init network function which initialize the whole network, we’ll implement this function in step 5.

class SentimentNetwork:
    def __init__(self, reviews,labels,hidden_nodes=10,learning_rate = 0.1):self.pre_process_data(reviews, labels)
        
                self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)

Note : All the next steps of the functions we implement will be under these class.

Step 4: Create Pre Processing Data Function

The Aim of this function is to get an Id for each word in the dataset, This Id will be the index of each word in the dataset.

The pre process data function takes two parameters the reviews and labels, we’ll create a review_word set to store all the words that appears in the review then convert this set to a list to easily access it.

This for loop will go through all the reviews and for each review It will loop through each word in it and add it to a set. Using split(" ") function each review will be break after each space to get individual words.

class SentimentNetwork:
    def pre_process_data(self, reviews, labels):
        review_vocab = set()
        for review in reviews:
            for word in review.split(" "):
                review_vocab.add(word)
self.review_vocab = list(review_vocab)

Will do the same thing for the label, But here we’ll not use the split function because we only have one word for each line (POSTIVE OR NEGATIVE).

class SentimentNetwork:
    def pre_process_data(self, reviews, labels):
        review_word = set()
        for review in reviews:
            for word in review.split(" "):
                review_word.add(word)
        self.review_word = list(review_word)
        
        label_vocab = set()
        for label in labels:
            label_vocab.add(label)        self.label_vocab = list(label_vocab)

Then Create a dictionary that holds every word and it location in the list, this loop will go through every word and convert it to a number using its index in the list.

self.word2index = {}
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i

This is how the word_dictionary will look like, A word and its id.

Do the same thing for the labels.

self.label2index = {}
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i

This is the final code for the function in the class.

def pre_process_data(self, reviews, labels):
        review_vocab = set()
        for review in reviews:
            for word in review.split(" "):
                review_vocab.add(word)

        self.review_vocab = list(review_vocab)
        
        label_vocab = set()
        for label in labels:
            label_vocab.add(label)
       
        self.label_vocab = list(label_vocab)
        
       
        self.word2index = {}
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i
        
        self.label2index = {}
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i

Step 5: Initialize Network Function

Create a function that initialize every thing in the network including (input nodes, hidden nodes, output nodes, learning rate, first layer, and weights).

Below we Initialized the input, output, hidden nodes, and learning rate which will be passed to the class object.

def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
       
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes
        
        self.learning_rate = learning_rate

Initialize the weights between the input and hidden self.weights_0_1 with zeros and with random values for the ones between hidden and output self.weights_0_2 .

self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                                (self.hidden_nodes, self.output_nodes))

Initialize the first layer connected to hidden layers with zeros.

self.layer_1 = np.zeros((1,hidden_nodes))

This is the full code for the initialize function.

def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
       
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes
        
        self.learning_rate = learning_rate
        
        self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                                (self.hidden_nodes, self.output_nodes))
        
        self.layer_1 = np.zeros((1,hidden_nodes))

Step 6: Label To Binary

A Small Function which converts Label which is Positive or Negative to a binary. For positive the output will be 1 and for negative will be 0.

def get_target_for_label(self,label):
    if(label == 'POSITIVE'):
        return 1
    else:
        return 0

Step 7: Sigmoid function and its derivative

At the output layer we’ll use a sigmoid function we can compute using the below function, Also we’ll need to compute the derivative of the derived output using sigmoid.

def sigmoid(self,x):
    return 1 / (1 + np.exp(-x))def sigmoid_output_2_derivative(self,output):
        return output * (1 - output)

Step 8: Implement the training function (Forward Pass)

This is the main function which will loop through all the dataset to update the weights and learn.

When training a neural network we should make two passes on the data (forward and backward passes) the first pass is to get the output of each input, While the backward pass is to get the error of output which we will use to update the weights.

In this step we will go through the forward pass, So firstly Let’s define the function and define it’s parameters which will be (training data reviews and labels).

def train(self, training_reviews_raw, training_labels):

Now let’s create list of lists for first one it will holds all the reviews and within each one it will holds all the words for a single review.

training_reviews = list()
    for review in training_reviews_raw:
        indices = set()
        for word in review.split(" "):
            if(word in self.word_dictionary.keys()):
                indices.add(self.word_dictionary[word])
        training_reviews.append(list(indices))

Initialize correct_so_far = 0 to track how many output are correct, Will increment after each correct iteration.

Now let’s loop through all the training data reviews and update the weights of all the layers and compute the error.

for i in range(len(training_reviews)):
    review = training_reviews[i]
    label = training_labels[i]
        
    self.layer_1 *= 0
    for index in review:
       self.layer_1 += self.weights_0_1[index]    layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))

Let me explain what happens in the code above :

We reset the first layer nodes with zeros.

self.layer_1 *= 0

Now we loop in the review and get each word index which is It’s id and update layer 1 with weights of these index from weights of between layer 0 and hidden layer which we initialized in step 4.

for index in review:
 self.layer_1 += self.weights_0_1[index]

Now for the output of layer 2 we compute it using the sigmoid function which we implemented above in step 6 by dotting layer 1 and the weights between hidden layer and output layer.

layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))

That’s it for the forward pass.

Step 9: Backward Pass

In the backward pass we calculate the error of the output and update the weights between input and hidden layersweights_0_1 and the weights between hidden and output layersweights_1_2 .

First : Calculate Output error

— Output error is the one at layer 2, We compute it by subtracting the real value of the output which is Positive or negative by the output we got from the last step layer 2 , But we can’t subtract a numerical value by a text (Positive or negative) so we will use the function we implemented in step 5 which converts label text to a binary number.

layer_2_error = layer_2 — self.get_target_for_label(label)

Now Multiply these error to the derivative of sigmoid of layer 2.

layer_2_delta = layer_2_error*self.sigmoid_output_2_derivative(layer_2)

Secondly : Calculate Hidden layer error

— For the hidden layer we’ll use the layer 2 delta we got from the previous step and dot multiplication it with the weights between hidden and output layer.

layer_1_error = layer_2_delta.dot(self.weights_1_2.T)

Since we’re not using any activation function in the hidden layer then layer 1 delta will be the same as the error.

layer_1_delta = layer_1_error

Finally let’s update the weights :

— Update weights_0_1 :

Since Layer 0 has many input for each word in a review then we’ll update the weights between each input and hidden neuron.

So let’s loop through every word and get its index and use it to update the weight then multiply it by the learning rate.

for index in review:
 self.weights_0_1[index] -= layer_1_delta[0] * self.learning_rate

— Update weights_1_2 :

self.weights_1_2 -= self.layer_1.T.dot(layer_2_delta) * self.learning_rate

— Track correct outputs:

if(layer_2 >= 0.5 and label == 'POSITIVE'):
                correct_so_far += 1
            elif(layer_2 < 0.5 and label == 'NEGATIVE'):
                correct_so_far += 1

Now that we have update the weights after each iteration the training is done here.

Step 10: Run & Test

Now let’s Implement a function Run that takes input Review and compute the output (At layer 2).

As we did in the train function we take the review and split it to words then take each word’s index and search for its weight using these index in the weights between first and hidden layer to update layer 1.

def run(self, review):
self.layer_1 *= 0
        unique_indices = set()
        for word in review.lower().split(" "):
            if word in self.word_index.keys():
                unique_indices.add(self.word_index[word])
        for index in unique_indices:
            self.layer_1 += self.weights_0_1[index]

And now for the output layer we dot layer 1 with weights of hidden and output layer and pass the output to sigmoid function and we get the output.

layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))

Layer 2 should give us 1 or 0 depending upon the class, but for sure there will be error will be ranging from 0 to 1, So let’s say any thing below 0.5 will be ‘0’ and anything above 0.5 will be ‘1’.

if(layer_2[0] >= 0.5):
     return "POSITIVE"
else:
     return "NEGATIVE"

For the test function

Step 11: Let’s Run this network

We’ll create an object from this class and pass these parameters (reviews and labels, number of hidden neurons, learning rate).

And call the training function to start training and pass the reviews and labels.

mlp = SentimentNetwork(reviews[:-1000],labels[:-1000],10,learning_rate=0.01)
mlp.train(reviews[:-1000],labels[:-1000])

I hope this blog helped you building this network and get a good understanding of how neural network works behind the scenes.

Let me know your feedback. Thanks for reading!