Feed-Forward Implemented: Deep Learning with PyTorch Part #4

Using Basic Python to Implement a Simple Neural Network From Scratch

Published in

Geek Culture

6 min readMay 19, 2021

Welcome to Deep Learning with PyTorch Part #4! If you are new, feel free to take a look at the previous parts. The previous parts are all on my Medium page. As a refresher, in the last article, I walked through the mathematics of feed-forward. Feed-Forward is one of the fundamental concepts of neural networks. Feed-forward is a process in which your neural network takes in your inputs, “feeds” them through your hidden layers, and “spits” out an output.

In this article, I will provide a thorough implementation of feed-forward in Python. All code presented in this article can be found on my Github page in this repository. I have done my best to comment on the code thoroughly so that you understand what is happening at each step. I highly suggest that you take the code sample and experiment with it. After all, when it comes to coding, you learn as you do. Without further ado, let’s get to work!

Basic Setup

Before we jump right into the code, it is very, very important that we make sure our environment is set up properly. The editor you use is really up to you. I highly suggest that you download the Anaconda Platform. This is because you will automatically install all basic Data Science tools such as Jupyter Notebook, sklearn, etc. If you choose not to download Anaconda, make sure you have the NumPy library installed. This can be done by typing the following command:

pip install NumPy

Fair warning, in this tutorial, I am assuming that you have basic Python knowledge and have knowledge regarding object-orientated Programming (OOP) concepts.

Code

The neural network architecture that we will be looking to build will be a simple 3 layer neural network. Figure 1 provides a visual understanding of such a neural network architecture.

Figure 1: This is the neural network architecture that we will be implementing. X1 represents are the 1, and the only, input. B1 & B2 represent the biases for each layer, and w1,w2,w3, and w4 represent the weights. The image was created by the author.

Furthermore, the activation function(s) that I will be using will be sigmoid. Typically, you would probably use a variety of activation functions such as ReLU or tanh, but I am using sigmoid for simplicity purposes.

You may have noticed that the architecture for the neural network is relatively simple. The main reasoning for this is that I want to provide you with a simple, but concrete understanding of how feed-forward works in neural networks. If we implemented a more complex architecture with many, many hidden layers, I guarantee that your brain will be completely fried after implementing it from scratch. Adding on, the neural network that we will implement most likely won’t do well on datasets. There are 2 reasons for that:

The neural network isn’t trained yet. I will introduce the training aspect of neural networks in a future article.
It is way too simple. It is extremely rare to run across a problem where you will be able to solve it using this neural network architecture. I know nothing is impossible, but this situation would be very, very close to it.

Here is the full code:

Figure 2: the full code of the feed-forward implementation for a neural network

This code is a lot, so let’s walk through it. If you draw your attention to the bottom of Figure 2, you can see that I have implemented the sigmoid function. As I mentioned above, this function will serve as the activation function for the neural network. If you are not familiar with the sigmoid activation function, Figure 3 provides a mathematical representation of it.

Figure 3: The Mathematical Representation of the Sigmoid Function. https://analyticsindiamag.com/beginners-guide-neural-network-math-python/

Furthermore, Figure 4 displays a graphical understanding of it.

Figure 4: A graphical representation of the sigmoid function. https://analyticsindiamag.com/beginners-guide-neural-network-math-python/

As you can see in Figure 4, the sigmoid function prides itself on squeezing values between 0 & 1. If you are familiar with classical machine learning, you may recognize this as logistic regression.

Enough about the sigmoid function, let’s turn our attention to the forward function, which is the core of this code. In the forward function, you can see how I am putting the input, X, through the hidden layers and the output layer. In other words, I am performing the feed-forward phase of the neural network. As we discussed in the previous articles, feed-forward is essentially a lot of matrix multiplications. This is evident in the code. As you can see, I first concatenate the bias & input matrices into 1 matrix. Then, I multiply this matrix by the weights via matrix multiplication. I repeat these steps for both, the hidden layer and the output layer. The only difference between the 2 layers is that I have applied the sigmoid activation function to the outputs of the hidden layer. I didn’t apply the sigmoid function to the output layer for simplicity purposes; however, there may be times where you want to apply an activation function to your output layer. For example, you might want to get probabilities as outputs so you may want to apply the softmax activation function on your output layer.

The last thing that I would point out is the declaration of the weights and biases in the constructor (__init__ function). As you can see, I set the weights to random numbers. An important point to make here is that whenever you build a neural network, you will always start with random weights. This is because if we set each weight to 0, for example, then back-propagation may have issues in terms of updating the weights and whatnot. To not get involved in that mess, we set our weights randomly. Furthermore, you may also notice that I set the biases to equal 1. Typically, the biases in a neural network are set to 1.

Why use Frameworks?

I would like to close off this article by discussing why we use deep learning frameworks when we could just implement neural networks from scratch. Honestly, there is a very simple answer to this question: it would be way too much work for us to implement a very deep neural network (like 100+ hidden layers) from scratch. Furthermore, the math behind a very deep neural network may make your brain quite literally explode. Frameworks like PyTorch and Tensorflow do all the heavy lifting for us. They make it very simple & easy to build neural networks. All we have to do is define our architecture, loss function, and more. Why should we have “re-invent the wheel”?

Closing

If you made it to the end of this article, I thank you. Your support as a reader really motivates me to create more meaningful content. I hope you learned a thing or two about implementing feed-forward from scratch. As always, if you have any questions, feel free to leave them in the comments below. If you have any feedback, also feel free to let me know in the comments below or on a private comment on this article. If you are interested in reaching out, feel free to connect with me on LinkedIn. Till next time! Enjoy Deep Learning!