An Introduction to PyTorch by working on the Moons dataset using Neural Networks.

Romano Vacca
Coinmonks
7 min readJul 23, 2018

--

“By far the greatest danger of Artificial Intelligence is that people conclude too early that they understand it.”
Eliezer Yudkowsky

This post is the first in a series of learning how to use deep learning library PyTorch. It is an attempt to help others who are just getting started with artificial intelligence and have trouble with other tutorials that start of at a high level. The approach in these series is to first understand the code behind the built-in functions of PyTorch, this way the theory behind it can be understood better, resulting in better overall knowledge. With every tutorial, more of the custom code will be replaced by the built-in functions, utilizing the true power of PyTorch.

Here’s the code that will be used for this tutorial:

How does PyTorch work?

PyTorch is currently fairly new, as they just recently announced their 1.0 release. The framework has a number of components, but one of the significant components is GPU Utilisation. What this means is that PyTorch can do the same as numpy but then using GPU power to calculate faster, which is very useful for Neural Networks. The important part to mention here is that this only takes one line of code to activate, where Tensorflow and others require a bit more work.

Required packages

  • PyTorch
  • Pandas
  • Numpy
  • Matplotlib
  • Sklearn
  • Seaborn

Getting Acquainted with PyTorch

PyTorch uses Matrix-like structures called Tensors. These tensors can be seen as a generalization of matrices and look like n-dimensional arrays (ndarrays in numpy). Click here if you want a more in-depth explanation about tensors.

Vector/matrix/tensor visualisation from : https://hackernoon.com/learning-ai-if-you-suck-at-math-p4-tensors-illustrated-with-cats-27f0002c9b32

To create a Tensor, we first have to import the Pytorch Module, called torch.

We create 4 variables x0,x1,x2,x3 all having different shapes, which we can check by calling .shape , just as we do for a normal numpy array.

Torch tensors also have types such as:

  • torch.LongTensor
  • torch.FloatTensor
  • torch.DoubleTensor

Watch out: Tensor types need to match when doing calculations with them. If you get errors about a type mismatch, you might need to set the dtype of your tensor.

Convert numpy to torch tensor

A powerful transformation in PyTorch is the conversion from numpy array to a torch tensor and vice versa.

Now that you saw some capabilities of PyTorch, lets start with the Moons dataset and explore other PyTorch features.

The Moons dataset

Example Moons distribution

The moons dataset is a simple built-in dataset from scikit-learn. We use a neural network(which we will create ourselves), to tackle this problem.

We import all the necessary libraries. Note: if you are using Jupyter Notebook, you can uncomment the last two lines so the graphs can be generated in the notebook.

We create two variables, X and y which will store the data points and according labels. We want to have an y array of shape (200,1), so that we can match the data point to the right label, hence the reshape.

Now that we have our data, its time to make our own Neural Network!

Neural network

PyTorch has its own built-in neural network class, but for the purpose of demonstrating and learning, we will build our own. More information about the PyTorch neural net can be found here.

There’s a lot going on here, this code can be understood better with a visualisation. Lets start with the __init__ :

Neural net visualization from https://playground.tensorflow.org/

This image is a visualisation of the neural network we are building now. The input size(2), matches with the two features(x1,x2) in the image. We have specified the hidden size to be 3, which can be seen as the three vertical blocks or neurons (three per hidden layer) in the image. In the code we have specified W1,W2,W3 , which are matrices, hence the capital W. This means that W1 for example, exists of [w1,1 | w1,2 | w1,3 | w2,1 | w2,2 | w2,3] . Also, a bias is added.

Forward propagation of another neural network

The forward function can be calculated by taking the sum of X times W1 together with the bias. In PyTorch we use the function “.mm” short for matrix-multiplication(since X1 and W1 are matrices) and we add the bias to it. Then we use an activation function, in this case the sigmoid to introduce nonlinearity in the model. This is done so the model can learn more complex relationships in the data. This output is then used as input for the next layer and so on.

Now the code we got so far only does half the job, it enables us to feed the input through the network, but the network needs to adapt to get better results, which is accomplished by backward-propagation.

We set the variable “model” , passing it the input size of 2, the hidden size of 3 and the output size of 1. The inputs and labels are in this case both floats (x and y values). We create a variable “losses”, so we can later on show how the loss developed along the road.

Then, we create a loop for the backpropagation:

First, we pass the inputs through the model, which gives us a certain output. PyTorch has a built-in function to calculate the BinaryCrossEntropy, but to understand it more, we built it ourselves. The backward-propagation can be easily calculated by PyTorch by calling .backward(), automatically calculating all the gradients. So now we have the direction in which the weights should be changed in order to more accurately predict the outcome of a certain input, we have to actually update them.

Here we say that all the weights of W1 should be decreased by (learning rate * gradient). After doing that, we need to set the calculated gradients to zero, otherwise these will add up and mess up our network.

Finally, we can plot the loss to see how we performed.

Improvements

To test how good you understand the implementation of PyTorch, I suggest adding more layers, or changing the hidden size. Adding another activation function is also a possibility.

Conclusions

This article showed how to get started with PyTorch by using low-level built-in functions. In the next tutorials, we will use more of the built-in functions of PyTorch. The goal of this article was to better understand how a neural network can be implemented.

In the next series, we will use PyTorch and Neural nets on the Titanic dataset.

--

--