5 Essential PyTorch Functions for Dealing With Tensors

Published in

The Startup

7 min readMay 30, 2020

I finally decided to try PyTorch and have decided to document my progression in this exciting library in these posts. This article is my take on a few of the most fundamental and basic functions that make the foundation of your next dream deep learning project!

An introduction to PyTorch: According to Wikipedia, PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing. That’s the definition. But what exactly is PyTorch? It is nothing but a helpful tool that makes a ML developer’s life easier. Essentially, it is a library of resources, from which one can pick any functionality and use it directly in their code, without necessarily having to understand the intricate details of how the functionality was implemented. To put it in simple words, “it just works”. We don’t always have to code our basics from scratch (though I definitely recommend doing that first to grab a firm hold on the concepts and deepen your understanding!).

But to make full use of that offered functionality, we must need to understand how to invoke those promised functions from within the library itself. And that is the purpose of this article.

Before we jump in, we also need to understand what are tensors? At this point, we just need to know that tensors is a container which can house data, scaling to multiple dimensions. Worked with NumPy before? Well, PyTorch tensors are essentially similar to NumPy ndarrays, but with an added functionality to operate on GPUs.

The functions we will be looking at today are:

torch.mm()
tensor.new_ones(), tensor.new_zeros() & torch.eye()
.reshape()
torch.randn() & torch.rand()
backward() with AutoGrad

And a bonus for those of you with NVIDIA GPUs:

torch.cuda.current_device,torch.cuda.device() and torch.cuda.get_device_name()

Let’s get started!

We start by importing torch into our python environment. Run the command: import torch . Now we can take a look at our functions.

Function 1 — torch.mm( )

As stated above, a tensor can be thought of as a matrix. The most common and useful (in most cases) operation when dealing with matrices is the multiplication of those two matrices. In NumPy, we can use the dot() function. torch.mm() aims to provide a similar functionality, but with tensors.

The syntax is as follows:torch.mm(input, mat2, out=None) → Tensor

It performs a matrix multiplication of the matrices input and mat2.

Matrix multiplication is one of the most essential operations when dealing with matrices in the field of machine learning and deep learning. torch.mm() takes 2 or 3 parameters, namely

input (Tensor) – the first matrix to be multipliedmat2 (Tensor) – the second matrix to be multipliedout (Tensor, optional) – the output tensor.

out is taken as None by default it isn't specified explicitly.

The mm() function performs simple matrix multiplication of the two input tensors t1 and t2. I have stored the result into another tensor variable t3. To get an in-depth understanding of matrix multiplication, have a look at this link.

Here we are multiplying a tensor by itself. Also called squaring a matrix. As expected, all the negatives, when multiplied with similar negatives become positive integers. Decimals can also be present in the tensor as I have shown.

There are some rules when multiplying matrices, they need to be compatible. We need to make sure that the the number of columns in the 1st one equals the number of rows in the 2nd one. (The prerequisite to be able to multiply) If the lat 2 lines of the above cell are uncommented, the example will successfully execute.

In the world of deep learning, one of the most fundamental things is the feed forward neural network, which is one of the most simple and also a highly useful network. It is basically an abstracted composite function, that is a combination of multiplication of matrices. It is not that vectors and matrices are the only way to do these operations but they become highly efficient if you do so. It enables parallel operations when a large number of computations are required. But more on that later.

Function(s) 2 — tensor.new_ones(), tensor.new_zeros() & torch.eye()

Often while working with matrices, we quickly need a matrix filled with all 1s or maybe all 0s. These functions can generate such a matrix, saving us the hassle of declaring those matrices ourselves by hand.

The syntax: new_ones(size, dtype=None, device=None, requires_grad=False)

The example is pretty much self explanatory. We have generated a 3x2 matrix, filled with 1s.

Once again, like the above function, We have generated a 4x4 matrix, filled with 0s.

Clearly we are trying to generate a tensor with it’s 3rd dimension taking a negative value. This does not make sense, and hence throws an error.

Often in the area of machine and deep learning, we need matrices which are pre-initialised to be filled with 0s or 1s, or just an identity matrix. The functions above makes the job easier for us.

Function 3 & 4 — .reshape() and torch.rand/n()

Often we will have a tensor which we want to reshape into another form to aid in our calculations. .reshape() is such a function that will help us in this field.

Here we can see two functions of torch in play, which are .reshape() and torch.randn(). Reshape allows us to alter the dimensions of the tensor at hand, while randn generates a tensor filled with random numbers from a normal distribution; with a mean 0 and variance 1 (also called the standard normal distribution).

Looking at another example:

Here we can see how reshape can also eliminate an entire dimension in the process of reshaping, if we desire so. Also, if you noticed, I have now used the function rand() instead of randn(). This will tell Torch to generate only positive samples, and discard the negative ones.

Looking at yet another example:

Here we have tried to reshape the tensor from (2,3,4) to (12,3). This is not allowed by torch, and doesn’t make sense logically as well. To understand it at a higher level, 2x3x4 != 12x3, while this condition was TRUE for the two examples above. The reshape operation must be compatible for it to be executed successfully.

Both reshape() and rand or randn are used widely when we need to slightly tweak the dimensions of weight matrices etc, and rand helps us to generate random samples, useful in the case of generating random weight matrices.

Function 5 — backward()

The function computes the gradient of current tensor w.r.t. graph leaves. Gradients are calculated by tracing the graph from the root to the leaf and multiplying every gradient in the way using the chain rule.

We have stated that the tensor x requires_grad = True. This means that the tensors start forming a backward graph that tracks every operation applied on them to calculate the gradients using something called a dynamic computation graph (DCG). source

Hence x is a something that can be differentiated. So, when we call the backward( ) function on z, we are performing the differentiation of z, with respect to x. Thus, be performing dz/dz, we get (3 + 2 - 4) which evaluates to 1, which is the result obtained above.

Find out more about DCGs here.

Here we have defined 2 tensors which can be differentiated, namely x and y. Then we have individually differentiated the term z, first with respect to x and then with respect to y. The results obtained are the numbers 192 and 12.

Here I have defined x to be a tensor, but we haven’t given it the right attribute for the requires_grad property. Thus the backward graphs isn't formed, and differentiation cannot be performed.

Machine learning uses derivatives (and hence differentiation) in optimization problems. Optimization algorithms like gradient descent use derivatives to decide whether to increase or decrease weights in order to maximize or minimize some objective (e.g. a model’s accuracy or error functions). Hence we can see how backwards() can be of immense importance to us.

Function(s) 6 — torch.cuda.current_device, torch.cuda.device() and torch.cuda.get_device_name()

These sets of functions help us in figuring out if our tensor is residing on our GPU or not, and is used in most cases to make sure if the PyTorch code is running on our expensive GPUs. source

Conclusion

So in this notebook, we have just touched upon a few of the functions that make up PyTorch the robust library that it is. These functions can help us get started with messing around with tensors, and are what I consider the basic building blocks of any program in the field of Deep Learning and Machine Learning, with the help of PyTorch. Here is a great comprehensive course on PyTorch for Deep Learning by freeCodeCamp on YouTube.

Reference Links

Official documentation for torch.Tensor: https://pytorch.org/docs/stable/tensors.html
https://machinelearningmastery.com/sparse-matrices-for-machine-learning/
https://medium.com/intuitionmachine/pytorch-dynamic-computational-graphs-and-modular-deep-learning-7e7f89f18d1

If you’re still reading, thank you! This is my first medium post so I’m still a bit rusty. Any sort of constructive criticism is always welcome!

If you notice a mistake, please let me know in the comments and I’ll be more than happy to correct myself!

Find me @ https://www.linkedin.com/in/dev-bhartra-45b639165/