Deep Learning With PyTorch — Tensor Basics: Stride, Offset, Contiguous Tensors

Moein Shariatnia
The Startup
Published in
6 min readJul 10, 2020

--

Deep Learning

On July 6th, the full version of the Deep Learning with PyTorch book was released. This is a great book and I’ve just started studying it. The third chapter is about the basics of the tensor operations and creation and I thought it’s a good idea to write this post about one of the most fundamental components of the deep learning libraries, something called Tensor.

If you are reading this, the chances are you’ve come across this word, Tensor, and probably used it in your deep learning journey. Tensor is a fancy word that is given to all multi-dimensional arrays in computer science instead of naming different arrays with different words as in math, where a 1-D array is called a vector and 2-D array is called matrix. So, in computer science and deep learning, a scalar can be a 0-D tensor, a list of numbers can be a 1-D tensor and so on.

Now, let’s look at tensors at another angle; here’s the thing that you may not have noticed while using them. Tensors are not the things that store the numbers for you! (that can sound odd!) They are just views to a one-dimensional storage of numbers in your RAM. What does that mean? Let’s learn this with code, not pure theory:

X = torch.tensor([[1., 2.], [3., 4.], [5., 6.]])
X.storage()
>>output:
1.0
2.0
3.0
4.0
5.0
6.0
[torch.FloatStorage of size 6]

In the snippet above, X is a 3 by 2 tensor but when we call storage() method on it, PyTorch gives us the real storage of X in RAM, which is of course a 1-D array with size 6. We can save this storage object in a variable and see its type.

storage = tensor.storage()
type(storage), storage.dtype
>>output:
(torch.FloatStorage, torch.float32)

Now let’s transpose our X tensor to X_t and see if the storage changes. You can both print out the storage of the new X_t tensor or check its “id” to see if it matches that of X. Also pay attention to the shape that has been transposed.

X_t = X.transpose(0,1)
X_t.storage(), X_t.shape
>>output:
1.0
2.0
3.0
4.0
5.0
6.0
[torch.FloatStorage of size 6], torch.Size([2, 3])
id(X_t.storage())==id(X.storage())
>>output:
True

So, it seems like the storage is intact. But, if the storage is not changed, what did the transpose method do on X? Here is the thing. PyTorch tensors are actually objects that have some attributes and methods like other objects in Python. “Stride” is a property of a tensor that determines how many elements should be skipped over in the storage array in order to get the next element in a given dimension in the original tensor! Probably this definition didn’t help that much; so, let’s clarify that. In the first dimension of X which is rows of the 3 by 2 tensor, to go from the first element of the first row (number 1.) to the second element of the second row (number 3.) we need to go 2 elements further in the storage array to reach that (remember that the storage array is a list of numbers from 1. to 6.). So, the stride in this dimension will be 2. But what about the next dimension (column dimension)? In this case, we don’t need to skip any number in the storage array so the stride will be 1. If we call the stride() method on X we will get these two exact numbers for the strides of each dimension.

X.stride()
>>output:
(2,1)

Below you can see a picture from the book on a 3 by 3 tensor. You can see that the stride on the first dimension is 3 and on the second dimension is 1 in this particular example.

source: Deep Learning with PyTorch Textbook

This is a great feature! When transposing or doing some other specific operations on tensors, PyTorch doesn’t change the main storage of numbers in RAM, rather it just changes the stride property of the tensor object to show a different view of that storage. So, think of tensors as a port or window by which the real storage is seen.

X_t
>>output:
tensor([[1., 3., 5.],
[2., 4., 6.]])

Let’s define a new tensor X_prime with the X_t data and see its stride:

X_prime = torch.tensor([[1., 3., 5.],
[2., 4., 6.]])
X_prime.stride(), X_prime.storage()
>>output:
(3, 1),
1.0
3.0
5.0
2.0
4.0
6.0
[torch.FloatStorage of size 6]

So, X_prime and X_t have the same data in the same order (in tensors!) but with different storage arrays and different strides on them.

Now that we learned stride, we can learn about another concept about tensors: Contiguous tensors. Consider our X_prime tensor. If we move along a single row in this tensor, the elements are placed as in its storage and we do not have to jump over any number. A tensor like this is called “contiguous”. However, if we move along a single row in our X_t tensor, the order is different from what we see in its storage and we have to skip some numbers in storage to go to obtain the next element of tensor; so, X_t is not contiguous.

X_prime.is_contiguous(), X_t.is_contiguous()
>>output:
True, False

By the way, making a tensor contiguous is as easy as calling the contiguous() method on that tensor:

X_t = X_t.contiguous()
X_t.is_contiguous()
>>output:
True

If you check X_t’s storage after this operation, you will see that it has changed and is like that of X_prime’s.

It is worth mentioning that some operations like “view” in PyTorch work only on contiguous tensors.

In the last part of this post, I want to introduce another term, “offset”, which is really straightforward. When you chunk a tensor by indexing into it, as you might have guessed, the underlying storage is intact! Again! This time PyTorch gives you another tensor which shows another view to the main storage array. In order to save the additional information in the new tensor about where to start the tensor from the main storage, this information is stored in another property of the tensor object named “storage_offset”. Let’s look at the code:

X = torch.tensor([[1., 2.], [3., 4.], [5., 6.]])
X_chunk = X[2]
X_chunk, X_chunk.storage_offset()
>>output:
tensor([5., 6.]), 4

In the code above, we selected the third row of our X tensor and saved it in a new tensor named X_chunk, and then printed out the new tensor and it’s storage offset. Storage offset is telling us (or maybe PyTorch!) that this new tensor is a view on the same storage as X’s storage but starts from the element in the storage with index 4 (i.e. number 5). That’s it! You now know what offset means!

I hope this post was helpful in understanding of what PyTorch does under the hood when creating or changing existing tensors. It’s really a nice feature that when we change a tensor’s shape or we index into it, we are not allocating new parts of our memory to those new tensors and the real storage of the numbers is intact. The examples here were simple to let us get an intuition of the definitions but when we are dealing with large tensors like weights in a huge model like ResNet101, this feature could save us a lot of memory and speed up our calculations.

Many of the ideas were inspired by: Deep Learning with PyTorch textbook. https://pytorch.org/deep-learning-with-pytorch

--

--

Moein Shariatnia
The Startup

Machine learning engineer and Researcher | Also a medical student!