Comparison between Pytorch Tensor and Numpy Array

Ashish Singh
3 min readAug 11, 2020

In this article, we will see be covering the following topics:

  1. What is the difference between numpy array and pytorch tensor?
  2. How to create numpy arrays and pytorch tensors. Furthermore, we will see how to perform the same operation on both data types.

This blog is also part of assignment 1 of this awesome course ZerotoGans. I recommend everybody to check out this course who are starting with the pytorch.

What is the difference between numpy array and pytorch tensor?

  1. The numpy arrays are the core functionality of the numpy package designed to support faster mathematical operations. Unlike python’s inbuilt list data structure, they can only hold elements of a single data type. Library like pandas which is used for data preprocessing is built around the numpy array. Pytorch tensors are similar to numpy arrays, but can also be operated on CUDA-capable Nvidia GPU.
  2. Numpy arrays are mainly used in typical machine learning algorithms (such as k-means or Decision Tree in scikit-learn) whereas pytorch tensors are mainly used in deep learning which requires heavy matrix computation.
  3. Unlike numpy arrays, while creating pytorch tensor, it also accepts two other arguments called the device_type (whether the computation happens on CPU or GPU) and the requires_grad (which is used to compute the derivatives).

How to create numpy arrays and pytorch tensors?

While converting tensor to numpy note that the underlying storage remains the same as depicted below

Rand function

The rand function is used to create random samples from a uniform distribution on the interval [0,1]. The first argument is the size of the desired array. Below example create 2-d array of dimension 2*3 (2 rows and 3 columns)

Seed function

Using the seed function to ensure the reproducibility.

Reshaping the array

Reshaping means changing the underlying size of the array. Specifying -1 in the reshape method tells the numpy or the pytorch to automatically infer that. This can be specified only once.

In the case of pytorch tensor

Note that there are two functions in pytorch to reshape the array. One is permute which basically permutes the dimensions without changing the data ordering and other is reshape which just changes the size to the desired size and so the ordering of elements get changed.

As you can see below the reshape method has changed the order of elements.

Slicing the arrays

Slicing works the same way in both numpy array and pytorch tensor

Add a new dimension

To add a new dimension in numpy array we use expand_dims.

To expand dimension in pytorch we have to use unsqueeze method.

I hope you like this article. Please let me know in comments if you found any issues.

References

--

--