Five PyTorch Tensor Operations you had absolutely no idea about
Getting Started with PyTorch
Introduction
PyTorch is a relatively new and robust deep learning framework, having a dynamic computation graph and supports flexibility. It was designed primarily by Soumith Chinatala, Facebook AI Research. The new version (1.4+) supports deployment without need of Open Neural Network Exchange(ONNX), thus making it production ready. It supports scalable distributed training and performance optimization in research and production is enabled by the torch.distributed backend. It has a robust ecosystem for computer vision and natural language processing tasks among other deep learning applications. It is well supported on cloud platforms and recently Microsoft offered its cloud support to azure. Pytorch is more flexible than any other deep learning framework out there
Pytorch, as every other deep learning framework has a N-dimensional arrays for computation, which can be used on GPUs too. These N-dimensional arrays are called Tensors. Traditionally numpy arrays were not made for deep learning and won’t run on GPUs. This is the reason we use tensors which are similar to numpy arrays but are written with a CUDA and C++ backend so as to offer GPU computation. Pytorch provides a large variety of Tensor operations, similar to its contemporaries Tensorflow with Keras and MxNet with Gluon.
You might be familiar with the generic ones such as torch.rand(), torch.tensor(), torch.abs() , torch.ones() and much more, but there are a few top secret Tensor operations were hidden from us and we would have to dig deep down in the documentation to look at them.
- torch.baddbmm
- torch.bmm
- torch.cholesky
- torch.geqrf
- torch.logdet
Installation
To install pytorch on your kernel/ notebook, go to https://pytorch.org .Under Quick Start Locally you’ll be provided with the installation preferences. I have installed the CUDA 10.1 version using Anaconda. Run this command once you’re in your conda virtual environment.
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
You can install locally via pip too. This downloads a wheel file and executes it
pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
Importing the torch library
To import pytorch after installation, simply type the following line in your python file/shell or Jupyter notebook
import torch
I hope that you have already installed Pytorch and are eager to know more about the non- generic functions. So let’s get started
torch.baddbmm
torch.baddbmm(input, batch1, batch2,*,beta=1,alpha=1,out=None)
Returns:
- a Tensor
Utility:
Performs a matrix multiplication on tensors batch1 and batch 2, batchwise and adds the input tensor to the final product. alpha and beta act as the weights for the product and input tensor respectively
Requirements:
- batch 1 and batch 2 must be 3-D tensors, with the same number of matrices
- if batch1’s shape is (b x n x m), batch2’s shape is (b x m x p) then input must be a (b x n x p) tensor, which makes the output to be the same shape as the input tensor
Params:
- input — the input tensor
- batch1 — the tensor of shape (b × n × m)
- batch2 — the tensor of shape (b × m × p)
- beta — (optional) — weight for input
- alpha — (optional) — weight for the product matrix of batch1 and batch2
- out — (optional) — the output tensor
A tensor of shape (5,20,10) is returned
A tensor of shape (6,2,3) is assigned to the tensor ‘a’ which can be seen by performing a.size()
Here the function returns an error as the first tensor is of the shape (b ×n × m) but the second tensor is not of the expected shape (b ×m ×p), thus they can’t be broadcast together
This function should be used to perform weighted batch-wise multiplication of tensors.
torch.bmm
torch.baddbmm(input, mat2,out=None)
This is a simplified version of torch.baddbmm
Returns
- a Tensor
Utility:
- Performs a matrix multiplication on the input and mat2, batchwise
Requirements:
- input and mat2 must be 3-D tensors, with the same number of matrices
Params:
- input — the input tensor of shape (b ×n ×m)
- mat2 — the tensor of shape (b ×m × p)
- out — (optional) — the output tensor
Note that this function does not broadcast
A Tensor of shape (5,20,4) is returned
A tensor of shape (6,2,4) is assigned to the tensor ‘a’ which can be seen by performing a.size()
Here the function returns an error as the first tensor is of the shape (b × n ×m) but the second tensor is not of the expected shape (b ×m ×p).
To be used in Multi Layer Perceptrons
torch.cholesky
torch.cholesky(input, upper=False, out=None)
Returns:
- a Tensor
Utility:
- Performs the Cholesky decomposition of a symmetric positive-definitive matrix A or its batches
Requirements:
- input tensor of shape (a,n,n) where a is whole number and n is a natural number ‘a’ denotes the batch size, if any, of the symmetric postive-definite matrices
Params:
- input — the input tensor of shape (*,n,n)
- upper — (optional) — boolean value, if set to True returns the upper triangular matrix, else returns the lower triangular matrix. The default values is False
- out — (optional) — the output tensor
We get a lower triangular matrix since the default value of upper is False
We get a upper triangular matrix since we set the value of upper to True
The Cholesky computation of a singular matrix cannot be calculated
Can be used in Single Value Decomposition
torch.geqrf
torch.geqrf(input, output=None)
Returns:
- a tuple of Tensors (a,tau) of the form (Tensor,Tensor)
Utility:
- Low level function for calling LAPACK directly. torch.qr() can also be used
- Computes a QR decomposition of input, but without constructing QQ and RR as explicit separate matrices.
Requirements:
- input matrix of a given shape
Params
- input — the input tensor
- out — (optional) — the output tensor
A tuple of Tensors (a,tau) of the form (Tensor,Tensor) is returned.
The length of the tuple is 2
Wrong data type passing can result in errors
Used in implict QR decomposition
torch.logdet()
torch.logdet(input)
Returns:
- a Tensor
Utility:
- Performs the computation of the log determinant of a square matrix or its batches
Requirements:
- input tensor of shape (a,n,n) where a is whole number and n is a natural number ‘a’ denotes the batch size, if any, of the square matrices
Params:
- input — the input tensor of shape (*,n,n)
Note: -inf is returned if determinant is zero and nan if determinant is negative
The log of the determinant of the square matrix is performed correctly
The log of the determinant of the square matrix is performed correctly
This occurs because the matrix c is not a square matrix
Single Value Decomposition results are used when input is non invertible.
Conclusion
You have learnt about five awesome Pytorch Tensor operations, their return types, their arguments, their requirements, their utilities and how to use and not use them in Python code.
Author’s Note
This is just the beginning in your journey for deep learning. A lot remains to be discovered. So what are you waiting for ? Go, fall in love with Pytorch and dive deep into the ocean called Deep Learning