Introduction to “Tensors”

Tensors building blocks of machine learning!

Summary: I like to learn by playing around with code and sharing my knowledge via blogs like this. This blog post provides a quick introduction to Tensors such as what is a tensor, it’s attributes and how it differs from a matrix, some code examples followed by its use cases. For more posts like these on Machine Learning Basics follow me here or on Twitter — Shaistha Fathima.

Here is the link to all the posts of Introduction to “Tensors” series:

If Interested you may also check the other series on Basic Concepts You Should Know Before Starting with the “Neural Networks” (NN)

What are Tensors?!

Tensors are the basic building blocks of the modern machine learning. You can think of them as a container storing numbers in it, which we refer to as numerical data.

Types of Tensors:

0-D tensor / Scalar : A container with just a number say 23 or 5 or 4, etc is called a 0-D Tensor ( or simply scalar unit in general science!).

1-D tensor / Vector : If you remember just a little bit of physics ( Ahh!! Those never ending classes….! I know right, honestly i liked chemistry more!! coming back..), a vector was ??… Yes, a vector is a mathematical object that has a size, called the magnitude, and a direction. So , you do remember…just messing with you…haha!

Now, seriously, if you know any programming language you might remember something called “arrays” right! Yes, 1-D arrays are the nothing but string of data stored in single row or a column. In deep learning we call it, 1-D tensor! 1-D arrays are also known as vectors!

I think, now, you can put the pieces together ? what is a 2-D tensor?… Yes, its called a “Matrix”! Bet, you all knew that!

So what’s a 3-D tensor?…

Yup! Its just a tensor, no more extra naming….

So, in general, we call a 2d array as a matrix, but anything higher than that can just be called a “tensor”. (i.e., in deep learning — though multi-dimensional matrices do exist! )

Putting it all together, tensors can be thought of a multi-dimensional array or could just be called generalized form of matrix. If you remember, arrays based on their indexes are generalized as Scalar, Vector or matrix as shown below or simply a number, 1D array, 2D array in computer science when the indexes are 0, 1, 2 respectively and any higher index than 2 ,will with just be spelled as a tensor!

Array and its name as per its index number

The term “rank” is used to define the dimension or the indexes of the tensor.

Example, 2-d tensor has a rank = 2 ; 3-d tensor has a rank = 3

This might not make much sense! But, believe me, what you need to remember is just that tensors are nothing but nth dimensional matrices without going to deep into it!

For example, array with indices 3 is 3d tensor (Ok, ok, 3d matrix is also the same) So, how do they differ?

If you do insist,as per wikipedia : In mathematics, a tensor is an algebraic object related to a vector space and its dual space that can take several different forms, for example, a scalar, a tangent vector at a point, a cotangent vector (dual vector) at a point, or a multi-linear map between vector spaces. Euclidean vectors and scalars (which are often used in elementary physics and engineering applications where general relativity is irrelevant) are the simplest tensors.[1] While tensors are defined independent of any basis, the literature on physics often refers to them by their components in a basis related to a particular coordinate system……

Okey! That was too much to handle… A simple mathematical definition would be — a tensor is a mathematical object analogous to but more general than a vector, represented by an array of components that are functions of the coordinates of a space.

To put it simply! As per an amazing explanation by Steven Steinke in his medium blog post called “What’s the difference between a matrix and a tensor?( A must read!! For better understanding! )

A tensor is a mathematical entity that lives in a structure and interacts with other mathematical entities. If one transforms the other entities in the structure in a regular way, then the tensor must obey a related transformation rule.

This “dynamical” property of a tensor is the key that distinguishes it from a mere matrix. It’s a team player whose numerical values shift around along with those of its teammates when a transformation is introduced that affects all of them.

Any rank-2 tensor can be represented as a matrix, but not every matrix is really a rank-2 tensor. The numerical values of a tensor’s matrix representation depend on what transformation rules have been applied to the entire system.

Coming back we may simply think of a tensor as a generalized form of matrix, for example, when we say 3-d tensor, it could mean a “cube” of numbers!

1-D, 2-D, 3-D tensors respectively

Fundamental Attributes of Tensor:

  • Rank: Rank, represents the no. of dimensions present within the tensor or the indexes.

Example: 3-D tensor, has an index of 3 or is a 3 dimensional array or matrix.

  • Axes: It is a special dimension of tensor, elements are said to be running along these axis. Depends on the “length” of the axis, i.e., the no of indexes available along the axis.

Example

a = [ [1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
Tensor "a" has rank = 2; i.e., its a 2-D tensor, i.e., the index =2 so the no of axes = 2
Each element along the first axis represents -> [array]
a[0] = [1,2,3]
a[1] = [4,5,6]
a[2] = [7,8,9]
Each element along the second axis represents-> [numbers/data]
a[0][0] = 1
a[0][1] = 2
a[0][2] = 3
.
.
  • Shape: Tensors have their own shapes, which is given by the “length” of each axis, i.e., how many indexes are present.

Example: both size and shape are same in tensor

a = [ [1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
It is a 3 by 3 matrix i.e., with 3 rows and 3 columns
size or shape = [3,3]

Note: Shape of the tensor can be changed!! That is, rows and columns can be interchanged!!

We now know what tensor is, it’s attributes and how it differs from a matrix. But, this brings us to another Question: How and Where do we use it?

As mentioned in another awesome post by Daniel Jeffries at Hackernoon.

Tensors are containers storing numbers right! Here are some common types of datasets that we store in various types of tensors:

  • 3D Tensor= Time series , RGB color images
  • 4D Tensor= Images (jpeg)
  • 5D Tensor= Videos

Okey… How do we do that?

In almost every one of these tensors the common thread will be sample size. Sample size is the number of things in the set. That could be the number of images, the number of videos, the number of documents, or the number of tweets.

Typically, the actual data will be one less the sample_size:

rest_of_dimensions - sample_size = actual_dimensions_of_data

For example, an image is really represented by three fields, like this:

(width, height, color_depth) = 3D

But we don’t usually work with a single image or document in machine learning. We have a set. We might have 10,000 images of tulips, which means we have a 4D tensor, like this:

(sample_size, width, height, color_depth) = 4D

Time Series Data

3D tensors are very effective for time series data.

Example : Medical Scans — We can encode an electroencephalogram EEG signal from the brain as a 3D tensor, because it can be encapsulated as 3 parameters:

(time, frequency, channel)
The transformation would look like the above

Now if we had multiple patients with EEG scans, that would become a 4D tensor, like this:

(sample_size, time, frequency, channel)

Images

4D tensors are great at storing a series of images like jpegs. As we noted earlier, an image is stored with three parameters:

  • Height
  • Width
  • Color depth

The image is a 3D tensor, but the set of images makes it 4D. Remember that fourth field is for sample_size.

In tensorflow for MNIST dataset :

(sample_size, height, width, color_depth).

Whereas for Pytorch:

(sample_size,color_depth, height, width).

Color photos can have different color depths, depending on their resolution and encoding. A typical JPG image would use RGB and so it would have a color depth of 3, one each for each red, green, blue.

Example: Picture of awesome cat pic. It’s a 750 pixel x 750 pixel image will have following characteristic (750,750,3)

750 x 750 pixel

Hence this cat would get reduced to a series of cold equations that would look like this as it “transformed” or “flowed.”

Then let’s say we had a bunch of images of different types of cats. Perhaps we have 100,000 that were 750 pixels high by 750 pixels wide. We would define that set of data to Pytorch as a 4D tensor of shape:

(10000,3,750,750) 

5D Tensors

A 5D tensor can store video data. In TensorFlow video data is encoded as:

sample_size, frames, width, height, color_depth)

If we took a five minute video (60 seconds x 5 = 300), at 1080p HD, which is 1920 pixels x 1080 pixels, at 15 sampled frames per second (which gives us 300 seconds x 15 = 4500), with a color depth of 3, we would store that a 4D tensor that looks like this:

(4500,1920,1080,3)

The fifth field in the tensor comes into play when we have multiple videos in our video set. So if we had 10 videos just like that top one, we would have a 5D tensor of shape:

(10,4500,1920,1080,3)

Lets not go too deep into it for now…..

Conclusion

I hope with this you might have understood what tensors are and where can we use it. In the next post, here , we will for the first time use tensors in Pytorch, look at different operations that can be performed by it.

Feel free to ping me if you have any doubts regarding this post!

Till then stay tuned, and happy coding!

--

--

shaistha fathima
Secure and Private AI Writing Challenge

ML Privacy and Security Enthusiast | Research Scientist @openminedorg | Computer Vision | Twitter @shaistha24