Introduction To Deep Learning 🤖— Chapter 6

satyabrata pal
ML and Automation
Published in
5 min readOct 31, 2020

Understanding The đźš‚ Of A Deep Learning System

This is a sequel to previous chapters.

What Is An Image ?🖼️

Images are anything but collection of pixels and they can be represented as a multidimensional array of numbers.

It will be better if I explain with an example. For this le me grab a few images from the MNIST dataset.

path=untar_data(URLs.MNIST_SAMPLE)

As usual we will use untar_data to download a subset of the original MNIST data.

We can check if we have got image data and not any funny things. We can use ls() for this . Explained in previous chapters.

path.ls()

See! we got different directories inside our dataset directory. What we are interested at this moment is the train directory.

I will grab a sample image from this directory like so.

imgof3=(path/'train'/'3').ls()
Image.open(imgof3[1])

So we got the kind of image which we wanted for this chapter. The MNIST dataset consists of images of handwritten digits. The problem statement is to identify which image is what.

Not too long ago this was a benchmark for judging the performance of image recognition systems. In today’s time the deep learning systems have advanced so much that this dataset is often regarded as a “toy” dataset, but it should be noted that datasets like this help us to build examples to understand the processes which powers the decision making of neural networks.

How To Convert An Image To Something A 🖥️ Can Understand

Computers perceive images differently than humans. They recognize only numbers. How do we take the image which we saw in the previous section and convert it into numbers?

There is something known as “tensors”. Think of these as multidimensional arrays. An image can be represented as tensors which makes it easier for a computer to make computations on it.

Fastai has a function known as tensor() , this takes in an image an spits back a tensor for that image. The function signature is this 👇

Look how it takes in an object and gives back a floating point representation of the same.

tensorof3=tensor(Image.open(imgof3[1]))
tensorof3

The above tensor is what a computer sees when you give it an image.

Great for the computer, not so great for us humans and this chapter is for humans. So, we need a better way to represent the tensor in a way comfortable to us.

This can be easily done by taking the above tensor and converting it into a pandas dataframe.

df=pd.DataFrame(tensorof3)
df

Again we didn’t do a good job. I still can’t make anything out of the bunch of numbers here. Let us try to pick up a subset of the above matrix and see if we can get a better view of the number hidden in the above tensor.

df=pd.DataFrame(tensorof3[4:24,4:24])
df

Look! an outline of a number is barely visible now. There’s one more trick which we can exploit.

df.style.set_properties(**{'font-size':'6pt'}).background_gradient('Blues')

Now you see! We exploited the style.set_properties method of pandas to color code only the numbers other than 0 in the tensor.

Did you notice how their are numbers other than 0 only in places which marks the presence of the image of a handwritten digit? The 0’s represent the places which are empty i.e. don’t have any image pixel.

We can do the same for the other set of numbers i.e. number 7.

imgof7=(path/'train'/'7').ls()
tensorof7=tensor(Image.open(imgof7[1]))
tensorof7

Now it’s time to put everything together and collect the tensors for all the images.

def tensorsofnum(imgofnum):
return [tensor(Image.open(img)) for img in imgofnum]

tensorsofnum function takes in an image and returns an array of tensors.

tensorsof3=tensorsofnum(imgof3)
tensorsof7=tensorsofnum(imgof7)

Printing the length of the array helps us to debug if we have got all the things into our array or not.

(len(tensorsof3),len(tensorsof7))

Conclusion

We have got what we wanted. We got our data, we now have an idea about how an image is perceived by a computer and we also know how we can convert image to tensors. However, the real reason we are going through all this is because we want to be able to convert our data into a form with which we can do image recognition using deep learning, but before using deep learning we will find the simplest non-deep learning way with which we can accomplish the task of image recognition and after that we will improve on it by using deep learning.

In the next chapter we will do exactly that. We will device a simple method to recognize the numbers in the image and then we will use that simple model as a benchmark for our deep learning model.

How To Show Your Support To The Publication🤗

Creating a content requires a lot of research, planning, writing and rewriting . This is important because I want to deliver practical content to you without any fluff.

If you like my work and my content and want to support me then the following are the ways to show your support →

  • If you like my work then click on this link to Buy me a coffee.
  • Buy my deep learning course at udemy. Just click on the course link in the show notes and get awesome discount on my deep learning course.
  • Subscribe to my publication and share it across so that more people can discover it.
  • Subscribe and share my podcast “SimpleAI” on google podcast or any other podcast player of your choice. Don’t forget to give it a 5 star.
  • Subscribe to my newsletter.

--

--

satyabrata pal
ML and Automation

A QA engineer by profession, ML enthusiast by interest, Photography enthusiast by passion and Fitness freak by nature