Deep Learning For Beginners

If you work in the tech sector or have interest in the tech scene, you’ve probably heard the term “deep learning” floating around quite a bit. It’s the emerging area of computer science that is revolutionizing artificial intelligence, allowing us to build machines and systems of the future. Although deep learning is making our lives easier, understanding how it works can be hard. Having spent quite some time exploring the world of deep learning, mostly for computer vision applications, I learned a thing or two on what it’s all about and therefore I’m here to share what I learned.

Firstly, before you understand deep learning, it’s important that you know what machine learning is. Quite simply, machine learning is an area of Artificial Intelligence (AI) that allows computers to ‘learn’. Traditionally, we always got computers to do things by providing it a strict set of instructions (a.k.a. computer programs). Machine Learning uses a very different approach. Instead of giving the computer a set of instructions on how to do something, we give it instructions on how to learn to do something. We do that by giving it data and programming it to use various mathematical and statistical models that make sense of the data and learn to make decisions based on that. For example: think of a system that can classify pictures of animals as ‘cat’, ‘dog’, ‘tiger’, ‘lion’ or ‘elephant’. Instead of manually finding unique visual characteristics and patterns from images of those animals and then coding it up, you can program the computer to take in images of those animals and find visual patterns and differences between images of different animals all by itself. That can be done using a range of different algorithms. The idea here is that the computer ‘learns’ by itself instead of being specifically programmed to do a certain task (in this case, classifying images of different animals). The process of teaching the computer (i.e. giving it data to learn from) is referred to as training.

Deep learning, as the name suggests, is a sub sect of machine learning. Deep Learning mostly involves using deep artificial neural networks (algorithms/computational models loosely inspired by the human brain) to tackle machine learning problems. Going back to the example I gave earlier, state-of-the-art image classification solutions today use deep learning. Note: sometimes, using decision trees and/or other machine learning algorithms may also be referred to as deep learning, but for the most part deep learning involves the use of neural networks.

So, what is a neural network? Here’s an analogy: imagine a neural network as a series of doors one after another and think of yourself as the ‘input’ to the neural network. Every time you open a door, you become a different person (i.e. you change in some way). By the time you open the last door, you have become a very different person. When you exit through the last door, you become the ‘output’ of the neural network. Each door, in this case, represents a layer. A neural network, therefore, is a collection of layers that transform the input in some way to produce an output. Each layer in the neural network consists of ‘weights’ and ‘biases’ — these are just numbers that augment the input. The overall idea of a neural network is that it takes in some input (usually a collection of numbers that represent something, e.g. Red-Green-Blue values of pixels in an image), applies some mathematical transformations to the input using the weights and biases in its layers and eventually spits out an output. If you’ve taken some linear algebra class before, you can look at the input, output and weights as matrices. The input matrix gets transformed by a series of matrices (i.e. the weight and bias matrices of the layers) and that becomes your output. Of course, this is a very simplified description of how a neural network works but you get the idea (I hope).

A deep neural network is just a neural network with many layers (as you stack layers on top of another, the neural network keeps getting ‘deeper’). How many is many? Well, there’s a VGG16 neural network architecture (used for image classification) that consists of 16 layers and then there’s the ResNet architecture (also used for image classification) that consists of 152 layers — so, the range is pretty wide. The basic idea of deep learning is using neural networks with multiple layers.

Now, the question is: how does a neural network learn? Backpropagation! As I said before, neural networks consist of layers that consist of weights and biases (which are just collections of numbers). During the training phase, the neural network tries to find the right weights/biases that lead to the most accurate output. It does so using a method called backpropagation. Before a neural network is trained, the weights/biases are initialized, either randomly or from a previously trained model. Either ways, when training happens, the neural network changes those weights and biases based on what it ‘learns’. When we build a neural network, we have to decide on (i.e. choose or design) something called a cost function. The cost function is basically just a mathematical function that takes in the output from a neural network (for a given input) and the ground truth data (i.e. the expected output from the neural network for that given input) and calculates how off/bad the result from the neural network was. Using optimization techniques like gradient descent, the computer calculates how to change the weights and biases such that the cost function is minimized (this will make more sense to you if you have taken calculus before — remember optimizing/minimizing/maximizing functions?). It keeps doing this as it trains on more and more data (get the output from the neural network, calculate cost and backpropagate to change weights). Over time, the weights and biases adjust with the data and (hopefully) you end up with a neural network that has a high output accuracy. Remember, the practical effectiveness or accuracy of a neural network is largely dependent on the data used to train it; so it’s very important that the proper dataset is built or chosen. Without good data (and a good amount of data) it can be very hard to train an accurate neural network.

PS: The cost function basically measures how inaccurate the neural network is; as we minimize the cost function by changing weights/biases, we are essentially trying to make the neural network mathematically more accurate (as defined by the cost function). However, that accuracy is dependent on the data it’s trained on; so a low cost does not necessarily mean the neural network is adequately trained.

So far, you should have a basic understanding of what machine learning, deep learning and neural networks are. That’s all I wanted to share through this post. I know that many of the resources online may seem very technical or intimidating, so I tried to keep it simple. Of course, a lot of what I wrote are over-simplifications intended to help anyone get a basic idea on deep learning. If you want to know more, you should consider doing more research on this field — it has been a very rewarding experience for me!

Lastly, this is for those of you who would like to learn more about deep learning. If you have never looked into Machine Learning before, you can consider taking the Intro to Machine Learning course on Udacity. It’s a pretty great start I’d say, especially if you have absolutely no past experience or knowledge on machine learning. Going into deep learning will require you to be comfortable with calculus and linear algebra, at least at a basic level to understand what’s going on. If you got that covered, I would recommend taking the CS231N course at Stanford University (available for free online at this link), but that’s just because most of my experience with deep learning has been with computer vision and that’s what the course focuses on. There’s also another course offered by Stanford called CS224N that focuses on natural language processing with deep learning. Either of them should be a good starting point. At the end of the day, however, most of the learning happens when you try to build things on your own, so get the basics sorted and start experimenting with neural networks if you want to go deeper into deep learning. Also, look for research papers on neural networks (architecture, applications, etc.) and read them! I am learning myself, so if you come across any interesting learning material, please do share it with me.

Looking forward to an exciting future powered by AI and Deep Learning!