Day 8 of 100DaysofML

Published in

100DaysofMLcode

7 min readJun 24, 2020

Deep Learning. One of the core fundamentals on our journey in Machine Learning. It can be regarded as a subset of ML. As the term suggests, it can be used for better learning and performance in case of supervised and unsupervised learning.

Some of the abbreviations that we shall constantly use and you should keep in mind are:
DL-Deep Learning
NN-Neural Networks
ML-Machine Learning

I shall try to make the analogy simple on when and why we use Deep Learning. Machine learning that makes use of Neural Networks is termed as Deep Learning. During the creation of our neural network, we need to keep in mind that we are using DL.

Just like ML, DL has two different types of learning which are classified as Supervised and Unsupervised. We already know about the difference between the two so I’m going to talk about the types among both of them. The supervised learning algorithms in DL are:
1. ANN (Artificial Neural Network
2. CNN (Convolutional Neural Network)
3. RNN (Recurrent Neural Network)

On the other hand, the unsupervised DL algorithm types are:
1. Self Organizing maps
2. Boltzmann machine
3. Autoencoders (Recommendation Engines)

I’m going to focus more on the Supervised learning algorithms but Ill try to give a gist on most of the algos. Okay, but first we should focus on why DL over ML? For starters, lets take a look at this graph:

We can train DL models the same way like we train ML models but the main difference is seen in the performance of the two. For large amount of data, DL works better and provides better performance as compared to ML. This is an observed fact.
I can give you the example of a Recommendation system which basically uses the data that we feed to it. It tries and understand the usage patterns of the user just like spotify, it tries and understands the genre of music which you mostly listen to and then tries to predict songs of a similar genre based on the amount of data we feed to it. The more the data, the better the prediction will be.

Now, let's get to the concept.

1. ANN or Artificial Neural Network
We can perform classification as well as Regression using ANN. Both of them are supervised learning algos and can be used to predict values (regression) or categories (classification).

How does it work?
The entire idea of a neural net is that it consists of a number of layers. Let us look at the diagram below:

Here, in the given diagram, each of the colors represent a given layer of the neural net. Inside these layers, we have a number of nodes (circles of different colors). Now each neural network may consist of a number of layers and the complexity of the NN keeps increasing as the layers increase. Ignore the arrows in the diagram for now.

Let us take another diagram to understand the 3 most important layers of the NN which are:

Input layer: These are the starting most layers of the NN and they receive the data, hence it is termed as input layer. In this layer, the weights assigned to each of the nodes is given to be different since they are randomly initialized.
Hidden layer: These refer to the middle layers of NN. In the given diagram, there is only one such layer but we can definitely have more of them but as we keep increasing the number of middle layers, the complexity of the NN keeps on increasing. The nodes in these layers have their own activation function.
Output layer: As the term suggests, this is the last layer which predicts the output of the given NN. For instance, it may give us a value in case of regression or predict a class in case of classification.

Obviously, they are a bit more complex than they look an have a number of things if we are trying to consider the complexity but let's just take baby steps for now.

Let us now break things into even smaller parts and understand a single layer of the NN.
Let us take a diagram and try to understand.

Here, in the diagram, we have a single layer of the Neural Network which has 3 main circles or Nodes. These are the fundamental units of a NN and are also called Perceptron. Each of these circles with numbers 1,2 and 3 on them represent a node. I shall talk about the functions and features related to these nodes in tomorrow’s blog but i just want to cover the fundamentals of a NN for now.

So, a quick summary before it gets overwhelming: A neural network consists of a number of layers which are used for training. There are 3 main layers termed as input, hidden and output layers and each of these layers consist of a number of nodes within them which have their own activation function and weights assigned to them.

2. CNN or Convolutional Neural Network
So these types of neural networks are used in case of trying to help with computer vision. Things related to identifying and object detection are all done with the help of Convolutional Neural Network. Let us take an example. For instance a picture is given to us and we need to identify the animal in it, then we need to define the different features of the picture and provide it with tons of pictures to train the model before it can actually predict the animal in the picture accurately. For these kinds of scenarios, we use CNN or convolutional Neural Networks. CNN consists of 4 main steps when we try to solve it conceptually or logically. I shall describe these steps in brief and we can look into them in the upcoming blogs.
- Convolutional
- Relu Activation function
- Max Pooling
- Flattening
These are the main steps, we can look into them later. a diagram giving the basic flow is shown below.

NN gif (**towardsdatascience.com**) which shows the representation for prediction of an animal

3. RNN or Convolutional Neural Network
RNN is the most complex type of NN in my opinion. I shall try to explain with an example for better understanding. Let us take a broken sentence and it needs to be completed.
Eg: Charan went to the supermarket and bought ___________.
This is the sentence that needs to be completed. So when we pass this to the DL model, it uses its training instances to identify the positioning of verbs, nouns, etc. in the sentence to come out with a prediction of what could be filled in the blank.
RNN makes use of a lot of memory. It needs to understand the positioning of verbs, nouns etc. for which it uses quite a considerate amount of memory. Understanding and creation of RNN models takes some time but we will get there eventually. Some of the different types of RNN are:
- LSTM (Long Short Term Memory): These systems are used for stronger systems.
- GRU (Gated Recurrent Units): These are used for weaker systems.
We shall try to understand them later. Right now, the understanding of the fundamental concept behind the different layers is important.

In the diagram above, if you are wondering what (t-1), t and (t+1) stand for, they are the positions of the words. If at the (t-1)th position there is a word, then it will influence the word at the t-th position and so will it affect the word at the (t+1)th position.

This was a gist about the supervised learning algos. Now lets give a gist about the unsupervised learning algos.

Self organizing maps: These types of unsupervised learning algos are used in order to cluster data based solely on the features given to us. For example, in the case of credit card fraud detection, we make our classification (fraud or not fraud) based on the input features that are given.
Boltzmann machine: The main purpose of Boltzmann Machine is to optimize the solution of a problem. It is the work of Boltzmann Machine to optimize the weights and quantity related to that particular problem.
Autoencoders (Recommendation engines): These are the type of NN which tries to predict something based on the data fed to it just like seen in the case of spotify whose example I have mentioned above.

Just gonna cover that much for today, I shall try to cover more about the nodes and its activation functions soon along with creation of my own NN. Keep Learning.

Cheers.

Day 8 of 100DaysofML

Written by Charan Soneji