Getting started with Deep Learning

Shashank Verma
CSE Association SRM
8 min readJul 6, 2020

If you ever get fascinated by the words Machine Learning, Artificial Intelligence, Deep Learning, neural networks, blah… blah… , Well, you don’t need to be intimidated by them from now.
If you are a beginner, an absolute one! then these few minutes are gonna boost your confidence and make a proper path for you to start your journey towards Deep Learning and AI.

What is deep learning?
Well in simple words, Deep learning is a family member of Machine Learning whose power lies in the hands of Artificial Neural Networks.

Now, the question arises, what is a neural network?
You must have heard of neurons that are present in our brains, which help our brain to transmit signals, to perform calculations, to remember things, to speak a language, to identify objects and the functions are limitless.
Artificial Neural Networks vaguely mimics the process of how the brain operates, with neurons that fire bits of information.

Neural Networks work on perceptron logic, It’s working can be understood by visualizing it as a logic gate. There are some inputs, a certain function is performed and then there is an output.

This figure shown above is the basic structure of a perceptron. There are several inputs, a function that needs to be performed and a step function.

Ever wondered, if you see an animal on the road, you can quickly identify it whether it is a cow, a dog, or a lion…well, I hope you don’t see one but anyways, we will get to know how a machine can identify whether a given picture is of a particular animal or not.

Prerequisites before you start your Deep Learning Journey:-
1) Give some time to matrix math and basic algebra.
2) Basic knowledge of how python language works.
3) Python libraries: Specifically NumPy, Pandas, Matplotlib, sklearn would help a lot.
4) You must know how to work with Jupyter Notebooks.
5) Fifth, and the most important thing is patience, you must be patient enough to give yourself some time to absorb what you have learned.

These things, in particular, will take you around 10 days to complete, and when you are confident enough then you can follow this simple and basic Deep learning project which identifies whether a particular image is an image of a cat or not.

This type of work where we provide the data to our model and it predicts yes or Not is called Binary Classification.

Before starting, I would recommend that you should be following some course or should be reading some editorials regarding neural networks and how they work, Logistic regression, Loss Function, Cost Function, and Gradient Descent, backtracking as well. Get the notion of weights and features. What images actually are and how each pixel represents a particular value, how they can be represented as a vector, all these things are very important to understand the working of this model.

Let’s begin the discussion on this very simple and entry-level deep learning model of classifying whether a given image is of a cat or not.

All of the work is done in the python 3.6 environment in Jupyter notebook.

Firstly, we will import all the necessary packages that will be required throughout the making of the model.

We will then load our dataset. Now here our dataset is of the .h5 extension
there are a certain number of images which are present as training sets and testing sets.

Let us now take a look at how our data looks like:

This code helps us to display the image at the 68th position in the train_set_x_orig. The line y=1 tells us that the image is of a cat.

If it was y=0, then that image does not represent a cat.

As we can see here, these images are of spray bottles and the label here shows us y=0, which means the given image is not a cat’s.

Now, the next step is to find the number of training samples, testing samples and the dimensions of the image. So in order to do that we’ll use .shape method.

We can clearly see that the training set contains 209 images and the testing set contains 50 images.
Each image is of 64x64x3 dimensions where 3 represents the RGB color grid, and 64x64 means the height and width of the images.

For convenience, we should now reshape images of shape (num_px, num_px, 3) in a NumPy array of shape (nump_px * num_px *3 ,1). After this, our training and test dataset is a NumPy array where each column represents a flattened image. There should be m_train columns.

One common preprocessing step in machine learning is to center and standardize your dataset, meaning that you subtract the mean of the whole numpy array from each example, and then divide each example by the standard deviation of the whole numpy array. But for picture datasets, it is simpler and more convenient and works almost as well to just divide every row of the dataset by 255(the maximum value of a pixel channel).

So far what we have done:
1) Figured out the dimensions and shapes of the problem.
2) Reshaped the datasets such that each example is now a vector of size (64*64*3,1)
3) Standardized the data.

Now we will build the algorithm using these mathematical equations, which are basically a sigmoid function and a log loss function, which you might have studied if you’ve followed the prerequisites and reached this far.

Now, the main steps for building a Neural Network are:

  1. Define the model structure(such as the number of input features).
  2. Initialize the model’s parameters
  3. Loop:

Calculate current loss(forward propagation)
Calculate current gradient (backward propagation)
Update parameters(Gradient Descent)

So now, we will define all the helper functions one by one and then use them in building the final model.

So this is a sigmoid function, which is basically used to convert any value in the range of 0 to 1.

Here, we have implemented parameter initialization in the cell below.

Now it’s time for forward propagation:
Now that our parameters are initialized, we can do the forward and backward propagation

Then we have to design a function named optimize where our parameters will get optimized after every iteration.

  1. We have initialized our parameters
  2. We are also able to compute a cost function and its gradient.
  3. Now, we have to update the parameters using gradient descent.

For a parameter w, the update rule is w=w-(learning_rate * dw).

The previous function will output the learned w and b. We are able to use w and b to predict the labels for a dataset x.

  1. Calculate yhat=A=sigmoid(w^t *x +b)
  2. Convert the entries into a 0(if activation ≤0.5) or 1 (if activation >0.5),store all the predictions in a vector Y_prediction.

Now we’ll merge all the functions into a single model. But before that let’s discuss what we’ve done so far:

We’ve implemented several functions that initialize(w,b), Optimize the loss iteratively to learn parameters(w,b), computing the cost and its gradient- updating the parameters using gradient descent- Use the learned(w,b) to predict the labels for a given set of examples.

Now we’ll see how the overall model is structured by putting together all the building blocks together in the right order.

Now we’ll train our model:

We, can see that we have achieved the training accuracy of 99 percent and the test accuracy of 70 percent. Training accuracy is close to 100 percent. this is a good sanity check: out model is working and has high enough capacity to fit the training data. The test accuracy is 70 percent, For this entry-level and most basic model it is not bad.

Let’s see if our model correctly classifies the image:

We can also test our model with our own image:

  1. Add your image to the jupyter notebook’s directory, in the “images” folder
  2. Change your image’s name in the following code
  3. Run the code and check if the algorithm is right(1=cat, 0=non-cat)

If it is a cat then it will display that its a “cat”.

And now we are done.

Now, the thing to remember here is that this was just an entry-level model. Later we’ll learn about more concepts that will help us improve our test accuracy and build a more generalized model. But to start this is the best binary classifier you can make.

So, I would suggest studying the prerequisites properly, try to make this model by yourself after studying the concepts of perceptions and logistic regression, don’t rush into things, give yourself time to absorb all these things, and then only move ahead to the next concepts. If you follow this surely your journey to learn deep learning will be a beautiful one.

--

--