Deep Learning — Artificial Neural Network(ANN)

Arun Purakkatt
Jul 29, 2020 · 7 min read

Building your first neural network in less than 30 lines of code.

1.What is Deep Learning ?

Deep learning is that AI function which is able to learn features directly from the data without any human intervention ,where the data can be unstructured and unlabeled.

1.1 Why deep learning?

ML techniques became insufficient as the amount of data is increased. The success of a model heavily relied on feature engineering till last decade where these models fell under the category of Machine learning. Where deep learning models deals with finding these features automatically from the raw data.

1.2 Machine learning vs Deep learning

ML vs DL (Source: https://www.kaggle.com/kanncaa1/deep-learning-tutorial-for-beginners)

2.What is Artificial neural network?

2.1 Structure of a neural network:

In a neural network as the structure says there is at least one hidden layer between the input and output layers. The hidden layers does not see the inputs. The word “deep” is a relative term which means how many hidden layer a neural network have.

While computing the layer the input layer is ignored. For example in the picture below we have a 3 layered neural network as mentioned input layer is not counted.

Layers in an ANN:

1 Dense or fully connected layers

2 Convolution layers

3 Pooling layers

4 Recurrent layers

5 Normalization layers

6 Many others

Different layers performs different type of transformations on the input. A convolution layer mainly used to perform convolution operation while working with image data. A Recurrent layer is used while working with time series data. A dense layer is a fully connected layer. In a nutshell each layer have its own features and used to perform specific task.

Structure of a neural network (Source: https://www.gabormelli.com/RKB/Neural_Network_Hidden_Layer)

2.2 Structure of a 2 layer neural network:

structure of a 2 layer neural network(Source: https://ibb.co/rQmCkqG)

Input layer : Each of the nodes in the input layer represents the individual feature from each sample within our data set that will pass to the model.

Hidden layer :The connections between the input layer and hidden layer , each of these connections transfers output from the previous units as input to the receiving unit. Each connections have its own assigned weight. Each input will be multiplied by the weights and output will be an activation function of these weighted sum of inputs.

To recap we have weights assigned to each connections and we compute the weighted sum that points to the same neuron(node) in the next layer. That sum is passed as an activation function that transforms the output to a number that can be between 0 and 1.This will be passed on to the next neuron(node) to the next layer. This process occurs over and over again until reaching the output layer.

Lets consider part1 connections between input layer and hidden layer , as from fig above. Here the activation function we are using is tanh function.

Z1 = W1 X + b1

A1 = tanh(Z1)

Lets consider part 2 connections between hidden layer and output layer , as from fig above. Here the activation function we are using is sigmoid function.

Z2 = W1 A1 + b2

A2 = σ(Z2)

During this process weights will be continuously changing in order to reach optimized weights for each connections as the model continues to learn from the data.

Output layer : If it’s a binary classification problem to classify cats or dogs the output layer have 2 neurons. Hence the output layer can be consists of each of the possible outcomes or categories of outcomes and that much of neurons.

Please note that number of neurons in the hidden layer is a hyper parameter like learning rate.

3. Building your first neural network with keras in less than 30 lines of code

3.1 What is Keras ?

There is a lot of deep learning frame works . Keras is a high-level API written in Python which runs on-top of popular frameworks such as TensorFlow, Theano, etc. to provide the machine learning practitioner with a layer of abstraction to reduce the inherent complexity of writing NNs.

3.2 Time to work on GPU:

In this we will be using keras with Tensorflow backend. We will use pip commands to install on Anaconda environment.

· pip3 install Keras

· pip3 install Tensorflow

Make sure that you set up GPU if you are using googlecolab

google colab GPU activation

We are using MNIST data set in this tutorial. The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from MNIST. The digits have been size-normalized and centered in a fixed-size image.

We are importing necessary modules

Loading the data set as training & test

Now with our training & test data we are ready to build our Neural network.

In this example we will be using dense layer , a dense layer is nothing but fully connected neuron. Which means each neuron receives input from all the neurons in previous layer. The shape of our input is [60000,28,28] which is 60000 images with a pixel height and width of 28 X 28.

784 and 10 refers to dimension of the output space , which will become the number of inputs to the subsequent layer.We are solving a classification problem with 10 possible categories (numbers from 0 to 9). Hence the final layer has potential output of 10 units.

Activation function can be different type , relu which is most widely used. In the output layer we are using softmax here.

As out neural network is defined we are compiling it with optimizer as adam,loss function as categorical_cross entropy,metrics as accuracy here. These can be changed based upon the need.

AIWA !!! You have just build your first neural network.

There is questions in your mind related to the terms which we have used on model building , like relu,softmax,adam ..these requires in depth explanations I would suggest you to read the book Deep Learning with Python by Francois Chollet, which inspired this tutorial.

We can reshape our data set and split in between train 60000 images and test of 10000 images

We will use categorical encoding in order to return number of features in numerical operations.

Our data set is split into train and test , our model is compiled and data is reshaped and encoded. Next step is to train our neural network(NN).

Here we are passing training images and train labels as well as epochs. One epoch is when an entire data set is passed forward and backward through the neural network only once.Batch size is number of samples that will propagate through the neural network.

We are measuring the performance of our model to identify how well our model performed. You will get a test accuracy of around 98 which means our model has predicted the correct digit while 98 percentage of time while running its tests.

This is how the first look of a neural network is. That’s not the end just a beginning before we get a deep dive into different aspects of neural networks. You have just taken the first step towards your long and exciting journey.

Stay focused , keep learning , stay curious.

“Don’t take rest after your first victory because if you fail in second, more lips are waiting to say that your first victory was just luck.” — Dr APJ Abdul Kalam

Reference : Deep Learning with Python , François Chollet , ISBN 9781617294433

Stay connected — https://www.linkedin.com/in/arun-purakkatt-mba-m-tech-31429367/

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Arun Purakkatt

Written by

MBA,M.Tech-ML Engineer/Data scientist

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Arun Purakkatt

Written by

MBA,M.Tech-ML Engineer/Data scientist

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store