Deep Learning — Artificial Neural Network(ANN)

Arun Purakkatt
Analytics Vidhya
Published in
7 min readJul 29, 2020

Building your first neural network in less than 30 lines of code.

1.What is Deep Learning ?

Deep learning is that AI function which is able to learn features directly from the data without any human intervention ,where the data can be unstructured and unlabeled.

1.1 Why deep learning?

ML techniques became insufficient as the amount of data is increased. The success of a model heavily relied on feature engineering till last decade where these models fell under the category of Machine learning. Where deep learning models deals with finding these features automatically from the raw data.

1.2 Machine learning vs Deep learning

ML vs DL (Source: https://www.kaggle.com/kanncaa1/deep-learning-tutorial-for-beginners)

2.What is Artificial neural network?

2.1 Structure of a neural network:

In a neural network as the structure says there is at least one hidden layer between the input and output layers. The hidden layers does not see the inputs. The word “deep” is a relative term which means how many hidden layer a neural network have.

While computing the layer the input layer is ignored. For example in the picture below we have a 3 layered neural network as mentioned input layer is not counted.

Layers in an ANN:

1 Dense or fully connected layers

2 Convolution layers

3 Pooling layers

4 Recurrent layers

5 Normalization layers

6 Many others

Different layers performs different type of transformations on the input. A convolution layer mainly used to perform convolution operation while working with image data. A Recurrent layer is used while working with time series data. A dense layer is a fully connected layer. In a nutshell each layer have its own features and used to perform specific task.

Structure of a neural network (Source: https://www.gabormelli.com/RKB/Neural_Network_Hidden_Layer)

2.2 Structure of a 2 layer neural network:

structure of a 2 layer neural network(Source: https://ibb.co/rQmCkqG)

Input layer : Each of the nodes in the input layer represents the individual feature from each sample within our data set that will pass to the model.

Hidden layer :The connections between the input layer and hidden layer , each of these connections transfers output from the previous units as input to the receiving unit. Each connections have its own assigned weight. Each input will be multiplied by the weights and output will be an activation function of these weighted sum of inputs.

To recap we have weights assigned to each connections and we compute the weighted sum that points to the same neuron(node) in the next layer. That sum is passed as an activation function that transforms the output to a number that can be between 0 and 1.This will be passed on to the next neuron(node) to the next layer. This process occurs over and over again until reaching the output layer.

Lets consider part1 connections between input layer and hidden layer , as from fig above. Here the activation function we are using is tanh function.

Z1 = W1 X + b1

A1 = tanh(Z1)

Lets consider part 2 connections between hidden layer and output layer , as from fig above. Here the activation function we are using is sigmoid function.

Z2 = W1 A1 + b2

A2 = σ(Z2)

During this process weights will be continuously changing in order to reach optimized weights for each connections as the model continues to learn from the data.

Output layer : If it’s a binary classification problem to classify cats or dogs the output layer have 2 neurons. Hence the output layer can be consists of each of the possible outcomes or categories of outcomes and that much of neurons.

Please note that number of neurons in the hidden layer is a hyper parameter like learning rate.

3. Building your first neural network with keras in less than 30 lines of code

3.1 What is Keras ?

There is a lot of deep learning frame works . Keras is a high-level API written in Python which runs on-top of popular frameworks such as TensorFlow, Theano, etc. to provide the machine learning practitioner with a layer of abstraction to reduce the inherent complexity of writing NNs.

3.2 Time to work on GPU:

In this we will be using keras with Tensorflow backend. We will use pip commands to install on Anaconda environment.

· pip3 install Keras

· pip3 install Tensorflow

Make sure that you set up GPU if you are using googlecolab

google colab GPU activation

We are using MNIST data set in this tutorial. The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from MNIST. The digits have been size-normalized and centered in a fixed-size image.

We are importing necessary modules

Loading the data set as training & test

Now with our training & test data we are ready to build our Neural network.

In this example we will be using dense layer , a dense layer is nothing but fully connected neuron. Which means each neuron receives input from all the neurons in previous layer. The shape of our input is [60000,28,28] which is 60000 images with a pixel height and width of 28 X 28.

784 and 10 refers to dimension of the output space , which will become the number of inputs to the subsequent layer.We are solving a classification problem with 10 possible categories (numbers from 0 to 9). Hence the final layer has potential output of 10 units.

Activation function can be different type , relu which is most widely used. In the output layer we are using softmax here.

As out neural network is defined we are compiling it with optimizer as adam,loss function as categorical_cross entropy,metrics as accuracy here. These can be changed based upon the need.

AIWA !!! You have just build your first neural network.

There is questions in your mind related to the terms which we have used on model building , like relu,softmax,adam ..these requires in depth explanations I would suggest you to read the book Deep Learning with Python by Francois Chollet, which inspired this tutorial.

We can reshape our data set and split in between train 60000 images and test of 10000 images

We will use categorical encoding in order to return number of features in numerical operations.

Our data set is split into train and test , our model is compiled and data is reshaped and encoded. Next step is to train our neural network(NN).

Here we are passing training images and train labels as well as epochs. One epoch is when an entire data set is passed forward and backward through the neural network only once.Batch size is number of samples that will propagate through the neural network.

We are measuring the performance of our model to identify how well our model performed. You will get a test accuracy of around 98 which means our model has predicted the correct digit while 98 percentage of time while running its tests.

This is how the first look of a neural network is. That’s not the end just a beginning before we get a deep dive into different aspects of neural networks. You have just taken the first step towards your long and exciting journey.

Stay focused , keep learning , stay curious.

“Don’t take rest after your first victory because if you fail in second, more lips are waiting to say that your first victory was just luck.” — Dr APJ Abdul Kalam

Reference : Deep Learning with Python , François Chollet , ISBN 9781617294433

Stay connected — https://www.linkedin.com/in/arun-purakkatt-mba-m-tech-31429367/

--

--