A to Z about Artificial Neural Networks (ANN) (Theory N Hands-on)

Shreyak

Published in

Analytics Vidhya

6 min readMay 13, 2020

In this blog, I will make you understand everything about Artificial Neural Networks.

We'll be going deep in theory and will solve a problem using the ANN model.

Artificial neural networks (ANN) are computing systems vaguely inspired by the biological neural networks that constitute animal brains. Such systems “learn” to perform tasks by considering examples, generally without being programmed with task-specific rules. For example, in image recognition, they might learn to identify images that contain cats by analyzing example images that have been manually labelled as “cat” or “no cat” and using the results to identify cats in other images.

Perceptron Model

A perceptron model was a form of neural network introduced in 1958 by Frank Rosenblatt. Perceptron is a single layer neural network and a multi-layer perceptron is called Neural Networks.

Perceptron is a linear classifier (binary). Also, it is used in supervised learning. It helps to classify the given input data.

If f(y) is just a sum, then y=x1w1+x2w2+x3w3.

But a single perceptron won’t be enough to learn complicated systems. Fortunately, we need to expand the single perceptron, to create a multi-layer perceptron model. We'll also introduce the idea of the activation function.

To build a network of perceptrons, we can connect layers of perceptrons, using a multi-layer perceptron model.

Input Layer- receives the input as data from the dataset.

Output Layer (it can be more than one also)- Give the output as a prediction from the model.

Hidden Layers- Any layer between Input and Output layer.

Activation Function

It is used to set boundaries for the overall output values of:

x*w+b

we can also state as:

z=x*w+b

And then pass z through some activation function to limit its values. Keep in mind that you will find variables f(z) or X to denote a tensor input consisting of multiple values.

Sigmoid — The sigmoid function is applicable for outputs having 0 or 1.

Rectified Linear Unit (ReLu)- ReLu has been found to have very good performance, especially when dealing with the issues of vanishing gradient. It’ s always used as default because it has overall good performance.

Cost Functions

The cost function is always referred to as a loss function. We can keep track of our loss/cost during training to monitor network performance. A cost function is a measure of “how good” a neural network did with respect to it’s given training sample and the expected output. It also may depend on variables such as weights and biases. A cost function is a single value, not a vector because it rates how good the neural network did as a whole.

BackPropogation

Back-propagation is the essence of neural net training. It is the method of fine-tuning the weights of a neural net based on the error rate obtained in the previous epoch (i.e., iteration). Proper tuning of the weights allows you to reduce error rates and to make the model reliable by increasing its generalization.

Backpropagation is a short form for “backward propagation of errors.” It is a standard method of training artificial neural networks. This method helps to calculate the gradient of a loss function with respects to all the weights in the network.

Why We Need Backpropagation?

Most prominent advantages of Backpropagation are:

Backpropagation is fast, simple and easy to program
It has no parameters to tune apart from the numbers of input
It is a flexible method as it does not require prior knowledge about the network
It is a standard method that generally works well
It does not need any special mention of the features of the function to be learned.

Dropout Layer

Dropout refers to ignoring units (i.e. neurons) during the training phase of a certain set of neurons which is chosen at random. By “ignoring”, I mean these units are not considered during a particular forward or backward pass.

More technically, At each training stage, individual nodes are either dropped out of the net with probability 1-p or kept with probability p, so that a reduced network is left; incoming and outgoing edges to a dropped-out node are also removed.

Why do we need Dropout?

Answer for this question in one sentence is “TO PREVENT OVERFITING” of the dataset.

CODE

Now let's move towards the project which I used ANN to make you all more aware of Artificial Neural Network (ANN) and it’s working.

Project name — Breast Cancer Classifier Using Artificial Neural Networks.

Dataset Used- https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

GitHub repository for this project— Click Here

STEP 1. Import the necessary Libraries

#Import necessary libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation,Dropout
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report,confusion_matrix

STEP 2. Load your dataset- Here I have CSV dataset. I used pandas to load that in my Colab.

df = pd.read_csv(‘cancer_classification.csv’)
df

STEP 3. Check info of the dataset.

df.info()

STEP 4. Check correlation

df.corr()['benign_0__mal_1'].sort_values(ascending = True)

STEP 5. Divide dataset into X and y

X = df.drop(‘benign_0__mal_1’,axis=1).values
y = df[‘benign_0__mal_1’].values

Here X has all features except “benign_0__mal_1"

y has “benign_0__mal_1".

STEP 6. Split dataset into training and testing

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.25,random_state=101)

STEP 7. Apply some preprocessing in order to make the model more accurate.

from sklearn.preprocessing import MinMaxScaler
scaler=MinMaxScaler()
scaler.fit(X_train)

STEP 8. Transform your data assigned in X and y.

X_train=scaler.transform(X_train)
X_test=scaler.transform(X_test)

STEP 9. Get the shape of the dataset, it will help us to make take units for dense layers.

X_train.shape

Here the shape of the dataset is (426, 30)

STEP 10. Let's build our ANN model. Here we use 3 Dense Layers of 30, 15 and 1 units respectively. First Layer is 30 because of the shape of the X_train is 30, and the last layer’s unit is 1. Because of the single class classifier.

model = Sequential()model.add(Dense(units=30,activation=’relu’))model.add(Dropout(0.5))model.add(Dense(units=15,activation=’relu’))model.add(Dropout(0.5))model.add(Dense(units=1,activation=’sigmoid’))model.compile(loss=’binary_crossentropy’, optimizer=’adam’)

Summary of the model.

STEP 11. Apply early stop for the epochs.

early_stop = EarlyStopping(monitor=’val_loss’, mode=’min’, verbose=1, patience=25)

STEP 12. Fit the dataset to the ANN model.

model.fit(x=X_train,y=y_train,epochs=600,validation_data=(X_test, y_test), verbose=1,callbacks=[early_stop])

STEP 13. Get out the predictions from the model.

predictions = model.predict_classes(X_test)

STEP 14. Get the accuracy of the model.

import sklearn.metrics as metrics
print(“Accuracy: ({0:.4f})”.format(metrics.accuracy_score(y_test,predictions)))

Accuracy: (0.9790)

So our accuracy is 97.90 % which is very good. You can apply other Neural Network to the same dataset and comment/ response your accuracy and experience below.

So in this simple project, I wanted to show some hands-on Artificial Neural Network, and tell you how it works.

I hope you like this blog. Subscribe my Medium account. I post blogs related to Data Science, Machine Learning, Python and Deep Learning.

Feel free to share your thoughts in the comment section and you can also connect with me in:-
Linkedin — https://www.linkedin.com/in/shreyak007/
Github — https://github.com/Shreyakkk
Twitter — https://twitter.com/Shreyakkkk
Instagram — https://www.instagram.com/shreyakkk/
Snapchat — shreyak001
Facebook — https://www.facebook.com/007shreyak

Thank You!