Introduction To Neural Networks

Srajan Gupta
5 min readJan 2, 2019

--

Hi, Everybody. So this post is about one of the most interesting topics of machine learning — Neural Networks. In this post, we will get an overview of what a Neural Network is and how is it so useful in solving complex problems.

An Artificial Neural Network resembles a Biological Neuron. A Biological Neuron has a cell body, dendrites and synaptic terminals (which are connected to dendrites of other neurons). They receive signals through synaptic terminals in the form of an electric pulse and give an output which is also in the form of an electric pulse. In a similar fashion, an artificial neuron receives signals or inputs from other neurons, performs computation and gives the corresponding output.

While studying Neural Networks we often hear about the words Deep Learning. This is nothing but the study of artificial neurons, connected in a particular manner.

Let’s first start with the basics.

Neuron

A Neuron is a Basic unit of a Neural Network. It can take inputs from other neurons and give the corresponding output. It’s only limitation is, that the inputs and output can only be a binary number i.e. 0 or 1.

Perceptron

A Perceptron, on the other hand, is a single layer of LTU’s (discussed below) which is similar to an artificial neuron but the only difference is that the inputs and output might not be necessarily binary. They can be any number. And each connection is associated with a weight. As shown in the figure below, the LTU operates a function f(x) on the combination of inputs and the respective weights.

A Linear Threshold Unit (LTU) as shown above is a perceptron which computes the linear combination of these inputs and weights.

Z = (x1)*(w1) + (x2)*(w2)

And after this, it applies the function f(x) on Z, which is the resulting output of the LTU. But for an LTU to give an output it needs to know the values of the weights w1 and w2. Now here comes the training part. The LTU is trained first to obtain the values of w1 and w2.

A Perceptron is composed of a single layer of LTU’s. Each of which is connected to every other LTU of the previous layer of LTU’s or in other words the previous Perceptron.

The above combination of neurons and Perceptron receives two inputs and gives one output after the whole computation process.

Perceptrons do not output a class probability, rather they just make predictions based on a hard threshold.

Training a Perceptron

A Perceptron is fed one training instance at a time. And for every output neuron that produced a wrong prediction, it reinforces the connection weights from inputs that would have contributed to the correct prediction.

W(i, j) : = W(i, j) + n*(Y-y )x

W(i, j) — Connection weight between ith input neuron and jth output neuron.

n — Learning rate

Y — Output of the jth output neuron for the current training instance

y — Target output of the jth output neuron for the current training instance

x — ith input value of the current training instance

Implementing a Perceptron in Python

>>> import pandas as pd
>>> import numpy as np
>>> from sklearn.datasets import load_iris
>>> from sklearn.linear_model import Perceptron
>>> iris = load_iris()
>>> X = pd.DataFrame(data = iris.data, columns = iris.feature_names)
>>> y = iris.target
>>> per_clf = Perceptron(random_state = 42)
>>> per_clf.fit(X,y)
Perceptron(alpha=0.0001, class_weight=None, early_stopping=False, eta0=1.0,fit_intercept=True, max_iter=None, n_iter=None, n_iter_no_change=5,n_jobs=None, penalty=None, random_state=42, shuffle=True, tol=None,validation_fraction=0.1, verbose=0, warm_start=False)
>>> per_clf.predict([[1,2,3,4]])array([2])

Multi-Layer Perceptron

A Network Network model which has more than one layer of perceptrons is known as a Multi-Layer Perceptron. It comprises of an input layer, one or more layers of LTUs and one output layer. The layers other than the input and output layers are also known as hidden layers. When there are two or more hidden layers the Neural Network is known as Deep Neural Network. A 2-Layer Perceptron is given in the figure below -

Each Layer of the Neural Network except the output layer has a neuron which always gives 1 as output. This Neuron is known as Bias Neuron.

Let’s now study how to train an MLP.

Backpropagation Algorithm

When a training instance is fed into the Neural Network, computations are performed and output is calculated for every layer of the network. Then the output error (actual output — predicted output) of the network is calculated, and it computes how much each neuron in the last hidden layer contributes to the output error. It then proceeds to measure how much of these error contributions came from the previously hidden layer and so on until the algorithm reaches the input layer.

Activation Functions

Let us learn what these functions f(x), g(x) and h(x) are.

These functions can actually be any one of these.

Sigmoid Function:

σ(z) = 1 / (1 + exp(–z))

The Sigmoid Function ranges from 0 to 1.

Hyperbolic Tangent Function:

tanh (z) = 2σ(2z) — 1

The tanh function ranges from -1 to +1.

RELU Function:

ReLU (z) = max (0, z)

The RELU Function ranges from 0 to infinity. This function is non-differentiable at z = 0.

The below figure shows many other commonly used Activation Functions.

--

--