Perceptron, simplest artificial neural network

Utakarsh
5 min readJul 9, 2024

--

Perceptron one of the simplest Artificial neural network architectures was introduced by Frank Rosenblatt in 1957s. It is the simplest type of feedforward neural network, consisting of a single layer of input nodes that are fully connected to a layer of output nodes. It can learn the linearly separable patterns. it uses slightly different types of artificial neurons known as threshold logic units (TLU)

let us assume we want to select a group of students for university admission, now we are selecting the students upon comparing their test score and grade.

let's draw a perception P=2*Test score+1*Grade+1*18, now if

here, (2,1,1) are weights and (Test score, Grade) are input features and 18 is the bias.

  • Input Features: The perceptron takes multiple input features; each input feature represents a characteristic or attribute of the input data.
  • Weights: Each input feature is associated with a weight, determining the significance of each input feature in influencing the perceptron’s output. During training, these weights are adjusted to learn the optimal values.
  • Bias: A bias term is often included in the perceptron model. The bias allows the model to make adjustments that are independent of the input. It is an additional parameter that is learned during training.

Now we can generalize perceptron,

Z=w1*x1+w2*x2+w3*x3…...+wn*xn+B

(w1,w2,w3,….wn)are the weights associated with the (x1,x2,x3….xn) input features followed by bias B.

In short it can be represented as Z=WX+B.

we can represent input features as nodes and connects as weights, just like done in neurons

Z is the weighted sum which is used for drawing the conclusion after comparing with the threshold, the conclusive output is drawn in either (1,0) i.e. (Yes, No).

  • Activation Function: The weighted sum is then passed through an activation function. Perceptron uses Heaviside step function functions. which take the summed values as input and compare with the threshold and provide the output as 0 or 1.
  • Output: The final output of the perceptron is determined by the activation function’s result. For example, in binary classification problems, the output might represent a predicted class (0 or 1).
where net is the weighted sum and sgn(net) is the activation function

Activation Function is represented as signum function sgn(net), h(z) if equals to 0 conclusion drawn is NO whereas h(z) if equals to 1 conclusion drawn is YES.

E.g. in the above University selection case, threshold is equal to 0,

if h(p)≥0, students will be accepted.

else(h(p)<0), students will be rejected.

where h(p) is the activation function.

Till now we are drawing perceptions, but we desire to build a perception based on the input features dataset.

Below is given set of data(data.csv) is plotted

data.csv

0.78051, -0.063669,1
0.28774, 0.29139,1
0.40714, 0.17878,1
0.2923, 0.4217,1
0.50922, 0.35256,1
0.27785, 0.10802,1
0.27527, 0.33223,1
0.43999, 0.31245,1
0.33557, 0.42984,1
0.23448, 0.24986,1
0.0084492, 0.13658,1
0.12419, 0.33595,1
0.25644, 0.42624,1
0.4591, 0.40426,1
0.44547, 0.45117,1
0.42218, 0.20118,1
0.49563, 0.21445,1
0.30848, 0.24306,1
0.39707, 0.44438,1
0.32945, 0.39217,1
0.40739, 0.40271,1
0.3106, 0.50702,1
0.49638, 0.45384,1
0.10073, 0.32053,1
0.69907, 0.37307,1
0.29767, 0.69648,1
0.15099, 0.57341,1
0.16427, 0.27759,1
0.33259, 0.055964,1
0.53741, 0.28637,1
0.19503, 0.36879,1
0.40278, 0.035148,1
0.21296, 0.55169,1
0.48447, 0.56991,1
0.25476, 0.34596,1
0.21726, 0.28641,1
0.67078, 0.46538,1
0.3815, 0.4622,1
0.53838, 0.32774,1
0.4849, 0.26071,1
0.37095, 0.38809,1
0.54527, 0.63911,1
0.32149, 0.12007,1
0.42216, 0.61666,1
0.10194, 0.060408,1
0.15254, 0.2168,1
0.45558, 0.43769,1
0.28488, 0.52142,1
0.27633, 0.21264,1
0.39748, 0.31902,1
0.5533, 1,0
0.44274, 0.59205,0
0.85176, 0.6612,0
0.60436, 0.86605,0
0.68243, 0.48301,0
1, 0.76815,0
0.72989, 0.8107,0
0.67377, 0.77975,0
0.78761, 0.58177,0
0.71442, 0.7668,0
0.49379, 0.54226,0
0.78974, 0.74233,0
0.67905, 0.60921,0
0.6642, 0.72519,0
0.79396, 0.56789,0
0.70758, 0.76022,0
0.59421, 0.61857,0
0.49364, 0.56224,0
0.77707, 0.35025,0
0.79785, 0.76921,0
0.70876, 0.96764,0
0.69176, 0.60865,0
0.66408, 0.92075,0
0.65973, 0.66666,0
0.64574, 0.56845,0
0.89639, 0.7085,0
0.85476, 0.63167,0
0.62091, 0.80424,0
0.79057, 0.56108,0
0.58935, 0.71582,0
0.56846, 0.7406,0
0.65912, 0.71548,0
0.70938, 0.74041,0
0.59154, 0.62927,0
0.45829, 0.4641,0
0.79982, 0.74847,0
0.60974, 0.54757,0
0.68127, 0.86985,0
0.76694, 0.64736,0
0.69048, 0.83058,0
0.68122, 0.96541,0
0.73229, 0.64245,0
0.76145, 0.60138,0
0.58985, 0.86955,0
0.73145, 0.74516,0
0.77029, 0.7014,0
0.73156, 0.71782,0
0.44556, 0.57991,0
0.85275, 0.85987,0
0.51912, 0.62359,0

let’s Define a learning rate for our algorithm

Learning Rate: The learning rate is a hyperparameter in machine learning that controls the step size at which the weights of a neural network are updated during training

perceptron learning algorithm: -

z=w1x1+w2x2+……...B In General WX+B

h(z) is Activation function

>Start with random weights: w1, w2, w3, …... wn, b.

>for every misclassified point x1, x2, x3 ….xn

>learning rate: a

>>if prediction (h(z)=0):

>>>for i=1,2……, n:

>>>>change wi to wi+a*xi

>>>change B to B+a

>>if prediction (h(z)=1):

>>>for i=1,2……, n:

>>>>change wi to wi-a*xi

>>>change B to B-a

Coding the Perceptron Algorithm

For a point with coordinates (p,q), label 𝑦, and prediction given by the equation y_hat=step(w1x1+w2x2+B):

  • If the point is correctly classified, do nothing.
  • If the point is classified positive, but it has a negative label, subtract ap,aq and a from w1​, w2, and B respectively.
  • If the point is classified negative, but it has a positive label, add ap,aq and a from w1​, w2, and B respectively.

Conclusion:

Our perceptron algorithm is working as expected, also one can increase the accuracy of perception by changing the learning rate and number of boundary lines drawn.

--

--