The Perceptron Algorithm: pseudo code

Rajwrita Nath
4 min readMay 30, 2020

--

What you need to remember before diving: ( probably, me just talking to my future self)

A perceptron is a classification model that consists of a set of weights, or scores, one for every feature, and a threshold. The perceptron multiplies each weight by its corresponding score, and adds them, obtaining a score. If this score is greater than or equal to the threshold, then the perceptron returns a ‘yes’, or a ‘true’, or the value 1. If it is smaller, then it returns a ‘no’, or a ‘false’, or the value 0.

Input:

  • A dataset of points, where every point has a positive or negative label.
  • A number of epochs, n.
  • A learning rate.

Procedure:

  • Start with a random line. In other words, start with random values for the score of each word, and the bias.
  • Repeat the following procedure n times:
  • o Pick a random point.
  • o Apply the perceptron trick to the point and the line. In other words, if the point is well classified, do nothing, and if it is misclassified, move the line a little bit closer to the point.
  • Enjoy your well-fitted line!

Now the question, how many times should we run this procedure of picking a random point and adjusting the line to fit this point better? There are several different criteria to figure out how many times to run it, such as the following:

  • Run the algorithm a fixed number of times, which could be based on our computing power, or the amount of time we have.
  • Run the algorithm until the error is lower than a certain threshold we set beforehand.
  • Run the algorithm until the error doesn’t change significantly for a certain amount of time.

Normally, if we have the computing power, it’s ok to run it many more times than needed, since once we have a well-fitted line, it won’t move very much. In the following section, here I code the perceptron algorithm and analyze it by measuring the error in each step, so we’ll get a better idea of when to stop running it.

Some small terms before we start:

# The features are the number of times each word appears, and they are denoted as follows.

#The scores for the words are called the weights. They also include the bias, and they are denoted as follows.

#The score for a particular sentence is, just as before, the sum of the number of times each word appears (xi) times the weight of that word (wi), plus the bias (b).

# The prediction is simply 1 (happy) if the score is greater than or equal to zero, or 0 (sad) if the score is less than zero.

Finally! There we go:

#Coding the score function:

def score(weights, bias, features):
return features.dot(weights) + bias

# The error function:

def error(weights, bias, features, label):
pred = prediction(weights, bias, features)
if pred == label:
return 0
else:
return np.abs(score(weights, bias, features))

# Total error:

 def total_error(weights, bias, X, y):
total_error = 0
for i in range(len(X)):
total_error += error(weights, bias, X.loc[i], y[i])
return total_error

# The perceptron trick: checks if the point is correctly classified. If it is, then it does nothing. If it is misclassified, then it adds:

def perceptron_trick(weights, bias, features, label, learning_rate = 0.01):
pred = prediction(weights, bias, features)
if pred==label:
return weights, bias
else:
if label==1 and pred==0:
for i in range(len(weights)):
weights[i] += features[i]*learning_rate
bias += learning_rate
elif label==0 and pred==1:
for i in range( len(weights)):
weights[i] -= features[i]*learning_rate
bias -= learning_rate
return weights, bias

#Shortcut: Notice that the difference between the label and the prediction (namely label-pred) is 0 when they’re equal, +1 when the label is positive and the prediction is negative, and -1 when the label is negative and the prediction is positive. This helps simplify our code in the perceptron trick by using this quantity to remove the if statement as follows.

def perceptron_trick_clever(weights, bias, features, label, learning_rate = 0.01):     
pred = prediction(weights, bias, features)
for i in range(len(weights)):
weights[i] += (label-pred)*features[i]*learning_rate
bias += (label-pred)*learning_rate
return weights, bias

# Last step: For the perceptron algorithm all we need to do is repeat the perceptron trick many times (as many as the number of epochs). For convenience, we’ll also keep track of the errors at each epoch. As inputs, we not only have the data (features and labels), we also have the learning rate which we default to 0.01, and the number of epochs which we default to 200. Here is the code for the perceptron algorithm.

def perceptron_algorithm(X, y, learning_rate = 0.01, epochs = 200):     
weights = [1.0 for i in range(len(X.loc[0]))]
bias = 0.0
errors = []
for i in range(epochs):
errors.append(total_error(weights, bias, X, y))
j = random.randint(0, len(features)-1)
weights, bias = perceptron_trick(weights, bias, X.loc[j], y[j]) return weights, bias, errors

Run! Yay! Done for the day zzz.

BEAUTY

^ If you need a little bit more help ;)

--

--

Rajwrita Nath

Women Techmakers Scholar 2020, DSC NSEC Lead, Moderator at Manning Publications Co.