The Perceptron Algorithm in Python
The following is a basic example of the perceptron algorithm in Python. In this post, I will show a quick example of how we can use the algorithm to quickly create a predictive algorithm for the Iris dataset. This essay is a continuation of my AI series and if you need a refresher, go to Part 1–1 where I introduced the algorithm.
The data that we will be working with is the all too famous Iris dataset which, taken from scikit-learn, contains “3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray”. We will be using the petal and sepal length to predict Setosa and Virginica.
First, we need to load in a few packages:
#Intializations
import numpy as np #For working with arrays
import pandas as pd #For transforming data
from sklearn import datasets #For loading in the dataset
import matplotlib.pyplot as plt #For plotting
Next, lets load in our data and drop the extra class:
data = datasets.load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['target'] = data['target']
#Droping Iris-versicolor
df = df.loc[df['target']!=1]
As a reminder, for the perceptron algorithm to work, we need the data to be linearly separable. So let’s see what our data looks like to determine if we can continue:
df.plot.scatter(x='sepal length (cm)', y='petal length (cm)',c='target',colormap='viridis')
Looks seperable to me! Now let’s change the data to make it a little easier to work with. First, we will change the target to 1 (Iris-virginica) and -1 (Iris-setosa). Then we will keep only the sepal and petal length data.
df['target'] = np.where(df['target']==2, 1, -1)
df = df[['sepal length (cm)','petal length (cm)','target']]
df = df.rename(columns = {'sepal length (cm)':'sepal','petal length (cm)':'petal'})
Next we will randomize the weights, and define eta, the number of iterations, and the input and output arrays.
#Setting the random state
m=np.random.RandomState(94)
x = df[['sepal','petal']].values
y = df['target'].values
#Randomizing the weights
w = m.normal(loc=0.0, scale=0.01,size=1 + x.shape[1])
#Setting the learning rate and epochs
eta = .1
n_iter = 10
Now we should define our prediction function. Where we will take the dot product of the weights and input values to find the predictedvalue.
def predict(X,W):
return np.where(np.dot(X,W[1:])+W[0] >= 0.0,1,-1)
Now for the last step, we can finally implement the algorithm and look at the error as we train to see how well our algorithm works.
errors = []
for i in range(n_iter):
error = 0
for xi, target in zip(x,y):
update = eta * (target-predict(xi,w))
w[1:] += update *xi
w[0] += update
error += int(update != 0.0)
errors.append(error)
print(errors)
#OUTPUT
[2, 2, 3, 1, 0, 0, 0, 0, 0, 0]
Amazingly, in just four epochs we get an equation that has an error of 0! Now this is a cherry-picked example, and in future posts, our error will be much worse. However, it is surprising to see such a simple algorithm converge so quickly. Here you can find the git hub repo.
Zackary Nay is a full-time software developer working within the construction industry, where he implements artificial intelligence and other software to expedite repetitive tasks. In his free time, he likes to read and go backpacking.