Perceptron Convergence Theorem

4 min readNov 2, 2023

This article assumes that you have a basic understanding of the Perceptron Learning Algorithm, as it will only discuss its convergence.

Theorem : “For any finite set of linearly separable labeled examples, the Perceptron Learning Algorithm (PLA) will terminate after a finite number of iterations.”

Notations :

We denote w* the final weights vector that’s perfectly separates our dataset :

Such that our dataset S is defined as follows (P is an integer greater than 1) :

We will count a learning step n only those in which the weights are updated

Before proceeding further with the proof, let’s define a few parameters. :

The angle between the vector Wn and the vector W* :

The Euclidean distance of the point xµ to the plane w*, which will refer to as the margin :

Visualisation of the margins delta (signed)

We express the margins as follow :

we define also the minimum distance between a point x and the hyperplan of vector w*:

Proof:

The intution here is to try to upper bound c(n) and use the fact that the function cos is less or equal to 1

For a given step n, and since there been an update, the norm of the weight vector w is given as follows :

Let’s define R as the raduis of a closed ball with center origine of our axis containing all of our vector xµ:

And since n is an update step we have :

so we have :

We repeat the process n times ( knowing that the initial vector w0 is the null vector ) to get :

This is will be the first part of our proof which was dedicated for the denominator of c(n), now we will investigate its nominator.

The step n is an update step so :

The expression of the margin delta gives :

So :

We repeat the process n times ( knowing that the initial vector w0 is the null vector ) to get :

From (1) and (2) and the expression of c(n) we have :

We could simply use the cauchy schwartz inegality here, but the cos is giving more intuition about how the angle between wn and w* is increasing with root n

Finally :

Perceptron Convergence Theorem

Notations :

Proof:

Written by Adnanemajdoub