Completing the Square

Valerie Dela Cruz

Published in

The Startup

4 min readNov 28, 2020

A useful technique when dealing with multivariate Gaussian distributions

This post assumes some familiarity with the Gaussian (also called Normal) distribution and matrix operations.

In my article called Maths Behind Machine Learning, I briefly touched on the idea of Gaussian distributions. The famous Gaussian distribution is so ubiquitous in applications of statistics that having some tools to see how it shows its face in some expressions is very useful. One of these tools is by completing the square.

What is the problem statement?

Let’s take the density function p(y) below where y is a vector of n x 1 dimension.

(1)

We can show (1) as a multivariate Gaussian distribution with mean as shown in (2):

(2)

and covariance as shown in (3):

(3)

How do we show this?

We do this in two steps.

First, we will show what a Gaussian distribution probability density function with mean as per expression (2), and covariance as per expression (3) looks like.

Second, we will show what to do to p(y) in (1) to show this is indeed a Gaussian distribution.

Executing Step 1

To recap, in step 1, we will show what a Gaussian distribution probability density function with mean as per expression (2), and covariance as per expression (3) looks like.

For our purposes, we could disregard the normalising factor. What matters really is the expression inside the exponential function or sometimes called the Mahalanobis distance as shown in (4)

(4)

We solve (4) and get the following:

(5)

If we look closely at (5), we can see the first two terms are something we see in expression (1). We just have to take out the factor 2. This is illustrated better in the boxed expression below.

same as (5) but with emphasis on the first two factors

Executing Step 2

With step 1 in mind, we elaborate what we do to p(y) to show this is indeed a Gaussian distribution.

Let’s look at (1) again or at least just the expression inside the exponential function.

same as (1) but with emphasis on the terms inside the exponential

We add and subtract a factor that emerged in (5) which is

(6)

in order to get the following calculations

(7) : This is the expression inside the exponential function in (1)

(8): This is obtained adding and subtracting (6) to (7)

(9): This is obtained by grouping some terms in (8)

(10)

Examine expression (10) and (4) and it can bee seen that this is a Gaussian distribution. The last factor with b are constants (i.e. does not depend on y) so it won’t matter in the context of finding out the form of the probability density function.

So what is the point?

It is very useful to recognise a probability density function of this form

because it can be shown as a Gaussian distribution with mean

and covariance

We like Gaussian distributions also due to the great advantage that many calculations dealing with it become easier.

So there we have it, an added item in our math tool kit!