Different topics in mathematics are normally taught separately with very little discussion of how the topics are connected. Granted most of the time the connection requires a very deep understanding of the topics and possibly other unrelated topics. However, in some cases, the connection presents itself to be obvious.
The Gram-Schmidt orthogonalization is a process that transforms a set of vectors (or functions) into a set of orthogonal (or orthonormal, depending on formulation) vectors. It is an useful procedure if you want to perform the QR decomposition of matrices, where Q is the matrix of orthonormal vectors derived from applying Gram-Schmidt to the matrix.
Consider a matrix A with columns ai. We want to generate a matrix Q with columns qi, such that the columns are orthonormal. In other words
Gram-Schmidt gives us a procedure to get from A to Q. It is as follows. Let
The choice of the first vector (with respect to which all other vectors will be orthonormal to) is arbitrary. So,
Now, to get the next orthogonal vector, we need to essentially remove any part of the vector a2 that is parallel to the vector A1. This can be done simply by
This process of removing the components parallel to the previous Ai can be repeated for the remaining vectors, giving the general formula
Now one may be thinking, how could a process that uses dot products and projections be applied to functions? The application arises in Sturm-Liouville theory of differential equations.
Gram-Schmidt for functions
When looking at differential equations of the Sturm-Liouville form, we sometimes get a result where there are multiple eigenfunctions for some eigenvalue as solutions to the differential equation. This poses a trouble for one of the postulate of the Sturm-Liouville theory that states that the eigenfunctions derived as solutions to the differential equations are orthogonal to each other. Hence, we can apply Gram-Schmidt to make the eigenfunctions orthogonal to each other. The process is as follows:
be the eigenfunctions corresponding to an eigenvalue. We will to find eigenfunctions
such that they are orthogonal
where k would be the mod of the function.
The procedure is as follows: As with the vectors, the first eigenfunction can be any one of the given eigenfunctions. So,
Now, we know that in order to make the second eigenfunction orthogonal to the first, we need to remove the part of the eigenfunction that is ‘parallel’ to the first, and so we can assume the second eigenfunction would be of the form
Where c is a constant. We can determine c as such:
Expanding the brackets and moving the terms around, we get
As with the vectors, we can iterate this procedure and get the general formula as
This is the formula for finding orthogonal eigenfunctions. Making them orthonormal would be trivial, just divide by their mod.
You might already seen that both the formulas are similar in the type of terms and were both derived in a very similar manner. For clarity’s sake, I am going to show both the formulas here again:
It makes you wonder, is there a more abstract mathematical notion that encompasses both functions, vectors, dot products and integrals? The answer is yes, and it is called the inner product space. An inner product space is a vector space with a notion of inner products. Inner products are the generalizations of the dot products, and for continuous functions, are defined as the integral over the entire domain. In general, an inner product is written as
So, we can now write the Gram-Schmidt orthogonalization in a very general form as
So essentially, Gram-Schmidt can be used to generate orthogonal elements of any inner product space given elements from that inner product space. This is a very useful result that goes well beyond matrices and functions.