Computational Linear Algebra: Multiplication and Identity

Computational Linear Algebra: Lecture 2 — Matrix Multiplication and Inverse

Monit Sharma
10 min readMay 19, 2023

Introduction:

Welcome back to our blog series on Computational Linear Algebra! In the previous lecture, we explored the fundamental concepts of scalars, vectors, matrices, and basic operations like matrix addition and subtraction. Today, we will delve deeper into the world of matrices and explore two crucial operations: matrix multiplication and matrix inverse.

Matrix Multiplication:

Matrix multiplication is a fundamental operation in linear algebra that allows us to combine matrices in a meaningful way. It is important to note that matrix multiplication is not commutative, meaning the order of multiplication matters.

Let’s consider two matrices, A and B, where A is an m x n matrix and B is an n x p matrix. To multiply these matrices, the number of columns in A must be equal to the number of rows in B. The resulting matrix, denoted as C, will have dimensions m x p.

The element at row i and column j of matrix C is calculated by taking the dot product of the i-th row of matrix A and the j-th column of matrix B. Mathematically, it can be expressed as:

C[i][j] = A[i][1] * B[1][j] + A[i][2] * B[2][j] + ... + A[i][n] * B[n][j]

Matrix multiplication is a powerful tool that enables us to perform a wide range of operations, such as transformations, solving systems of linear equations, and more. It forms the backbone of many computational algorithms and applications.

Matrix Inverse:

The inverse of a matrix is a concept that plays a vital role in solving equations and finding solutions to various linear systems. For a square matrix A, if there exists another square matrix B such that the product of A and B is the identity matrix (denoted as I), then B is the inverse of A.

Mathematically, if A * B = B * A = I, then B is the inverse of A, denoted as A^(-1).

Not all matrices have an inverse. A matrix is invertible, or non-singular, if and only if its determinant is nonzero. If the determinant of a matrix is zero, it is called a singular matrix, and it does not have an inverse.

Finding the inverse of a matrix can be achieved through various methods, such as Gaussian elimination, LU decomposition, or using specialized algorithms like the Gauss-Jordan method. These methods aim to transform the given matrix into reduced row-echelon form or upper triangular form, allowing us to determine the inverse if it exists.

The identity matrix, denoted as I, is a special square matrix where all the diagonal elements are 1, and all the off-diagonal elements are 0. It has the property that when multiplied with any matrix, it retains the original matrix. In other words, I * A = A * I = A.

Applications and Importance:

Matrix multiplication and matrix inverse find extensive applications in various fields, including computer graphics, data analysis, optimization problems, machine learning, and more. They are fundamental tools in solving systems of linear equations, transforming geometric objects, performing projections, and modeling real-world phenomena.

Jupyter Notebook

# import numpy
import numpy as np

# avoid innacurate floating points
np.set_printoptions(suppress=True)

Introduction

We will see some very important concepts in this chapter. The dot product is used in every equation explaining data science algorithms so it’s worth the effort to understand it. Then we will see some properties of this operation. Finally, we will to get some intuition on the link between matrices and systems of linear equations.

Multiplying Matrices and Vectors

The standard way to multiply matrices is not to multiply each element of one with each elements of the other (this is the element-wise product) but to calculate the sum of the products between rows and columns.

The number of columns of the first matrix must be equal to the number of rows of the second matrix. Thus, if the dimensions, or the shape of the first matrix, is (m × n) the second matrix need to be of shape (n × x ). The resulting matrix will have the shape ( m × x).

Example 1

It is a good habit to check the dimensions of the matrix so see what is going on. We can see in this example that the shape of A is ( 3×2) and the shape of b is (2 × 1 ). So the dimensions of C are (3 ×1 ).

With Numpy

The Numpy function dot() can be used to compute the matrix product (or dot product). Let's try to reproduce the last example:

A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
[3, 4],
[5, 6]])

B = np.array([[2], [4]])
B

array([[2],
[4]])

C = np.dot(A, B)
C

array([[10],
[22],
[34]])

It is equivalent to use the method dot() of Numpy arrays:


C = A.dot(B)
C

array([[10],
[22],
[34]])

Let’s verify with numpy



A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
A
B = np.array([[2, 7], [1, 2], [3, 6]])
B
C = A.dot(B)
C

array([[ 13, 29],
[ 31, 74],
[ 49, 119],
[ 67, 164]])

Formalization of the dot product

Properties of the dot product

We will now see some interesting properties of the matrix multiplication. It will become useful as we move forward in the chapters. Using simple examples for each property will provide a way to check them while we get used to the Numpy functions.

Matrices multiplication is distributive

is equivalent to

A = np.array([[2, 3], [1, 4], [7, 6]])
A
B = np.array([[5], [2]])
B

C = np.array([[4], [3]])
C

A(B+C)

D = A.dot(B+C)
D

array([[33],
[29],
[93]])

is equivalent to AB + AC

D = A.dot(B) + A.dot(C)
D

array([[33],
[29],
[93]])

Matrices multiplication is associative

A = np.array([[2, 3], [1, 4], [7, 6]])
A
B = np.array([[5, 3], [2, 2]])
B

A(BC)

D = A.dot(B.dot(C))
D

array([[100],
[ 85],
[287]])

is equivalent to (AB)C

D = (A.dot(B)).dot(C)
D

array([[100],
[ 85],
[287]])

Matrix multiplication is not commutative

A = np.array([[2, 3], [6, 5]])
A
B = np.array([[5, 3], [2, 2]])
B

AB

AB = np.dot(A, B)
AB

array([[16, 12],
[40, 28]])

is different from BA

BA = np.dot(B, A)
BA

array([[28, 30],
[16, 16]])

However vector multiplication is commutative

x = np.array([[2], [6]])
x
y = np.array([[5], [2]])
y
x_ty = x.T.dot(y)
x_ty

array([[22]])

is equivalent to

y_tx = y.T.dot(x)
y_tx

array([[22]])

Simplification of the matrix product

A = np.array([[2, 3], [1, 4], [7, 6]])
A
B = np.array([[5, 3], [2, 2]])
B
AB_t = A.dot(B).T
AB_t

array([[16, 13, 47],
[12, 11, 33]])

is equivalent to

B_tA = B.T.dot(A.T)
B_tA

array([[16, 13, 47],
[12, 11, 33]])

Identity and Inverse Matrices

# importing packages
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# plotting parameters
sns.set()
%pylab inline
pylab.rcParams['figure.figsize'] = (4,4)


# avoiding innaccurate floating points
np.set_printoptions(suppress=True)

This chapter is light but contains some important definitions. The identity matrix or the inverse of a matrix are concepts that will be very useful in the next chapters. We will see at the end of this chapter that we can solve systems of linear equations by using the inverse matrix.

The identity matrix I is a special matrix of shape (n×n ) that is filled with 0 except the diagonal that is filled with 1.

An identity matrix can be created with the numpy function eye()

np.eye(3)

array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])

While applying the Identity matrix to a vector the result is the same vector

x = np.array([[2],[6],[3]])

x

array([[2],
[6],
[3]])

xid = np.eye(x.shape[0]).dot(x)
xid

array([[2.],
[6.],
[3.]])

Intuition

You can think of a matrix as a way to transform objects in a n-dimensional space. It applies a linear transformation of the space. We can say that we apply a matrix to an element: this means that we do the dot product between this matrix and the element. We will see this notion thoroughly in the next chapters but the identity matrix is a good first example. It is a particular example because the space doesn’t change when we apply the identity matrix to it.

The space doesn’t change when we apply the identity matrix to it , We saw that x was not altered after being multiplied by I.

Inverse Matrices

The matrix inverse of A is denoted A^(-1). It is the matrix that results in the identity matrix when it is multiplied by :

This means that if we apply a linear transformation to the space with A , it is possible to go back with A^(-1). It provides a way to cancel the transformation.

Example 2

For this example, we will use numpy function linalg.inv() to calculate the inverse of A.

A = np.array([[3, 0, 2], [2, 0, -2], [0, 1, 1]])
A

array([[ 3, 0, 2],
[ 2, 0, -2],
[ 0, 1, 1]])

Now, we calculate its inverse

A_inv = np.linalg.inv(A)
A_inv

array([[ 0.2, 0.2, 0. ],
[-0.2, 0.3, 1. ],
[ 0.2, -0.3, -0. ]])

We can check that A^(-1) is well with the inverse of A with python.

A_bis = A_inv.dot(A)
A_bis

array([[ 1., 0., -0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])

We will see that inverse of matrices can be very useful, for instance to solve a set of linear equations. We must note however that non square matrices (matrices with more columns than rows or more rows than columns) don’t have inverse.

Solving a system of linear equations

The inverse matrix can be used to solve the equation Ax=b by adding it to each term:

Since we know by definition that A^(-1)A = I

We saw that a vector is not changed when multiplied by the identity matrix. So we can write:

This is great! We can solve a set of linear equation just by computing the inverse of A and apply this matrix to the vector of results b!

Example 3

We will take a simple solvable example

We will use the notation as used before

Here x1 corresponds to x and x2 corresponds to y . So we have:

Our matrix A of weights is:

And the vector b containing the solutions of individual equations is:

Under the matrix form, our system becomes

Let’s find the inverse of A

A = np.array([[2, -1], [1, 1]])
A
A_inv = np.linalg.inv(A)
A_inv

array([[ 0.33333333, 0.33333333],
[-0.33333333, 0.66666667]])

We also have:

b = np.array([[0], [3]])

We have:

x = A_inv.dot(b)
x

array([[1.],
[2.]])

This is our solution!

This means that the point of coordinates (1, 2) is the solution and is at the intersection of the lines representing the equations. Let’s plot them to check this solution:



x = np.arange(-10, 10)
y = 2*x
y1 = -x + 3

plt.figure()
plt.plot(x, y)
plt.plot(x, y1)
plt.xlim(0, 3)
plt.ylim(0, 3)
# draw axes
plt.axvline(x=0, color='grey')
plt.axhline(y=0, color='grey')
plt.show()
plt.close()

We can see that the solution (corresponding to the line crossing) is when x=1 and y=2. It confirms what we found with the matrix inversion!

Singular Matrices

Some matrices are not invertible. They are called singular.

Conclusion:

In this lecture, we have explored the concepts of matrix multiplication and inverse, two essential operations in computational linear algebra. Matrix multiplication allows us to combine matrices and perform a wide range of transformations and calculations. The matrix inverse, when it exists, enables us to solve equations and find solutions to linear systems.

In the next lecture, we will continue our journey into computational linear algebra by exploring more advanced topics, such as eigenvectors, eigenvalues, and diagonalization. Stay tuned.

--

--