Feed forward and back propagation back-to-back — Part 2 (Linear equation in multidimensional space)

Published in

Analytics Vidhya

8 min readMay 29, 2020

Preface

In part 1 of this series (Linear Equation as a Neural Network building block) we saw what linear equations are and also had a glimpse of their importance in building neural nets.

There was a slight simplification though. We saw linear equation only in 2-dimensional space. Why is this an issue?

The answer is that many problems in the world have more than one independent variable. Consider, for example, a function that maps symptoms to illness. Normally a patient has to show more than one symptom to be diagnosed with an illness. Take the flu for instance. Fever alone does not result in a flu diagnose. But temperature together with headache, runny nose, wet cough and fatigue are sufficient symptoms to enable a doctor to diagnose a flu case. In this hypothetical function, symptoms are the independent variables and the illness the dependent one. So if we want our linear equation to address real world problems we need to add dimensions to it.This is the reason why we need to generalize the linear equation to be defined in spaces with more than 2 dimensions. We need to define the linear equation in n-dimensional space, where n is any positive integer.

The geometric interpretation of linear equations in more than 2 dimensions is not a line. In 3D space it is a plane and in n-dimensional space is a hyperplane. But we still want to have a line defined. This is where Linear Algebra comes to our rescue! OMG, Linear Algebra?

If you happen to be reading this story without having read part 1 don’t worry. In this series I have a covenant with the reader stated as follows:

Covenant: A note of comfort to the eventual reader

I won’t let concepts like gradient and gradient descent, calculus and multivariate calculus, derivatives, chain rule, linear algebra, linear combination and linear equation become boulders blocking your path to understanding the math required to master neural networks. By the end of this series, hopefully, these concepts will be perceived by the reader as the powerful tools they are and how they are simply applied to building neural networks.

Linear equation in 3 dimensions (3D)

The general form of a linear equation in 3-dimensions where one of a, b or c isn’t zero is:

Equation 1: General form of a linear equation in 3-dimensions

The geometric interpretation of this equation is a plane. Figure 1 contains the plot of the plane defined by equation: -4x + y -10z = -20, where point A= (-2,2,3), B= (3,2,1) and C= (-4,4,4).

Figure 1: Plane define by equation ***-4x + y -10z = -20***

In 3-dimensional space a line is defined by the intersection of two planes. The intersection of planes -4x + y -10z = -20 and x + 5y = 8 is plotted in figure 2:

Figure 2: Intersection of ***-4x + y -10z = -20*** and x + 5y = 8

The conclusion is that for a point to be on the line it has to satisfy both equations. An example of such a point is point A.

Wait a minute. For me it is starting to become a little complex to understand what is going on. Trying to clarify may become even worse. To work with this line we would have to work with its projections on XY, XZ and YZ planes which are lines. These projections are plotted in figure 3.

Figure 3: Line projection over XY, XZ and YZ planes

An observation. All figures in this post were plotted with GeoGebra.

Yes! Thing are starting to become too complicated to my taste. Don’t worry, in case you are. I won’t follow this path. Why? Because this is not the way neural networks algorithms are based on. They are based on Linear Algebra, that gives us a more simple way to represent and describe a line in any number of dimensions, as we shall see, and, by the end of this story, I hope, you will agree.

Linear algebra

Linear algebra is a branch of mathematics concerning linear equations (as the equations of the planes above), solution to system of linear equations, linear maps and their representation in vector space and in matrices.

The solutions to the linear equation of a plane is the infinite set of points that satisfy it. So a line in 3-dimensional Euclidean space (the space we’ve been working this far) is the set of solutions that satisfies two equations of planes. Linear algebra helps us deal with the complexities I’ve went through above.

What I will do now is to define what a vector is and how to represent a line in vector space.

Simply speaking a vector, in Euclidean space, is an object that has a magnitude and a direction. Figure 4 shows its geometrical representation.

In vector space a vector is just an array of numbers as we can see in equation 3 below. Here I will adopt the following conventions: the array will be a column array as shown below.

and:

Vector(u) (in yellow) has a magnitude, which is the distance between points a=(0,0) and b=(2,2), and a direction which is given by the angle α.

Equation 2: Vector(u)

Considering the given points and the equation above we have:

Equation 3: Vector(u)=b-a in vector space

One should read vector(u) as, starting from point a, pushing an object by 2 units in the x dimension and also two units in the y dimension, reaching point b.

In figure 5 the same 2D vector plotted above is plotted again, only this time in 3D.

Equation 4 below is the same as equation 3 above. The difference is that the latter is in 2D and the former in 3D.

Equation 4: Vector(u)=b-a

Cool isn’t it? In vector space the algebra is significantly simplified as it is independent of the number of dimensions.

Lets consider point c=(5,3) and calculate vector(v) as shown below:

Equation 5: Computing value of vector(v)

What is the value of vector(w) which is the addition of vector(u) plus vector(v) in both 2D and 3D?

Equation 6: vector(w) = vector(u) + vector(v) in 2D

Equation 7: vector(w) = vector(u) + vector(v) in 3D

The geometric representation of vector(w) in both 2D and 3D follows:

To define the equation of a line in vector space we need one more operation, namely scalar multiplication.

Scalar multiplication is simply multiplying all coordinates of a vector by a number. Formally:

Equation 8: Scalar multiplication

Examples in figure 9, where the resulting vector, i.e., the result of the scalar multiplication of vector(u) by λ=2 and λ=0.5, is in red:

When we multiply a vector by a scalar the new vector has the same direction with different magnitude. Now we have all the tools to define the equation of a line in vector space. This equation is:

Equation 9: Equation of a line in vector space

Where a is a known point on the line and vector(u) gives its direction. By making λ range from -∞ to +∞ we can represent in vector space any of the infinite points on a line.

Geometrically, considering point O=(0,0), point A=(2,2), point B=(6,4), point C=(10,6) and point E=(-2,0) we have:

Figure 10: Geometric visualization of a line as sum of vectors

At first the above figure seems confusing. So lets remedy that. Vector(w) is the same vector plotted in figure 7, i.e, the sum of vector(u) [black arrow] plus vector(v) [red arrow].

Vector(w₁) [dotted blue arrow linking point O to C] is the sum of vector(u) plus 2 * vector(v) [yellow arrow linking points A to C]. In this case λ = 2.

Vector(w₂) [dotted blue arrow linking point O to E] is the sum of vector(u) minus vector(v) [yellow arrow linking points A to E]. In this case λ = -1

We can now define the equation of above line following the general format shown in equation 9 above:

There is one thing missing though. What is the relationship from the above equation and the equation of the line, y = 1/2x+1, plotted in dashed red in figure 10? In other words how do we pass from an equation in vector space to one in euclidean space?

Lets make:

And then:

This entails:

As λ must be equal:

Resulting in:

To formalize the equation of a line in vector space in n-dimensions I will change the notation slightly. Instead of using letters as coordinates, like x and y, I will use the letter x with subscripts like x₁ and x₂.

Equation 13: Equation of a line in n-dimensions

So equation 9 is the simplified form of equation 13. They both mean the same thing.

Epilogue of part 2

As we move to working with many features, in other words many independent variables, we saw that working in vector space simplifies considerably the maths. So to define a line we need to know only one point on the line and the vector that defines its direction in space.

In part 3 of this series we shall see how n-dimension linear equations can be combined together to build what we all know as a neural network.

Watch this space to check when I update this series by publishing part 3. See you then!