Essential Math for Machine Learning: Hyperplanes

The Linear Cut

10 min readJan 31, 2024

This article is part of the series Essential Math for Machine Learning.

Introduction

In the realms of machine learning, hyperplanes play a pivotal role in shaping concepts, solving problems, and providing insights into complex datasets. Often encountered in algorithms such as Approximate Neareat Neighbor (ANN), Support Vector Machines (SVMs), hyperplanes can seem daunting due to their abstract nature. However, understanding hyperplanes in n-dimensional space is crucial for grasping the fundamentals of many advanced topics. This blog post aims to demystify hyperplanes, elucidating their fundamental properties, how they’re defined, and how we can manipulate them for practical applications.

What is a Hyperplane?

A hyperplane can be thought of as a flat, n-1 dimensional subset of an n-dimensional space. For instance, in a three-dimensional space, a hyperplane is a two-dimensional plane. In two dimensions, it’s simply a line. The power of hyperplanes lies in their ability to divide a space into two half-spaces, which is incredibly useful in classification tasks, optimization problems, and more.

Formula for a Hyperplane

The general equation of a hyperplane in an n-dimensional space is given by:

a1*x1 + a2*x2 + … + an*xn + d = 0

Here, x1, x2, …, xn represent the coordinates in n-dimensional space, a1, a2, …, an are the coefficients determining the orientation of the hyperplane, and d is the constant term. This equation effectively defines a boundary where every point on the hyperplane satisfies the equation.

Note that the coefficients can be arbitrarily scaled by a non-zero value. In particular, we might want to scale it by 1/sqrt(a1²+a2²+…+an²), so after scaling b1²+b2²+…+bn² is 1, where bi = ai/sqrt(a¹²+a²²+…+an²).

The Normal Vector of a Hyperplane

The normal vector to a hyperplane is a vector that is perpendicular to the hyperplane. For the hyperplane equation given above, the normal vector N can be represented as:

N = (a1, a2, …, an)

Proof:

Consider 2 points in the hyperplane, P(p1,p2,…pn) and Q(q1,q2,…,qn), the vector PQ(q1-p1,q2-p2,…,qn-pn) is parallel to the hyperplane. Since both points satisfy the hyperplane equation, we have:

a1*p1 + a2*p2 + … + an*pn + d = 0

a1*q1 + a2*q2 + … + an*qn + d = 0

Subtracting the 2 equations we get:

a1(q1-p1) + a2(q2-p2) + … + an(qn-pn) = 0

which is dot product of N and PQ. This indicates the normal vector is perpendicular to all vectors parallel to the hyperplane.

With normal vector, the hyperplane equation can also be interpreted through the dot product of the normal vector N = (a1, a2, …, an) and any vector whose endpoint is in the hyperplane V = (v1, v2, …, vn). Their dot product is a constant number -d:

dot(N, V) = -d

This means that when projecting any vector whose endpoint is in the hyperplane onto the normal vector, the length of the projection is a constant: -d / |N|.

import numpy as np
import matplotlib.pyplot as plt

# Hyperplane: 2x + y - 3 = 0

# Create a range of x values
x = np.linspace(-2, 2, 400)

# Calculate the corresponding y values based on the updated hyperplane equation
y = 3 - 2 * x

# The normal vector to the hyperplane 2x + y - 3 = 0 is [2, 1]
normal_vector = np.array([2, 1])

# Plot origin for the normal vector
origin = [0, 0]  # Origin point

# Create the 2D plot with the axes going through the origin (0, 0)
plt.figure()

# Plot the line representing the updated hyperplane in 2D space
plt.plot(x, y, label='Hyperplane: 2x + y - 3 = 0')

# Plot the normal vector with precise control over its length
plt.quiver(*origin, *normal_vector, scale_units='xy', angles='xy', scale=1, color='red', label='Normal Vector [2, 1]')

# Display the value of the normal vector on the plot
plt.text(1.2, 1.2, f'Normal Vector: [2, 1]', color='red')

# Setting labels
plt.xlabel('X axis')
plt.ylabel('Y axis')

# Move the spines to go through the origin
plt.axhline(0, color='black', linewidth=0.5)
plt.axvline(0, color='black', linewidth=0.5)
plt.gca().spines['left'].set_position('zero')
plt.gca().spines['bottom'].set_position('zero')
plt.gca().spines['right'].set_color('none')
plt.gca().spines['top'].set_color('none')

# Set the aspect of the plot to be equal
plt.axis('equal')

# Setting plot limits for better visualization
plt.xlim([-2, 2])
plt.ylim([-2, 5])

# Adding a legend
plt.legend()

# Show the plot
plt.show()

Vectors Parallel to a Hyperplane

If a vector is parallel to a hyperplane if and only if it is orthogonal to the normal vector of the hyperplane, which means the dot product of the vector V and the normal vector N is 0.

import numpy as np

def is_parallel_to_hyperplane(vector, hyperplane_coefficients):
    normal_vector = np.array(hyperplane_coefficients[:-1])
    dot_product = np.dot(normal_vector, vector)
    return dot_product == 0

# Example vector
vector = np.array([1, 2, 3])
# Coefficients for a 3D hyperplane
hyperplane_coefficients = np.array([1, 2, -1, 3])
parallel = is_parallel_to_hyperplane(vector, hyperplane_coefficients)
print("Is the vector parallel to the hyperplane?", parallel)

Projecting a Vector onto a Hyperplane

Projecting a vector onto a hyperplane involves decomposing the vector into 2 orthogonal components — one is the vector that lies within the hyperplane, the other is the vector that is parallel to the normal vector, then remove the latter from the vector to obtain the former. The latter is the projection of the vector onto the normal vector, which can be calculated with

project(V, N) = (dot(V, N) / dot(N, N)) * N

and the length of the projection equals

||project(V, N)|| = dot(V, N) / ||N||

Now, let’s provide a Python code example that demonstrates how to project a vector onto a hyperplane:

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def project_onto_hyperplane(vector, normal):
    scalar_projection = np.dot(vector, normal) / np.dot(normal, normal)
    vector_projection = scalar_projection * normal
    projection = vector - vector_projection
    return projection

# Example vector and hyperplane normal
v = np.array([3, 4, 5])  # Vector to be projected
n = np.array([1, 0, 0])  # Normal vector of the hyperplane

# Project v onto the hyperplane
projection = project_onto_hyperplane(v, n)

The Distance from a Point to a Hyperplane

To calculate the distance from a point to a hyperplane in n-dimensional space, we want to project the vector onto the normal vector and subtract the offset by which the hyperplane is away from the origin. The formula is:

D = |dot(V, N) + d| / ||N||

The geometry interpretation of the formula is that, dot(V, N) is the projection of V onto N, scaled by the length of the normal vector ||N||. And |d| is the distance the hyperplane off from the origin scaled by ||N||. Since the whole thing is scaled by ||N||, divided by ||N|| will get the distance. Based on the formula we can see that points on the hyperplane will have distance 0 to the hyperplane.

import numpy as np

def distance_point_to_hyperplane(point, hyperplane_coefficients):
    """
    Calculates the distance from a point to a hyperplane.

    Parameters:
    point (numpy.ndarray): Coordinates of the point.
    hyperplane_coefficients (numpy.ndarray): Coefficients of the hyperplane (a1, a2, ..., an, d).

    Returns:
    float: The distance from the point to the hyperplane.
    """
    normal_vector = hyperplane_coefficients[:-1]
    d = hyperplane_coefficients[-1]
    distance = np.abs(np.dot(normal_vector, point) + d) / np.linalg.norm(normal_vector)
    return distance

point = np.array([1, 2, 3])
hyperplane_coefficients = np.array([1, 2, -1, 3])
distance = distance_point_to_hyperplane(point, hyperplane_coefficients)
print("Distance from the point to the hyperplane:", distance)

Randomly Selecting a Vector in a Hyperplane

Choosing a random vector in the hyperplane can be done by randomly generate a vector then project it onto the hyperplane.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def project_onto_hyperplane(vector, normal):
    scalar_projection = np.dot(vector, normal) / np.dot(normal, normal)
    vector_projection = scalar_projection * normal
    return vector - vector_projection

def random_vector_in_hyperplane(n_dimensions, hyperplane_coefficients):
    random_vector = np.random.randn(n_dimensions)
    normal_vector = hyperplane_coefficients[:-1]
    constant_term = hyperplane_coefficients[-1]
    adjusted_vector = random_vector - (np.dot(normal_vector, random_vector) + constant_term) / np.dot(normal_vector, normal_vector) * normal_vector
    return adjusted_vector, random_vector

# Example usage
n_dimensions = 3
hyperplane_coefficients = np.array([1, 2, -1, 3])  # Example coefficients for a 3D hyperplane
vector_in_hyperplane, random_vector = random_vector_in_hyperplane(n_dimensions, hyperplane_coefficients)

Randomly Selecting a Point in a Hyperplane

Selecting a point in a hyperplane is different from selecting a vector. In the context of a hyperplane, a point refers to a specific location in space that lies on the hyperplane, while a vector can be thought of as a direction and magnitude from the origin. When we talk about selecting a random point in a hyperplane, we’re essentially talking about choosing coordinates that satisfy the hyperplane equation.

To randomly select a point in an n-dimensional hyperplane, randomly choose n−1 coordinates, then solve for the remaining coordinate. This ensures that the point lies on the hyperplane.

import numpy as np

def random_point_in_hyperplane(hyperplane_coefficients):
    """
    Generates a random point in a hyperplane.

    Parameters:
    hyperplane_coefficients (numpy.ndarray): Coefficients of the hyperplane (a1, a2, ..., an, d).

    Returns:
    numpy.ndarray: A random point in the hyperplane.
    """
    n_dimensions = len(hyperplane_coefficients) - 1
    point = np.random.rand(n_dimensions - 1)  # Randomly choose n-1 dimensions

    # Last dimension calculation to satisfy the hyperplane equation
    last_dimension = -(np.dot(hyperplane_coefficients[:-1], np.append(point, 0)) + hyperplane_coefficients[-1]) / hyperplane_coefficients[-2]
    return np.append(point, last_dimension)

# Example usage
hyperplane_coefficients = np.array([1, 2, -1, 3])  # For a 3-dimensional hyperplane
random_point = random_point_in_hyperplane(hyperplane_coefficients)
print("Random point in the hyperplane:", random_point)

Intersection Point of a Vector and a Hyperplane

The intersection point of a vector v and a hyperplane refers to finding a point where a vector extended in space intersects with a hyperplane. The intersection point x(x1, x2, …, xn) is constrained by 2 conditions:

It is on the hyperplane, so a1*x1 + a2*x2 + … + an*xn + d = 0.
The vector is in the same direction as the vector v. So we can represent x as (t*v1, t*v2, …, t*vn), where t is a scaling factor.

So we have 1 variable t and 1 equation:

(a1*v1 + a2*v2 + … + an*vn) * t + d = 0

t = -d / (a1*v1 + a2*v2 + … + an*vn)

x = t * v

But there is an edge case, if the vector v is perpendicular to the normal vector, a1*v1 + a2*v2 + … + an*vn will be 0. If d is 0, v is in the hyperspace, there will be infinite number of intersection points; otherwise, there will be no intersection points at all.

import numpy as np

def intersection_point_vector_hyperplane(vector, hyperplane_coefficients):
    """
    Finds the intersection point of a vector and a hyperplane.

    Parameters:
    vector (numpy.ndarray): The vector.
    hyperplane_coefficients (numpy.ndarray): Coefficients of the hyperplane (a1, a2, ..., an, d).

    Returns:
    numpy.ndarray: The intersection point, or None if no intersection exists.
    """
    # Extract normal vector and constant term from hyperplane
    normal_vector = hyperplane_coefficients[:-1]
    constant_term = hyperplane_coefficients[-1]

    # Calculate the value of t for intersection
    dot_product = np.dot(normal_vector, vector)
    if dot_product == 0:
        return None  # No intersection or line lies in the hyperplane

    t = -constant_term / dot_product

    # Calculate the intersection point
    intersection_point = t * vector
    return intersection_point

# Example usage
vector = np.array([1, 2, 3])  # Example vector
hyperplane_coefficients = np.array([1, 2, -1, 3])  # For a 3-dimensional hyperplane
intersection = intersection_point_vector_hyperplane(vector, hyperplane_coefficients)
print("Intersection point:", intersection)

Which Side of a Hyperplane a Point Lies

To determine on which side of a hyperplane a point lies in n-dimensional space, we use the hyperplane equation and the coordinates of the point. The formula is an extension of the hyperplane equation itself:

Given a hyperplane defined by the equation:

a1*x1 + a2*x2 + … + an*xn + d = 0

and a point P with coordinates (p1, p2, … pn), the side of the hyperplane on which the point lies is determined by the sign of the expression obtained by substituting the point’s coordinates into the hyperplane equation:

f(P) = a1*p1 + a2*p2 + … + an*pn + d

If f(P) > 0, the point lies on one side of the hyperplane.
If f(P) < 0, the point lies on the other side.
If f(P) = 0, the point lies exactly on the hyperplane.

The code is available in this colab notebook.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def determine_side(point, hyperplane_coefficients):
    value = np.dot(hyperplane_coefficients[:-1], point) + hyperplane_coefficients[-1]
    return np.sign(value)

# Hyperplane coefficients
hyperplane_coefficients = np.array([1, 1, -1, 3])

# Specific points
point_on_hyperplane = np.array([1, -2, 0])  
point_positive_side = np.array([3, 0, -5])
point_negative_side = np.array([-3, -2, 6])
points = np.array([point_on_hyperplane, point_positive_side, point_negative_side])

# Visualization
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# Plot the hyperplane
xx, yy = np.meshgrid(range(-5,6), range(-5,6))
zz = (-hyperplane_coefficients[0] * xx - hyperplane_coefficients[1] * yy - hyperplane_coefficients[-1]) / hyperplane_coefficients[2]
ax.plot_surface(xx, yy, zz, alpha=0.4, color='yellow', label='Hyperplane')

# Plot and check the side for each point
for point in points:
    side = determine_side(point, hyperplane_coefficients)
    if side == 1:
        ax.scatter(*point, color='blue', label='Positive Side')
    elif side == -1:
        ax.scatter(*point, color='red', label='Negative Side')
    else:
        ax.scatter(*point, color='green', label='On Hyperplane')

# Setting plot limits and labels
ax.set_xlim([-5, 5])
ax.set_ylim([-5, 5])
ax.set_zlim([-5, 5])
ax.set_xlabel('X axis')
ax.set_ylabel('Y axis')
ax.set_zlabel('Z axis')

# Show the plot
plt.show()

Conclusion

Hyperplanes are foundational elements in understanding geometric and algebraic properties of n-dimensional spaces. They serve as the backbone for numerous algorithms and theories in data science, optimization, and beyond. By understanding the core principles that define hyperplanes, their normal vectors, and how vectors interact with them, one can gain deeper insights into the structure and behavior of complex, multidimensional spaces. Whether you’re generating random hyperplanes for simulations or dissecting high-dimensional data, the concepts outlined here provide a solid foundation for exploring and leveraging the power of hyperplanes in various applications.

Essential Math for Machine Learning: Hyperplanes

The Linear Cut

Introduction

What is a Hyperplane?

Formula for a Hyperplane

The Normal Vector of a Hyperplane

Vectors Parallel to a Hyperplane

Projecting a Vector onto a Hyperplane

The Distance from a Point to a Hyperplane

Randomly Selecting a Vector in a Hyperplane

Randomly Selecting a Point in a Hyperplane

Intersection Point of a Vector and a Hyperplane

Which Side of a Hyperplane a Point Lies

Conclusion

Written by Dagang Wei