Topological Data Analysis with Spectral Analysis of Combinatorial Laplacians

Alexander Del Toro Barba (PhD)
15 min readAug 9, 2024

--

Photo by SIMON LEE on Unsplash

Combinatorial Laplacian for Topological Data Analysis

In the first part of our series we talked about how to do Topological Data Analysis with Persistent Homology. Now we look at Combinatorial Laplacians for identifying topological invariants.

We need tools from graph theory (Laplacian), linear algebra (Eigenvalues), and topology (Betti numbers). tldr: The number of zero eigenvalues of the combinatorial Laplacian corresponds to the number of connected components of a graph, which is the zeroth Betti number (β₀).

Graph Laplacian (L) is the starting point. Also called Laplacian matrix, it represents a graph and is constructed from the adjacency matrix (which describes connections) and the degree matrix (which tracks how many neighbors each node has).

An example of the Laplacian matrix of a simple network (n = 4) (Source)

Combinatorial Laplacian is a generalization of the graph Laplacian. It is a matrix associated with a graph that gives insights into the properties and structure of the graph. The combinatorial Laplacian of a graph G with vertex set V and edge set E is defined as L = D — A, where D is the degree matrix and A is the adjacency matrix.

Combinatorial Laplacian that talks about connectivity of (k+1) cliques for βₖ

Eigenvalues and Eigenvectors of the combinatorial Laplacians help us to detect topological invariants, the Betti numbers. Eigenvalues and associated Eigenvectors satisfy the equation L × v = λ × v, where λ is an eigenvalue and v an eigenvector. There are two important concepts we will need:

  • Algebraic Multiplicity is the number of times an eigenvalue appears as a root of the characteristic polynomial of the matrix.
  • Kernel (Nullspace) is the set of all vectors that, when multiplied by the Laplacian matrix, result in the zero vector (0).

Spectral theory helps us to determine the Eigenvalues. Eigenvalues are λ ≤ λ ≤ λ≤ · · · ≤ λₙ-₁, where n is the number of nodes (or vertices) in a graph.

Spectral gap is used to measure the connectivity of a system. It is defined as the difference between the smallest and second-smallest eigenvalues of a normalized Laplacian matrix of graph. The smallest eigenvalue is always zero for connected graphs, and a graph with large spectral gap λ>> 0 is well-connected. We know that the smallest nonzero eigenvalue of combinatorial Laplacian β₁ is directly linked to the number of connected components in the data, and subsequent non-zero eigenvalues relate to higher-order topological features, like the number of holes in the data measured by the second smallest non-zero eigenvalue etc.

Cheeger’s inequality is the origin of this concept and demonstrates that the magnitudes of the small non-zero eigenvalues of the graph Laplacian characterize the connectedness of the graph. Similar results hold for combinatorial Laplacians in higher dimensions, where the eigenvalues reflect topological features of the simplicial complex, such as connected components, cycles, and higher-dimensional holes.

Betti numbers β are topological features, or invariants, we are interested in. Betti numbers measure the number of connected components, holes, and voids. For example, Connected Components are a part of a graph where you can reach any node from any other node by following edges.

  • Zeroth Betti numbers indicate number of connected components,
  • First Betti numbers indicate number of holes,
  • Second Betti numbers indicate number of voids,
  • Higher Betti numbers indicate number of higher dimensional holes.

Connected Components = Zeroth Betti number β₀

Let us focus on the connected components first, before we go higher: A connected component in a graph is a subset of vertices where you can reach any vertex from any other vertex within that subset by following edges. In simpler terms, it is a group of vertices that are linked together directly or indirectly. And the zeroth Betti number (β₀) specifically represents the number of connected components in the data.

A graph with three components (Source)

Algebraically, the number of zero eigenvalues of combinatorial Laplacians corresponds to the number of connected components of a graph. This is a fundamental result in spectral graph theory.

The kernel of the Laplacian can be used to count the number of connected components in a simplicial complex.

This is because the kernel of the combinatorial Laplacian captures connectedness by focusing on vectors orthogonal to all eigenvectors with eigenvalue 0, which remain unchanged under the action of the Laplacian. In TDA this directly relates to the homology groups of the underlying simplicial complex, allowing for the calculation of Betti numbers.

This means that the number of connected components in the graph is captured by the the dimension of the kernel (null space or set of solutions) of the Laplacian, as well as the algebraic multiplicity of the 0 eigenvalue (=how many times the same Eigenvalue). In more detail:

  • Zero Eigenvalue and Connected Components: The Laplacian always has an eigenvalue of zero (λ = 0). The number of times zero appears as an eigenvalue (=its algebraic multiplicity) directly tells you how many separate, disconnected pieces (connected components) your graph has.
  • Nullspace and Connected Components: The eigenvectors associated with the zero eigenvalue form a special space called the nullspace (kernel) of the Laplacian. The dimension of this space (how many independent vectors it contains) is also equal to the number of connected components in the graph.

A large zeroth Betti number (β₀) directly translates to a large number of connected components. In the context of topological data analysis a connected component is considered a topological invariant and relates to the idea that the number of connected components (and thus the zeroth Betti number) remains unchanged under continuous deformations like stretching or bending, as long as no tearing or gluing occurs.

A large number of connected components suggests that the data is more fragmented or dispersed, with many distinct groups or clusters. This could be interesting when looking for diversity or heterogeneity in the data, or when trying to identify outliers or anomalies. A small number of connected components could indicate that the data is relatively well-connected and perhaps even clustered together.

You can imagine a graph as a map of islands connected by bridges: Each separate island represents a connected component. The number of islands tells you how many times zero is an eigenvalue of the Laplacian and also the dimension of the nullspace. The nullspace holds special vectors that essentially highlight each individual island within your graph.

Deciding if a combinatorial Laplacian has a trivial or non-trivial kernel (i.e., Betti number zero or non-zero) is hence an important task in TDA. If β₀= 0, it might indicate a single, well-connected cluster. If β₀> 0, it suggests the presence of multiple clusters.

Holes = β₁ , Voids = β₂ and Higher Betti numbers βₖ

Now we move higher: the first Betti number (β₁) characterizes the number of one-dimensional holes or loops in a complex, the second Betti number (β) characterizes the the number of two-dimensional voids in a complex.

For example, the kernel of a combinatorial Laplacian of a circle is one-dimensional, which corresponds to the circle’s single Betti number, β₁ = 1. This reflects the connectedness of the circle as a topological space. For a torus β₁ = 2, since it exhibits have two holes in this dimension.

For a torus, the first Betti number is β₁ = 2 , which can be thought of as the number of circular “holes” (Source)

Now you see why mathematicians say that Betti numbers represent the number of “holes” of different dimensions in the simplicial complex. And the dimension of the kernel at a specific ‘k’ corresponds to a single Betti number (not the sum of all Betti numbers, just to be clear!)

Now we reached the core, the pièce de résistance of TDA when using combinatorial Laplacians, when we generalize to arbitrary dimensions:

The number of k-simplices in the kernel of the k-th Laplacian is equal to the number of k-dimensional holes in the simplicial complex.

The k-th Laplacian Δₖ​ is an operator that plays an important role in understanding the topological features of the complex. It is defined in terms of the boundary operators ∂ₖ that maps k-simplices to (k−1)-simplices, where ∂ₖ is applied to a k-simplex to give a formal sum of its (k−1)-dimensional faces, with appropriate orientation and coefficients.

Spectral Analysis of Combinatorial Laplacians

Let us look at an example on how to use the combinatorial Laplacian and spectral analysis to determine the topological features of a simplicial complex efficiently.

The k-th Betti number βₖ ​is the dimension of the k-th homology group Hₖ​​

In terms of the combinatorial Laplacian, the Betti number βₖ​​ is equal to the dimension of the kernel of k-th Laplacian:

Betti number βₖ​​ is equal to the dimension of the kernel of k-th Laplacian Lₖ in combinatorial Laplacian

The k-th combinatorial Laplacian Δis constructed using boundary operators ∂ₖ for each dimension k (it’s better to use Δₖ instead of Lₖ​, because Lₖ​ is more often reserved for the so-called Hodge Laplacian):

Combinatorial Laplacian

Higher-order Betti numbers can be computed using spectral analysis of the Laplacian matrices associated with a simplicial complex:

  1. Construct Chain Complex: Consider a simplicial complex K. For each dimension k, you have a chain group Cₖ consisting of formal sums of k-simplices. Identify all simplices of simplicial complex. For each dimension k, form the chain groups Cₖ​, which are vector spaces generated by the k-simplices.
  2. Compute Boundary and Coboundary Operators: Compute the boundary operators ∂ₖ for each dimension k, which maps k-simplices to (k−1)-simplices. The boundary operator ∂ₖ : Cₖ → Cₖ​₋₁ maps k-simplices to (k−1)-simplices. Then compute the coboundary operator δₖ : Cₖ​₋₁ → Cₖ which is the adjoint of the boundary operator, mapping (k−1)-simplices to k-simplices. Boundary and Coboundary Operators define how simplices in different dimensions are related, with boundary operators mapping higher-dimensional simplices to their boundaries and coboundary operators serving as their adjoints.
  3. Compute Laplacians: Using boundary operators ∂ and their adjoints (transposes), construct the k-th combinatorial Laplacian Δₖ​ = ∂ₖ​​₊₁ ​δₖ ​​+ δₖ​​₋₁ ​∂ₖ for each dimension​. Note that δₖ​ ​is the adjoint (transpose) of the boundary operator ∂ₖ​​₋₁. For a given simplicial complex, construct the Laplacian matrices for each dimension k: Δₖ = ∂ₖ ∂ₖ₊₁​ᵀ + ∂ₖᵀ∂ₖ. The k-th combinatorial Laplacian is defined as: Lₖ​ = dₖ​₋₁ δₖ​​₋₁ + ​δₖ​dₖ​ or equivalently Δₖ = ∂ₖ₊₁​δₖ + δₖ​​₋₁∂ₖ since δₖ = ∂ₖ​​₋₁ᵀ and δₖ​​₋₁ = ∂ₖᵀ. This operator acts on k-chains. The k-th Laplacian combines information about how k-simplices relate to (k+1)-simplices and (k−1)-simplices, which provides a way to analyze topology of the simplicial complex with linear algebra.
  4. Perform Spectral Analysis and Identify Zero Eigenvalues: Compute the eigenvalues of each Laplacian matrix Δₖ = eigenvalue decomposition to obtain spectrum of eigenvalues for each Δₖ. Determine the dimension of the kernel = the number of zero eigenvalues. Count the number of zero eigenvalues for each Laplacian Δ. The number or multiplicity of zero eigenvalues in the spectrum of Δₖ​ corresponds to the dimension of the kernel of Δ which is exactly the k-th Betti number βₖ. The higher order Betti numbers βₖ​ represent the rank of the k-th homology group, which indicate the number of k-dimensional holes in a simplicial complex.
A complete example of Topological Data Analysis (Source)

By using the combinatorial Laplacian and spectral analysis, one can leverage the eigenvalues of the Laplacian matrices associated with the simplicial complex to identify the dimensions of the homology groups. It provides a way to understand the topological features such as computing higher-order Betti numbers.

Example Calculation

Let us walk through an example of using the combinatorial Laplacian and spectral analysis to compute Betti numbers for a simple simplicial complex to get insight into the topological features of the complex.

Consider a simplicial complex K consisting of the following:

  • Vertices: v, v, v
  • Edges: e= (v, v), e = (v, v), e = (v, v)
  • A single 2-simplex (triangle): t₀ = (e, e, e)
2-simplex (triangle)

Before we start, a crucial point is the orientation of the simplices. Each simplex is assigned an orientation, which determines the signs in the boundary operators and the boundary matrix:

We orient the edge from vᵢ to vⱼ. The boundary operator ∂​ maps edges (1-simplices) to vertices (0-simplices) according to this orientation: For an edge eₖ = (vᵢ , vⱼ), oriented from vᵢ​ to vⱼ​, the boundary operator ∂​ is:

Boundary operator for ∂₁

This means ∂​(eₖ​) should have a +1 for vⱼ​ and a -1 for vᵢ​​.

For the simplicial complex with vertices v, v, vand edges: e= (v, v), e = (v, v), e = (v, v), the boundary operator ∂​​ is calculated as follows:

  • Edge e= (v, v) → ∂​​(e​) = v​− vwith −1 for v​ and +1 for v
  • Edge e = (v, v) ∂₁ (e₁​) = v− vwith −1 for v​ and +1 for v
  • Edge e = (v, v) ∂₁ ​(e₂​) = v​ − vwith −1 for v​ and +1 for v

Step 1: Construct Chain Complex

a) Chain Groups:

  • C​ (0-chains): generated by v, v, v
  • C​ (1-chains): generated by e₀, e, e
  • C​ (2-chains): generated by t

b) Boundary Operators:

Boundary operator (Source)

​: C​ → C:

  • ∂₁ ​(e₀​) = v​− v
  • ∂₁ (e₁​) = v− v
  • ∂₁ ​(e₂​) = v​ − v

​ ​: C→ C ​:

  • ∂₂​ (t₀​) = e​ + e ​− e
Illustration of boundary operators ∂, chain, cycle, and boundary groups along with their images under the boundary operators. The black dot is the empty set ∅

Step 2: Compute Boundary Matrices

a) Boundary Matrix ∂₁:

  • Columns correspond to edges e₀, e, e
  • Rows correspond to vertices v, v, v
  • Arranging edges and vertices, we construct the boundary matrix ∂:
Column 1 = e= (v, v) | Column 2 = e = (v, v) | Column 3 = e = (v, v)

b) Boundary Matrix ∂₂:

  • Boundary matrix ∂₂​ (t₀​) = e​ + e ​− e
  • Columns correspond to the single 2-simplex t₀ = (e, e, e)
  • Rows correspond to edges e₀, e, e₂ = 1, 1, -1

Step 3: Compute Laplacians

a) 0-th Laplacian L₀ or Δ:

b) 1-th Laplacian L₁​ or Δ₁:

Step 4: Perform Spectral Analysis

Now we compute the eigenvalues of the matrix L₀ and L₁​ by solving the so-called characteristic equation where λ represents the eigenvalues, I is the identity matrix, and L₀​ is the target matrix:

a) 0-th Laplacian L₀: Compute eigenvalues:

  • Eigenvalues: 0,3,3
  • The number of zero eigenvalues = 1, thus β=1

b) 1-th Laplacian L₁: Compute eigenvalues:

  • Eigenvalues: 0,3,6
  • The number of zero eigenvalues = 1, thus β=1.

Summary of Betti Numbers

  • β= 1 : There is one connected component.
  • β= 1 : There is one loop (1-dimensional hole).
  • β= 0 : There are no 2-dimensional holes (since the single 2-simplex forms the whole space in this simple case).
2-simplex (triangle)

Persistent Homology: Moving from one simplicial complex to a chain complex across different filtrations ϵ

In the example above we calculated Betti numbers for one filtration. Given a distance threshold ϵ between vertices, we constructed one simplicial complex Kϵ, and for different dimensions Cₙ (for example C​ generated by v, v, v₂, and C​ generated by e₀, e, eand C​ generated by t).

If we increase the distance of ε, then include more points and make the entire calculation of Betti numbers for different dimensions again, we create a chain complex across different filtrations with each distance ϵ.

A chain complex is a sequence of chain groups Cₖ​ (groups of k-simplices) connected by boundary operators ∂ₖ​.

Simplicial complex at a given distance ϵ (vertical) and chain complex with different distances ϵₙ (horizontal). In each step, Betti numbers are computed for different dimensions β₀, β₁, β₂ etc. This process leads us to persistent homology, where the stability of topological invariants is monitored.

This would be the entire TDA process then. Each step in the filtration requires constructing a new simplicial complex and performing the entire calculation of Betti numbers for different dimensions. By analyzing the changes in these Betti numbers across different ϵ-values, persistent homology provides a comprehensive picture of the topological features of the data across multiple scales:

Illustration of data from a point cloud data and the corresponding barcode plots of persistent features

This is the example workflow of Topological Data Analysis! In topological data analysis (TDA), we work with a filtration, which is a sequence of nested simplicial complexes built at increasing distance thresholds (ε-values). Each step in this filtration involves constructing a simplicial complex at a particular ε-value and then computing the Betti numbers for that complex. Generally, we compute Betti Numbers for Kϵ:

  • For the constructed simplicial complex Kϵ​ at different filtrations ϵ​, compute the chain groups Cₖ, boundary operators ∂ₖ​​, and Laplacians Lₖ
  • Perform spectral analysis on Laplacians to determine the eigenvalues and thus compute the Betti numbers βₖ​ for each dimension k.
  1. Given a Set of Points:
  • Start with a set of data points in a metric space

2. Filtration Step 1 (ε = 0.5):

  • Construct simplicial complex Kϵ = K₀.
  • Compute ∂ₖ​ and Lₖ such as ​: C​ → Cand ∂₂​ : C₂ → Cetc.
  • Compute Betti numbers β₀​, β₁, β₂ etc. from Eigenvalues

3. Filtration Step 2 (ε = 1.0):

  • Increment the distance threshold ϵ to ϵ′, where ϵ′ > ϵ
  • Construct new simplicial complex Kϵ′ = K₁.
  • Compute boundary operators ∂ₖ​​, and Laplacians Lₖ
  • Compute Betti numbers β₀​, β₁, β₂ etc. from Eigenvalues

4. Continue for Increasing ε-values:

  • Repeat the process for ϵ = 1.5 , 2.0 , …

Standard Homology βₖ vs Reduced Homology β̃ₖ

Now there is one more twist: We saw that there is a strong relationship between the kernel of the combinatorial Laplacian and the Betti numbers of a simplicial complex. The kernel of a linear operator (like the Laplacian) is the set of all vectors that the operator maps to zero. The dimension of the kernel is the number of linearly independent vectors in this set.

The connection lies in the fact that the kernel of the combinatorial Laplacian at a specific dimension ‘k’ is related to the k-th Betti number. But the exact relationship depends on whether we are using standard homology or reduced homology. So far we considered standard homology.

Reduced Homology is motivated by the intuition that all of the homology groups of a single point should be equal to zero. This affects our understanding of the relationship between the number of k-th eigenvalues, the number of topological features like connected components, and the number of k-th betti number. The number of zero eigenvalues of the Laplacian corresponds to the dimension of the kernel of the Laplacian. But there is a difference in the number of connected components for the zeroth Betti number:

  • In standard homology, the number of zero eigenvalues (counting multiplicities) in the original Laplacian matrix (unreduced) equals the zeroth Betti number (β₀), represents the number of connected components in a simplicial complex.
  • In reduced homology, the number of zero eigenvalues (counting multiplicities) in the reduced Laplacian matrix equals the zeroth Betti number (β̃₀), represents the number of connected components minus one in a simplicial complex (β₀ — 1).

The reduced Laplacian is obtained by removing a row and a column from the original Laplacian. This process effectively “collapses” one of the connected components, reducing the number of zero eigenvalues by one.

To summarize for reduced homology, which is often used in TDA: Number of connected components β₀ = β̃₀ + 1 with following relations:

  • The number of zero eigenvalues of the Laplacian =
  • the dimension of the kernel of the Laplacian =
  • the zeroth Betti number (β̃₀) =
  • the number of connected components β₀ minus one

Constructing filtrations to extract Betti numbers in topological data analysis is computationally intensive, especially for larger datasets or more complex spaces. This is a potential angle for quantum computing. We will look into quantifying quantum advantage for TDA in part 4 of this series.

Disclaimer: The views and opinions expressed in this article are my own and do not necessarily reflect those of my employer.

--

--