Demystifying Spectral Embedding
A Dimensionality Reduction Technique
Hello to you all,
In this blog, we will be taking all the complexity out from Spectral Embedding, a technique used for non-linear dimensionality reduction. If you are unfamiliar with what is Dimensionality Reduction and it’s purposes, then refer to my previous blog. Without any further ado, let’s quickly go over the things that we will be covering in this article.
- Introduction to Spectral Embedding & Laplacian Eigenmaps
- Pre-requisites for understanding the algorithm
- Additional Resources
Introduction to Spectral Embedding & Laplacian Eigenmaps
Spectral Embedding is a technique used for non-linear dimensionality reduction. Under it’s hood, the algorithm in action is Laplacian Eigenmaps. Laplacian Eigenmaps is considerably similar to Isometric Feature Mapping (also referred to as Isomap). You can find an amazing reference to Isomap towards the end of this section, if you are unfamiliar with it. The primary difference between Isomap and Laplacian Eigenmaps is that the goal of Isomap is to directly preserve the global (non-linear) geometry, but the goal of Laplacian Eigenmaps is to preserve the local geometry (i.e., nearby points in the original space remain nearby in the reduced space).
This technique relies on the basic assumption that the data lies in a low-dimensional manifold in a high-dimensional space. Now, this very statement, may lead many readers to stumble, since the term “manifold” here is a entire mathematical concept in itself, that too a hefty and important one. If you are familiar with the concept of manifolds and slightly with the concept of manifold hypothesis, then you are good to go, otherwise, you can find an amazingly easy-to-understand reference for the same, towards the end of this section.
The Spectral Embedding (Laplacian Eigenmaps) algorithm consists of three stages:
- Constructing the Adjacency Graph
- Choosing the Weights
- Obtaining the Eigenmaps
In the third section (i.e., Algorithm), we will dive into each of these steps and see how we can perform these steps in different ways.
Isomap Embedding — An Awesome Approach to Non-linear Dimensionality Reduction
How to use Isometric Mapping to “unroll the Swiss roll”?
Pre-requisites for understanding the algorithm
In this section, we will go over the pre-requisites for understanding Laplacian Eigenmaps. In order to keep the focus revolving around Laplacian Eigenmaps, I will be out-sourcing the explanation for most of these pre-requisites. Furthermore, there’s no benefit in reiterating over what is already available, that too in a much more comprehensive manner.
In this section, we will take a deep-dive into the three primary steps of the algorithm.
1. Constructing the Adjacency Graph
The first step is to construct an adjacency graph based on the given data. We put an edge between nodes i and j if the corresponding data-points are “close”. Now, there are multiple ways to define “close”. I will be sticking to the ones defined in the research paper, so that if you want to explore the original research work, you will easily be able to draw up analogies.
- Nearest Neighbors: Two points, xᵢ and xⱼ are connected by an edge if one is among the K-Nearest Neighbors of each other.
- Epsilon Neighborhood: Two points, say xᵢ and xⱼ, are connected by an edge if Norm(xᵢ-xⱼ) < eps, where Norm(x) is the usual Euclidean Norm
2. Choosing the Weights
Now, the next step is to weight the edges, which we have defined in the first step. Here too, we have 2 different variations.
- Gaussian Weights: If nodes i and j are not connected then put Wᵢⱼ = 0, otherwise use the formulation, Wᵢⱼ = exp[-||xᵢ - xⱼ||² / t]
- 0/1 Weights: Wᵢⱼ = 1 if vertices i and j are connected by an edge, otherwise put Wᵢⱼ = 0.
3. Obtaining the Eigenmaps
After the second step, we will have the weight matrix (W) with us. Using W, we will obtain the diagonal weight matrix (D), whose entries are column (or row, since W is symmetric) sums of W, i.e., Dᵢᵢ = ∑ⱼ Wⱼᵢ. Once we have obtained D, we will obtain the Laplacian Matrix (L), where L = D-W.
Laplacian is a symmetric, positive semi-definite matrix which can be thought of as an operator on functions defined on vertices of G.
And now, finally, we pose the generalized eigenvector problem,
and find out the solutions to this problem. Once again, if you are interested in learning how to solve generalized eigenvector problems, refer to this great tutorial. But let me assure you, you won’t be needing it, since scikit-learn will take care of it for you. So, let’s say that we solve this problem, and we obtain the solutions of this problem as follows:
L f₀ =λ₀Df₀
L f₁ =λ₁Df₁
L fₖ -₁ =λₖ -₁Dfₖ -₁
where, f₀,f₁ … fₖ -₁, represent the eigenvectors to this problem, ordered according to their eigenvalues, i.e., 0 = λ₀ ≤ λ₁ … ≤ λₖ -₁. From these k eigenvectors, we leave out the eigenvector f₀ corresponding to eigenvalue 0, and use the next m eigenvectors for obtaining the lower m-dimensional representations, i.e.,
xᵢ = [f₁(i), … fₘ(i)]
Though, I have pretty much described the entire algorithm behind Laplacian Eigenmaps, still, if you want to explore it in even a greater depth, check-out the original research work, the link to which can be found in the last section.
In the previous section, we went over the algorithm in a great detail. However, I did not provide any explanation as to the motivation and significance of these steps, and how these exact steps lead us to the low-dimensional embeddings. This is primarily because of 2 reasons. The first is that the justification behind these steps is pretty comprehensive, and trying to summarize it in this blog, would make it exceedingly long. So, if you want to get the justification, I urge you to check out the original research work, which provides the justification in a great detail. And the second reason is that, I am not very qualified to summarize the justification provided in the research work.
Nonetheless, let me highlight the justifications provided in the original research work, so that you can decide for yourself, if you want to explore them or not.
- The authors first show that the embedding provided by the Laplacian Eigenmap algorithm preserves local information optimally in a certain sense.
- Then they provided a justification as to why the eigenfunctions of the Laplace Beltrami operator have properties desirable for embedding.
- And finally, they demonstrated how the Laplace Beltrami operator on differentiable functions on a manifold is intimately related to the heat flow.
The Laplacian of a graph is analogous to the Laplace Beltrami operator on manifolds.
- Below you can find the link to the original research work behind this genius technique. Feel free to give it a read while sipping some tea or perhaps some coffee.
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation
One of the central problems in machine learning and pattern recognition is to develop appropriate representations for…
- Below you can find a short and concise description of Spectral Embedding on Scikit-Learn’s User Guide
2.2. Manifold learning
Look for the bare necessities The simple bare necessities Forget about your worries and your strife I mean the bare…
- Below you can find the Scikit-Learn’s implementation of Spectral Embedding
A little about ME 👋
You can safely skip this section, if you have no interest in knowing the author, or you already know me. I promise that there is no hidden treasure in this section 😆.
I am a Machine Learning and Deep Learning Enthusiast, and this is my second piece of content based on the same. If you like it, do put your hands together 👏 and if you would like to read further articles based on Machine Learning and Deep Learning #StayTuned.