# Demystifying Spectral Embedding

## A Dimensionality Reduction Technique

## Hello to you all,

In this blog, we will be taking all the complexity out from **Spectral Embedding**, a technique used for non-linear dimensionality reduction. If you are unfamiliar with what is Dimensionality Reduction and it’s purposes, then refer to my previous *blog*. Without any further ado, let’s quickly go over the things that we will be covering in this article.

# Overview

- Introduction to Spectral Embedding & Laplacian Eigenmaps
- Pre-requisites for understanding the algorithm
- Algorithm
- Intuition
- Additional Resources

# Introduction to Spectral Embedding & Laplacian Eigenmaps

Spectral Embedding is a technique used for **non-linear dimensionality reduction**. Under it’s hood, the algorithm in action is **Laplacian Eigenmaps**. Laplacian Eigenmaps is considerably similar to **Isometric Feature Mapping** (also referred to as Isomap). You can find an amazing reference to Isomap towards the end of this section, if you are unfamiliar with it. The **primary difference** between Isomap and Laplacian Eigenmaps is that the goal of Isomap is to directly preserve the global (non-linear) geometry, but the goal of Laplacian Eigenmaps is to preserve the local geometry (i.e., nearby points in the original space remain nearby in the reduced space).

This technique relies on the basic assumption that the **data lies in a low-dimensional manifold in a high-dimensional space**. Now, this very statement, may lead many readers to stumble, since the term “manifold” here is a entire mathematical concept in itself, that too a hefty and important one. If you are familiar with the concept of **manifolds **and slightly with the concept of **manifold hypothesis**, then you are good to go, otherwise, you can find an amazingly easy-to-understand reference for the same, towards the end of this section.

The Spectral Embedding (Laplacian Eigenmaps) algorithm consists of three stages:

- Constructing the Adjacency Graph
- Choosing the Weights
- Obtaining the Eigenmaps

In the third section (i.e., Algorithm), we will dive into each of these steps and see how we can perform these steps in different ways.

# Pre-requisites for understanding the algorithm

In this section, we will go over the pre-requisites for understanding Laplacian Eigenmaps. In order to keep the focus revolving around Laplacian Eigenmaps, I will be out-sourcing the explanation for most of these pre-requisites. Furthermore, there’s no benefit in reiterating over what is already available, that too in a much more comprehensive manner.

# Algorithm

In this section, we will take a deep-dive into the three primary steps of the algorithm.

## 1. Constructing the Adjacency Graph

The first step is to construct an adjacency graph based on the given data. We put an edge between nodes i and j if the corresponding data-points are “close”. Now, there are multiple ways to define “close”. I will be sticking to the ones defined in the research paper, so that if you want to explore the original research work, you will easily be able to draw up analogies.

*Nearest Neighbors:*Two points, xᵢ and xⱼ are connected by an edge if one is among the K-Nearest Neighbors of each other.*Epsilon Neighborhood:*Two points, say xᵢ and xⱼ, are connected by an edge if Norm(xᵢ-xⱼ) < eps, where Norm(x) is the usual Euclidean Norm

## 2. Choosing the Weights

Now, the next step is to weight the edges, which we have defined in the first step. Here too, we have 2 different variations.

*Gaussian Weights*: If nodes i and j are not connected then put Wᵢⱼ = 0, otherwise use the formulation, Wᵢⱼ = exp[-||xᵢ - xⱼ||² / t]*0/1 Weights*: Wᵢⱼ = 1 if vertices i and j are connected by an edge, otherwise put Wᵢⱼ = 0.

## 3. Obtaining the Eigenmaps

After the second step, we will have the weight matrix (W) with us. Using W, we will obtain the diagonal weight matrix (D), whose entries are column (or row, since W is symmetric) sums of W, i.e., Dᵢᵢ = ∑ⱼ Wⱼᵢ. Once we have obtained D, we will obtain the Laplacian Matrix (L), where L = D-W.

Laplacian is a symmetric, positive semi-definite matrix which can be thought of as an operator on functions defined on vertices of G.

And now, finally, we pose the generalized eigenvector problem,

Lf=λDf

and find out the solutions to this problem. Once again, if you are interested in learning how to solve generalized eigenvector problems, refer to this great *tutorial*. But let me assure you, you won’t be needing it, since scikit-learn will take care of it for you. So, let’s say that we solve this problem, and we obtain the solutions of this problem as follows:

L f₀ =λ₀Df₀

L f₁ =λ₁Df₁

…

L fₖ -₁ =λₖ -₁Dfₖ -₁

where, f₀,f₁ … fₖ -₁, represent the eigenvectors to this problem, ordered according to their eigenvalues, i.e., 0 = λ₀ ≤ λ₁ … ≤ λₖ -₁. From these k eigenvectors, we leave out the eigenvector f₀ corresponding to eigenvalue 0, and use the next *m* eigenvectors for obtaining the **lower m-dimensional representations**, i.e.,

xᵢ = [f₁(i), … fₘ(i)]

Though, I have pretty much described the entire algorithm behind Laplacian Eigenmaps, still, if you want to explore it in even a greater depth, check-out the original research work, the link to which can be found in the last section.

# Intuition

In the previous section, we went over the algorithm in a great detail. However, I did not provide any explanation as to the motivation and significance of these steps, and how these exact steps lead us to the low-dimensional embeddings. This is primarily because of 2 reasons. The first is that the justification behind these steps is pretty comprehensive, and trying to summarize it in this blog, would make it exceedingly long. So, if you want to get the justification, I urge you to check out the original research work, which provides the justification in a great detail. And the second reason is that, I am not very qualified to summarize the justification provided in the research work.

Nonetheless, let me highlight the justifications provided in the original research work, so that you can decide for yourself, if you want to explore them or not.

- The authors first show that the embedding provided by the Laplacian Eigenmap algorithm preserves local information optimally in a certain sense.
- Then they provided a justification as to why the eigenfunctions of the Laplace Beltrami operator have properties desirable for embedding.
- And finally, they demonstrated how the Laplace Beltrami operator on differentiable functions on a manifold is intimately related to the heat flow.

The Laplacian of a graph is analogous to the Laplace Beltrami operator on manifolds.

# Additional Resources

- Below you can find the link to the original research work behind this genius technique. Feel free to give it a read while sipping some tea or perhaps some coffee.

- Below you can find a short and concise description of Spectral Embedding on Scikit-Learn’s User Guide

- Below you can find the Scikit-Learn’s implementation of Spectral Embedding

# A little about ME 👋

You can safely skip this section, if you have no interest in knowing the author, or you already know me. I promise that there is no hidden treasure in this section 😆.

I am a Machine Learning and Deep Learning Enthusiast, and this is my second piece of content based on the same. If you like it, do put your hands together 👏 and if you would like to read further articles based on Machine Learning and Deep Learning *#StayTuned.*