Search Like Light Speed — (2) LSH

Published in

Dataman in AI

16 min readOct 21, 2023

In “Search like light speed — (1) HNSW,” we have learned Approximate Nearest Neighbors (ANN) and the graph-based Hierarchy Navigable Small World (HNSW) algorithm. In this post, we will learn the hashing-based Local-Sensitivity Hashing (LSH) algorithm. If you have not heard about hashing — do not worry. I will start with the basics to explain what hashing is and its advantages. I then explain why LSH works in high-dimensional spaces. This post covers the following topics:

What is hashing?
The advantages of hashing
Why is hashing useful for similarity search?
Hashing for similarity search is proven in low-dimensional spaces
Hashing for similarity search is challenging in high-dimensional spaces
Why does LSH work in high-dimensional spaces?
Code examples

After reading this post, you will gain a good understanding of hashing algorithms for similarity search, and you will perform coding on real data. Let’s start!

What is Hashing?

Hashing is the process of converting data into a fixed-length string of letters and numbers. Hashing is a kind of data labeling. It converts data into fixed-length strings called hash values. The data can be another kind of data such as…

Search Like Light Speed — (2) LSH

Written by Chris Kuo/Dr. Dataman