Search Like Light Speed — (2) LSH
In “Search like light speed — (1) HNSW,” we have learned Approximate Nearest Neighbors (ANN) and the graph-based Hierarchy Navigable Small World (HNSW) algorithm. In this post, we will learn the hashing-based Local-Sensitivity Hashing (LSH) algorithm. If you have not heard about hashing — do not worry. I will start with the basics to explain what hashing is and its advantages. I then explain why LSH works in high-dimensional spaces. This post covers the following topics:
- What is hashing?
- The advantages of hashing
- Why is hashing useful for similarity search?
- Hashing for similarity search is proven in low-dimensional spaces
- Hashing for similarity search is challenging in high-dimensional spaces
- Why does LSH work in high-dimensional spaces?
- Code examples
After reading this post, you will gain a good understanding of hashing algorithms for similarity search, and you will perform coding on real data. Let’s start!
What is Hashing?
Hashing is the process of converting data into a fixed-length string of letters and numbers. Hashing is a kind of data labeling. It converts data into fixed-length strings called hash values. The data can be another kind of data such as…