Introducing Autofaiss: An Automatic K-Nearest-Neighbor Indexing Library At Scale

An article by Victor Paltz, Romain Beaumont

Victor Paltz
Criteo R&D Blog
6 min readAug 12, 2021

--

BIG NEWS: Creating a fast and optimized KNN index has never been easier! Even on hundred of millions of vectors!

Links Github, Docs, Getting started in 10 seconds

TLDR; Autofaiss wraps the Faiss library and tunes automatically the best KNN index to maximize the recall given your RAM and query speed constraints.

For example, using Faiss efficient indices, binary search, and heuristics, Autofaiss makes it possible to automatically build a large (200 million vectors, 1TB) KNN index in 3 hours - in a low amount of memory (15 GB) with latency in milliseconds (10ms)

Introduction

Let’s start with some definitions

  • k-nearest-neighbor (KNN): A simple algorithm that consists of searching for vectors that are similar to a query vector based on the score given by a similarity function.
  • Index: A data structure containing a transformed version of the initial vectors. The KNN index is designed to run the KNN algorithm optimally.

Then why do we need such a data structure?

Well…, it is used in most of the search and recommendation engines that you are using every day! It can also be used in a tremendous number of other applications. Here is a non-exhaustive list of applications that use the KNN algorithm:

Demo usage: An image search engine on flowers

Let’s dive into how these recommendation engines work! To illustrate it, let’s take the example of a flower image database on which we would like to do an image search. A simple application of such a search engine would be information retrieval about a flower on a picture without knowing its name.

You can try a little demo here!

For such a recommendation engine, the code would be divided into two parts:

Figure 1: Recommendation engine schema for image search on flowers

The first part is the encoding function (Encoder arrow in figure 1). It transforms an image into a high-dimensional vector. These vectors are designed such as the distance between two vectors is proportional to the similarity of the images. For instance, two pictures of roses would have their vectors very close, while the distance from a rose vector to a sunflower vector would be high. The Encoder function can be computed using deep learning methods with a BPR loss.

The second part of the recommendation algorithm is the KNN index. Given an input query image, the recommendation algorithm will first compute the corresponding query vector. Then, the KNN algorithm will find vectors closest to the query vector that correspond to the most similar images.

You can find more information about how to create image embeddings in this medium post. And to know more about semantic search engines, you can look here.

To create your own search engine on images in 5 minutes try this notebook!

A few words on Faiss, the library wrapped by Autofaiss

Faiss is a library for efficient similarity search and clustering of dense vectors. It implements many different state-of-the-art KNN indexing algorithms which are highly tunable.

But why do we need KNN indices? Couldn’t we just compute all the distances and take the top k smallest ones?

In most situations, the exhaustive brute-force search has a prohibitive cost both for runtime and memory space. As a consequence, much research was done to design efficient similarity search algorithms.

The principle behind most of these algorithms is to approximate the nearest neighbor search to reduce the overall algorithm complexity.

Yet, if many algorithms solve the Approximated nearest neighbor search in a decent time, the amount of data stored in RAM might be prohibitive and is not compatible with scalable applications with hundreds of thousands of vectors.

Faiss implements quantization-based algorithms that are new kinds of algorithms that can highly reduce the used RAM while keeping a good trade-off on the recall score and query time.

We managed to build a KNN index of only 10GB from a 1TB dataset of 200M of images with a search time of only 10ms and a good recall — in 3 hours!

Such compression ratio and query speed wouldn’t be possible without the product quantization.

The main idea behind product quantization is to split the original vector space into many smaller sub-spaces. Then, for each of these sub-spaces, we will find 256 cluster centers that will represent the data inside well. Each piece of sub-vector in the subspace will be associated with the number corresponding to the closest cluster. Consequently, instead of storing a floating-point sub-vector, you just need to store an integer, and this trick reduces a lot the amount of RAM required to store the vectors.

Figure 2: Product quantization explained

In the example of figure 2, the reduction ratio is x16. It is possible to compress the data even more, but too much compression would decrease the recall score of a similarity search, and the reconstruction score between the original vectors and the compressed vectors would be too low.

Introducing Autofaiss, the automatic KNN index builder

Now that you know why KNN indices are so useful and why Faiss is a great indexing library, let’s analyze why it’s still quite challenging to use it.

There are hundreds of algorithm combinations that you can use in Faiss, and each of them can require up to 6 hyperparameters to define how to build the index. In addition, there are other hyperparameters that need to be set to optimize the trade-off between the recall and the search time on the index. It can be really long and complicated to make it work

Most users don’t understand the subtility between the different indices and hyperparameters. They often end up spending a lot of time creating sub-optimal indices!

Let’s take the example of our 200M image embeddings dataset (1TB). You can see in figure 3 the trade-off between the index size versus the Recalls at a given query speed constraint (latency).

Figure 3: Benchmark of different indices with size VS recall trade-off at fixed query speed (10ms)

That’s where Autofaiss comes!
At Criteo we also spent weeks calibrating indexing algorithms to find the best trade-offs between query speed, index size, recall, and index construction time. The result of this effort is the creation of Autofaiss, an indexing library that builds optimal indices given query speed and RAM usage constraints.

Github: https://github.com/criteo/autofaiss

The library is very simple. There is a command that does everything for you:

You can try to create your own index in google Colab from here!

If you want to have more transparency on how Autofaiss chooses the best index, you can use this DEMO to play with different parameters

Figure 4: Demo caption (Link)

It is also possible to tune an index of your choice with Autofaiss by providing the Faiss index_key parameter in the CLI :)

Conclusion

KNN indices are key components in machine learning and recommendation engines. Some libraries implement powerful algorithms to build them, but the ramp-up is long and difficult. With Autofaiss we are opening this technology to everyone and provide tools to automatically create optimized indices based on RAM usage and query speed constraints.

We hope this library will simplify the work of hundreds of researchers and developers and that amazing new projects will be enabled thanks to it!

Links Github, Docs, Getting started in 10 seconds

--

--