Candidate retrieval in a recommendation system

Jun Xie
2 min readJul 22, 2022

--

Candidate retrieval is a critical step in a recommendation system which can take a lot of engineering effort across different teams to build, scale and maintain.

In any modern recommendation system, there are two major steps, one being candidate retrieval and the second being candidate ranking. Assume there is a large corpus with millions, even billions of documents and given an incoming user’s request, the retrieval step aims to find hundreds to thousands similar documents quickly by matching signals between the user and the documents. Given the retrieval output, the ranking step extracts document features and does a heavy ranking using deep learning algorithms. Heavy ranking here means that it takes longer latency compared with the latency of the retrieval step. Then the ranked documents are sorted based on the score, and a subset of top ranked documents are returned to the user.

The retrieval step needs to be blazingly fast given the large amount of data. The industrial standard solution is to transform the unstructured data (document, image and video) to vector/embedding (semantic representation) using pre-trained models (like BERT) and then index those vectors accordingly for fast retrieval. At first, those generated vectors are stored in some permanent database (like Bigtable). Then there are two components involved to build the index and serve similar candidates. Index publisher is used to pull out vectors and other document metadata information from the database and index them accordingly. The generated indexes are typically stored into cloud storage. The index server is the serving component, which loads those indexes periodically into memory and serves requests based on in-memory indexes.

The above mentioned solution consists of multiple components, which are costly to build and maintain and needs cross team effort. When it comes to scale up (from handling M documents to B documents), all involved components are needed to scale up accordingly. It can be quite a complicated and audacious initiative for any organization, from small to big enterprise. In addition, the cost increases accordingly, maybe linearly instead of sub-linear. The same idea can be applied to operational work.

Overall, the candidate retrieval is not easy to build due to so many pain points. A lot of companies, like Google, Elastic, Milvus and Pinecone, are working on this area to make engineering life less suffering.

--

--

Jun Xie

Founder and ex-Snap software engineer. I am interested at Machine Learning and Database. Feel free to drop me an email: xiejuncs@gmail.com