Zheng Liu | Pinterest tech lead, Search
Pinterest Search handles billions of queries every month and returns nearly four billion Pins every day. Over the past year, monthly mobile text searches have increased 40 percent and visual searches are up nearly 60 percent. We recently made Search and Lens more front and center on our apps by launching them on home feed, because nearly 85 percent of searches now happen on mobile devices.
In order to continue scaling search, our system needs to find the most relevant results among +100 billion Pins for every Pinner. Previously, our search system was built on top of Lucene and written in Java. But as we’ve grown and introduced new discovery features, our legacy system faced challenges and could no longer support us. That’s why we built Manas, a customized full stack search system written in C++ that dramatically decreased latency while increasing capacity. In this post we’ll give an overview of Manas’ architecture and a look at what’s next for Pinterest Search.
As search usage on Pinterest has quickly grown, our Lucene-based solution increasingly faced challenges, including:
- The query volume and index size grew so fast that we needed to reduce serving latency and improve capacity.
- Besides Search, this system powered multiple use cases inside Pinterest, including Pinner Search, board Search, Related Pins, home feed recommendations and more. We needed the flexibility to customize the search process, which wasn’t possible before.
- We wanted to apply the system to complicated and powerful ranking models, but the Lucene index format and scorer interface weren’t suitable for these models.
- We also wanted to personalize search results, which the standard Lucene system couldn’t support.
- We built Manas to solve these challenges. Manas is designed as a generic search framework with high performance, high availability and high scalability. Compared to the old system, the search backend latency has been reduced by half while increasing capacity by 30 percent.
Manas is a full stack search indexing and serving system. The serving system consists of several stages: query understanding, candidate retrieving, lightweight scoring, full scoring and blending.
The Manas index includes an inverted index and a forward index.
Same as a common inverted index, Manas inverted index stores the mapping from term to list of postings. Each posting records the internal doc ID and a payload. To optimize the index size and serving latency, we implemented dense posting list and split posting list, which are two ways to encode the posting list based on the distribution of key terms among all documents. The inverted index is used for candidate generating and lightweight scoring.
On the other hand, Manas’ forward index stores the mapping from internal doc ID to the actual document. To optimize data locality, the forward index supports column family, similar to HFile. The forward index is used for full scoring.
We define Manas doc as the unified schema for different applications to describe what data they want to index for each document. In Manas doc, matching terms could be specified for retrieving, and the properties of the document could be added for filtering and lightweight scoring. For example, the system could return only English documents after filtering results by language property.
Index builder takes a batch of Manas documents and builds an index. We define the unified Manas doc schema so the index builder can be shared for different use cases.
The above graph illustrates the indexing pipeline.
- Different applications generate the Manas doc for their corpus.
- Manas docs are partitioned to multiple groups.
- Index builder transforms all Manas doc in a partition to an index segment. Each index segment is a small portion of the full index.
The graph below illustrates the life of a search in Manas.
Here’s what happens when a query enters the system:
- The query understanding service processes the raw query and generates an execution plan.
- Corpora are served with a serving tree. The Blender fan-outs the request to roots of different corpora, collects these different results and blends them. We store these blended results in cache for pagination.
- Root is a scatter-gather service. It aggregates results from leaves and reranks them.
- Leaf starts by loading the index segment(s) built by the indexing pipeline. It retrieves the candidates and does both lightweight and full scoring.
Manas Leaf is extensible and allows multiple different applications to be customized. This is achieved by encapsulating application-specific information in the index. Application-specific scoring logic can be embedded such that Manas will only execute application-performed tasks when scoring the documents.
The service architecture is designed with multiple layers and well-defined interfaces between them so that each layer is extensible. The architecture of the Leaf node is structured as following:
As illustrated above, the storage layer is responsible for loading up the index and provides an abstraction that allows for obtaining a continuous bulk of binary data given an identifier. This layer allows us to easily change the underlying storage of the index. On top of the storage layer, the index layer decodes the binary data to index and provides the interface to read the index. The posting list layer allows us flexibility in implementing the inverted index. The operator layer defines an interface for implementing query operators, and the model runner defines the model interface for full scoring. Finally, the API layer specifies the query formats evaluated by the Leaf node.
Candidate retrieving and lightweight scoring
In addition to supporting normal “AND,” “OR” and “NOT” operators, we built “Weak And” support in Leaf (based on this paper). This allows us to quickly skip on posting lists.
We use Squery to represent a structured query in a tree format. It describes how the Leaf retrieves and lightweight scores candidates from the index. Leaf understands the Squery and executes it on index.
The above graph is an example of Squery asking Leaf to retrieve the English-only document and match the term “cute” and “cat” or “kitten.” It gives a higher score if the click rate of the document is high.
Different applications use different algorithms to calculate the final score. To make Manas generic, we introduced forward index, which is a binary blob and could be anything. In practice, the forward index is a serialized Thrift object. Manas doesn’t interpret the forward index, it injects it into a DSL model and executes the DSL model to calculate the score. DSL is a domain-specific language used at Pinterest to customize extracting features from forward index and select a machine learning model to calculate the score based on the extracted features. Different applications could create different DSL models and specify which forward index should be injected.
With a sizable forward index to support complicated scoring algorithms, the total index size is increased significantly. To support more sophisticated scoring in the future, we’ll add more signals to the index. It wouldn’t be scalable to load all of the index into memory, so Manas only loads the inverted index for candidate retrieving and lightweight scoring, and serves the forward index from SSD and local cache.
Periodically, we execute the indexing pipeline to build the index. Once the new index is ready, we allocate new instances from AWS to create a cluster. We deploy the new index to the newly created cluster. Then the blender will switch traffic to the new cluster, and the old cluster will be deprecated.
We just launched Manas as the backbone of Pinterest Search. It also powers embedding-based Search and Ads. This is just the start, and there are many more use cases coming. Meanwhile, we’re still improving the system with ongoing projects, such as in-place index swap, increment indexing, real-time index updates and customized re-ranking. If you’re passionate about building a world-class search system, join our team!
Acknowledgements: Many engineers at Pinterest worked together to build Manas, including Andrew Chen, Caijie Zhang, Chao Wang, Charlie Luo, Chengcheng Hu, Lance Riedel, Mukund Narasimhan, Jan Gelin, Jiacheng Hong, Jinru He, Keita Fuiji, Randall Keller, Roger Wang, Sonja Knoll, Tim Koh, Vanja Josifovski, Vitaliy Kulikov, Wangfan Fu, Wenchang Hu, Xinding Sun, Shu Zhang, Yongsheng Wu and Zhao Zheng.