Vector Database for AI

How Xiaomi Browser Uses Milvus to Build Its News Recommender System

Making recommendation through vector similarity search

Milvus
Vector Database for AI

--

Milvus is a vector database for AI, which was officially open-source 6 months ago and has so far drawn 200+ company/organization users worldwide. Xiaomi, a global mobile Internet company focusing on the research and development of smart devices, is one of the most important users of Milvus.

Application Scenarios

As the Internet goes mobile, the explosion of information is posing a new challenge to the processing of information. For the users, quickly locating the intended information in a time when information is exponentially growing becomes an essential requirement; for the platforms, pushing the right content to users becomes increasingly difficult.

When a user opens a browser on Xiaomi’s mobile phone, the backend will recommend contents that are of interest, using Milvus to speed up the retrieval of related articles.

The fundamental function of an article recommender system is to select appropriate articles from the massive article library and push them to the users. To accommodate the large volume of contents in the library, a recommender system is generally divided into two stages, the retrieval stage and the sorting stage. At the retrieval stage, the system narrows down the candidate set, for which the user may be interested, from the library; at the sorting stage, it sorts the obtained candidate set according to certain indicators and then pushes the related contents to the user.

At the retrieval stage, according to the user’s interest and click information, the system retrieves thousands of articles from the underlying article library that are most suitable for the user, sorts these articles, and then displays the recommended articles on the client. When the user clicks on the client, it can provide real-time feedback based on the user behavior, quickly track the user’s preferences, and make new recommendations.

When it comes to recommending news articles, a retrieval must be up-to-date and its model must meet the following requirements:

(1) Efficiency: Must complete a retrieval within a short timeframe.

(2) Relevance: The retrieved articles must match the user’s interest as much as possible.

(3) Timeliness: Newly-published articles must also be retrieved so that the latest contents stand a chance of being viewed.

Key Technology

Milvus

Milvus, an open-source vector database, can be integrated with deep learning models, such as image recognition, video processing, voice recognition, and natural language processing, to provide search and analysis services for vectorized unstructured data. The unstructured data is converted to feature vectors through deep learning models and imported into Milvus. Milvus stores and indexes the feature vectors. When you search in Milvus, it returns the most similar results to the query vector.

BERT

BERT, short for Bidirectional Encoder Representation Transformers, is a new language representation model. You can take it as a general NLU (Natural Language Understanding) model providing support for multiple NLP tasks. The following are its key features:

  • Transformer is used as the main framework of the algorithm. Transformer can capture the bidirectional relationship in a sentence more thoroughly;
  • Multi-task learning goals: Mask Language Model (MLM) and Next Sentence Prediction (NSP);
  • Using the more powerful machine training for larger-scale data brings the results of BERT to a new level. Users can directly use BERT as the conversion matrix of Word2Vec and efficiently apply it to their own tasks.

BERT’s network architecture uses a multi-layer Transformer structure. It abandons the traditional RNN and CNN. It converts the distance of two words at any position into 1 through the Attention mechanism, and solves the dependency issue that persists in NLP for a long time.

Transformer’s network architecture is shown in the figure below, which comprises Multi-Head Attention and a fully-connected feed forward. This architecture converts the input corpus into a feature vector.

The network structure of BERT is shown in the following figure. A ‘trm’ in this figure corresponds to the network architecture of Transformer in the above figure.

BERT provides a simple and a complex model. The corresponding hyperparameters are as follows: BERT BASE: L = 12, H = 768, A = 12, total parameter 110M; BERT LARGE: L = 24, H = 1024, A = 16, the total number of parameters is 340M.

In the above hyperparameters, L represents the number of layers in the network (i.e. the number of Transformer blocks), A represents the number of self-Attention in Multi-Head Attention, and the filter size is 4H.

Implementation

A homepage recommender system in the Xiaomi browser can be divided into three services: a vectorization service, an ANN service, and an ID Mapping service:

The vectorization service converts the article title into a general sentence vector. The SimBert model based on BERT is used here. As a BERT model, SimBert is a 12-layer model with a hidden size of 768. Simbert uses Chinese_L-12_H-768_A-12 for continuous training for the training task “metric learning + UniLM”, and has trained 1.17 million steps on a single TITAN RTX with the Adam optimizer (learning rate 2e-6, batch size 128). Simply put, this is an optimized BERT model.

The ANN service: insert the feature vector of the article title into Milvus’ collection (here the collection is equivalent to the table in a relational database), and then use Milvus for vector similarity search to get the IDs of similar articles.

The ID Mapping service: Use the IDs obtained by Milvus search to obtain relevant information such as page views and clicks on the corresponding articles.

The overall architecture diagram of a retrieval system is as follows:

There are thousands of articles stored in the above-mentioned scenario. Since new data of these articles is generated every day, the outdated data needs to be deleted. Therefore, in this system, we do full updates to data for the first T-1 days and incremental updates to data for the subsequent T days.

A full update is the offline update shown in the figure above. Every morning, the old collection will be deleted, and then the processed data of the T-1 days is inserted into the new collection. Incremental update is the real-time update shown in the figure above, here is the process of inserting new data generated on the same day in real-time. After the data insertion is completed, a similarity search is conducted in Milvus, and then the retrieved similar articles are sorted again by click rate, and the articles with high click rates are retrieved. In scenarios requiring frequent data updates, the rapid data insertion and high-performance data query of Milvus allows it to greatly speed up the update of articles in the library and the retrieval rate of articles.

Conclusion

The most commonly used retrieval method in an article recommender system is vectorizing information such as that of commodities and users, and then calculating similarity between vectors to realize the retrieval.

The growing popularity of this technology is thanks largely to the rise of ANNS-based (approximate nearest neighbor search) vector similarity search engine, which greatly improves the efficiency of vector similarity calculation.

Compared with other similar products, Milvus implements data storage, has abundant SDKs for a variety of platforms, and provides distributed deployment solutions, which greatly reduces the workload of building the retrieval layer. What’s more, Milvus community is more active and provides better project support. This is another important reason why Xiaomi chose Milvus.

We hope to see Milvus going further down the path of unstructured data processing and enabling more enterprises. In the meantime, we hope more like-minded people would join and contribute to the Milvus open source community.

Visit Milvus GitHub: https://github.com/milvus-io/milvus

--

--

Milvus
Vector Database for AI

Open-source Vector Database Powering AI Applications. #SimilaritySearch #Embeddings #MachineLearning