Yunsong Guo | Pinterest engineer, Discovery
Pinterest hosts more than 30 billion Pins (and growing) with rich contextual and visual information. Tens of millions of Pinners (users) interact with the site every day by browsing, searching, Pinning, and clicking through to external sites. The home feed, a collection of Pins from the people, boards and interests followed, as well as recommendations including Picked for You, is the most heavily user-engaged part of the service, and contributes a large fraction of total repins. The more people Pin, the better Pinterest can get for each person, which puts us in a unique position to serve up inspiration as a discovery engine on an ongoing basis.
The home feed is a key way to discover new content, which is valuable to the Pinner, but poses a challenging question. Given the ever increasing number of Pins from various sources, how can we surface the most personalized and relevant Pins? Our answer is Pinnability.
Pinnability is the collective name of the machine learning models we developed to help Pinners find the best content in their home feed. It’s part of the technology powered by smart feed, which we introduced last August, and estimates the relevance score of how likely a Pinner will interact with a Pin. With accurate predictions, we prioritize those Pins with high relevance scores and show them at the top of home feed.
The benefits of Pinnability
Before launching Pinnability a few months ago, all home feed content from each source (e.g., following and Picked For You) was arranged chronologically, without taking into account which Pins people may find to be most interesting. In other words, a newer Pin from the same source always appeared before an older Pin. This simple rule is easy to understand and implement, but it lacked the ability to effectively help Pinners discover Pins that really interest them, because a low-relevance Pin could very well appear before a high-relevance one (see Figure 1).
With Pinnability launched, the candidate Pins for home feed are scored using the Pinnability models. The scores represent the personalized relevance between a Pinner and the candidate Pins. Pins in home feed are prioritized by the relevance score as illustrated in Figure 2.
Powering Pinnability with machine learning
In order to accurately predict how likely a Pinner will interact with a Pin, we applied state-of-the-art machine learning models including Logistic Regression (LR), Support Vector Machines (SVM), Gradient Boosted Decision Trees (GBDT) and Convolutional Neural Networks (CNN). We extracted and tested thousands of textual and visual features that are useful for accurate prediction of the relevance score. Before we launch a model for an online A/B experiment, we thoroughly evaluate its offline performance based on historical data.
Figure 3 summarizes the three major components of our Pinnability workflow, namely training instance generation, model generation and home feed serving.
Training instance generation
The basis of the Pinnability training data is the historical Pinner interaction with home feed Pins. For example, after viewing a Pin in home feed, a Pinner may choose to like, repin, click for a Pin closeup, clickthrough, comment, hide, or do nothing. We record some of the “positive actions” and “negative action” as training instances. Naturally the number of Pins viewed is often much larger than the number of Pins in which the Pinner made a positive action, so we sample the positive and negative instances at different rates. With these defined, we test thousands of informative features to improve Pinnability’s prediction accuracy.
Our unique data set contains abundant human-curated content, so that Pin, board and user dynamics provide informative features for accurate Pinnability prediction. These features fall into three general categories: Pin features, Pinner features and interaction features:
- Pin features capture the intrinsic quality of a Pin, such as historical popularity, Pin freshness and likelihood of spam. Visual features from Convolutional Neural Networks (CNN) are also included.
- Pinner features are about the particulars of a user, such as how active the Pinner is, gender and board status.
- Interaction features represent the Pinner’s past interaction with Pins of a similar type.
Some features are subject to transformation and normalization. For instance, log transformation is applied to many “count features” such as the number of Pins a Pinner owns for regression-friendly distributions.
The major challenge we faced in developing a robust training data generation pipeline was how to cope with the large data scale. We built MapReduce pipelines to generate the training instances, each representing a Pinner/Pin interaction. A training instance contains three parts of information:
- Meta data (Pin ID, Pinner ID, source of the interaction, timestamp, etc.) for data grouping when we want to train and analyze a Pinnability model with a subset of training instances, such as following and Picked For You (PFY) models.
- Target value to indicate whether a Pinner has taken a positive action after viewing the Pin. We can train separate models that optimize different positive actions such as repins and clickthroughs.
- Feature vector that contains the informative signals for interaction prediction.
Pinnability model generation
In training Pinnability models, we use Area Under the ROC Curve (AUC) as our main offline evaluation metric, along with r-square and root mean squared error. We optimized for AUC not only because it is a widely used metric in similar prediction systems, but also because we’ve observed strong positive correlation between the AUC gain from offline testing and an increase in Pinner engagement in online A/B experiment. Our production Pinnability model achieves an AUC score averaging around 90 percent in home feed.
We experimented with multiple machine learning models, including LR, GBDT, SVM and CNN, and we use AUC score in 10-fold cross-validation and 90/10 train-test split settings with proper model parameters for evaluation. We observed that given a fixed feature set, the winning model always tends to be either LR or GBDT for Pinnability. For online A/B experimentation, we prioritize models based on offline AUC scores.
Among the thousands of features we added to the training instances, we select features that significantly increase our offline AUC metric as candidates for online A/B experiments. Given the large amount of candidate features, we often test new features in smaller groups, such as recency, Pin owner quality and category match features. The A/B experiments we conducted compare Pinner engagement between the control group using production features and the treatment group using the new experimental features. If the results are positive, we evaluate the extra data size and latency impact on our servers before adding the new features to our production Pinnability models. We iterate quickly on new features supported by our robust training instance generation, model training and evaluation pipelines. In order to monitor models’ on-going performance, we keep a small holdout user group that is not exposed to the Pinnability models. Comparing the engagement difference between the holdout and enabled groups provides valuable insights about Pinnability’s long-term performance.
Currently we only use offline batch data to train our Pinnability models. This poses a potential issue in that we’re not utilizing the most recent data to dynamically adjust model coefficients in serving. On the other hand, we tested and confirmed that the model coefficients do not change substantially when trained on different batches of data separated by several days, so the benefits of online model adjustment are subject to further evaluation.
We’re also exploring ways to apply online training with real-time instances to augment our offline training pipeline so our models are calibrated immediately after we gather new home feed activity data. Online training poses new challenges both algorithmically to our machine learning pipeline and systematically to our home feed serving framework.
Home feed serving
Home feed is powered by our in-house smart feed infrastructure. When a new Pin is repinned, smart feed worker sends a request to the Pinnability servers for the relevance scores between the repinned Pin and all the people following the repinning Pinner or board. It then inserts the Pins with the scores to the pool that contains all followed Pins. PFY Pins are inserted into the PFY pool with the Pinnability relevance score in a similar fashion.
When a user logs on or refreshes home feed, smart feed content generator materializes the new content from the various pools while respecting the relevance scores within each pool, and the smart feed service renders the Pinner’s home feed that prioritizes the relevance scores.
We continue to refine Pinnability and have released several improvements to-date. With each iteration, we’ve observed significant boosts in Pinner engagement, including an increase in the the home feed repinner count by more than 20 percent. We’ve also observed significant gains in other metrics including total repins and clickthroughs.
Given the importance of home feed and the boost in Pinner engagement, Pinnability continues to be a core project in building our discovery engine. We’ve also begun to expand the use of our Pinnability models to help improve our other products outside home feed.
We’re always looking for bright engineers to join Pinterest to help us solve impactful problems like Pinnability.
Yunsong Guo is a software engineer on the Recommendations team
Acknowledgements: Pinnability is a long-term strategic project being developed in collaboration with Mukund Narasimhan, Chris Pinchak, Yuchen Liu, Dmitry Chechik and Hui Xu. This team, as well as people across the company, helped make this project a reality with their technical insights and invaluable feedback.