Applying deep learning to Related Pins

Pinterest Engineering
The Graph
Published in
6 min readJan 12, 2017

By Kevin Ma | Pinterest engineer, Discovery

One of the most popular ways people find ideas on Pinterest is through Related Pins, an item-to-item recommendations system that uses collaborative filtering. Previously, candidates were generated using board co-occurrence, signals from all the boards a Pin is saved to. Now, for the first time, we’re applying deep learning to make Related Pins even more relevant. Ultimately, we developed a scalable system that evolves with our product and people’s interests, so we can surface the most relevant recommendations through Related Pins. In this post, we’ll cover how we use deep learning to generate recommendation candidates, which, in testing, has increased engagement with Related Pins by 5 percent.

Board co-occurrence

For years, Related Pins have been generated using board co-occurrence. For example, Pin X is saved to N boards by many people. Related Pins for Pin X are generated from all the other Pins saved to the N boards. Board co-occurrence provides infinite candidates to recommend: every Pin has been saved to a board, therefore every Pin has at least some Pins we can generate recommendations from.

However, board co-occurrence has a few disadvantages:

  • Board segmentation. Pinners often have multiple boards for one interest. Someone may save a wine-related idea to a wine board, but cocktails made with wine to a separate cocktails board. This gap in boards makes it challenging to recommend related drinks for the wine Pin.
  • Board granularity. Boards are usually created for a broad topic, so Related Pins surfaced using board co-occurrence can be tangentially related. Figure 2 shows a query Pin of a lion couple cuddling and was saved to the boards “animals” and “wild animals.” As a result, the Related Pins are not exactly cuddling lions, but all kinds of wild animals.
  • Board drifting. The topic of a board tends to change over time as an interest evolves. For example, a “healthy” board could start with fitness ideas and eventually evolve into other areas like recipes.

Board co-occurrence can lose the context of a Pin, so we needed a way to better understand Pins and their relative relationships to Pinners. Inspired by the Word2vec approach for creating word embeddings in the context of human language, we developed Pin2vec to embed Pins in the context of Pinners’ activity.

Using deep learning to generate Related Pins

We built Pin2Vec to embed all the Pins in a 128-dimension space. First, we label a Pin with all the other Pins someone has saved in his/her activity session, each as a Pin tuple. Pin tuples are used in supervised training to train the embedding matrix for each of the tens of millions of Pins of the vocabulary. We use TensorFlow as the trainer. At serving time, a set of nearest neighbors are found as Related Pins in the space for each of the Pins.

Training data is collected from recent engagement, such as saving or clicking, and a sliding window is applied. Low quality Pins and those not engaged with are removed from training. Then, each Pin is assigned with a unique Pin ID. Within the sliding window, training pairs are extracted such that the first Pin is the example and each of the following Pins is its label. Figure 3 illustrates an example session and training pairs. In our case, you can imagine each user session is a sentence with Pins as words.

We used a feedforward neural network with a hidden layer of 128 dimensions. Figure 4 shows the architecture. The network is inspired by word2vec. The input vector is a one-hot vector with a size of vocabulary and, in our case, is tens of millions of Pins. The vector is reduced to the 128-dimension vector by multiplying with the hidden layer weight matrix. An eLu activation function is applied after hidden layer. At last the hidden layer output is multiplied with the softmax matrix and a cross-entropy is used to calculate the loss. We sampled 64 negative Pins in loss optimization in lieu of iterating on tens of millions of Pins. We trained the Pin2Vec embedding on machines with 32 cores and 244GB memory.

The learned Pin2Vec groups Pins with respect to a Pinner’s recent engagement. The Related Pins are closer to each in the high dimensional space. Figure 5 shows the 3D presentation of the Pins, and each is depicted as a point.

Next, we look up the N nearest Pins for each of the Pins by their distances. We found the results are more relevant than using board co-occurrence signals. Figure 6 shows the Pin2Vec-found Related Pins for the cuddling lions. Every Pin is an image of a couple of female and male lions snuggling up, suggesting Pinners were looking for the exact topic when they saved these Pins. Figure 7 shows the Related Pins for a bottle of wine. Board co-occurrence only found images of bottled wines, whereas Pin2Vec found recommendations for drinks made with wine. This suggests Pinners actually saved the bottled wine Pin and the wine cocktail Pins in the same time series. Pin2Vec helps bridge the gap.

Pin2Vec has become an important source for generating candidates, but it doesn’t replace board co-occurrence. In our tests, we’ve found the board co-occurrence is better performing for long tail Pins that are sparse in engagement data. Both board co-occurrence and Pin2Vec are used to generate candidates, while a separate rerank system sorts the candidates based on richer features (we’ll discuss reranking in a separate post). The euclidean distance of Pin2Vec neighbors will be one of the features.

Pin2Vec is now an important candidate source powering Related Pins. It’s a big step in improving the relevancy of Related Pins. Looking ahead, we’re already making our models faster and analyzing more signals to better personalize recommendations for Pinners around the world.

If you’re interested in building neural network learned embedding like Pin2Vec, join us!

Acknowledgements: This technology was built in collaboration with Diveesh Singh, David Liu, Raymond Shiau, Mukund Narasimhan, Andrew Zhai, Frances Haugen, Zhigang Zhong, Stephenie Rogers, Derek Cheng, Vanja Josifovski, Dmitry Chechik, Xin Liu, Dmitry Kislyuk and Jenny Liu. People across the whole company helped launch this feature with their insights and feedback.

--

--