Rui Li & Heath Vinicombe | Engineers, Content Knowledge
Every day, people come to Pinterest to shop and find ideas in categories like home and fashion, and it’s our responsibility to show them the most relevant and inspiring products. We measure relevance in a few different dimensions, but most importantly, we need to show Pinners results that match their expectations. For example, they wouldn’t want to see boots when they search for sandals.
We’ve outlined our work in classifying Pins with the correct entity (sandals vs. boots) through Pin2Interest, which provides high-quality levels for Pins that align with our rich taxonomy of interests. However, we’ve learned there are also instances where shoppers want to help in identifying attributes in their search (for example, she might specifically be looking for a “purple empire waist dress” for her prom). If we can surface the results by understanding the specific attributes of each Pin, we can return even more relevant results.
We built a machine learning system with the help of fashion specialists who developed a detailed guide of fashion terms and descriptions. The system has resulted in greater accuracy in results and better recommendations.
Without the adoption of attribute work, the attribute keyword is treated no differently from any other keywords, which makes the results noisy. For example, a search for “pink tennis shoe outfit” in Product Pins returned some results that are not pink at all.
Ultimately, the goal here is to understand our Pins at entity + attribute levels:
How do we do it?
In our first phase, we focused on seven objective fashion attributes and enabled them for relevant categories in Women’s Fashion, one of our top product verticals:
- Dress Length
- Dress Style
We have released the following for each of these seven attributes:
- Attribute values
- Labeling guide and high-quality human-labeled data
- A Pin level classifier called Pin2Attribute to tag attributes to each Pin
Defining attribute values:
We collaborated with fashion specialists to define the attribute values for each attribute. They defined the attributes that are useful from both inventory perspective and shopper perspective. The values are defined in a fine granularity, e.g. “geometric” and “watercolor” for “pattern” attribute. On average, we defined 15–30 values for each attribute (excluding brand), which enables us to understand the attributes in a fine grained level if needed.
High quality labeled data collection:
Like many other companies, we use crowdsourcing extensively for labeled data collection across teams. Crowdsourcing works well for simple tasks such as, “Is this picture relevant to this query?”, but we needed additional human input to develop the best results. Due to lack of domain knowledge from the labelers, we weren’t able to get the high quality labels we needed from crowdsourcing, and so we partnered with fashion specialists to solve for such complicated labeling tasks.
In order to give the labelers enough relevant information, the fashion specialists provided elaborate labeling guidance with both text definition and examples images, as well as more context for confusing cases.
Here is a typical definition page (for attribute “Dress Style”):
Here’s a page explaining the difference between “Sheath” and “A Line”:
This detailed labeling guide enabled us to obtain high quality labeled data, resulting in a >95% agreement rate and >95% labeling accuracy.
Using a Pin-level keyword extraction system called annotations, we built a prototype for our Pin2Attribute classifier. We began by piggybacking on the annotation results and text matching. The idea is quite simple: if the annotation contains a keyword that’s one of the defined attribute values, e.g. “red”, we tag the Pin with this attribute: Color = red. We also used a simple heuristic scoring function that is tuned based on text sources.
The text based solution yields decent performance on key attributes, such as 70% precision for “color”. However, this solution has one major problem: low coverage. The coverage of text based color classifier is only 50%, so if a Pin doesn’t contain any of the attribute keywords, no attribute will be extracted at all.
We’ve invested years of work in image understanding, so we were able to leverage the unified visual embedding that was built by our visual search team, and build a text + visual fused classifier based on it. At this stage, it’s a simple shallow model. The input is a concatenation of categorical feature of text based predictions (16 dimensions for “color”) and the unified visual embeddings (2048 dimensions). Then there’s one fully connected hidden layer and an output softmax layer which outputs the probabilities of each attribute value (16 dimensions for “Color):
This simple shallow neural network yields 100% coverage which solves our coverage issue. In addition, it also improved the precision to 79%. We have productized the visual classifier and and are working on productizing pattern, texture, etc.
Here is an example of the Pin2Attribute results for a Pin (with unnormalized scores):
If we look at the example we presented at the beginning of this blog, 25% of the search results were not relevant in terms of the color. After applying color similarity and brand similarity into the search ranking model, we saw relevance improvement for products.
The image below shows the control group (on the left) and the treatment group (on the right). You can see that in the treatment group, 100% Pins are pink, showing the contribution of the color attributes. The overall relevance improvement by using color and brand is 1.4%. The relevance is measured by comparing the offline human evaluation of the control and treatment search results for a given set of queries.
Another major surface for shopping discovery is through Related products which are recommendations that appear below selected subject Pin. The adoption of the seven attributes has improved engagement, including overall impressions by 1%, repins by 2.7%, overall Product Pin impressions by 0.7% and repins by 1.4%.
Below is a qualitative example:
The top row of Pins are the “control” group with the leftmost Pin as the subject Pin and the ones next to it as the Related Product Pins. Here we can see that with a “red dress” as the subject, three “related products” aren’t red. However, after adopting the seven attributes, there’s only one “related product” that’s not red. This is caused by the lack of none 100% precision of the Pin2Attribute classifier, which shows we have room to improve.
While Pinterest is great for exploring ideas like “prom dress”, “outfit for working”, “winter outfit”, “bohemian look”, we’re working on ways for results to become even more specific when searches allow for it. Using attributes such as occasion and season and style, which will go beyond previous seven attributes, this is doable.
In phase two, we’re going to test out these subjective attributes. We’ll be using a different approach than the text + visual based classifier for Pin2Attribute classification. Stay tuned.
In this blog, we presented how Pinterest tackles the attribute understanding for products in an early stage. We’ve proved that by understanding seven objective attributes on fashion products using a simple text + visual classifier without extensive model tuning, we can improve relevance for both product search and related products.
This gives us confidence to invest more on this project by turning our attention to:
- enabling more subjective attributes
- expanding to new verticals, e.g. home decor and food and drinks.
Great thanks to everyone who has contributed to this project: our EM Yunsong Guo; our PMs Angela Guo, Miwa Takaki and Mohak Nahta; our engineers Ruben Sipos, Avinash Nayak, Monadhika Sharma, Olafur Gudmundsson, Neng Gu, Eric Kim, Raymond Shiau, Eileen Li, Paul Baltescu, Yiwei Sun, Emaad Ahmed Manzoor (intern), Hanzi Mao (intern), Mikayla Timm (intern); our fashion specialists and labeling team: Mariellen Barros, Serena Perfetto, Lauren Goodnow, Marta Scotto and Jade Feltwell. Also special thanks to Yu Liu, Tim Weingarten, Jeff Harris, Chuck Rosenberg, Kunlong Gu, Troy Ma, Edmarc Hedrick, Stephanie Rogers, Lulu Cheng, Helene Labriet-Gross and Kate Taylor for their support and help.
We’re building the world’s first visual discovery engine. More than 300 million people around the world use Pinterest to dream about, plan and prepare for things they want to do in life. Come join us!