Building a more inclusive way to search
Laksh Bhasin | Pinterest engineer, Search Quality
Every day, tens of millions of people search for ideas on Pinterest, whether it’s a recipe to cook for dinner that night or a new hairstyle to try over the weekend. According to a Pinterest study, 70 percent of people use Pinterest to discover and save everyday looks and styles they want to try. With more than eight billion beauty and hair ideas saved, we’ve been testing a new feature to help Pinners more easily find relevant beauty ideas in search. Today we’re starting to roll out the beta version, which enables you to narrow beauty results by a skin tone range. In this post we’ll cover how we built and implemented a more inclusive search experience.
More than 200 million people use Pinterest every month. Our product reflects the global interests and tastes of people all over the world, with more than 100 billion ideas to explore. However, it’s not always easy to find the most relevant results. The majority of queries on Pinterest are less than three words, which presents an interesting serving challenge. In addition, our current ranking algorithms are heavily influenced by what the majority of people have engaged with over time. This means that some Pinners have had to work harder to find what they were looking for.
Any search engineer knows how important it is to make the experience seamless and easy to use, so that a user re-queries as few times as possible. We heard from Pinners that they couldn’t always find what they were looking for when searching for hair and beauty ideas, so we wanted to address this problem, starting with skin tone ranges. The skin tone ranges experiment enables you to customize your search results by a skin tone range. We’re starting with four palettes, and each represents a range of skin tones. As our technology improves and we gather more feedback from Pinners, we plan to expand the ranges to more skin tones.
Detecting a skin tone in an image is a challenging problem, because it depends a lot on lighting, shadows, how prominent a face is, blurriness, and many other factors. The most accurate way to detect a skin tone in a Pin’s image would be to have a human label every single Pin image according to a scientific skin tone palette. But with billions of unique images, and many more created each day, that’s an expensive approach that would be hard to scale.
Instead, we use a method that works at scale — machine learning. It’s not always perfect, but feedback from Pinners has been encouraging so far.
In order to move quickly to address this problem, we used a third-party Face AI library from ModiFace, a company specializing in augmented reality and machine learning for beauty applications. With the help of deep neural networks, ModiFace produced successive algorithms for skin tone detection. We went through a few iterations in order to improve this algorithm, especially for images with poor lighting and prominent shadows. For example, in the initial model, the image below was detected as a dark skin tone, because the lighting and shadows in the image are difficult for a machine learning algorithm to pick up on.
One way we tried to generate more training data to correct the machine learning algorithm was fetching results and passing them through Sofia, our human evaluation platform.
There also were a number of other factors we had to consider. For example, not every Pin contains a face, so we do some preliminary filtering based on a Pin’s category. We distribute the backfill process by running a Spark job on multiple worker nodes, while being careful not to send too much traffic at once to Amazon S3. In order to ensure the backfill runs relatively quickly, we use smaller image sizes to speed up the skin tone detection algorithm, even though the algorithm itself is generally size-independent. There’s a definite tradeoff between speed and accuracy, which we’ll continue to improve.
As new Pins are added to the system, we incrementally run the skin tone detection algorithm on just those new Pins, so we continue to increase our coverage of skin tone data and improve results.
When we run the skin tone detection algorithm on each image, we convert the RGB color output into the Lab color space. This color space has an axis L for “lightness” and two axes a and b for color components green-red and blue-yellow. Complexions typically fall into a specific subspace of the ab color plane, and we use the L coordinate to select different skin tones, from lighter skin tones (high L) to darker skin tones (lower L). Since many of the images on Pinterest are high quality photos with good lighting, we used the CIE-L*ab space with a 2-degree viewing angle and D65 illuminant (daylight).
Serving skin tone ranges
We leveraged much of our serving and logging logic for skin tone ranges using previous work we did for recipe filters, which allow Pinners to search for recipes that match their dietary preferences.
This initial version of skin tone ranges filters content based on skin tone darkness or lightness using four ranges, each with some amount of overlap. On the front end, skin tone ranges are shown to users as quadrants, so that it’s clear each palette actually encompasses a range of skin tones. To ensure a good experience for Pinners, skin tone ranges are currently only shown for a predetermined list of common hair and beauty queries.
Pinterest is about more than just finding inspiring images (in this case, of people sporting hair and beauty ideas). Pinners expect to find actionable content, like beauty products and tutorials (we take a similar approach for Lens results). In order to make our results more actionable, we blend different types of Pins with the appropriate skin tones into results using our query rewriting and understanding framework.
We also rewrite queries to ensure more engaging results for each specific skin tone range. If a user searches for “makeup” and chooses a darker skin tone range, that tells us they’re looking for deeper-toned makeup ideas, and enables us to refine our search ranking logic to better meet their expectations.
It’s important that Pinners know we respect their privacy. That’s why if you tap a skin tone range, we do not store this information or use it to build a profile for you. This means you’ll need to tap a skin tone range each time you search. We also don’t use this information to target ads. We do not attempt to predict a user’s personal information, such as ethnicity.
Once this beta experiment rolls out to all Pinners, our future work will generally focus on improving the accuracy of results and bringing the experience to more platforms. We’ll try out new methods of query rewriting and blending in actionable content and hope to improve our search ranking models in order to better take into account the selected skin tone. To make experimentation easier with new skin tone detection algorithms, we’ll have to make a few backend changes to allow us to index multiple detected skin tones and run A/B experiments more easily.
Lastly, skin tones are just the start of building a more inclusive search. We hope to help Pinners find more personalized results by offering more ways to narrow your search.
We’re always working on improving our system to give Pinners a more personalized search experience. If you like working on search problems like this, join our team!
Acknowledgements: This project was built by a cross functional team of passionate Pinployees who care deeply about inclusion and diversity, and wish to scale these values for Pinners. Thank you to Rahim Daya, Xixia Wang, Stephanie Rogers, Frances Haugen, Candice Morgan, Karen Gomez, Larkin Brown, Antonio Alucema, Andrey Gusev, Christina Lin and Aaron Ru.