Ultra Personalization: Where Machine Learning Meets Human Expertise

Published in

CODE + CONTOUR by IPSY

5 min readSep 2, 2021

Personalizing the member experience (and innovating new ways to do so) is at the heart of all our BFA brands. We continually ask ourselves how we can offer the very best personalized experience, and in August 2020, this was our answer: Ultra Personalization.

Ultra Personalization is a feature that gives IPSY Glam Bag Plus members more control over their bags, allowing them to build their perfect bag every month by choosing three of their five full-size items. How it works is that we select the first two bag items using our proprietary allocation algorithm, IPSY Match. Next, we personalize an assortment of even more products for members to choose from.

Now, one year after the launch of Ultra Personalization, Senior Product Manager Barbara Evangelista is here to walk us through the technical overview of this massive beauty operation. Read on to learn more about how we couple machine learning with human expertise to create unprecedented personalization.

Every month, from 2nd-3rd, Ipsters subscribed to Glam Blag Plus participate in Ultra Personalization.

The Data

A feature like Ultra Personalization requires a massive amount of data, and it all starts with a member’s Beauty Profile: a unique profile created the moment a member takes the Beauty Quiz and joins IPSY. Each Beauty Profile leverages hundreds of data points about a member’s preferences, and since members are also able to update their profiles as often as they’d like, we can observe changes they make to their preferences over time. We also use data gathered from product and bag reviews, which provide valuable feedback about products members have been matched with, added to their bag, or chosen on their own.

Information about the products themselves comes into play as well. Thanks to the expertise of our talented Merchandising teams, we have product information, or metadata, ranging from the simple (category, brand, or shade) to the complex (ingredients, packaging, or format), that can be mapped to member preferences directly through the Beauty Profile. This metadata helps us understand patterns in member behavior gathered from various touchpoints, from site engagement to ratings and purchases. Take a red lipstick for example: For a simple approach, we could observe how a member rated this product. Or, we could take it further by calculating a member affinity more broadly, from the “lipstick” category, to a specific brand of lipstick, to a specific shade of lip products in general.

Product metadata also helps us identify products that may not be viable matches for members. Some products — like foundations, concealers, or brow products — have specific targeting with respect to skin tones or hair colors. Additionally, there are products like aerosol hair sprays or CBD lotions, which have shipping and age restrictions, respectively. Product metadata enables us to filter these products out entirely to ensure they are only displayed to the appropriate members during the Ultra Personalization experience.

The Personalization Process

For the Ultra Personalization experience, we want to give members plenty of options to choose from. In a typical month, we have over 50 products available in the Glam Bag Plus assortment, so for each of the three choice product spots in a member’s bag, we create a mutually exclusive set of products from the total available. Each set has been balanced by category and brand for maximum variety and is then further personalized to every member.

An essential part of making Ultra Personalization as delightful as possible is being able to quantify for a member, their affinity for every product. Transforming the data described above to the member-product level for all historical pairs of members and products, we can train a model to predict whether a member will or will not choose a product. We can then use this model to produce a score for every upcoming member-product pair, given a new assortment of products, thus we call this model the “scoring model.”

Since we can train models optimized for various behaviors, the scoring model is the perfect opportunity for A/B testing. For example, we might train the model on churn, looking to answer whether a member will cancel their subscription after receiving a product. We can also train on product ratings, determining how a member will rate a product on a scale of one to five.

For every member, we create a “collection” of products, in other words, three sets of products ordered by the output of the scoring model. We eventually pass this collection to the site to be displayed during Ultra Personalization. During the experience, we will surface four to six products per choice product spot (for a total of 12 to 18 products) to ensure a streamlined and user-friendly mobile experience.

While these scores may be able to predict whether a member will or will not choose a product with strong confidence (based on model KPIs, such as AUC, RMSE, or NDCG, depending on what type of model we use), we don’t rely solely on these scores to create our members’ product collections. Why? Because we risk curating a collection from the assortment that isn’t actually as exciting as the score suggests.

Here’s an example: Let’s say we have a member who indicates they want to receive liquid lipsticks as often as possible, and there happen to be three different shades of the same product in a single set of choice products. Our scoring model might rank all three shades as the top products in that set, which, when displayed onsite, would place them in the most valuable real estate in the Ultra Personalization interface. Three products, out of the four displayed, with minimal variety except in shade are not the personalized beauty discovery experience we are aiming for. In fact, it would seem we are forcing the member to choose a liquid lipstick, the only true choice is a shade.

To mitigate this, we work with our merchandise and planning teams to produce a set of criteria (based on NPS surveys, product and bag reviews, and comments) to guide what products we display to members. For example, we’ve found that members prefer to see products that are new to IPSY or diverse in terms of brand or category, so we built these rules into an algorithm intended to create diverse collections consisting of products we know the member will love as well as new items that can surprise and delight.

The Takeaway

At BFA, we are passionate about constantly improving the member experience. Applying machine learning to Ultra Personalization has certainly helped us deliver on our promise of delightful, personalized beauty for everyone, but it also has given us an amazing opportunity to learn from our members. We’re now discovering patterns of behavior that would have otherwise been hidden from us. With more Ipsters than ever before engaging with our site and our products in new ways, we’ve unlocked opportunities to apply machine learning, which naturally includes experimenting with our IPSY Match algorithm. Keep reading our blog, Code+Contour, for more content around how we continue to integrate machine learning into the BFA member experience.

Ultra Personalization: Where Machine Learning Meets Human Expertise

The Data

The Personalization Process

The Takeaway

Written by Barbara Evangelista