Using Statistical Scoring to Find the Best Algorithm for Dress Recommendations

Published in

Queenly Engineering

5 min readMay 15, 2020

An example of Queenly’s “More Like This” feature

Here at Queenly, our goal is simple: to provide an affordable platform that will allow girls to feel and look her best. It is no surprise that recommendation systems are everywhere and essential for the business success of many large companies (Amazon, Spotify, Youtube). We are constantly looking for ways to improve the experience for our users and we figured, “what better way to do that than implementing a recommendation engine?” In this post, we will outline the steps we took to develop an algorithm for our dress recommendation engine, reveal the final results, and explore our next steps for improvement.

The Problem — Relevancy and Accuracy in Most Similar Dresses

Since we are still in the early stages and thus lack the amount of data needed to generate recommendations based on a content-based or collaborative filtering approach, we decided to start off with a knowledge-based system. If you know anything about shopping for dresses and assume that recommendations should be based on some combination of brand, color, size, style, etc., then you’re right.

Through a custom classification system of extracting keywords from text and images, we created a list of tags based on a seller’s given attributes of a dress for each item in our collection (to be described in a later blog post). The guidelines for our tags field attribute are constantly being refined and revised, as this field will serve as the basis for our recommendation engine.

Algorithm First Iteration: Generating Candidates by Applying TF-IDF

The first step of the algorithm involves generating candidates for each wardrobe item in the database, which will explain why we placed a larger emphasis on flawless tags.

For each dress, we used the built-in sci-kit learn TfIdfVectorizer class to create Term Frequency-Inverse Document Frequency (TF-IDF) vectors derived from each dress’s unique tags.

In short, TF-IDF is a technique that breaks down words into features and assigns a score for each word based on its frequency and importance. Using the matrix generated from TfIdfVectorizer, we calculated cosine similarity scores between every pair of dresses. In just three lines of code, we now have a set of values that we can work with for ranking:

tfidf = TfidfVectorizer()
tfidf_matrix = tfidf.fit_transform(wardrobe[‘wardrobe_tags’])
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

Algorithm Second Iteration: Weighted Scoring with Filtering

Now that we have numbers defining the similarity between dresses, there needs to be a way to rank the top most similar dresses for each dress in order to output our recommendations. Of course, this can easily be done using pd.sort(), but we decided to assign weights to specific attributes that we deemed important.

A simplified representation of our custom algorithm is as such:

The first set consists of dress candidates with TFIDF scores that match target dress j on color and size, which is given a weight of 𝜶. The second set consists of dress candidates with TDIDF scores that match target dress j on size only, which is given a lower weight of 𝜷. All other dress candidates that don’t fit either criterion in these two sets are discarded as irrelevant matches.

Our top recommendations are then created from the union of the two sets of dress candidates. We use topK() as a shorthand for labeling the process of extracting the most relevant recommendations and filtering out the ones that do not match on size or color, which is achieved through basic pandas filtering and sorting methods.

Final Results

Our weighted TF-IDF recommendation engine is then put into production as a processing step in our data pipeline. After applying it to our dress database, this is how it appears on our web app! (not bad, right?)

Web example of the “More Like This” feature on Queenly

Future Plans

Similar to how we are constantly looking for ways to improve the experience for our users, we are also hoping to improve the algorithm in the next few iterations. In the future, we hope to implement the algorithm using a collaborative filtering approach in conjunction with weighing additional features in our custom ranking equation. In this method, we aim to build co-occurrence matrices based on user engagement data (dresses tapped, saved dresses, etc.) to further generate relevant content for our users.

In a more advanced approach, our recommendation engine will utilize a knowledge dependency graph where a node represents a dress and an edge represents a user who saved that dress, giving us the ability to capture a large amount of rich data from our users.

As our inventory and user base expands, knowledge graphs will generate the most relevant candidates by traversing the graph from a set of nodes that are already known to the user. From here, we can use a random walk algorithm to find dress recommendations at the intersection of multiple starting points and output the most personalized recommendations for our users.

Thank you for reading! If you’re interested, make sure to follow us on Facebook and Instagram for more updates on our company. Sign up to become a user today and join us in our mission of empowering and helping girls all over the world 💜

About the Author

Erin Liu is a software engineering intern at Queenly and a rising senior at UC Berkeley studying data science.