Community Spotlight

Spending 66 Hours To Build a Lipstick-Choosing AI

Using DocArray and Jina to save the (Chinese Valentine’s) Day

Alex C-G
Jina AI

--

Note: this article is a translation. You can read the original Chinese version here.

Introduction

Chinese Valentine’s Day has just been and gone. What, you didn’t know? Yikes, you’d better go out and buy some lipstick for your significant other to apologize!

But how can you choose the lipstick that’s perfect for their skin tone? What are you, some kind of skin tone expert? I know I’m not.

It’s lucky that one of Jina’s community members, Simon Liang, has built an app to do just that.

To use the lipstick search model, upload a pic of your loved one (or loved ones — we don’t judge), and then you’ll instantly find the best lipstick.

Simon is the CEO of Xiansi Technology. Let’s read his own (translated) words to see how the project went from concept to implementation. Take it away, Simon!

Watch the video

Development background

Every year on Valentine’s Day, anniversaries, and my girlfriend’s birthday, I would worry about choosing a gift. Buying lipstick seems simple, but it’s actually a bottomless pit. If you accidentally buy the wrong color number, it can easily turn into a disaster.

Analysis design

So what information can be extracted from a photo to help search for lipstick? I saw that every time my girlfriend tried a color at the cosmetics counter, she would apply lipstick to the back of her hand or forearm, and repeatedly check to see if her skin tone and lipstick color match.

That is to say, the most important thing is to find the relationship between lipstick and skin tone, namely what kind of lipstick suits what kind of skin tone. Therefore, as long as the lipstick and skin color information are extracted from the color test video one by one, we can find out what kind of lipstick people with similar skin color are using. You can let big data help make purchasing decisions to find the most suitable lipstick.

Lipstick database

Since the shooting methods of the color test videos are different, Google’s MediaPipe facial feature point mapping algorithm is used to define 468 feature points on a 3D model of the face. Through machine learning technology, these feature points are mapped onto a 2D picture. As long as the swatch map has a full face, it can stably find the cheeks, lips and chin, so as to capture the pixels of these key parts.

Overview of model structure¹

After capturing the pixels, I use a K-means clustering algorithm to extract 20 representative skin colors and 20 lip colors. Since the default RGB parameters cannot intuitively and completely represent the lip color, it needs to be converted to HSV that can represent the hue, saturation, and brightness. By putting the distribution of these 20 colors into the histogram, it is not difficult to find that the hue parameters of lipsticks are basically concentrated in one place. When the lipstick color is more pink, the hue will be more inclined to blue-purple; when the lipstick color is more orange, the hue will be more inclined to orange-yellow.

Finally, by converting the histogram of the color distribution into a vector, the similarity between the two vectors can be calculated. Once the algorithm is understood, it can be said to be 80% successful. But my 20 years of work experience tells me that the last 20% often takes 80% of the time. At this time, Jina’s ecosystem gave me a lot of surprises.

Traditional search vs neural search

As an Elasticsearch certified engineer, I am very familiar with building traditional search engines. Traditional image search is more about associating the image with the corresponding text label to realize the search from text to image.

Neural search, on the other hand, can “see” what’s in a picture almost like a human. “Everything can be embedded”, anything can be described by a vector, and the computer compares and analyzes the vectors returned by the AI model to find similar results.

Jina experience

The DocArray provided in Jina’s ecosystem is a very flexible data container. It can not only store text and numbers, but also convert binary files such as pictures and sounds into NumPy arrays and store them in the most efficient way.

In addition, DocArray provides an efficient vector search method that can complete millions of vector searches in seconds. Through DocArray, I can quickly build a lipstick class in Python, and store the algorithm running results in a notebook in this class. The search and retrieval in the code are very user-friendly.

As for the Jina framework itself, it is an AI framework specially tailored for engineers. Many machine learning libraries have high hardware requirements and require a GPU. But in our project maybe we only need a GPU in one of the steps, and the CPU can solve the remaining steps. Through Jina, I can disassemble each step of the algorithm into Executors which I deploy on the cloud. During execution, Jina can call these Executors according to the preset steps, and finally output the results. Assembled together and returned, this is cloud-native technology.

In this project, I created two Executors, s3_downloader and FaceMesher. I will also upload the two Executors in this project to Jina Hub (a reusable component sharing platform), and share them for everyone to use.

Some problems encountered during the development process were also quickly fixed with the support of the Jina community. I was pretty smooth in the last 20% of the project, and quickly implemented this lipstick search engine.

Project demo

In addition to the skin tone search shown above, I also implemented a lip color search. When you want to know what lipstick can be used to reproduce Liu Yifei’s lip color, you can take a still photo to search and find a similar style corresponding to the makeup look.

Developer’s feedback

Q (Community Assistant J): Hello Simon, how do you feel about the experience of using Jina during the development of this project?

A (Simon Liang): The whole thing is still the same, our team will always use Jina to encapsulate our ML Pipeline. Because we have been thinking about how to add AI models to our core products, and Jina is a very good container, which allows us to schedule these AI models very quickly. Without Jina, we need to figure out how to re-distill these AI models, use the native deep learning inference framework, and then implement them. **But with Jina, I just need to wrap it up and deploy it. It is a strong improvement for the speed of our online launch and the robustness of the entire solution. **

Q (Community Assistant J): During the development of this project, how did Jina specifically help you?

A (Simon Liang): Before getting to know Jina, I had to write a lot of code by hand, first using Transformers to make a service, and then encapsulating a REST service. After using Jina, this piece can be omitted. If you need to quickly land in the initial stage, after the model is finetuner, you can directly use Jina to serve. The entire development process is shortened a lot, because the functions and performance provided by Jina itself are relatively guaranteed.

Tools that can quickly help the engineering of AI technology and implement AI models are quite rare on the market. Jina just fills the gap in this part. Because many ML pipelines do not have a dedicated team to distill these PyTorch models and put them online, they can traditionally be difficult to work with. And with Jina, it can be done very quickly. In general, Jina can make these complex AI applications more approachable.

References

[1]. “Face Mesh”. 2022. MediaPipe. https://google.github.io/mediapipe/solutions/face_mesh.

--

--