“Osoroi Code” — How to be dressed like your beloved ones
This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (Term 2020/01).
Abstract: This project’s goal was to create a clothes matching tool as a web application. At first, the Data Science team generated a clothes imagery data set in the correct format with the needed labels. Using this data set, the AI team trained a model that outputs the most probable labels associated with a new piece of clothing. From these labels, the model determined similar clothes images. In the final step, a web application presented the results.
Have you ever wondered why there are so many pictures from people at Disneyland in Tokyo wearing the same outfit? Like it’s a pretty serious thing for them?
→ Fun Fact 1: Neither did we.
→ Fun Fact 2: It is!
Usually, you may think more about not wearing the same outfit as your friends do, but in Japan, this is a desirable trend. It’s called “Osoroi Code” (お揃いコーデ for the native speaker), has plenty of followers, especially among young women, and is still growing in popularity. People following Osoroi love to show their friendships and adhere to the values of the group they belong to. To make this belonging as visible as possible, they wear the same shirts, pullovers, or accessories. At this point, our project kicks in. Our idea was to develop an application that analyzes an image containing a piece of clothing provided by a user. It attempts to classify the type of clothes and present similar articles. That way, it should be straight forward for the user to find a matching outfit, Osoroi like.
Two professions were essential for the realization of this project: Data Scientists and AI developers. Our two data scientists mainly worked on the data to prevent “garbage in — garbage out” — issues, while our three AI developers taught Python how to classify fancy shirts. During an initial brainstorming session, we filled a Kanban-Board to keep an overview of all tasks and defined some milestones, which we felt necessary to achieve during our journey. Once per week, we presented the results of the past week, discussed new insights or blockers, and held coding sessions using Slack group calls. We further used Git for version-control and Notion.so as a project management tool.
We initialized our project work by searching for a dataset we could work with and found one published by the University of Hongkong containing almost 290.000 pictures, each of them labeled and categorized. Throughout the following weeks, we made some severe findings. First, we detected some labels that seem to be insufficient for our algorithm (Have you ever needed to describe a pullover as “Spongebob / MinaKwon / PatrickStar / Nylon / Hoodie”?). Second, clothes with prints increased the number of imperfect labels since they are unique and hard to classify. Third, we recognized that this dataset was the result of another labeling algorithm and, as such, not helpful to train our algorithm. If we’d used this dataset, we would’ve adapted all shortcomings of the other algorithm. We started looking for another dataset and were successful on Kaggle. We found a dataset containing “only” 44.500 pictures categorized by category, (sub)type and gender target (see figure 1 for an overview). After data cleaning and transformation using Python and its libraries Pandas, NumPy, and Matplotlib, we had a dataset that we could use to train our model. For this task, our AI team decided to use PyTorch because it is easy to learn, there is excellent documentation on the internet, and it offers easy debugging and data parallelism with the possibility of GPU processing making it capable of competing with TensorFlow. To structure our code more clearly, we used PyTorch Lightning as a wrapper. At first, we experimented with the classification of clothes from the Fashion-MNIST toy dataset. It worked really well, even with a small neural network. But when we tried to classify higher resolution images, we realized that our network wasn’t powerful enough. So we took advantage of Transfer Learning and used the ResNet-50. This pre-trained model was a good starting point for our classification task. It offered a good compromise between training duration and prediction accuracy.
We only had to modify the last layer of our model to get the needed number of targets. During the first training process of our model, we discovered that the performance of our algorithm was beyond practicability, so we tweaked the last layer again to achieve a feasible execution time of around three minutes per epoch on a GPU. With the trained model, we could then start classifying clothes.
The main focus of our project was to develop an algorithm that is capable of classifying pictures of clothes to find similar pieces of clothing.
The Kaggle clothes picture dataset we used for the training and evaluation of our model contained more than 44.500 pictures with almost 200 different labels. Our neural network was built on the foundation of ResNet-50, with one additional output layer added to receive the correct labels for our classification process. Our web app then displayed the results.
The interface of the web app is deliberately kept simple. The user can upload an image containing specific clothes. After a short period of analysis, our algorithm presents a set of best-matching labels for this image (with its matching score). Furthermore, it also shows four more pictures, each of which has the most excellent match with these attributes.
Throughout our TechLabs-Journey, we experienced how important a high-quality dataset is and which role data cleaning plays in the process of developing such an algorithm. It was incredibly challenging to find a sufficient dataset with correct and useful labels, and crucial to transform and clean it according to our requirements. Apart from that, we also figured out that training an image recognition algorithm can be quite time-consuming without a GPU with CUDA support so that system performance can be a serious barrier.
A positive surprise was that pre-trained models are precious and an excellent base to build upon. Moreover, we were impressed with how easy it felt to transfer our algorithm to a web app.
Impact & Outlook
At first glance, our tool may seem like a solution to a problem that most people in our Western society have never seen as one. But if you take a closer look at it, you could conclude that it could be applied to fields beyond clothes, such as various DIY approaches. For instance, one could simply upload a picture of a screw whose product characteristics are unknown, to be able to repurchase the same screw. Such features could revolutionize the way of buying online. However, our algorithm would need certain advancements to be integrated into webshops of any kind. In particular, the ability to recognize specific patterns, instead of colors only, would be one of the essential requirements on our bucket list.