Matching Attributes: How Alibaba’s AI Learned to Pick an Outfit
This article is part of the Academic Alibaba series and is taken from the paper entitled “Interpretable Partitioned Embedding for Customized Fashion Outfit Composition” by Zunlei Feng, Zhenyu Yu, Yongcheng Jing, Mingli Song, Yezhou Yang and Junxiao Jiang, accepted by ICMR 2018. The research project belongs to Alibaba-Zhejiang University Joint Institute of Frontier Technologies (AZFT). The full paper can be read here:
If you’ve ever wondered how training computers to process images in tasks like navigation and facial recognition works, imagine trying to train a computer to put together fashionable outfits from pictures of clothing items. It would have to be able to recognize not only each item’s individual qualities, but also how well that item paired with others, from simply belonging in an outfit, to meeting the more elusive criteria of good taste. If it could do these things, though, it wouldn’t simply allow a computer to do a stylist’s job; rather, it would tell stylists exactly what kinds of criteria make their best choices work and others flop, enabling them to do more with their own fashion taste and expertise.
Now Alibaba’s tech team, in collaboration with researchers from Zhejiang University and Arizona State University, is showing it can do just that with the help of a partitioned embedding network that recognizes attractive clothing combinations and forms suggestions to show customers on e-commerce platforms. Building on outfit composition systems already being developed for online shopping, these latest efforts greatly improve on previous “unexplainable” embedding techniques that obscure the specific decision processes and criteria behind system outputs.
The Alibaba-backed research team designed its partitioned embedding network to learn interpretable attributes from clothing items using a network architecture built from three components: an auto-encoder module, a supervised attributes module, and a multi-independent module. The auto-encoder module serves to encode all information relevant to an item’s attributes into the embedding. The supervised attributes module then applies multiple attribute labels to ensure that different parts of the overall embedding correspond to different attributes. Finally, the multi-independent module applies adversarial operation to fulfill a mutually independent constraint arrangement.
The framework of partitioned embedding network. The overall network architecture consists of three components: an autoencoder module, a supervised attributes module, and a multi-independent module. The auto-encoder module serves to encode all useful information into the embedding. In the supervised attributes module, attributes’ networks are used to ensure that different parts of the overall embedding correspond to different attributes. In the multi-independent block, adversarial prediction networks are adopted to make sure that different parts of the whole embedding are independent.
With the interpretable and partitioned embedding thus established, the team constructed an outfit composition graph and an attribute-matching map. Given specific attribute descriptions, the system proved able to recommend a ranked list of outfit compositions with interpretable scores for matching effectiveness. Experiments demonstrate that the partitioned embedding maintained unmingled parts corresponding to different attributes, preventing any loss of input information in the process.
Fashion outfit composition graph. In the graph, all items are classified into five classes according to category. For each category, items are clustered into different cluster centers according to attribute importance.
Already, results from a study involving 30 professional stylists indicate Alibaba’s approach can generate more subjectively desirable outfit combinations than previous models while at the same time further illuminating the factors that contribute to an outfit’s appeal. By ensuring that a specific input criterion (i.e. attribute) is maintained intact during all processes leading to output in combination with other attributes, this approach allows for exploration of how specific qualities in clothing items can influence their suitability in a complete outfit. As well as eliminating the uncertainty of previous approaches, it provides a foundation for future products aimed to help designers, businesses, and consumers select better product lines and wardrobes.
The full paper can be read here.