FashionGen and ECCV Challenge

High quality, professional photos with detailed annotations for fashion image generation

Thomas Boquet

Published in

Element AI Lab

3 min readJun 8, 2018

Written with Peli Grietzer, Negar Rz and Simon Hudson

This post has been updated following the release of the dataset. Work on the dataset and the competition was led by Element AI fundamental research scientist Negar Rostamzadeh.

The rapid evolution of research in image synthesis and transformation in the last several years has many of us in the AI research sphere excited about using generative models for augmented creativity in fields like art, design, and fashion. For the creative artist or designer, access to classic generative model applications like style-transfer, semantic manipulation, latent-space interpolation and text-to-image generation can support exciting new forms of creative exploration, experimentation and discovery in their domain. Over the longer run, the growth of generative models could potentially give designers a means to instantaneously visualize and modify ideas, creating mock-ups as fast as they can think of them.

State-of-the-art generative models can already generate extremely realistic, polished images, and recent architectures are achieving consistently high performance on conditioned generation. However, the broad field of applied generative models has stayed mostly theoretical due to a lack of good datasets to work on. There is not yet a sufficient volume of quality public datasets to develop industry-grade models of the very intricate visual domains where artists and designers work their trade. For now, most research in generative models for augmented creativity is restricted to applications for testing proofs of concept.

In order to help solve this problem, global fashion platform SSENSE is sharing its expert-annotated e-commerce photographs with the AI research community, and partnering with Element AI to prepare and host the data. FashionGen consists of hundreds of thousands of annotated studio closeups of fashion items in multiple poses, pairing each item to fine-grained design descriptions sourced from fashion-industry professionals. The items are additionally tagged by attributes like season, designer, country and ensemble-membership. This is the first large-scale dataset to feature expert annotations for high quality, clean images drawn from a specialized design domain, making it an especially promising benchmark for state of the art text-to-image generation. In addition to presenting a realistic public benchmark for design-oriented generative models, the dataset improves upon the cutting edge in large-scale Computer Vision datasets in several useful ways:

High definition photographs under consistent studio conditions.
Tens of thousands of objects rotated in multiple poses.
Paragraph-length descriptive captions sourced from experts.
Exclusively high-fashion images, providing a uniquely structured visual domain.
Uniform format and rich metadata.

To celebrate the upcoming release of FashionGen, Element AI and SSENSE are jointly sponsoring a text-to-image generation challenge for ECCV ’18 (European Conference on Computer Vision) in Munich, Germany. We invite interested parties to sign up at www.fashion-gen.com to get pre-release access to the FashionGen dataset, and throw their hat in the ring for the competition. We will evaluate the quality of generated images both using a pre-trained network hooked up to the challenge API (we limit the number of submissions per day to protect the critic-network from attempts to reverse-engineer the model parameters), and using the expert opinion of human jurists from the fields of AI, art and fashion. Awards will be given to the best submissions, and the winners will present their work in the inaugural Computer Vision For Fashion, Art and Design workshop at ECCV ‘18.

Following ECCV, Element AI has now released the Fashion-Gen dataset. The work was led by Element AI fundamental research scientist Negar Rostamzadeh.

FashionGen and ECCV Challenge

High quality, professional photos with detailed annotations for fashion image generation

If you enjoyed this post, be sure to check out the rest of the Element AI Lab Blog.

Written by Thomas Boquet