Case study: AI/AR-based Virtual Try-on for e-commerce

Omal Perera
Ascentic Technology
6 min readSep 5, 2023

What is Virtual Try-On (VTO)

Virtual Try-On enables consumers to superimpose digital representations of products onto their real-world environment using their smartphones, tablets, or other AR-enabled devices. This technology is commonly used in the fashion and beauty industries, where users can try on clothing, accessories, eyewear, makeup, and even hairstyles virtually. It has several advantages for e-commerce

1. Enhanced customer engagement
2. Reduced purchase uncertainty
3. Improved personalization
4. Increased sales and conversion rates
5. Social sharing, viral marketing, and Competitive advantage

A model uses FXGear’s virtual fitting room FXMirror to virtually try on a garment. / Courtesy of FXGear

Forecasting the future of the Virtual Try-On market

The global virtual fitting room market grew significantly, reaching USD 3.78 billion in 2022. This upward trend is projected to continue with a strong compound annual growth rate of 24.1% from 2023 to 2030 according to fortunebusinessinsights.com. The market’s expansion can be attributed to the increasing adoption of Virtual Reality (VR) and Augmented Reality (AR) technologies in virtual fitting room solutions, providing users with a heightened sense of realism and engagement during the online shopping experience.

Moreover, businesses across various industries are actively exploring ways to integrate VR shopping experiences into their physical storefronts and online platforms, recognizing the potential benefits of enhanced user engagement and satisfaction. The incorporation of VR and AR technologies enhances the overall purchasing experience for consumers, enabling them to visualize products better before deciding.

Few big steps in the e-commerce world with Virtual try-on.

Image courtesy — https://corporate.walmart.com

A prominent example of this trend is Walmart Inc. In March 2022, Walmart launched a cutting-edge virtual try-on tool. This tool empowers customers to virtually try on clothing items through a personalized virtual model that aligns with their individual body type, hair colour, and skin tone. The success of this initiative has been a driving force behind the market’s growth, as it addresses common concerns related to online apparel shopping, such as fitness and suitability.

Adidas’s new virtual try-on feature in its iOS app aims to help shoppers pick out new shoe styles without setting foot in a shoe store in an effort to improve the customer experience.

How does it actually work?

There are several approaches that have been used in these applications, and let’s have a look at a couple of them in brief.

1. Overlying a pre-designed asset on a 2D/3D model using AR

The following are some screenshots from Wanna wear app which uses a similar technology.

Collection of screenshots from the Wanna wear app

The following will be the foundation approach for such an implementation. But always keep in mind that it will be a huge development effort to build up an accurate solution on top of these foundation steps.

  1. Real-time user pose detection.
  2. Transform 3D digital garment assets according to the posture.
  3. Map to body landmarks and inpainting
  4. Applying on top of AR-detected character

Let’s discuss some challenges in the above implementation.

Posture detection

Posture detection and tracking the body landmarks are critical components of AR applications that aim to interact with users in a more natural and intuitive way. Python, a popular programming language, can be combined with Human Pose Estimation techniques to achieve this functionality. ARKit, Apple’s AR development framework, offers powerful tools for creating immersive AR experiences on iOS devices. By utilizing ARKit’s capabilities and integrating them with Python libraries for Human Pose Estimation, apps can track and identify the key points of a person’s body in real-time. These body landmarks include the position of joints like elbows, wrists, knees, and ankles, enabling the AR application to interpret the user’s posture and movements.

Media pipe is a popular python library to work with body landmarks

Designing, transforming digital assets to match the 2D/3D model

To ensure a realistic and immersive try-on experience, designers and developers employ various techniques, including computer vision, image processing, and machine learning algorithms when preparing digital media resources.

The process is far more complex than creating a static 3D object, as resource objects such as clothing items, accessories, or cosmetics should support human movements and be accurately mapped to fit the user’s body or face in real-time.

Managing digital 3D assets from Vyking Studio to use in its apps — https://www.vyking.io/

2. TryOnDiffusion — for 2D images

Using try-on diffusion for virtual try-on is a cutting-edge approach that revolutionizes the way users experience products in the digital world. Try-on diffusion leverages advanced computer vision and machine learning techniques to seamlessly integrate virtual objects into real-world environments, creating a more realistic and immersive try-on experience.

Very recently (June 2023) Google has released a new virtual try-on tool using generative artificial intelligence is available through its search engine. It is only usable on women’s tops to start.

A selection of three different models wearing the same shirt. With Google’s virtual try-on feature, users can select models ranging from size XXS to 4XL, with different skin tones, body shapes, and hair types. Image courtesy: https://www.retaildive.com

Beneath the surface, it uses a generative AI model that can take just one clothing image and accurately reflect how it would drape, fold, cling, stretch, and form wrinkles and shadows on a diverse set of real models in various poses. It supports people ranging in sizes XXS-4XL representing different skin tones (using the Monk Skin Tone Scale as a guide), body shapes, ethnicities, and hair types.

The biggest advantage of this approach is the ability to work without pre-designed dedicated overlay images of clothing or accessories.

diffusion model sends images to their own neural network (a U-net) to generate the output: a photorealistic image of the person wearing the garment. https://blog.google/

In today’s market, there are several web and mobile app solutions demonstrating successful AR/AI virtual try-on implementations. Companies like Wanna, Reactive Reality, Vyking, FxMirror, SnapChat AR are just a few examples of such platforms.

Each of these solutions comes with their own set of capabilities, and limitations, and their performance, accuracy, reliability, and features vary significantly. As such, they each take a unique approach tailored to the products they cater to.

As discussed earlier, some solutions support only 2D images of models which give you a more accurate silhouette, while others enable real-time application of 3D body mapping, but don’t ensure 100% accuracy.

Consequently, choosing the most suitable solution for your business will heavily depend on your specific requirements.

Finally…

Applications of virtual try-on for day-to-day activities are on a rapid growth trajectory, with VR and AR technologies playing a critical role in enhancing user experiences.

As discussed previously, there are several approaches where current solutions are based and we can expect a lot more new developments to be introduced in the future. As companies invest more in virtual shopping solutions, the market is set to grow significantly, catering to the changing preferences of tech-savvy consumers and shaping the future of retail. Remember, the future of shopping is at our fingertips!

Thanks for reading! If you have had any encounters with virtual try-on implementations, your insights would be invaluable to us. Feel free to let me know your opinions, suggestions, or any interesting experiences you may have had.

--

--