Member-only story

Understanding MixNMatch: Creating A More Realistic Synthetic Image

Combine different factors from multiple real images to a single synthetic image

Nahid Alam
Towards Data Science
6 min readMay 6, 2022

--

Figure 1: An overview of what is possible with MixNMatch Generative Model

I recently stumbled upon this paper called MixNMatch that aims to combine different factors from multiple real images to a single synthetic image — with minimal supervision. This post is intended to be detailed and requires some background in Deep Learning and Generative Models. If you are looking for a TLDR; version of it, you can checkout my twitter thread here.

If you like this post, please share with your network.

Summary

At its core, MixNMatch is a conditional image generation techniques using conditional Generative Adversarial Network (GAN). MixNMatch disentangles and encodes multiple factors from different real images to a single synthetic image. Specifically — it combines image background, pose, shape and texture from different real images to a single synthetic image with minimal supervision.

During training, MixNMatch only needs a loose bounding box around the object to model the background but none for object’s pose, shape or texture.

Problem

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Nahid Alam
Nahid Alam

Written by Nahid Alam

Working on Computer Vision at Meraki Camera Intelligence https://twitter.com/nahidalam

No responses yet