Prototyping our VFX-inspired Configurator

Use of compositing technique for near real-time image generation

Xavier Ayme
Plum-living
5 min readMay 5, 2023

--

In the first article of this series, I shared the genesis of the Plum styler and how we dismissed existing technologies to create our own 3D inspiration tool. Keep reading to discover our journey to create the Plum styler.

After dismissing the two main technologies available on the market — real-time 3D and precomputed 3D — we had no other option than to create our 3D inspiration tool from scratch. But where to start? Our breakthrough came from a discussion with Julien, a 3D and VFX expert, and who became Plum’s Lead 3D Artist along this journey. Thank you Julien for making all this happen 🙏.

Using the VFX Principles of Compositing

In the VFX industry, one technique often used to create images is called “compositing”. To explain it in broad terms, compositing works like a digital collage in which you would take pixel scissors and glue to combine elements and build a single image. The simplest, most-known compositing technique is the use of a green screen to incorporate a background in a video — like in the meteo report on TV, or the last Marvel movie 😊

Compositing of a Tap & Kitchen environment

What if we used this technique to create our inspiration tool?

In our case, we would need to sum up all the kitchen elements (fronts, handles, taps, etc.) pixel by pixel, including their respective contributions to other objects (reflection, shadow, color bleeding) to create a photorealistic picture of the selected design. Our tool would then build the resulting image and display it on our website interface.

Different levels of contributions explained

See how changing the tap modifies the environment around it:

Changing the tap influences the environment around, and the other way

The concept is enticing, but it requires carefully selecting dependencies between variables to prevent the number of possible combinations from exploding.

Indeed, in a kitchen, fronts and handles are dependent on each other, but, for example, high cabinets and base cabinets are independent: the cross-shadows and light reflection can be ignored without compromising the photorealism of the image. Therefore, the generated picture will be the sum of the top and bottom parts, each being the combination of the variables (= objects that compose the scene) and their contributions.

All we have to do is build it 😄.

Building our VFX-based Inspiration Tool

The compositing technique is a 2-step process:

Step1: Generating image parts

Based on VFX techniques and software, the “Styler pipeline” takes all of the variables and automatically generates unique image parts. Each image is rendered thanks to a render farm — a data center optimized for 3D rendering — and decomposed into its atomic parts.

For this stage, we decided to partner with Qarnot Computing, a French company that offers low-carbon and high-performance computing power. Selecting a data center that takes advantage of ​​computer waste heat seemed obvious. And Qarnot Computing has been great with its support (many thanks Ariane!) and was natively integrated with our 3D rendering software.

Once the rendering is completed, the 100.000 unique image parts are stored in the cloud, ready to be picked by our recomposition service.

Step 2: Achieving near Real-Time Compositing

Our recomposition service is a C#-based microservice, hosted by Azure. It performs various operations sequentially when it receives a request to generate a configuration:

  1. it computes the different image parts, ready to be “glued” together,
  2. downloads and decompresses them,
  3. sums them up pixel by pixel, channel by channel (RGB) to create a resulting bitmap,
  4. and uploads the result to our CDN (Content Delivery Network), ready to be displayed in the client’s browser

For instance, our demo kitchen is composed of:

  • Beige Rosé matt lacquered fronts and natural oak sides for high blocks
  • Bleu Nuit matt lacquered fronts for low blocks
  • Mini-rond brass handles
  • A brass plinth
  • and a Loop brass mixer tap.

Our recomposition service will sum up all individual images corresponding to the requested configuration (between 50 to 100 on average) and send the result to the customer’s browser.

Overall image is a sum of different independant blocks ‘glued’ together

Are you still following? 😅

Developing our First Prototype

After 6 months of development for Julien & myself, a lot of hurdles have been cleared and we finally released our first working prototype, capable of managing 100.000 billion combinations with three different points of view and returning a photorealistic picture✌️.

But there’s still a minor issue: it takes 45 seconds to compute a single image 😬.

We started investigating and found a hidden pitfall. Remember: for each configuration, our service sums up about 100 images, each being 100 ko. It seems to be an acceptable amount of data for near real-time processing. So why does it take so long?

We eventually found that this calculation does not take into account the compression factor: each image is around 100 ko after JPEG compression but the algorithm needs uncompressed byte arrays in input. After decompression, we get about 100 images, each being 1800 x 1350 pixels, x 3 colors per pixel (red, green, blue), x 1 byte per color = 700 Mb to handle instead of 10 Mb!

Unfortunately, the C# language (and its Bitmap class) is too high level to manage efficiently such low-level parallel operations. Our prototype remains incompatible with real-time usage on a graphic interface 😓.

Our first Styler implemented ‘demo’ Kitchen

In the next article, I’ll share the solutions we implemented to drastically speed up the recomposition process.

--

--