Automating render captures with PlayCanvas

Generate captures of 3D models without human input using PlayCanvas

Published in

XRLO — eXtended Reality Lowdown

6 min readMay 20, 2022

Daniel is a Web Lead Developer at Magnopus. He has been in the industry for 20 years, during which he has moved from Front-End development to the Design side, then back to Full-Stack Development. In the last few years, he has made a sideways move into Game, 3D, and Mixed Reality development working on projects for The North Face, Wella Professionals, and Nokia to name a few, plus a Drum award-winning project for Napapijri.

Why we decided to explore automating render captures

As part of our 3D asset pipeline we uncovered the need to generate thumbnails in response to a user event. The requirement: generate a consistent thumbnail for any 3D model that has been uploaded and return an image to display in the UI once upload is complete.

Considering the use of PlayCanvas as our 3D engine of choice it made sense to investigate automating the capture process within the engine, rather than looking outside when the capability was there.

Considering all of the above, how would we go about implementing this?

The implementation

While the goal of our implementation is to automate sending a 3D model for capture to our tool, we need to start with a fixed model to ensure the export works as expected.

The summary of the steps we’ll be taking to achieve this:

Set up a PlayCanvas instance that renders our model to show what we’ll capture, plus an additional camera for capturing the screenshot.
Add a render target and textures to hold the buffer data to capture the content of the canvas and allow for the data to be extracted.
Create and apply image data to a HTML canvas element for extraction.
Convert the image data to a URL and download the captured screenshot.

Let’s introduce our go-to test model, duck.glb from the glTF sample models collection.

Side note: The code below is using our custom rendering library built using the PlayCanvas engine API. While some syntax will appear different from the PlayCanvas docs, the logic for the task remains the same.

To begin, we’ll need to set up an instance of PlayCanvas that loads our Duck model that can be viewed in the browser, with some camera controls to change the rotation of the asset for the final output.

renderer.scene.createCamera({
  name: "camera",
  fov: 0.4,
  clearColor: new Color().fromString(clearColor),
  disableBloom: true,
  position: [0, 0.75, 4],
  rotation: [0, 0, 0, 1],
});const renderCameraRef = renderer.scene.createCamera({
  name: "renderCamera",
  fov: 0.4,
  clearColor: new Color().fromString(clearColor),
  disableBloom: true,
  position: [0, 0.75, 4],
  rotation: [0, 0, 0, 1],
});// Ensure it gets rendered first so not to interfere with other cameras
renderCameraRef.value.camera.priority = -1;const mesh = await renderer.scene.loadMesh({
  name: assetName,
  filename: `${assetName}.glb`,
  url: assetUrl,
  position: [0, 0, 0],
});

This will set up a canvas element that is the full size of our browser window and renders the duck inside our environment.

Additionally as part of this step, we created a second PlayCanvas camera that will be used for the capture of the image and the associated pixel data.

A preview of the PlayCanvas test application.

This, plus the button in the bottom left, will be our test application for exporting what you can see in the canvas as a PNG file.

Next we’ll need to create a PlayCanvas render target and some textures to hold the buffer data to use when it comes to collecting the pixel data from the screen.

// Create render target
const device = renderer._app.graphicsDevice;const colorBufferTex = new pc.Texture(renderer._app.graphicsDevice, {
  width: device.width,
  height: device.height,
  format: pc.PIXELFORMAT_R8_G8_B8_A8,
  // @ts-ignore
  autoMipmap: true,
});const depthBufferTex = new pc.Texture(device, {
  format: pc.PIXELFORMAT_DEPTHSTENCIL,
  width: device.width,
  height: device.height,
  mipmaps: false,
  addressU: pc.ADDRESS_CLAMP_TO_EDGE,
  addressV: pc.ADDRESS_CLAMP_TO_EDGE,
});colorBufferTex.minFilter = pc.FILTER_LINEAR;
colorBufferTex.magFilter = pc.FILTER_LINEAR;
const renderTarget = new pc.RenderTarget({
  colorBuffer: colorBufferTex,
  depthBuffer: depthBufferTex,
  samples: 4, // Enable anti-alias
});renderCameraRef.value.camera.renderTarget = renderTarget;

We’ll need a new canvas element to hold the extracted image data in preparation for downloading as an image. The data collected in the buffer textures and the pixels array will be assigned to the 2D context of the new canvas.

Finally, after assigning all this data to the canvas, we will call draw image on the context to render the image data.

const cb = renderTarget.colorBuffer;
// Create a canvas context to render the screenshot to
const canvas = window.document.createElement('canvas');let context: any = canvas.getContext('2d');
canvas.height = cb.height;
canvas.width = cb.width;// The render is upside down and back to front so we need to correct it
context.globalCompositeOperation = 'copy';
context.setTransform(1, 0, 0, 1, 0, 0);
context.scale(1, -1);
context.translate(0, -canvas.height);
const pixels = new Uint8Array(colorBufferTex.width * colorBufferTex.height * 4);
const colorBuffer = renderTarget.colorBuffer;
const depthBuffer = renderTarget.depthBuffer;context.save();
context.setTransform(1, 0, 0, 1, 0, 0);
context.clearRect(0, 0, colorBuffer.width, colorBuffer.height);
context.restore();const gl = device.gl;
const fb = device.gl.createFramebuffer();// We are accessing a private property here that has changed between
// Engine v1.51.7 and v1.52.2
// @ts-ignore
const colorGlTexture = colorBuffer.impl ? colorBuffer.impl._glTexture : colorBuffer._glTexture;
// @ts-ignore
const depthGlTexture = depthBuffer.impl ? depthBuffer.impl._glTexture : depthBuffer._glTexture;gl.bindFramebuffer(gl.FRAMEBUFFER, fb);
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, colorGlTexture, 0);
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.DEPTH_STENCIL_ATTACHMENT, gl.TEXTURE_2D, depthGlTexture, 0);
gl.readPixels(0, 0, colorBuffer.width, colorBuffer.height, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
gl.deleteFramebuffer(fb);// first, create a new ImageData to contain our pixels
const imgData = context.createImageData(colorBuffer.width, colorBuffer.height); // width x height
const data = imgData.data;// Get a pointer to the current location in the image.
const palette = context.getImageData(0, 0, colorBuffer.width, colorBuffer.height); //x,y,w,h// Wrap your array as a Uint8ClampedArray
palette.data.set(new Uint8ClampedArray(pixels)); // assuming values 0..255, RGBA, pre-mult.// Repost the data.
context.putImageData(palette, 0, 0);
context.drawImage(canvas, 0, 0);

Lastly, we will extract the image data from the canvas element as a base64 string with a mime type of image/png. Then replaced with image/octet-stream to allow for the image to be downloaded through the browser.

const b64 = canvas.toDataURL(‘image/png’).replace(‘image/png’, ‘image/octet-stream’);link.setAttribute(‘download’, filename + ‘.png’);link.setAttribute(‘href’, b64);

The result:

This is now complete and ready to be integrated into an asset pipeline or user experience with few small changes.

We’ll remove the screenshot trigger on the button and add it to the initialise function, update the mesh loading to accept a parameter, and on completion, return the b64 string for consumption within our UI.

Now that we have this functionality in place and it generates thumbnails for 3D models, what else could we do with it?

Other possible future uses

Here are some more use cases we’ve highlighted that could benefit from this functionality.

Add a mode to allow users to frame their 3D asset and choose when to capture the thumbnail.

An example of how Epic Games’ Unreal Engine handles custom thumbnail manipulation.

Allow users to capture custom avatar portraits and apply them as a profile picture.

A close up headshot captured using a Ready Player Me avatar in our PlayCanvas capture tool.

Add a camera mode so users can capture stills of a 3D environment and share them inside the same environment as planes.

Capture of a 3D environment placed inside the same 3D environment as an image plane.

The applications are far reaching and can be used to offer a fun and powerful experience to users in multiple ways.

Conclusion

The automation of this process increases visibility of cause and effect for our users when they get feedback in the user interface that matches their actions. Without this process it would put the responsibility of adding a thumbnail on the user themselves, or introduce a need for tools outside of architecture.

It also opens the door for the implementation of new features, experiences, and enhancements to the process that we wouldn’t have been able to achieve without this automation.