A Quick Guide to Designing for Augmented Reality on Mobile (Part 4)

This article is part 4 of an ongoing series, catch up on Part 1, Part 2 and Part 3 here

Understanding depth and volume is crucial when designing for Augmented Reality. We experience the world in more than one dimension and need to design accordingly. In this article, I’ll shine a light on what 3D means in Augmented Reality. We will take a look at the components of a 3D scene, how the underlying technology works, and methods for producing 3D objects. This article is ideal for readers who want to:

  • Create their own 3D content for AR
  • Learn how to source 3D content
  • Have a better understanding about the current landscape for 3D.

How can designers start learning 3D?

For many years, 3D had an impossibly high barrier of entry for designers. It’s long been considered a technical medium reserved for industries like visual effects, video games, and medical illustration. Adoption in design has been further blocked by a steep (and rocky) learning curve, poor user experience, and prohibitively expensive entry fees.

As the need for designers with 3D skills skyrockets, however, new initiatives are emerging to make learning 3D easy, accessible, and fun for designers.

One such initiative is 3D for Designers, which has become the de facto starting point for designers and illustrators who want to add 3D to their toolkit. Created and taught by design industry veteran Devon Ko, it includes both free lessons and a comprehensive 3D design fundamentals course. If you want to learn shape, form, lighting, color, and animation in 3D, I highly recommend starting here.

Disclosure: Devon helps me edit this Augmented Reality on Mobile series, and I’m a Teaching Assistant in her course 💖


Composition of a 3D Scene

A 3D scene is composed of several objects and things that exist in x,y and z space. Some of the more common objects you will find in a scene are as follows:

Mesh: A collection of points, edges and faces that describe an object. 
e.g.: A 3D mesh of a sphere

Camera: A fixed viewpoint, as of that from a camera.
e.g.: A view that is always fixed on the sphere no matter where it goes.

Light: A source of illumination that creates light and shadow in the scene.
e.g.: A spotlight that only exposes something in the middle of a scene.

Material: contains information about how an object looks, including color, texture, and density.
e.g.: A sphere that look like it’s made out of wood.

Shader: Similar to a material but contains procedural data about how an object should look.
e.g.: Creating the effect of electricity zapping away a sphere.

An example of a shader that can be applied to a mesh in Unity

A 3D file may also contain additional data such as animation and rigging information. A mesh by itself or with additional data is also referred to as a 3D model.

Pre-Rendering vs Real-Time Rendering

Rendering is the process of interpreting 3D scene data into a graphic. Traditionally, 3D software has a render preview mode that can be activated to see a (typically) lower-quality preview of the final render. Depending on the hardware and complexity, a single frame can take a few seconds or even days to render.

AR as a medium cannot afford a few seconds or days to show content. This medium requires the rendering to always be done instantly and in real time and therefor requires a real-time rendering solution.

A real-time renderer is capable of displaying and updating information immediately. Examples of real-time renderers are Unity or Eevee.

For a real-time experience to feel smooth, the frame updates should take place at about 60 frames per second (60 fps). Keeping an eye on the fps ensures that the experience mimics the real world.

Examples of different frames per second

Common 3D content types

There are several types of 3D extensions and formats, some are exclusive to their authoring environment whereas others are open source. The following are common and emerging formats:

  • Object File (.obj): This is the most common and basic type of 3D file, it contains only the geometry information of an object. Most .obj’s are accompanied with an .mtl (material library file) that contains information about the material and textures.
  • Filmbox (.fbx): This file format stores more than just geometry and includes scene, camera, lighting, rigging and other 3D information. It does not include textures but may sometimes have base surface color. This format is maintained by Autodesk.
  • Collada (.dae): Similar to .fbx, but maintained by the open source community.
  • Graphics Library Transmission Format (.glTF): The self proclaimed ‘JPEG of 3D’, glTF is an open source 3D file format supported by The Khronos Group. It can carry a diverse amount of data specially created to meet the emerging needs of web and mobile 3D. It is functionally similar to an .fbx but much smaller in size.
  • Universal Scene Description ZIP (.USDZ): This is a zipped version of a .USD file, a format developed by Pixar to include robust and interchangeable data into a single file format. This means that you can have variations of an object in the same model instead of having several models.

So what’s the best file format for AR?

The challenge with loading 3D objects into an AR application is that it needs to be done immediately and in real-time.

The best file format is one that loads and performs as fast as possible, and that’s the goal of both .glTF and .USDZ.

The major conflict between these formats is that the industry has divided itself in terms of support. For example, Apple exclusively supports and encourages the use of .USDZ for ARKit applications, however Chrome does not support this format on the web and promotes .glTF.

If you are considering commissioning or buying a 3D asset, my recommendation would be an .fbx since you can covert it into both .glTF and .USDZ based on your needs.


Constructing a 3D object

3D models are constructed in two primary ways: sculpting and modeling. Figuring out the type of object you need first will determine the software or artist that is right for the task.

Sculpting: The act of deforming polygons with a brush point, similar to working with clay. This method is best for organic shapes like figures and animals. The most popular software specialized for this type of work is ZBrush and Mudbox.

Modeling: Controlling every point with precision, similar to working with vectors. This method is best for hard surfaces like machines and architecture.
3D tools such as Blender, Maya and Cinema4D are capable of both types of authoring.

Whichever process is chosen, it will ultimately create a mesh. A mesh is simply a collection of points, edges and faces that help describe a 3D object in space. These points, edges and faces combine to form polygons. The higher the polygon count, the smoother your 3D model will look. A higher polygon count may also impact performance and load times. This is one of the main reasons current AR games do not look as high fidelity as a game designed for a console.

Example of a sphere with low-polygon to a high polygon count

Sometimes, instead of creating a high polygon count mesh, a material can act as a way of introducing the same amount of detail but with a faster load time.

Materials

There’s a misconception that materials are like a layer of paint on an object. Materials are more like upholstery, they have structure and properties within themselves and can even influence the shape of a model. They can also exist separately from the mesh and be reused. The most popular tools for authoring materials are Substance Painter and Substance Designer

All of these spheres have the same mesh but different materials applied over them

UV Map: When a material is applied to an object, the UV map is one way to define how it is projected on the mesh. For example, if a flat texture of the earth is applied to a sphere, without a UV map it would look like all of the continents are squished into one spot.

In some cases a purchased model may not look correct, oftentimes because the materials are missing or the mapping is incorrect.

Engine Specific Materials

Not all renderers are the same, in fact, each rendering engine will have unique characteristics that will impact how something looks. For example, purchasing a model with V-Ray materials will only work if you are using the V-Ray renderer.

Examples of popular rendering engines

A material may not be the only thing that affects how your model looks, the following list of properties can also influence the visualization of an object.

Lighting

Lighting does more than just brighten a dark scene, it’s the secret to making something look like it belongs in an environment. Incorrect lighting is one of the most obvious visual discrepancies a viewer can pick up on.

Lighting can sometimes also have a greater impact on the color of an object than the materials themselves.

For lighting to be believable, it needs to be accompanied by the right shadows. Our eyes are trained to recognize where an object is in space based on the cues that shadows provide. For example, natural light from the sun casts stronger directional light and shadows than artificial light.

Some models come with a lighting rig which is a set of lights that mimic different studios or setups. These setups can be static or animated and can have different properties.

Sometimes the lighting can be defined by a 360° image, these are called Image Based Lights (IBL) and are a cheap solution for creating realistic lighting.

An example of an IBL being used to light the kettle.

Many AR services use a method called lighting estimation to generate more realistic lighting and shadows. This is a method of using computer vision to understand the world around the user and then generate the correct lighting setup. This process is similar to an IBL and is constantly analyzing information so it can update to match any new changes that take place.

An example of real-time lighting estimation in ARCore. The introduction of a bright light illuminates the model.

3D Scanning

With the advancement of cameras and sensors on phones, 3D scanning is starting to pick up momentum as a viable way of generating content from a phone. One of the most common methods of scanning is using photogrammetry.

An application like Meshroom can take image data from your phone and help generate a mesh so you have a 3D model.

An excellent case study of using different 3D techniques including scanning by Aaron Covrett

At this moment 3D scanning via phones still has some way to go since the mesh is not 100% accurate. If you do go this route, be prepared to clean up the materials and the mesh that gets generated.

The good news is that mobile device manufacturers are continuously investing in better hardware and sensors, which will in turn enable even more precise scanning in the future. A time when you can instantly scan a production ready 3D asset might not be that far away.


Resources

There are several resources that can help make it easier to get the right 3D content. Here are a few recommendations:

Art Station: A resource and community for sharing work and finding professional 3D artists

Turbosquid: A paid stock site with some of the most extensive offerings of 3D models and textures.

Google Poly: Thousands of free 3D assets optimized for AR/VR

SketchFab: A web based service and community for viewing 3D models and finding artists

Unity Asset Store: A large offering of paid and free 3D models and extensions optimized for a real time engine.

Mixamo: A free web service that automatically rigs biped models.


In part five I will explore tips and tricks from the visual effects (VFX) industry that can help craft more realistic and compelling AR experiences.

Thanks always to Devon Ko for the editing and Tony Parisi, Jeremy Cowles and Brendan Ford for the insights.