The Uncanny Valley of a Digital Landscape

Eric Arvai
Gen City Labs
Published in
5 min readOct 27, 2022

In 1970, Robotics professor Masahiro Mori identified the concept that was later translated to “Uncanny Valley.” When we observe a humanoid object that imperfectly resembles us it can provoke revulsion. The valley is a “dip” in our affinity for the replica.

By Smurrayinchester — self-made, based on image by Masahiro Mori and Karl MacDorman at http://www.androidscience.com/theuncannyvalley/proceedings2005/uncannyvalley.html, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=2041097

Does this translate into how we perceive a digital landscape metaverse space? Do we feel a sense of “revulsion” in the same way we view a digital replica of a human face? Success of a metaverse project can depend not only on the reaction to the way the space actually looks, feels, and sounds, but its utility as well.

This post will explore the various types of technology available to deploy a 3D landscape and the trade-offs that a developer needs to consider.

These trade-offs are usually between quality, money, and time. In many cases (not all), the more money you spend, the less time the user will experience loading the 3D space and the higher fidelity you can deliver it. These trade-offs apply most to the cloud based rendering schemes.

The closer the landscape resembles “real life” the more processing power must be allocated to render it. This can be processed by the individual user’s GPU/CPU or a cloud based one. It can be done in real time or pre-loaded. This isn’t always the case as with photogrammetry and LIDAR tools. But in general, the higher fidelity you want to deliver in a visual landscape, the more data that needs to be processed.

Unreal Engine (Pixel Streaming): The most “hyper-realistic” look with the highest cost

Pixel streaming is cloud-based rendering that essentially streams a video to your browser. There is a per-user, per-minute cost associated with the technology. This approach also has a wide device compatibility although mobile support would likely be an issue.

PRO:
Your slow machine won’t effect look and performance because all the processing is done in the cloud

Wide device compatibility because it essentially streams a video to your browser

Hyper real

Decent compatibility across playback platforms

CON:
Expensive cloud based processing compounded by number of live users

Unity WebGL Build: “Highly realistic, low cost”

WebGL is a cross-platform, royalty-free web standard for a low-level 3D graphics API based on OpenGL ES. WebGL brings plugin-free 3D to the web, implemented right into the browser. Major browser vendors Apple, Google, Microsoft, and Mozilla are members of the WebGL Working Group.

WebGL is a delivery method that can incorporate many different scanning technologies to determine a 3D object’s dimensions including photogrammetry and LIDAR.

The most ubiquitous example of WebGL is Google Maps’ terrain view.

Google Maps Terrain View of Bay Area, CA

The 3D Marketplace Thingiverse uses WebGL to present 3D objects to download and print using a 3D printer. Users can rotate, zoom, and move around the object from their browser.

Downloadable and printable 3D object from Thingiverse.com

Spacial.io teamed up with OpenSea to create a 3D gallery space celebrating Juneteenth. Users can move about the space as they would in a real gallery.

spacial.io Juneteenth Gallery

Matterport is a scanning technology that stitches together 2D photos into a 3D mesh. Users can move through a space with photorealistic visual quality. This technology is used a lot in real estate. The following examples are of the house from the movie “The Silence of the Lambs.”

Matterport scan of the house from “The Silence of the Lambs”
Matterport scan dollhouse view of house from — “The Silence of the Lambs”

PRO:
Highly realistic
Wide device compatibility
Decent compatibility across playback platforms (Web, Desktop, Mobil; iOS, Android)

CON:
Potentially long loading time with slow network speeds

Voxel

Voxel is short for volume pixel, the smallest distinguishable box-shaped part of a 3D image. It is the 3D equivalent of the 2D pixel. Oftentimes, Minecraft is used as an example of Voxel imagery however this isn’t entirely true based on the way that game is rendered using polygons to determine the position of each “voxel.” True voxel based platforms determine positions of each voxel relative to each other, and not necessarily on an x, y, x axis.

https://www.artstation.com/artwork/9znkL

Do we really care if the digital landscape looks like real life? If it’s too close to real life do we have an aversion to it? Definitely not. The uncanny valley for a digital landscape is determined far more by the user experience than the quality of the renderings.

My 11-year-old son is an avid Minecraft and Fortnite player. As long as the game experience is meaningful, he doesn’t seem to care that the voxel-like experience isn’t as hyper real as the pixel streamed one. It’s all about the in-game experience.

Humans are trained to be sensitive to micro facial expressions from infancy as we connect with our mothers and fathers. If something is “off” with a digital representation of a face, we spot it instantly and can tell if it’s not quite right. The opposite seems to be true for virtual spaces. Developers strive for the most realistic representation of the real world in most cases. But are forced to make trade-offs depending on the limitations of their resources and target audience.

In the end, there is no virtual uncanny valley that applies to 3D environments. There are only canny ones.

/fin

--

--