Becoming Human

CryEngine’s latest tech makes great strides toward ascending the uncanny valley

Steve Haske
9 min readFeb 26, 2014

Exponential growth in technology is making it harder and harder to convince ourselves we’re not living in a piece of everyday science fiction. Between DARPA contractually collaborating on faunal field operators and smartphone AI developing fast enough to outpace our own neural networks in a few short years, the future seems as prevalent as it is striking.

So naturally conversations about the uncanny valley, that “off” feeling you get when encountering something artificial that’s mimicking a human without quite pulling it off, have become increasingly commonplace. And like so many other complex ventures into new territory, the crucial difference between ascending beyond the confines of that wide gulf — or falling just short — is a matter of critical nuance.

As a next-generation game designed to take full advantage of the power of Microsoft’s Xbox One, the vividly barbarous legionary tale Ryse: Son of Rome is as innocuous as a gladius to the gut, even as the tech behind it employs countless seductive technical subtleties. If nothing else, the wizardry of developer Crytek’s proprietary CryEngine is a sight to behold: Ryse’s opening setpiece rages through the haze and cinder of battle as Rome burns and the brushed metallic finish of an army of centurions reflects dully against a late afternoon sun.

CryEngine’s technical wizardry brings everything from legit cinematography to realistic lighting to life in near-CG quality.

Though some aspects of the user interface are necessarily immersion-breaking, filmic touches like camera shake, depth-of-field photography and motion blur can make it hard to differentiate between Ryse and a motion-captured CG film, particularly at a mere glance. Of course, the uncanny valley is most squarely focused on human likeness — the term was coined in 1970 by robotics scientist Masahiro Mori to explain why we find technological facsimiles that look like us so disturbing — and breaking free from the confines of the uncanny is where Crytek has arguably labored the most, creating some of the most realistically rendered game characters to date.

When it comes to believability, this focus is to be expected.

“The biggest [factor] to get over the uncanny valley is definitely the facial animation,” says Crytek US engine business development manager Sean Tracy. “That’s the thing that breaks more often than anything, is the faces of the characters.”

A player’s sense of unease tends to go from non-existent to extreme is when a character face itself breaks, like, say, when a glitch causes a character to clench their teeth in an unnaturally horrifying open-mouth smile when they’re supposed to be speaking lines of dialogue. (This has actually happened to me in a big budget game, though it wasn’t Ryse.)

These breakdowns can be the result of having a low number of vertices (making up an array that defines the edges and shape of a rendered object) available to influence each bone in a model’s facial skeleton. The less vertices per bone, the harder it is to accurately mold complex layered textures like skin. Typically four is the number that animators use, but Crytek has doubled that, developing what they call “eight weight skinning”.

“When you have a really dense face, it’s a little bit tricky to only have four bones influence a single vert because you can’t do the folds, you can’t get things around the nose or around the mouth deforming the way you would really expect it to deform,” Tracy says.

Crytek has been working towards perfecting realistically rendered faces for years in games like Ryse and Crysis 3 (right).

Eight-weight skinning is only part of Crytek’s realistic face equation. In addition to motion-capture, which the team did with the help of an outside effects house, the next step is using corrective blend targets. Think of these as kind of a composite crafted by the engine choosing from a library of facial models based on the current animation of a character’s face.

That library of “morph targets” is how animations were typically done before tech advancements changed the game.

“Basically you would have maybe 90 or 100 models of this face in different sort of shapes, so he might be saying ‘O’ or ‘Yea’ or whatever those different phonemes are that we want from the lips,” Tracy says. “So in the past you would actually just blend in different morph targets depending on what he’s saying.”

Now that more primitive process is coupled with the performance capture data.

“When we’re doing a certain bone animation — for example when [protagonist Marius Titus] is screaming, we’ll actually blend in a sort of screaming morph target during the bone animation. So what happens is you get a mix of the morph target, plus this bone animation,” Tracy says.

That may all seem pretty technical, but the result is a face free of any unnatural mathematic tearing at its seams — sort of an animation equivalent of using Photoshop’s healing brush.

“That’s why these are corrective,” Tracy says. “[We’re] sort of fixing the mouth so it doesn’t get completely torn apart, because typically in games you do lose some control of the vertices, especially on the outer edges, so [Marius’] mouth might look way too wide or something.”

Complications notwithstanding, there’s no one-step solution to effectively combining bone animations and corrective blend targets.

“There’s not a lot of magic in terms of technology for the facial system. That’s a lot of really hard work by a lot of artists in Ryse.

Crytek may be getting close to escaping the uncanny valley, but rather than coming from a desire for a deeper reaching philosophy, their pursuit towards realism has always been a point in and of itself. With big budget game productions yielding shorter and shorter experiences, it’s important that fans get the highest production values possible for their money, says Tracy, adding that the company’s direction with the realm of the photoreal has always best reflected that.

You could practically live here (though you probably wouldn’t want to).

“With Cevat, it’s always been photoreal — it needs to be believable, it needs to be immersive,” Tracy says, referring to Crytek president and CEO Cevat Yerli. “That’s kind of our vision, if you would. Is trying to get to pure CG quality in real time. And honestly we’re very close.”

A key difference that widens the gap between Crytek and their competitors is that the Frankfurt, Germany-based developer goes out of their way to keep as many aspects of their tech in-house as possible, so compatibility issues with third party middleware applications (for instance, the Unreal engine’s Simplygon tool, a graphical scaler that renders scenery in greater or lesser detail depending on how close the player is to an object) are never an issue. It’s a methodology that highlights the extreme attention to detail they bring to their work.

This high-end direction is nothing new. Their Crysis series has been lauded countless times over the past several years for its performance and detail when running on a souped-up PC, though with the advanced processing capabilities of next-gen hardware, Ryse is something of a benchmark for console titles.

Every ounce of power is needed, too. Apart from the game’s various animation techniques, Tracy says other components contributing to Ryse’s rich output that unsurprisingly eat up a lot of CryEngine’s bandwidth.

“That’s kind of our vision, trying to get to pure CG quality in real time. And honestly we’re very close.”

Probably the other biggest contributing factor to making Ryse look as photoreal as it does is realistic lighting. Crytek’s solution here, called physically-based shading, it’s something that they have been building towards for years. In a nutshell, physically based shading renders a world with realistic lighting — that is, lighting that accurately reflects, refracts and diffuses according to different types of materials it touches.

“Classically in games this is actually a really tricky thing to solve, because all our materials react very differently to certain types of lights,” Tracy says. “Whether it’s fire that’s flickering or whether it’s the sunlight. So what we needed was consistency across the entire game.”

With the recent advancements in next-gen hardware, the team was able to get this computationally aggressive set of algorithms up and running, and Tracy says the abundant juxtaposition of metal and non-metal materials in Ryse made it a good candidate to put their physically-based shaders through their paces.

In the past, any photorealism Crytek was able to achieve was hampered by tech that wasn’t quite there.

“As soon as you’d actually do anything in the world with it in real time it would sort of break down. You need those shading rules while the light’s rolling over the surface — how it’s gonna react to a different index of refraction,” Tracy says. “But once you have an entire game that’s actually physically based, not only do you have a photoreal game, but you also have a photoreal game that can move.”

There are numerous other facets of the CryEngine utilized to create a realistic space. Tracy explains how CryEngine’s own LOD, or level of detail, generator frees up processing power by only rendering as much detail as is needed in relation to the player’s distance from any given object, similar to Unreal’s Simplygon; he says it’s not actually the polygons that gobble resources, but the amount of different materials present in any given rendered in a model (so, for example, wood, glass, metal and other materials inside a building).

With advanced rendering tricks like geom caching, Crytek could soon revolutionize how in-game body and facial animations are achieved.

Ryse is also using a system called geom cache, which seems to have the quiet potential to be a revolution for rendering technology. Essentially, any time an in-game action requires movement — whether it’s an explosion of flame, waves crashing against a beaten shore or even potentially the primal scream of a Roman warrior — that animation requires a skeleton from which to build. Geom caching eliminates that requirement.

Geom caching works based on rendering techniques used in motion pictures. The film equivalent, called Alembic, pulls a point cache — essentially storing the positions of vertices on any given asset as a series of points that can be used in-engine — once per frame.

Geom caching does this in real time by throwing out gobs of duplicate data that overlap frame-by-frame, thus freeing up enough memory to run these so-called prebaked animations in real time. The result? The point cache data from the animation’s vertices replaces the need for a skeleton, saving designers a significant amount of time.
Tracy says he hopes Crytek can further develop geom caching to replace the need for morph targets in facial animation, among other developments the company isn’t talking about yet.

“Once you have an entire game that’s physically-based, not only do you have a photoreal game, but you have a photoreal game that can move.”

“In the future what we hope to see is expanding the geom cache system to try to do facial and things like this, because again if we can get rid of the bones out of the face and not use morph targets and use something that’s even more advanced — that would make all the sense in the world to do.”

In any case, Ryse’s sophistication will likely soon be outclassed by whatever Crytek is doing next. Continual advancement is perhaps their real philosophy, and Tracy says they don’t plan on stopping any time soon.

“It’s never going to be like that for Crytek,” he says. “As we finish one piece of tech there’s ten other pieces of tech we’re wanting to work on or wanting to research.”

Nor has the developer’s pursuit of realism yet resulted in a fully-realized digital human replica — or even any would-be simulacra that may lie beyond.
“And I still don’t think we’ve totally overcome the uncanny valley,” Tracy says. “I think it’s gonna be awhile before we can actually break through that.”

--

--

Steve Haske

Culture critic and journalist with Wired, Unwinnable, other places you may’ve heard of. Also available in bite-sized chunks @afraidtomerge.