Achieving Photorealism using Neural Networks

Ayush Aggarwal
Sep 4, 2018 · 5 min read
The face of Skynet!

AI² — The saga continues…

Only this time, it will be even more boring than my previous article.

Let’s start with “ AI² ”. I have decided to write a series on my works involving Active Foreground Neural Network. For those of you who read my previous article, I would like you to think of a virtual gold medallion that I am giving to all of you. For those who didn't, well thats too bad. No gold medallions for you! (Check out the article, then award one to yourself)

So what’s so boring this time that I have thought of writing about? Well it partly involves AFNN (once again, only avid readers know this), and partly the project I did for my internship at PlayStation / Sony Interactive Entertainment.

Now I won’t be writing about my experience at PlayStation or the fun I had as a graphics engineer. All I can’t say as of this moment, is that the experience was great. It is imperative you understand that I am not allowed to say how enjoyable my time at PlayStation is going on.

As of this moment, I am still doing my internship, so I would have to use a combination of present and past tenses. Please don’t judge me for this.

My role as a graphics engineer with the game engine (under NDA) team was to work on the tessellation module. In April, when I started my internship, I was given the task to try and implement photorealism in a non-procedural, closed-world game on a PlayStation 4, without drastically increasing the TDP.

For my non-technical readers out there, photorealism can be thought of as a display/render in games, where the visuals almost seem life-like. So the game looks much more beautiful.

Now photorealism is one branch of the gaming industry which is commonly referred to as a black hole; whatever you try to do, there will always be some hardware limitation.

Before beginning my dreaded story, I would like to say that my work is still under progress, but it seems to be working for most part. Most people as of this time would be knowing that NVIDIA just released a new series of GPUs, called RTX (RT stands for Ray Tracing). What NVIDIA is trying to achieve, is perform ray-tracing on mainstream desktops, costing a fraction of the fortune that PIXAR currently uses. However, such technology is nowhere in sight for consoles.

Luckily, I got this project, and thus began my work. I started off by live training AFNN, to scan and get the tessellation count for individual objects in every scene that had to be rendered. The scene that was running along with the neural network had to be optimized for PS4, so my first encounter with a PS4 devkit was with a linker issue in one of my C++ dll files. Once the linker error was removed, the entire project had to be converted from CUDA to OpenCL, since the GPU used is owned by AMD, and can process OpenCL.

Preliminary checks complete, and training of the NN began. The game ran at 10FPS, but was cranked up to its highest visual settings. And with the Neural Network running in background, the TDP charts were through the roof. Nevertheless, I was adamant that this technique would work. However, the GPU was just not strong enough to be able to render a game and simultaneously train and run the NN.

After about 1000 cycles, passing through the same scene, I somehow hit a bottleneck at 2 epochs and my performance charts were also not very good. On a GTX 1080ti, AFNN can perform image recognition in 500 computations, at a an epoch rate of 1.5 and with performance in upper 90%. But since I didn’t have much time, I couldn’t optimize the AI for use with OpenCL.

So the conclusion…

The game ran at 10FPS with best visual scaling, with a very high TDP and a soundtrack only aliens could enjoy (I am talking about the fans, cause you know, there isn’t any wind in space).

Or is it?

After about a gap of 2 months, I tried my hands on the problem once again. Now, one of the best ways to achieve photorealism, is to increase the tessellation count of assets, and perform ray tracing to get most natural lighting effects. Processing High Dynamic Range (or for my less techy friends, HDR) at this time is currently possible only for very powerful GPUs, such as the NVIDIA Quadro.

The solution I found, was to train the AFNN in such a way, that the game that is running is pre-optimized for the consoles it had to run. Since the technology wasn’t for PC, I could control at least one parameter, and it was that the end hardware is either a PS4 or a PS4 Pro. This time, I didn't care about the TDP or the FPS at which the game was running. I built the game to run only at max possible settings. And using a characteristic feature of AFNN, started processing the tessellation count for each object that was currently in focus, by training it with live data in background, and using those values to simply get a tessellation count. This time, I didn't push the data directly into the engine to increase or decrease the tessellation count. One of the many great things about non-procedural games is that the designers can control the level of detail required. And with everything set at max, I ran the scene about 10,000 times (to increase the compute cycle and reduce the hamming distance). My epoch suddenly came down to ~1.5 epochs and the performance was in upper 80s to middle 90s (in percent).

The final result of running the Neural Network was a CSV file that contained the best and worst possible tessellation counts for each object viewable in the viewport. To add to the fun, I added backface culling to almost every object which could be loaded at a later time, depending upon its location in the scene. I used the CSV file as a plugin and modified the engine to use run-time tessellation count values.

This time, there were no issues. The game ran at 30FPS, with near-photorealistic graphics while rendering using HBAO. To manage the TDP, I created a shader that applied a fake gaussian blur for a fake DOF effect and reduced the tessellation count of objects hidden behind the blur. As for those objects that were using computation to render, even though they weren't visible in the viewport, were completely hidden, using another instance of AFNN.

Long story short, here is the second and definitely the final conclusion…

The game ran at 30FPS (locked), with backface culling and frustum culling. The NN was no longer required, and the TDP count was still within boundaries. Photorealism was achieved, without having to work on ray tracing.

This can easily be implemented using AFNN and its currently under production for the entire game engine. As for ray tracing, we can only hope Sony releases its next-gen consoles.

Thanks for reading.

P.S.: Be on the lookout for more articles in the series.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade