Jules Urbach on RTX & VR, Capture & XR, AI & Rendering, and Self Aware AI. Neural Lace Podcast S2 E2
Urbach on how RTX may vastly improve VR and AR, and on how scene capture with AI to enable turning the world into a CG object may change Mixed Reality for the better, also we talk about Self Aware Networks or Artificial General Intelligence, we talk about the importance of Rendering to AI, and the important of AI to rendering, and we answer a lot of questions that the community has had after major announcements at Siggraph from Nvidia and Otoy. Thanks very much to the artists who gave their permission to have their art shared in the video component of this podcast.
You’re listening to the Neural Lace Podcast Season 2 Episode 2: Octane: AI & Rendering with your host Micah Blumberg the link is here https://youtu.be/yMsaNsqzjFQ
Presented by NeurotechX and Silicon Valley Global News SVGN.io
Interview Transcription (this is not a perfect transcription, I did my best to convey the intended meaning, it’s my best estimate of what was said) I put my comments in italics to make it easier to differentiate. I also enlarged the font when my attention was particularly drawn to things Jules Urbach said.
J: Jules Urbach, CEO, Otoy
M: Micah Blumberg, Journalist, Neurohacker.
Begin the Neural Lace Podcast Season 2 Episode 2: Octane: AI & Rendering
M: Hello everyone this is Micah with Silicon Valley Global News and also with the Neural Lace Podcast and I am here talking to Jules Urbach he is the CEO of Otoy.
M: The inspiration for the call was that we recently had some epic news at Siggraph and even following Siggraph at other events were we were seeing the latest news on Nvidia’s RTX, and also Otoy has committed to releasing the Octane Render for Unreal Engine 4. Now Octane is integrated into the two top game engines UE4 and Unity.
Real Time Ray-tracing (RTX) is coming to Virtual Reality and is expected to revolutionize the VR…
To see how look at the Battlefield V demo running on Nvidia real time ray-tracing (RTX) to effectively make shadows and…
M: I was watching one of the talks that: Jules was interviewed at Siggraph by Nvidia and Jules was really surprised by the fact that we now have real time ray tracing on a single GPU. Was that correct did I interpret that correctly?
J: Oh yeah, yeah, it was a great interview and it was one of those cases where I’d been championing the cause of ray tracing hardware for a while. I was first exposed to that possibility ten years ago at Siggraph when I was on a panel with a company called Caustic Labs which was later acquired by Imagination Technologies, and they had really early ray tracing hardware and it was promising but it wasn’t enough to really move the needle and do anything. When they got acquired by Imagination Technologies which was a company that was powered iPhone GPU’s up until last year (they got bought and Apple stopped using their GPUs and that cratered the company) I was concerned that Ray Tracing Hardware was like the Electric Car, the car that runs on water, it would just wither on the vine. We had tested the imagination ray tracing hardware two years ago, and at Siggraph we announced that we were going to ship Octane on this hardware it’s a game changer, it was 10 times faster than the compute cores for ray tracing, and therefore path tracing which is what we use for high quality visual effects in rendering. I think that Nvidia has been keeping this really quiet, they’ve been working on this for almost a decade, and we got let in on the secret of that when the hardware was almost a year away from shipping, we got the development hardware, we rebuilt as much as we could of Octane as did everyone on the experimental RTX hardware. We specifically one implementation of stuff that we built on Vulkan that was an API for computer graphics that was fully able to leverage those ray tracing cores, and we were seeing 8x time performance of exactly the same generation card from the Pascal. It’s very rare to see a ten times improvement in speed overnight in one instance. There is still work to be done, it’s so new, even with the year of advance work that we did, we still have probably another six months before we can start shipping products that have the ability to absolutely tap into the full power of the ray tracing cores. The other part of that interview that I did with Nvidia was that
“there is no doubt in my mind that the next decade of computer graphics if not the next twenty years is going to be defined by Ray Tracing Hardware at the foundational layer. The same way that GPUs have just made 3D graphics a commodity on our phones: you can play Fortnite on an iphone.” Jules Urbach
M: There’s been a lot of people, and I’ve shared a couple articles on this topic, and I really wanted to point out the fact that ray tracing is going to make a huge difference in how we experience virtual reality it’s coming in this generation of cards the 20series cards. It’s going to change Virtual Reality. The sense of immersion that we have in VR before. I tell people this and I get this skepticism back, and they say “well there is no way that we can: VR has to run at 90fps, and you have to render the entire scene in real time ray tracing, I’m trying to argue you can use it as an effect. I wanted to draw back from the hype a little bit and get a sense of the realism from you, how is this real time ray tracing going to actually affect our VR experiences and do you expect to see headsets coming this year or next year, what do you think the reality is in terms of the industry for VR and AR.
J: I’m a big believer, I just answered a Quora post yesterday.
J: It asked if “Is Real Time Raytracing was a gimmick” It’s not, it’s so, there are two parts of the rendering pipeline that transformed by having this kind of hardware in consumer cards, chip in pretty much everywhere, the low end the high end everything is going to have ray tracing hardware, even on the lower end 20-series Nvidia cards. The first thing you can do with that in VR is you can trace primary rays. We were showing that at the Siggraph we could do 4k 120 [frames per second] that’s easily 4k 120 [fps] with path tracing and shading and everything, we just wanted to do foveated rendering, do all these tricks, you don’t have to work and render everything twice, and we can do all that stuff without using traditional rasterization on these RTX cards and even on one of those cards you could drive an existing VR headset and switch from rasterization to ray tracing. I think that is a big leap for a lot of game engines. One of the reasons why we have integrated Octane into Unreal and Unity is that those engines have been built for a very long time on rasterization, we have a great relationship with both companies, I was talking with Tim Sweeney about Unreal and Brigade and all those things and he was like “Well the first step for us is clearly hybrid where we traditional rasterization now the way it is now nothing changes but the reflections, shading (subsurfaces scattering) are improved by Real Time Raytracing because that’s the low hanging fruit. So all of the RTX demo’s that Jensen was showing of games typically have that improvement in quality from that, but that really, that’s almost nothing, that’s almost missing the true value of this, if you go one step further I think we will be able to deliver to Unity and Unreal Engine users right off the bat is that you will be able to switch to complete ray tracing of the scene. Immediately scene complexity, like forests can be rendered, anything that Octane can render even up to shading any film, any feature film, like Ant Man and the Lost can be dropped into VR and 1 or 2 of these RTX cards can power that.
One of these headsets that have 4k x 4k per eye if you ever were to go that far, you can imagine things progressing to that level, the RTX hardware can keep up with that, you can start to skip a lot of the hacks and a lot of the problems that…
one held back scene complexity, visual fidelity because you had to render everything with triangles and not rays of ray tracing, light and also things like anti-aliasing are solved, depth of field get solved, there are a lot of things that ray tracing just solves correctly without having to hack it,
and that’s double the case in VR where you can just render those rays, instead of doing two renders, and doing a lose res one and both using foveated rendering to mix those,
you can just use ray tracing to basically do a heat map and just send more rays to the parts of the view port that are looked at by the eye and that’s something you can’t do in traditional rasterization and it would be very expensive to do without RTX.
Two the second boost from this whole RTX and GPU ecosystem is not just the ray tracing hardware but also the tensor cores and the FP16 floating point operations on there. So we have our own AI Denoiser but frankly the sensor they are showing at Siggraph for real time is really good, being able to do real time denoisers that are AI based or machine learning based to clean up shading noise when you are doing really high quality ray tracing or path tracing that also becomes a lot cheaper and a lot faster on this generation of GPUs, so when you combine those together you end up getting really close to solving the rendering equation in other words all the laws of physics you need to drive [light rendering] in real time. I think it’s going to take new content to drive that. It’s converting your Unreal Engine [rasterization based rendering] game to better reflections with RTX is easy and it can look great, and look at that Star Wars demo that they did.
But there is a deeper and frankly a much higher quality option that is probably going to take six months to maybe even nine months to get into the hands of game developers, we are trying to do our part by providing Octane [integration into game engines]
in Unity for free, and we are going to make Octane and Brigade inside of Unreal Engine also very inexpensive, so we want people to develop content, we are working on a real time ray tracing system. It will obviously be able to go the distance with the RTX hardware, with Siggraph as you can see it’s running on an iphone and we can do real time ray tracing on that device and on Intel Integrated Graphics so we decided that we could solve this problem for everyone and everywhere we can but there is no denying that when you are doing this at 1080p, that if you want to go to 4k or if you want to do this for a VR headset, or even AR, that ray tracing hardware is a game changer.
The idea is that we will bring, through just the two integrations we have in these game engines, we will bring the entire cinematic pipeline that the film studios use for rendering movies and we will bring that to real time and we will bring that to VR and then it’s up to the VR headsets to come up with higher resolution and higher frame rates.
M: That’s something that a lot of people are really astonished by is that Real Time Raytracing is revolutionizing the movie industry in addition to the game industry and you are promising that we are going to be able to hit 90 frames per second at 4k resolution with titles look as good as existing VR titles today?
J: Yeah. Hopefully better. Hopefully with Ray-tracing you can do something better. With ray tracing you can do it the hard way you can brute force unbiased rendering which is what Octane does but there is also the same tricks you have in games: If you want to pre-compute the scene and still do ray tracing it will still look really really good and in fact many movies use that very same technique, not all movies are done in Octane. Octane is the laws of physics computed, you can still do shader tricks, like caustics and stuff with renderers like Arnold which is used in many films Arnold is also moving to the GPU which is great and I think the quality will definitely go up and the things you can do in Ray Tracing Hardware are not necessarily super obvious from day one with the launch of these cards and I think that is why people maybe aren’t getting how big of a deal this is, I think it will be pretty clear for both on the cinematic film side of things where you can really do real time and have that film quality and also of course your immerse HMD’s and the like.
M: Will developers have to start shifting away from using polygons in order to really maximize the use of these new cards. Will they need to start modeling in voxels or lightfields or?
J: No no no. On the constant side on the art side nothing really changes. In Octane there is really almost two paths and RTX only accelerates one of those which is polygons. The only thing ray tracing hardware does accelerate is the rendering of triangle meshes. The thing you can do for Octane in films is you can take something that is a hair object something that isn’t easy to turn into a triangle mesh, and you can still turn them into triangle meshes. You can turn everything into a triangle soup and you can send that to the Ray tracing hardware and it will be much faster. From the artside if anything it’s much simpler because the artists often start with a much high quality version of what they would do for a cutscene or a film and they have to figure out how to reduce that quality to get it into a game engine. Half the work we are doing with integrations into these game engines is not just the render but also the art type on it
you can take something you created in Cinema 4D or Maya or right from a Marvel movie which is maybe Cinema 4D and Octane, which was used for the title: Ant-man and the Wasp, you can just drop that into an an ORBX package which is the interchange format we open sourced for Octane and you can drop it into Unity and Unreal and with RTX hardware you can now render that quality basically in real time.
That makes it easier you don’t have to come up with two pipelines you can stick with one and start with the highest quality version to begin with.
M: I went back and I watched a talk that you did at Siggraph in 2015 part of a lightfield talk with a bunch of other lightfield talks, a really great series of talks, I will link them with the article that’s going to accompany this,
- Light Field Imaging: The Future of VR-AR-MR- Part 1: Paul Debevec https://www.youtube.com/watch?v=Raw-VVmaXbg
- Light Field Imaging: The Future of VR-AR-MR- Part 2: Mark Bolas https://www.youtube.com/watch?v=ftZd6h-RaHE
- Light Field Imaging: The Future of VR-AR-MR- Part 3: Jules Urbach https://www.youtube.com/watch?v=0LLHMpbIJNA&t=25s
- Light Field Imaging: The Future of VR-AR-MR- Part 4: Jon Karafin https://www.youtube.com/watch?v=_PVok9nUxME
M: I really want to go back to lightfield capture which is how I first heard of your work, it looks like at Siggraph that Google has basically reinvented the wheel with their lightfield capture back in 2014.
“Tripping the Light VR a talk by Google at Siggraph 2018
The Making of Welcome to Light Fields VR
This talk describes the technology and production techniques used to create Welcome to Light Fields, a new application by Google VR, freely available on Steam, that allows users to step inside panoramic light field still photographs and experience real-world reflections, depth, and translucence like never before in VR.”
J: I should actually point out that the employees that work there, Paul Debevec whose now doing that project at Google, worked Otoy and was one of the co-creators of Lightstage and worked with us for 8 years. He is a friend of the company he actually was part of that project that we did in our office, and he basically rebuilt that, your right rebuilt that at Google, and they’ve been experimenting with that, and it’s super cool. That’s pretty much the same thing that we had been doing with the spinning camera. The improvements of having 16 Gopros in an arc is probably faster which is great.
M: Well the astonishing thing is they went from 16 Gopros back down to the spinning camera rig that you had, so they actually went up to 16 gopros and they reduced it back down to the two camera sort of thing, except that you only had one camera that was active and they had two cameras that were active. It was interesting to see that sort of big circle from trying to fill in the gaps in the lightfield is that where the concept of AI-Denoising came from that then became the concept the technology that became real time ray tracing?
J: So they are all sort of separate pieces. For me lightfields, when you look at where lightfields end up it’s not so much the capture for me the real value, and even the work that Paul [Debevec] did and we did together when we were building lightstage, all this came out of our lightstage work, and lightstage is more than a lightfield capture, this is what we do for high end films, we scan in all the actors for the Avengers movie, or one of these big tent pole films and we did this every week, everyday, all the DC tv shows, all the lightstage captures, we don’t just capture a mesh, and we don’t just capture a lightfield we capture something more than that called the reflective field which gives you all the ways that light bounces off that surface so you essentially don’t have to, you know artists can write shaders for skin, you have the pore level details, you have the way light bounces and that’s really important for mixed reality as well as for films,
one of the reasons we didn’t go deeper into “lightfield capture” and push further on that experimental thing that Paul Debevec is doing at Google was because really we want to be able to capture what we capture on a lightstage basically in real time and make that something that is consumable inside of Octane or inside of the VR Pipeline because if you are just capturing a lightfield that is better than RGB and depth or maybe stereo or maybe pano it is still a very small subset of the data you want, what you want is a CG recreation that you can drop into a renderer and treat it like a CG object that then matches the real world and for that you need to capture materials
and there is a lot of work that we are doing where we are looking at, where we have the light stage, we have captured the ground truth, our AI denoisering work, all the AI stuff that we are doing is mostly speed up rendering to make it so that the rendering you would get out of the ray tracer/path tracer can be done in a tenth of the time, I can just finish the work, you don’t need to finish the simulation, the AI can just figure out the rest of the pieces heuristically, and capture is very similar to that, our work with light stage is we have the “absolutely” and we have 800 lights, we can capture perfect holographic representation of people but you have to bring them into a big stage. What we have been trying to work on what we have shown every year, not just at Siggraph, at GDC and others, is we can take 120 frames per second stereo camera which is going into phones these days and we can pretty much use a little bit of machine learning plus a some simple lighting patterns that are much less than what we do on a light stage and we can start to get this absolutely perfect live captured reality and that is really our goal and AI will play a role in that, there is no doubt that capture and AI is super important and we have to get that to work in real time otherwise AR and Mixed Reality will never look right and that’s something I discovered frankly when I was looking at the Magic Leap device which I got to test, which struck me because nothing was really being relit. I think your app on the magic leap platform has to request being able to capture the world around you but things like that are really important, so what we were showing at Siggraph on the phone basically is able to reconstruct the scene from the phone camera and then we are doing a little mini version of Octane called Octane lite, that is designed to do the minimum out of Ray Tracing so you can have objects that are mixed in reality, live on your phone and cast shadows and have reflections and just look great, and in order to do that not only do you have to render and denoise, you have to add a denoising to get the highest possible quality, but you also have to do some sort of scene reconstruction to do that live, and to do that without depth sensors. Magic Leap has them, and other devices like the Tango had that, but the iPhones just have two cameras, and they use basically use computer vision algorithm and machine learning tricks to get that sort of information into AR kit. I can’t even imagine VR frankly existing in the future without pass through cameras that allow you to switch to AR mode trivially. In which case you are back to the same problem in which you are either seeing the world through your eyes and there is an overlay like magic leap, or its something like what Oculus is talking about and who knows when it will be out but basically you will have camera pass through where you are actually reporting through two cameras what’s in the field of view and you could do AR you could blend it with VR, that kind of stuff is super interesting. I feel like that requires everything to just be up in terms of the quality so AI is important for scene reconstruction, for object segmentation, it’s important for physics, it’s important for making high quality rendering work in real time, it’s more of a visual description of light, it’s like all the rays of light that make up both your left and right view for depth but also depth of field that you see in terms: All that stuff is essentially something you could emit from a holographic display if you had something like 4000dpi and you had some sort of filter on top of it to guide the rays of light, even Paul [Debevec], even us, everyone wants to build a structured Holodeck or get to that point, and it’s possible.
J: From a marketing perspective we are actually looking ahead of AR and VR because if you can have a glasses free experience from a table or a wall and that becomes sort of the fabric of buildings and surfaces in offices and furniture in homes and sidewalks going forward that is going to be a lot easier to consume than putting on a pair of glasses and I have tried some of the lightest weight glasses, that is something where ray-tracing hardware is absolutely critical to make the display panel at that resolution running in real time. It would be very difficult to do that without Ray Tracing hardware making that 10 times faster, and because it’s 10 times faster we can now drive holographic displays probably in the next six months with this kind of speed in real time.
M: I wanted to just sort of jump off the deep end in terms of science, I was at a talk last night, so Mary Lou Jepsen has this company [Open Water], she is creating a new device functional near infrared spectroscopy, one of the technologies behind it is photoacoustic microscopy, and photoacoustic tomography, and there are some other technologies like ultra sound optical tomography but basically these are new kinds of medical imaging technologies that can be combined with things like electrical impedance tomography and what that means is we are combining light and electricity and sound to image to image the brain, we are basically creating a lightfield, of what used to be an X-ray, well we still use X-rays, there are new X-ray tomography technologies too but basically all these technologies can be combined to create a lightfield of a person’s brain, we don’t know the full depth of what we can get, there are all these people have these conversations at these really great conferences, but there is not a lot of talk about using AI Denoising or even using Deep Learning at all, it’s almost like there is a huge opportunity to figure out if real time ray tracing or predicting what should be there can be medically accurate, in terms of helping to model the brain, helping to model brain activity, ionic flows of activity, the flow ions, and the electromagnetic field of the brain. That would be very interesting if we could begin to accelerate that research in medical imaging by basically combining the lightfield that we are trying to render with some of this AI denoising technology or real time ray tracing technology AI.
M: And I wonder what are the real differences between the AI that’s involved in Real TIme Ray Tracing and the AI that’s involved in AI denoising and if it would be useful and maybe you can speculate as to how accurate this AI can be, whether it would be medically useful.
J: I should probably provide almost like a breakdown of how AI is used even in Octane. There are different AI libraries. Similar to how there are different filters in photoshop and not all of them could be used.
J: AI assisted rendering: We went through all the things that are slow in rendering and we applied AI to fix them. So the first thing is that when you do rendering correctly it’s noisy, its just like when you take low exposure there is a lot of noise, you need light to gather and you need to finish rendering to get a properly clean image if you are doing really high quality photo realistic rendering, AI denoising just finishes the render and you are essentially trading compute for guess work, but the AI is so good, it has looked at enough finished renders to know how to finish any arbitrary view. We built that ourselves. Nvidia has built that into optics and that is absolutely fundamental to rendering that is an AI layer you add to rendering, we’ve also added in something, well it’s pretty straight forward at least, it’s called AI Upscaling, it takes a low res image and scales it up in a way that you can’t do with a normal bicubic filter or anything like that, it does recreate the edges and you have spatial resolution or even temporal resolution added, which is why you can easily go to 120 or 240 from 30 frames per second. Nvidia actually showed a great version of that working, and then there is other stuff, how you can start to figure out if there is a piece of an object or a scene, can you actually finish the actual scene layout, can you figure out from the photo the actual geometric of the forest is and actually create the scattering of the leaves and the trees and all that stuff, and that’s deeper, that’s further out but it’s still something that is really interesting.
J: I think that is where AI and Rendering really do have a lot of cohesion, then there is stuff that is really more for humans to be creative: having AI that takes a simple stroke of your hand and creates all this stuff that is augmented based on your history of how you paint or how you do stuff in Quill or Medium or Tilt Brush those things are really key tools of the future for sure.
J: I also wanted to talk a little bit about lightfield rendering and baking. For me lightfields are a shortcut like we don’t have all the time in the world to use AI denoising and rendering with ray tracing: lightfields were a way to render something into a lightfield that you were basically generating a digital hologram and then the hologram could be looked at from any angle and your done and it’s so cheap and inexpensive that at the very least you can run this on low end hardware like a phone so you can use the compute cost and ray tracing power you do save on other things. That’s why I see lightfield rendering can be super useful for games in real time without over saturating the hardware. Even a 10x speedup in ray tracing hardware is something you can really use up if you are doing really complex light transfers. What you are describing though, with imaging of the brain as a lightfield or volumetric capture is super interesting and I think I was reading one of your posts where someone maybe it was you was looking at the rendering of that in the VR headset and that feedback where your seeing what your brain was doing was probably some kernel of an interface in the future that I’ve been thinking about for a long time. So I think this area of research and activity is super interesting. There are a lot of talks about having thought powered experiences, if you can see what your brain is doing, there is a lot of research about biofeedback loops. I think that the fact that you can visualize that in AR and VR and frankly yes raytrace it which is way better, you can take volumetric data sets and you can ray trace those, that’s what ray tracing is really great at it versus doing it in triangles. That is to your earlier question, that is the one thing you can bring in fire and volumetrics perfectly into ray tracing and it will look great. Unfortunately RTX hardware doesn’t do volumetrics, you can still add it in there, it will still be fast, but you still need to use the compute cores to render that. When you are seeing your own brain and you are thinking stuff and you are seeing your brain state I’ve spent years thinking about what that could mean and I’ve been waiting for the foundational pieces of what we have been building on the rendering side, on the ecosystem and platform side, and what’s fascinating is that there is a lot of work that you can imagine that can be done with AI looking at even at that feedback loop at filling in some holes and details to create things that frankly we don’t have great vernacular or words to describe how they work, we know that they [AI] are important tools for input or for content creation or for who knows more stuff, you are scratching at the key of something at the surface of something that is really really fundamentally important for the future I know it and I think that’s why your on that channel that’s great.
M: So I’m leading a meetup every week it’s called NeurotechSF and we have successfully connected an EEG device to WebVR to an Oculus Go.
Hacknight: We are playing with things like WebXR, EEG, Oculus GO, and Tensorflow. Lets Study WebXR for Brain Computer…
J: I saw that, that was great
M: Perhaps because of that one of my friends or connections is going to enable me to have access to an electrical impedance tomography machine. So the second phase of the goal is to, we are also still acquiring, another friend is sponsoring me to get some depth cameras we are going to get a few Lucid Cameras and these cameras are going to enable us to create a volumetric video and we are going to be collecting volumetric data of a person’s mind with our electrical impedance tomography machine. I wrote about this 3D crosshair convolutional neural networks so that I could try to do object segmentation on the volumes of the data that we are getting from our electrical impedance tomography and on the volume of data that we are collecting with our 3D cameras so that I can begin to correlate. So that we can have the AI begin to look for correlations between the person’s brain activity, the volume changes, and the volumes they are seeing in the world around them. This is just a big experiment we have four titan V graphics cards in the cloud, but we have to somehow stream that data up into the cloud and then back in real time. We are going to be running this in VR so people can see their medical imaging, we can correlate. VR is very useful because you can correlate whatever the user is looking at, or whatever their head is pointed at with EEG, actually that is probably our next step is that we are going to correlate what they are looking at with EEG, but eventually we are going to be correlating the EIT with what they are looking at with the volumes of data that we are going to be representing in VR and then VR also also allows us to isolate random signals from the environment. So we can begin to look for how a person’s brain changes in response to that specific content and the other idea is that you can look at how two people, how their brains are responding differently to the same piece of content. So the point I was getting as was going back to doing object segmentation on 3D volumes of data. It seems like, because we are talking about lightfields, and we are talking about volumetric capture, and we are talking about: so I haven’t really seen I think another area where this technology that you have been working on could be really useful comes to self driving cars, all the self driving cars that I know of are to my knowledge, and obviously I don’t know what’s going on inside these companies but on the outside it looks like are all using two dimensional slices of data streams, you get a lidar stream you get a 2D dimensional camera stream and you are running two dimensional convolutional neural networks on them and I wonder at some point if it would be useful to start actually creating a volume of data first and doing semantic segmentation on this volume and doing upscaling. I’m trying to think of how rendering can help solve the problem of self driving cars.
M: I asked do self driving cars need to become self aware networks, does a car need to become aware of itself, does it need to have… the thinking here is that… There is a book by Peter Tse where he talks about how his theory of what the consciousness is for it’s a higher level information pipeline that exists to help us solve problems that are too big for the unconscious mind to solve, and so he sees it as an additional floating point operation… like when you get a computer that is powerful enough to solve those logistical problems it would probably be a conscious computer. So that is why I was asking is that where we have to go with Self Driving Cars to get to get to the level 5 car, do we need to have a lightfield render? I asked the CEO of Nvidia if we are going to need a self aware self driving car and he said that he didn’t want to build a black box, he wanted to have every component basically understood separately so they could understand what was going on in the car if something wasn’t working, but I think you could visualize, you could render what a neural network is doing so that’s my thinking I don’t know if you have any thoughts on that topic.
J: I definitely would defer to Jensen on the self driving cars, Nvidia is obviously doing some crazy stuff in that field and the Tesla until recently was using Nvidia GPUs all over the place, but I don’t know if the car needs to be conscious. I think people can drive an almost be unconscious almost by rote do the right thing.
J: I’m serious, it’s sad to say, you have almost like the reflexes and everything you can almost be trained, like muscles, even for things like driving, it doesn’t necessarily require you to have the same consciousness that we do. So I don’t know if you need consciousness [for driving] when you have existential philosophical decisions which sometimes you have the car that [might face the question] do I run over the school children or do I save the driver? Even there it’s clear that you can program what the outcome you want into the system that is basically pre-decided. The issue is rendering is that it’s not really every… even for Otoy, we are not just a rendering company, we have capture, we have streaming, and we have other high level things which is basically the entire feedback loop from those things which is basically what, you really do need the software, any system can become probably what we want it to be, if we are going to recreate a perfect representation of reality or if we are talking about brain machine interfaces we have to capture the world, we have to render it, we have to edit it if we are going to understand it or edit it, and then feed that back into the interface with a human in the middle of it hopefully can do something with and that’s why the company is divided into three parts sometimes for some people parts are hidden [some people are only aware of the rendering]. The capturing and the rendering and even the streaming whether its, as you were pointing out you have 5 to 10 GPUs in the cloud, you’ve got a VR headset that’s not in the cloud, you have to combine those two, and a lot of the technology that we build just does that it solves that problem for you. That’s where we see ourselves fitting into this ecosystem is we build these tools to solve some of the harder problems.
You bring up a good point which is what does rendering really do for AI?
If you look at some of these things where they simulate like the evolution of jellyfish or multicellular organisms and you can see them basically be pushed onto the land and they grow limbs, and all this stuff, take that to its logical conclusion which [leads] right into the simulation theory where if you basically simulate everything in the physical world to the point where its so granular that the entities that in it are covered you can imagine basically creating life that learns and grows in an environment that is basically the same as ours, but can be accelerated and you can have shortcuts and do all sorts of things, and that’s why this simulation theory and that’s why Elon [Musk] was one of the people who made this theory popular, if you just imagined a video game, a VR game, where even if you still wore a pair of goggles, but it just looked absolutely real and your eyes are fooled, then you figure out touch and other stuff and then you come up to an area of [questioning] where [we have to ask] are we AI running in a simulation? And how do we prove that we are not, how do we figure out what that even is or isn’t. It’s a fascinating problem. We still have work to do where we get to the point where reality itself is digitized so convincingly One so that cars can understand it like we do, naturally verses with rough approximations. A lightfield is something we can render in a simulation the data is actually less interesting without the source to generate it or how it was captured.
M: Going back to 2015 at Siggraph, the first of the four talks on lightfield I believe it was with the gentleman you were mentioning [Paul Debevec] that now works at Google, he was saying “our eyes were capturing a light field from two points, eyes have lots of different cells that are responding to light.
J: Cone Receptors
M: That’s right. So the idea is that, with research going back into the 1980s on neural networks that can be used as signal processors it seems very plausible that, and looking at Jack Gallants work where he basically created a movie with MRI, he had someone watching a movie and he had the machine watching their blood flow and the movie and they trained it like that and then the machine just watched the blood flow and it pulled a movie, it recreated a movie from just the blood flow, and so the idea is that basically what we are seeing is video, what we are seeing a lightfield that our brains are constructing
J: Absolutely, yep
M: and this was an idea that I saw. So if we are creating a video, but we have the ability to learn from this video but we are also rendering it, and I’m thinking about Google’s Deep Dream and there was a huge story at Siggraph about the Deep Fakes [ I meant to say Deep Video Portraits]
Deep Video Portraits at Siggraph 2018
they are able to recreate an actor or recreate a person and you can make this person. So the idea is for me what does AI bring to, what does rendering bring to AI. If we are going to create an artificial sentient mind, it seems that our minds have neural networks but we are rendering, and our rendering is then being processed like its in a feedback loop with the AI, its rendering its crunching what was rendered and then it’s rendering on top of that again, and these are the feedback loops of the mind I think. So at some point I think that your technology leads towards, you are already combining AI and rendering, and that seems to me, like you put that in a feedback loop and you eventually have a mind.
J: Yeah. And it’s funny because to me… Yeah, and AI, its funny these things because as a product I want to release AI that’s useful like right now this year, but the stuff you were talking about where you can recreated a movie [from brain activity] I was reading about, I think it was from Japan or something, maybe I met the same researcher, where you can basically figure out the letters that you are thinking about by again sort of scanning the back of your head and you could almost get like a 16 x 16 black and white pixel representation of what your brain is seeing or visualizing, you take that to it’s logical conclusion that is going to be essentially, the stuff of Neural Lace that Elon [Musk] talks about the bandwidth back and forth to the brain is not necessarily going to be through goggles on the eyes, there is some deeper stuff going on and I think you are right also that consciousness and how we experience things and how an AI might experience things, we don’t render the same way that a computer does, we don’t even see the world the same way a camera does, our eyes like move around and we basically build an image by our eyeballs scan the scene very quickly and then coming up with a sort of complete picture in our brains, and what our brains see even if we see a light field, which is how you see convergence and depth of field and all that stuff in our eyeballs, even if that goes into our receptors and our brains are trying to figure out what that means, it’s still when we are seeing inside our minds its still something that is, we have no idea if people see blue the same way, there are even tests where people don’t have a word for the color blue, and they don’t see that color the same way we do, they don’t see the sky and plants with the same shades or the same colors they see different shades of green.
How Language Changes The Way We See Color
so it’s like your brain gets trained and because of that training it can absolutely process the same physical cues very differently, and so what we are rendering in our brains, even just visually, maybe have identical sources from the real world but it may just feel different to different people and maybe segment, and you talk about object segmentation to how we process it and analyze it it becomes different, what’s interesting about AI is that you do not necessarily have those limitations you can basically just keep rebuilding AIs that think and look at the world in different ways, even if it’s a philosophical zombie and it’s not true AI the ability for it to process and do stuff and kind of act the way that humans would with that information and do more with it is fascinating. I think we are going to get to the point where, and Google’s stuff is here with those digital assistants that were on the phone that sounded real. You are going to get to a simulated person with AI, but even when they look human, long before you get to the true like consciousness of a machine which is still like we do not even fully understand our own consciousness, we can do it in a way that the human brain almost can’t, the human brain can’t render perfectly like a photograph does or a CGI render does, you know with the precision that we have and so the details that AI can do visually or pick up, cues that are there, AI is already good at, could be augmented in ways we can’t even begin to imagine. That’s why the next ten or fifteen years it’s just so exciting for me. From the artist and creative perspective and philosophical perspective as well.
M: Awesome this is going to be a lot of great food for thought for folks when we they have a listen so thank you very much, Jules Urbach, the CEO of Otoy.
J: Thank you so much, it was always a pleasure.
AI Lighting, AI Denoising, Scene AI, using AI to fix 360 video, using AI to predict holograms from…
A podcast interview featuring Jules Urbach the CEO of OTOY from March 2018
GTI 2017 GPU Technology Conference, The Neural Lace Podcast #5 Guest Jules Urbach, CEO at OTOY
The Neural Lace Talks is a podcast about Science and Technology. Main website http://vrma.io Contact via firstname.lastname@example.org
Credits. I got permission from several folks participating in Octane groups on Facebook to show their art as the video of the podcast.