Thoughts on Augmented Reality

Published in

Geek Culture

5 min readJan 9, 2022

Hi, I’m Andy. When AR came into my life, I already was a Certified NUKE Trainer. Compositing background helped me a lot in understanding what AR really is. Camera tracking, depth channel occlusion, anchoring, dense point cloud reconstruction, and transform matrices are what I’m used to.

I started my acquaintance with AR in 2011 when I was working on a theater project where several particle systems were pinned to performance actor with the help of Quartz Composer app and Kinect sensors. Later, there were HoloDesk and HoloLens, Vuforia and Project Tango. I have been teaching ARKit since 2017 and RealityKit since 2019.

Over the past decade, I have formed a certain vision of what an Augmented Reality experience should be, primarily from the user’s point of view. In this story, I want to share with you some of my thoughts on this topic.

Year 2030

I firmly believe that no IT company in the world, building its own AR ecosystem, can boast of a clear idea of what augmented reality will be like in 8–10 years perspective. And this is not surprising, because it is simply impossible to predict the sequence of interrelated events leading to the final result — the planning horizon in most companies is usually 5 years. Further I outlined what, in my humble opinion, frameworks’ engineers should focus on in order to create a worldwide AR ecosystem by 2030.

Glasses and controllers

Despite the fact that Apple, Microsoft, Google, and PTC have done a tremendous amount of work over the past years, Augmented Reality industry is still in its infancy. Only the release of hi-res AR glasses helps to make a tangible leap in development. At the same time, glasses must have good performance, long battery life and an attractive price. The assortment of bluetooth IMU-controllers should be wide: rings, bracelets, gloves, joysticks, adhesive sensors, etc. First iterations of smart tags are already introduced: Apple AirTag, Samsung SmartTag, Google Tile…

Hardware constraints

An average user doesn’t care how many polygons a model has, or what a resolution of its textures. User just wants to feel a wow-effect of a 3D scene in mobile AR app. To achieve such a user experience, frameworks’ engineers need to get rid of the recommended limitations: 100,000 polygons and 2K textures. It is quite obvious that the implementation of this scenario will require a technological breakthrough in semiconductors’ manufacturing — a transition to picometers scale for processors. Also, the adoption of high-capacity graphene batteries is a top priority for any manufacturer of AR glasses.

Online Render Farms

Thanks to a widespread adoption of 5G high-speed mobile internet, frameworks’ developers will be able to render AR content off-device. It seems quite logical that rendering high-poly models with raytraced shadows on several powerful remote computers, and then transmitting the resulted frames over the Internet, is considerably less processor intensive task than rendering on-device. By the way, median 5G users are capable of watching 8K, 120 fps, 10-bit color video streaming. Nice, isn’t it?

Relocalization

Quite a lot has already been said about cloud relocalization and it makes no sense for me to talk on this topic today — many startups are now ready to propose their own solutions. However, it would be great if the main AR market players offered out-of-the-box tools for combining fundamentally different scenes (built on ARKit, ARCore, MRTK or Vuforia) in one multisession. Basically, this means that world anchors of each proprietary framework should become interchangeable. In addition to the above, geo-anchors must work everywhere, not only on the central streets of megalopolises; and combining AR world maps of two adjacent locations should be as simple as saying “cheese”.

Scene reconstruction and object occlusion

In January 2022, the working distance of the iToF and dToF mobile laser sensors is within 5 meters. Increasing the working distance at least twice, will notably expand the boundaries of the reconstructed scene and will be a significant contribution to improving the quality and deepness of the depth channel. But, notwithstanding the above, the integration of the high-quality deep channel (widely used in a day-to-day NUKE compositing) gives any AR framework the ability to separate virtual objects from real-world objects with the highest possible accuracy.

Far clipping plane of a camera frustum

Modern 3D engines are bad at rendering virtual models more than 1000 meters away. At a considerable distance from the camera, the models are rendered with artifacts. As you might guess, raycasting methods are also limited by that distance. Engineers will have to solve the problem of how the user can place, for example, a high quality model of a skyscraper at a distance of 5 kilometers from the device.

Tracking errors

One of the most serious problems in AR is tracking errors that accumulated with the distance: the more distance the user has traveled, the lower the accuracy of the AR world map. It is quite possible that satellite Internet providers will help us to correct our AR world maps.

Content at a reasonable price

Have you ever wondered: what is the most important thing about AR? The answer is evident — its content. Content creators will have to produce huge libraries of animated 3D models adapted for AR, from scratch. Of course, for many indie developers, a photogrammetry is like a breath of fresh air — however, static models need to be made alive. The problem is: not every company has a professional character animator on its staff. In this situation, the ability to record a skeletal animation in realtime is the most important AR feature in the future.

Metaverse-like services

Neither the Metaverse nor Zuckerberg’s ideas excite me. I would like Augmented Reality to help people in their lives, and not make them addicted. In addition to this, by the time Zuckerberg’s Metaverse is fully implemented, it will be different from the original idea.

Conclusion

Today, there are enough weaknesses in AR — unstable tracking, imperfection of light estimation, unrealistic physics, no support for refractive materials. But all these drawbacks are insignificant in comparison to the potential of AR in the future. Just imagine the possibilities for drivers and pedestrians, doctors and restaurateurs, architects and designers, plumbers and electricians, gamers and educators…

In conclusion, I would like to say one apparent thing: all major framework developers have come to the point when augmented reality libraries should become a part of their operating systems, which means that very soon we will see not AR frameworks we’re accustomed to, but rather the birth of brand-new visionOS, realityOS, Windows Reality or Real Fuchsia.

We are living in a wonderful time of change.

À bientôt!