Why This Is The Time of AR — And What Comes Next

VR is the long-term goal, but AR will not only be the first to reach the mainstream: it will also remain with us as a digital substrate to the real.

To the oft-quoted trope “Where’s my Jetpack?” the answer is, in a very real sense: “in your pocket.” The dawn of the Space Age focused our collective imagination on the idea that we could all have rockets strapped to our backs, which, while theoretically entertaining, is probably not a very good idea in practical terms.

What we have done instead is to build something far more powerful and useful; a dense, globally interconnected mesh of services and devices so far beyond our shared expectations of even fifty years ago that very few people imagined any of itwas remotely possible.

We have, more often that not, focused on replacing, rather than augmenting; replicating, rather than reimagining. We have spent a lot of time pursuing the idea that machines should remember for us, do our work for us, removing human input and interaction as much as possible from any process.

Augmenting is Hard

Having computing devices completely take over a task, bending processes to more manageable forms, is easier and more achievable in terms of implementation than integrating a device into pre-existing workflows. Starting with a hardware example: up to a few years ago it was a no-brainer to jump from hand sketches to CAD drawings, skipping some sketching and visualization steps in between, since there was no way to bridge the gap. A good chunk of pre-existing processes in design and architecture that had to do with sketching could not be easily integrated into the digital workflow. The iPad Pro and Microsoft Surface have, however, re-invigorated the sketching “stage” of projects by allowing for high-precision bitmap/vector combined drawings and sketches, augmenting a process instead of replacing it.

The same applies at lower levels in the stack. Early on, if you had to store a large volume of documents for the purposes of lookup, you would normalize their data and store them in a relational database. Indexing was expensive, computationally speaking, as were computer memory and storage. Moreover, storage had much higher failure rates, which meant you needed a way to prove, mathematically, that what you got back was what you had stored — hence, Relational Algebra and Calculus. Today, you’d be able to rely on infinitesimal failure rates, distributed replicated storage, and sophisticated failover systems, so could just store all the documents into a search engine (or another search-and-retrieve system based on what is essentially an inverted index), which is a more “natural” way of solving the specific problem of keyword-based search for unstructured documents.

This states what for experts in any particular field is obvious, but that is nevertheless too often forgotten: to a large degree the particular solution paths we have chosen have been dictated by the combined limitations of hardware and software at any point in time.

Lack of compute power, memory and storage; drastic variances in speed and reliability of both public and private networks; the relative rarity of computing devices, until a few years ago, typically confined to a desk, bulky, unmovable; lack of accurate input mechanisms beyond keyboards and mice; lack of sensors of various kinds; lack of a widely available infrastructure for obtaining information that isn’t commonly measured by typical device sensors, like traffic information or the weather — even constraints created by the expressivity, or lack thereof, of early mainstream computer languages. Simula, after all, dates back to the 1960s, and LISP was the second high level programming language ever created.

A Crazy Supercomputer: As Powerful as Your Phone, Plus Sitting!

This Time Is Different (Really)

We are now living through the first stages of a shift in which all of these limitations are disappearing or have disappeared already.

Evolution in computing has proceeded in waves, as particular technologies were implemented in one realm (e.g. Military applications) and then imported into others (e.g. Enterprise) when their cost became low enough, or whatever other constraint, such as manufacturing capacity or even patents and other protections, was removed. Over time, functions that originally required specialized hardware could be handled by mainframes, mini computers, and eventually PCs. This evolutionary path can be traced for all technologies: high-resolution printing, scanning, photography, portability, sensors. (Note: in recent years many innovations, in particular mobile phones, have followed the opposite path, appearing first in consumer applications and them moving to the enterprise. In most cases, however, military applications precede consumer applications by years or decades.)

As importantly, hardware advances were quickly followed by a transition of software that could use those new capabilities.

Over the last decade or so, however, something interesting has happened: hardware has continued to evolve, (arguably faster than before) in many areas such as embedded systems. New technologies, new platforms appear and disappear faster than ever before, making it difficult for even large development teams to keep up. Given the wide reach of Web technologies and an expanding number of software/hardware platforms developers have generally opted for lower-common denominator approaches that by necessity push the most complex processes to the server. Simultaneously, the rise of virtualization, on-demand cloud computing and lower server costs (resulting, to a large degree, from the emergence of Linux as a solid OS alternative in the late 90s) have made this path affordable and manageable for very small teams with limited funding.

Server-centric architectures have come to dominate software development, in some cases relegating personal devices to being “playback” devices, with most of the storage and compute power placed on the server side. Input can be locally cached but it is uploaded to servers as quickly as possible, with little computation done locally on the client prior to that.

This approach has many advantages, but it also involves an important tradeoff that is not often discussed: by targeting a minimum common denominator, a significant percentage of devices doesn’t see their power fully exploited, creating a capabilities/usage gap. Faster processors, more memory, more storage, all of them evolve in cycles that are faster than software release cycles. Whenever someone upgrades their phone but not their software (a common occurrence) the difference becomes more pronounced.

As a result, the gap between raw compute capabilities and software’s effective use of them for personal devices is wider than it’s ever been for most types of applications, with gaming and video being the major exceptions. Apple and to some degree Microsoft have maintained a focus on client-centric functionality, but there’s a lot more to be done.

The current generation of computing devices are light, with (relatively) long-lasting batteries. They combine high speed local and wide-area networking capabilities such as Bluetooth, Wifi, 3G and LTE. They provide access to real-time information from a multitude of sensors: HD cameras, accelerometers, gyroscope, compass, and GPS. Modern devices provide advanced physical feedback mechanisms like built-in high resolution screens, sound, vibration, and in some cases haptic feedback technology. And, critically, they also put in our pockets, or on our laps, large amounts of storage and memory — computing power that exceeds that of multi-million dollar supercomputers from twenty years ago; computers that consumed entire rooms and weighed multiple tons.

Hardware and software limitations have also historically restricted the very metaphors we could use. (Note: for example, initially, computer hardware and display technology could handle more easily vector-based display representation, until bitmapped graphics technologies (aka raster graphics) emerged when enough memory and compute power existed to in a sense simulate the result of vector computations on CRTs, essentially turning the analog into digital.) As those limitations shrink in scope or disappear we also have an opportunity to revisit those metaphors, like the very abstractions used by software and the way people interact with software among others.

VR may have gotten the ball rolling but it’s AR that will have the most impact first.

A Brave New World

New technologies and paradigms are often portrayed in nearly eschatological terms relative to what’s already there, because we confuse usefulness with overwhelming economic dominance: “Facebook will kill Google,” or “Television is Dead,” to name just two, all-or-nothing propositions in which we act as if the services and paradigms of the present will be instantly swept away by whatever’s next. Controversy sells and motivates, so this way of thinking and framing ideas makes for good copy and it is, admittedly, an effective way to “rally the troops.” But as far as getting results it is as shortsighted and arrogant as it is useless.

The new complements the old, and both new and old technologies and approaches coexist, sometimes for decades or even centuries. Radio didn’t immediately eliminate the need for newspapers. TV didn’t wipe out Radio. The Internet did not eliminate any of them. In fact, each new medium has not eliminated the need or function of older ones, but it has, more often than not, affected the way in which that medium was previously monetized — a big (if often ignored) difference.

This is important because framing the discussion around a new idea as a complete replacement for something else not only puts unnecessary and unrealistic expectations on the idea but also can (as in our case it would) end up distorting it to satisfy this either-or mentality. If we were to frame the problem, say, as “the future of all search” we would very quickly end up comparing the new system’s results to Google’s even in cases in which it makes no sense to do so; similarly, if we were to say we aim to “replace Web Browsers” we would suddenly find ourselves worrying about W3C standards and JavaScript benchmarks.

In reality what we should be looking for is to let each tool and service do what they do best and focus on providing alternatives for tasks for which existing solutions are either inefficient or just plain ill-suited. Microsoft Word allows you to create primitive spreadsheets, but that doesn’t negate the existence of, or the need for, Excel. Facebook allows you to do professional networking and publish short messages, but LinkedIn and Twitter not only exist but dominate those areas, respectively.

Just because information discovery, retrieval, storage, and navigation have come to be dominated by web search, browsers, file systems, and other tools, it doesn’t mean that alternatives can’t exist or would never succeed in the marketplace. In particular, as we shift to new platforms around mobile devices, with different capabilities and new interaction paradigms, there’s an opportunity to also change how we interact with information, how we present it, understand it, and relate to it.