Cheng Jang Thye
Antaeus AR
Published in
6 min readDec 20, 2023

--

Apple Vision Pro, harbinger of the next generation of human interface devices

Many of you would have heard about this device from Apple, the Vision Pro, from the various web sites and YouTube videos. Most of them would be talking about its exorbitant price and its many amazing new features. As a new Apple device category, it certainly deserves significant attention from the market. Here, I like to focus more on the context of the introduction of this device, in our evolution of human machine interface with computers.

(Source: Apple Computer)

Now, we started computing with the dumb terminal, connected to a mainframe computer. The user interface we had was famously known as a green screen or IBM 3270 Terminal. It’s essentially a single character input device (just google for “3270 terminal”), where the computer outputs a sequence of characters, pauses for you to enter something, and then lets you input a sequence of characters (terminated with an “Enter” key). This device was already better than a typewriter-like device that lets you type your commands and data onto a card which get read by the mainframe computer. Now, the point here is not about the history of this class of devices, but the way human has to interact with the computer. It is single dimensional, with one input stream and one output stream. Our eyes and fingers are still the main way we interact with the computer.

Then, we started to evolve to the next generation, a screen that is directly addressable in a position by the screen width and height. With this, the display becomes two dimensional (2-D). We started off with character based positioning, meaning the screen behaves like a grid of 25 x 80 rectangle of characters (25 lines of 80 characters per line), and the computer outputs information via the use of individual symbolic characters in any of the grid positions. For human interaction, we introduced the concept of a cursor, a location of the screen (typically with a blinking rectangle to signal its position for our attention) for a human operator to enter information. This is a leap ahead of the single dimensional device as we are now able to interact with the computer at different location of the screen as we “move” the cursor with arrow keys, and we can have information displayed anywhere on the screen.

(Photo by Super Snapper on Unsplash)

Of course, this evolved further into what we commonly see today. The screen now divides its display space into pixels (picture element) and we can have millions of it on the screen. We are no longer displaying characters but rather images of character using a collection of pixels in color (pixel actually came from the word picture element to represent the smallest dot the screen can display). In fact, the screen now becomes graphical in that images and graphics can now be displayed seamlessly on the screen. We have evolved the screen to provide a 2 dimensional view for our eyes which essentially allows us to freeze a moment of reality to examine in greater detail (in 2 dimensions) like a photograph. We also have new ways of interacting with the screen via use of a pointer device (such as mouse or trackpad) to directly manipulate the cursor, which now appears as a pointer on the screen, now known as the Graphical User Interface (GUI).

This progress is essentially to evolve the display of information to be as close to what our eyes can see, to maximize our visual abilities. We can now digitally capture something we see as an image and to view and manipulate it in our computers (to share, to store, to print, etc.). We can now see 2-D (two dimension) flattened version images of the physical world, and can even play games that simulate a 3 dimension (3-D) environment of the real world (like in Call of Duty first person shooter games). The development of such 3-D games and Virtual/Augmented Reality are steps moving towards providing a better experience of the real world for the human eyes from the flat (non 3-D) screen. We are still using a device with our hand and fingers, and leveraging on our hand motion and tactile reaction to help provide a better experience of the interaction with what is running in the screen.

(Photo by Campaign Creators on Unsplash)

By tapping on the stereoscopic vision capability in our eyes, we now start to have devices that let us see in 3-D. There have been several attempts to design such devices, but the latest devices all use two little screens placed in front of each eye to let us see objects with depth. This may sound simple but it actually requires a total revamp in how we organize and process the information to enable a realistic rendition in 3-D with our eyes. Apple’s Vision Pro is one of the latest device to offer the most realistic rendition. There are of course devices provided by other vendors, especially in the initial support for 3-D gaming, Virtual Reality simulation and Augmented Reality. Sony Playstation VR2, Meta Quest Pro, etc.. started offering their devices much earlier than Apple’s Vision Pro.

(Photo by Jessica Lewis on Unsplash)

They were all great attempts in building this virtual reality or augmented reality world but they lack the realism that Apple’s Vision Pro provides. The intent of this article is not to exhalt the advantages of Apple’s device, but to shed the light on new capabilities when we truly move to 3D visualization. Other devices will soon catch up with Apple’s offering with competitive pricing.

Like what the IBM 3270 Terminal has launched in the single dimension interface for computers and the 2-D interface initially created from Xerox PARC GUI that powers the Apple Lisa Computer (predecessor of Apple Macintosh computers) that is now in all the devices used in our hands, Apple’s Vision Pro may be the device that heralds a new generation of 3D experience, like the Netscape brower to the Internet. Over the years, we have developed the algorithms to provide an almost life-like graphical experience of 3D objects in scenes and even open spaces. Just look at the visual experience in games like Call of Duty and others, the new experience (how you see and provide input), metaphors (what you do) and mechanics (how you move in 3D) offers even more ways for us to interact with each other, the world and even abstract objects.

So, what can we expect the experience to be like? You can find more information on how application services can be built for Vision Pro here: https://developer.apple.com/visionos/

But I think the true portal experience may be upon us where you jump into a site that offers personalization: (remember the early days of Portal web sites in Internet browsing?)

(Photo by maxim bober on Unsplash)

Hope you have enjoyed reading this article and inspire you to have more ideas on what is to come. Thank you.

--

--

Cheng Jang Thye
Antaeus AR

An IT guy by profession, a sports fan (multiple sports), a husband with a loving wife and family, and a thinker wandering what is happening to our world.