The future of human interaction with machines?
Most computing, especially web and mobile, is mostly for consumption. The user provides very minimal input into the computer, but gets a lot of information out. With a few thumb swipes, the user can endlessly scroll through pages and pages of images, video and text.
What’s interesting about emerging human-machine interface tech is the hope that the user may be able to “upload” as much as they can “download” today. The most promising application is in augmented creativity — where the user works with a computer to design something new.
A few different technologies are finally starting to reach maturity for commercial use — HD sensors, predictive artificial intelligence, and 3D displays — and combined they can be used to reshape the way people interact with computers.
This new way of putting together software and hardware can be called a “3D human-machine interface”. It’s defining characteristics are:
- HD sensors: High-resolution, passive input from the user
- Predictive AI: Predictive artificial intelligence to continuously infer intent
- 3D displays: Realistic AR/VR interfaces with haptic feedback
How does it work?
An example of a 3D human-machine interface technology is an eye-tracker. Eye-trackers use high-speed, high-resolution cameras to track users’ eye-movements and use that data to understand their attention and intent. With a mouse, you send 1 click every few seconds. An eye-tracker may be able to infer a several bits of information about your attention and intent multiple times per second.
Imagine an eye-tracker being used in a 3D modelling program. It can tell which part of the 3D model you’re focusing on, and maybe automatically zoom in. It might use an trained AI to guess from your eye fixation behavior to guess what menu you want to open, and automatically click through the menus to get to the action you want. If the computer gets it wrong, you can manually say so, maybe by pressing a “cancel” button.
Why human-machine is an upgrade over current interface
The process of putting your thoughts into a computer may be 10x or even 100x faster than with just a mouse. If it took 10 hours to create a 3D model before, a 100x reduction is 6 minutes.
The “display” part of a human-machine interface shows the user a working copy of what they’re making. AR/VR headsets allow for much richer, wide field-of-view rendered displays compared to smartphones or laptops. In the 3D modeling example, a headset display would allow for viewing the working model in much more detail. AR/VR can also be combined with haptics to create a display that looks and feels like it’s made up of real physical objects, which makes interaction feel all the more natural.
Building a Human-Machine Interface
The next big thing in software depends a lot on what’s happening in hardware. Moore’s law ended sometime in 2016, which means we can no longer expect chips to get exponentially faster every year. The future is chips instead getting cheaper, and spreading computation across more and more parallel cores and specialized chips, such as GPUs and AI chips.
The other big trend in hardware has been the smartphone supply chain, which has led to an abundance of cameras, sensors and display technology. 5G will start it’s rollout later this year, which will make it easy for these sensors and displays to exchange large amounts of data with a processor.
Over the past decade, the biggest new thing in software has been artificial intelligence, in the form of machine learning and neural networks. These technologies are really good at three things:
- Filtering: Neural networks and ML are really good at getting a signal from noisy data, especially camera data. They can detect spoken words in an audio clip, and understand the content of a photograph.
- Learning: Neural networks and ML can be used to train adaptive agents. They can be optimized to win inside a fixed environment like a video game — or application.
- Generating: Creative AI can be used to synthesize new things from scratch, like photos, music, or even code.
These techniques complement what’s happening in hardware. We can use cameras and sensors to get lots of passive data from the human body. We can use ML to accurately interpret those signals, train agents to predict what the user wants and generate new content optimized for the user’s desires. Finally, we can use our new GPU chips and AR/VR headsets to display the output.
What’s promising about human-machine interfaces is that most of the underlying technologies — AI, specialized chips, VR/AR headsets, sensors — are predicted to improve dramatically over the next few years. Unlike other applications of AI such as self-driving cars, human-machine interfaces don’t need models with perfect accuracy. Fuzzy logic is just fine.
How long until deployment?
The missing piece of the puzzle in human-machine interfaces is a deeper understanding about how the mind interacts with computers. Computational neuroscience, including the study of inferring mental states from biometrics, is a relatively young field. Interaction design for human-machine interfaces is also almost completely unexplored, outside of a few research labs. However, the biggest missing piece is in applications. The mechanics of augmenting human creativity are grounded in relatively well-established technology. The real question is — what should augmented creativity be used to create?