After 50 Years, Is It Time to Say Goodbye to the Keyboard?
An Overview of Human-Computer Interfaces. What Comes After Touch Screen and Voice Recognition?
An Apple Watch, which is not even considered to be a very powerful computer, is able to process gigabytes of data each second. Our brains have tens of billions of neurons and over a quadrillion connections, and the human brain processes an enormous amount of data each second which we cannot even estimate. Yet, the humble keyboard and mouse, are still to this date the fastest bridge between the powerful human brain and the ultrafast world of 0s and 1s.
The Apple Watch is 250 times more powerful than the computer that landed Apollo on the moon. While computers have evolved from occupying a whole building to just nanometers, keyboards still remain the most reliable and most widely used human-computer interface.
Moving Beyond the Keyboard and Mouse?
Computers are getting embedded into different objects and since we cannot connect a keyboard and mouse to every object around us, we need to find other interfaces. The current solution to interact with smart objects, a.k.a. IoT, is through voice recognition which obviously has limitations such as public use. Let’s take a look at the methods that researchers and companies are working on at the moment.
Advances in multi-touch technology and multi-touch gestures (like pinching) have made touch screen the favorite interface. Researchers and startups are working on a better touch experience, from understanding how firm your touch is, which part of your finger is touching, and whose finger is touching.
DARPA funded research in this area in the 70s! but voice until recently was not useful. Thanks to deep learning, now we have got pretty good at voice recognition. The biggest challenge with voice at this moment is not transcribing, but rather perceiving meaning based on the context.
In eye tracking, we either measure the gaze (where one is looking) or the motion of the eye relative to the head. With the reduction in the cost of cameras and sensors as well as increasing popularity of virtual reality eyewear, eye tracking as an interface is become useful.
Gesture control is the human-computer interface closest to my heart. I have personally done scientific research on various gesture control methods. Some of the technologies used for gesture detection are:
Inertia Measurement Unit (IMU)
The data from accelerometer, gyroscope, and compass (all or some of them) are used to detect gestures. The need for recalibration and lower accuracy are some of the problems with this method.
Infrared+Camera (Depth Sensor)
Most of the cool gesture detection systems that we have seen use a combination of a high-quality camera plus an infrared illuminator and an infrared camera. Basically how it works is that it projects thousands of small dots in the scene, and based on how far an object is, the distortion is different (there are different methods like ToF that I will not go into). Kinect, Intel’s RealSense, Leap Motion, Google’s Tango, all use some variation of this technology.
In this method, the user’s finger or body acts as a conductive object that distorts an electromagnetic field which is produced by putting transmitter and receiver antennas in an object.
Radars have long been used to track objects, from airplanes to ships, and cars. Google’s Advanced Technology and Projects (ATAP) Group has done a remarkable job by shrinking radar into an 8mm by 10mm microchip. This general gesture control chipset can be embedded into smartwatches, TVs, and other objects for gesture tracking.
If you haven’t been WOWed yet, let’s take it even further. All the methods that were mentioned above, measure and detect a by-product of our hand gestures.
By processing the signals directly coming from the nerves in our muscles, we can get one step closer to the intent.
Surface EMG (sEMG) which is acquired by putting sensors on the skin on top of your biceps/triceps or forearm, gets a signal from different muscle motor units. While sEMG is a very noisy signal, it is possible to detect a number of gestures.
Ideally, you would want to wear the sensors on the wrist. The muscles in the wrist are however deep and hence it’s difficult to acquire a signal which could be accurately used for gesture detection.
A new company, called CTRL Labs, does the gesture control from the wrist sEMG signals. CTRL Labs’ device is measuring the sEMG signal, and is detecting the neural drive that is coming from the brain behind this motion. This is one step closer to the brain. With their technology, you would be able to put your hands in your pockets and type in your phone.
In the past year, a lot has happened. DARPA is spending $65M to fund Neural Interfaces. Elon Musk has raised $27M for Neuralink, Kernel has got $100M funding from its founder Bryan Johnson, and Facebook is working on a Brain Computer Interface. There are two very distinct types of BCIs:
ElectroEncephaloGraphy (EEG) gets a signal from the skin on the scalp.
It’s like putting a microphone above a football stadium. You will not know what each person is talking about, but you are able to understand if a goal was scored (from the loud cheers and claps!).
EEG based interfaces don’t really read your mind. For example, the most used BCI paradigm is the P300 Speller. You want to type the letter “R”; the computer randomly shows different characters; once you see “R” on the screen, then your brain gets surprised and emits a special signal. It is smart, but I would not call this “mind reading” because we cannot detect thinking about “R”, but rather have found a magic trick that works.
Companies like Emotiv, NeuroSky, Neurable, and a couple of others have developed consumer-grade EEG headsets. Facebook’s Building 8 announced a project on Brain Typing which uses another brain sensing technique called functional Near Infrared Spectroscopy (fNIRS) which aims to reach 100 words per minute speed.
This is the ultimate Human-Comptuter Interface and works by putting electrodes inside the brain, however, there are serious challenges to overcome.
It may have occurred to you that given all the interesting technologies mentioned above, why are we still limited to using keyboard and mouse. There are certain features in the checklist to be ticked for a human-computer interaction technology to make its way to mass market.
Would you use a touch screen as the main interface if it only worked 7 out of 10 times? For an interface to be used as the main interface it needs to have very high accuracy.
Imagine for a moment that the letters you type on the keyboard show up one second after you have pressed the key. Just that one second would kill the experience. A human-computer interface that has more than a couple of hundred milliseconds in latency is simply useless.
A human-computer interface should not require the user to spend a lot of time learning new gestures (i.e. to learn a gesture for each alphabet letter!)
The click sound of the keyboard, the vibration of a phone, the small beep sound of a voice assistant, all intend to close the feedback loop. The Feedback loop is one of the most important aspects of any interface design which often goes unnoticed by the users. Our brain keeps on looking for a confirmation that its action has yielded a result.
One of the reasons it’s very hard to replace the keyboard using any gesture control device is the lack of strong feedback loops.
The Future of Human-Computer Interfaces
Due to the challenges mentioned above, it seems like we are still not in a position to replace keyboards, at least not yet. This is what I think the future of interfaces will be:
- Multimodal: We will be using different interfaces at different occasions. We may still use the keyboard for typing, touch screens for drawing and designing, voice to interact with our digital personal assistants, radar based gesture control in the car, muscle-based gesture control for games and VR, and Brain-Computer Interfaces to select the best music to play for your mood.
- Contextually Aware: You read an article on your laptop about wildfires in northern California, and then ask your voice assistant on your smart headphone “How windy is it in there?”. It should understand you are referring to where the fires are.
- Automated: With the help of AI, the computer would become better at predicting what you intend to do, so you don’t even need to command it. It will know you need a certain music to be played when you wake up, so you don’t even need an interface to find and play a song in the morning.
I am an entrepreneur in silicon valley and my passion is human-computer interaction. I have done research on brain-computer interfaces, muscle machine interfaces, and gesture control devices. I post about entrepreneurship, venture capital, and new technologies. Please follow me on LinkedIn, Twitter, and Medium.