From Phone Booths to Smart Phones: How Human-Centered AI is Bringing Intelligence to Remote Collaboration

Andrew Rabinovich
5 min readDec 13, 2022

--

In 2013 Spike Jonze released a film called Her about a man who falls in love with an artificially intelligent (AI) companion. For many, this was the first exposure to the concept of AI and its ability to understand human interactions. Today AI is nearly ubiquitous, used in everyday life for applications such as search engines, movie recommendations, facial recognition and many more. Yet, we’ve barely scratched the surface on all that AI can unlock for human interactions.

Human-Centered AI (HCAI) is an emerging discipline intent on creating AI systems that amplify and augment rather than displace human abilities. If Jonze’s Her showed the hazard of replacing a human with AI, this new frontier is seeking to enhance the best of human interaction while solving complex problems.

During my time at Magic Leap, we set out to build Her. She was called Mica, a digital human with the ambition to pass the Uncanny Valley and the Turing Test, convincingly human in how she spoke and interacted. Visually Mica was stunning, from skin texture to eye gaze and motion kinematics. However she lacked the intelligence of human interactions. To bridge this gap given our current understanding of AI, lots of data with variable supervision is required. This means examples of interactions between humans, including the stimuli and responses between a person and the world around them. While the world-facing and user-facing sensors of head-mounted devices (MR/VR) capture both, unlike AI, virtual and mixed reality isn’t ubiquitous and examples for learning human interaction in those domains are seldom. So where can we find such data at scale?

Video conferencing had promise: here was an interaction model where one webcam and microphone, augmented with avatars and virtual backgrounds, captured the world while another webcam and microphone captured the user’s reaction to what they see on their screen. I became intrigued by the potential to leverage the vast amount of human interaction data produced by video conferencing to train HCAI to understand human interactions and solve one of the most complex and common problems with remote work: ineffective meetings. When the pandemic hit in 2020 I, along with millions of others, pivoted to working from home. Now virtual meetings were not only an interesting use case to train AI models, but a real opportunity to leverage AI to make meetings more productive and collaborative.

Headroom was conceived to be an AI layer on top of existing video conferencing platforms, but we soon realized that to build accurate real-time AI, we had to develop our own first in class video conferencing solution with real-time AI directly integrated into its content delivery framework. Our goal is to bring the best parts of human interaction to virtual meetings, while automating the parts that detract from connection and collaboration.

As remote work and online collaboration have become more prominent, it’s clear that collaboration is greatly aided by presence — seeing and hearing each other helps us collaborate. Presence alone, however, doesn’t truly simulate in-person interaction and certainly doesn’t go beyond it to create an amplified environment for collaboration. We need co-presence: the sense of “being there;” not just seeing and hearing another person, but understanding and remembering them in the context of your shared virtual environment. Human-centered AI can greatly aid this understanding and memory to increase collaboration effectiveness and lead to higher productivity by teams.

What does this look like in the meetings we have each week, from 1:1s to brainstorming sessions, board meetings to sprint reviews?

First, it means offloading the tedium of note taking, gauging engagement, and assigning action items to the AI, allowing human participants to focus on creating, sharing, and participating. Not only does this bring greater focus to meetings, it also produces recording, summary, and analysis no single person could capture in its entirety.

Second, video communication and collaboration are not disjointed processes. Sharing a screen of a whiteboard or canvas does not accurately represent the experience of sitting in a room with people, covering the wall with ideas on sticky notes. We need to think differently about the ways we work together and the tools we use if we want to truly replicate and enhance the experience of face-to-face collaboration in a virtual setting.

To do this, video conferencing must be ubiquitous across all collaboration platforms. Rather than integrating your tools with video conferencing platforms, video conferencing must integrate into all collaboration platforms. We call this Headroom Anywhere: a co-presence experience available on all the platforms where you love to work. With Headroom Anywhere, whether synchronous or asynchronous, each participant can elaborate on their idea and plan. They can talk though the process itself, whether developing a diagram, design, presentation, or even writing code or documents, combining conversation with the objects of collaboration.

Third, each meeting contributes to a personalized intelligence platform. Capturing these moments in Headroom not only amplifies in-meeting collaboration; it also creates perfect meeting memory where the multimodal content of meetings is searchable and shareable through video highlight reels, written summaries and automated action items. The entire digital footprint along with the AI insights, will be available both at the source of collaboration and in Headroom. Going even further, Headroom will be able to collate conversations and environments on the basis of content and context so you can trace an idea from inception to execution across the thread of the discussions that shaped it.

Human-centered AI is the key to the future of remote collaboration. One hundred years ago, to make a phone call one had to find a phone booth to place the call. With the introduction of landlines you could call from your home, and today with mobile phones you can make a phone call from anywhere. We are in the phone booth era of video conferencing; to collaborate on a document, spreadsheet, whiteboard or presentation, the conventional thinking is to start a virtual meeting and share your screen. We are on the mission to bring video conferencing into the smart phone era. Instead of going to a video conferencing platform to see and hear from others, you should be able to collaborate from anywhere with anyone, natively in the environment where you’re already working. The future of workplace productivity is here and AI is the driving force that will enable collaborative teams to get the most out of the interactions we have with each other.

Learn more about Headroom here.

--

--