Progressive Attention AI: An Artificial Intelligence That Accurately Emulates Human Behavior In The Workplace

Inside the development of a dual system AI & NLP platform to process conversational data with increased accuracy in transcription and highlighting vs. industry benchmarks

Even though Artificial Intelligence is a hot topic, in our eyes, AI is an ends to a mean. For us at Voicera, AI is a tool that can help you radically improve your productivity at work and in our most recent development we are delivering something that will have radical impact on your ability to be productive.

We spent the last year + developing an AI that can mimic human behavior. The human brain is an incredibly rich and complex system that we can learn from. It requires the ability to process information on multiple inputs at the same time and determine which input is more important at that very moment. This has to happen at the same time that the brain may already be more deeply focused on a particular task. You can think of it terms of this quote from our CEO, Omar Tawakol:

“The human brain has always had the luxury of constantly scanning its total environment while not wasting precious brain resources on every noise. Then when the brain senses something important, it allocates more attention to carefully understand a particular input from the environment. Without it, people wouldn’t be able to both engage in high level conversations and thoughts while also staying safe from threats. Most AI systems suffer from this same ‘sophie’s choice’ of expensive time-consuming processing OR lower accuracy. Progressive Attention AI solves this, by allowing a fast analysis of the environment to identify relevant inputs followed by very deep and focused processing of important moments that can yield a much higher accuracy.”

The development here is unique because it allows for a machine to run dual systems of operational processing that can functionally emulate how a person reacts and responds in a work environment. The new system mimics human-level attentiveness in meetings and conversations through the extraction of the most valuable items as they are determined to be relevant to the host and their participants. The system then applies progressively higher computational focus to those portions in order to produce much better outputs. This results in a higher accuracy transcript vs. established industry benchmarks as well as a more comprehensive “Meeting Minutes” where highlights and notes from the meeting are surfaced from EVA.

We Learned A Lot The Last 10+ Months…

Voicera operated in open beta for almost 6 months and aggregated data from 100,000 meetings. We launched the first commercially available version of our product in April and have seen significant success since then. In the meetings we were a part of we witnessed how the everyday user would leverage an in-meeting AI. At first we saw the overwhelming interest was in the full meeting transcript, with 50% of users visiting the transcript tab in our UI within the first 7 days of a meeting. We also saw that as much as 35% of users opened and/or shared their post-meeting highlights email while the vast majority visited the highlight tab in the full desktop browser UI. Since we launched the commercial product that featured the first version of our Progressive Attention AI we are seeing the percentage of people leveraging the highlights increase as compared to the percentage of people who are visiting and reviewing a full transcript.

A. Transcripts are “table stakes”

An initial goal of the platform was to give people what they think they needed with highlights, but we quickly realized you can’t get to a highlight until you build trust in the underlying transcript. When we launched we were leveraging some external transcription tools layered over a couple of our own, but we saw inaccurate transcripts and as a result we decided to fix that. Part of the Progressive Attention AI is a 5 layer proprietary transcription engine we refer to as Ensemble. Ensemble is an engine that operates at a much better accuracy within 30 minutes of a meeting with no human error correction. As a result of the development of this transcription engine we see increased accuracy that is 100–110% more accurate in terms of Word Error Rate (WER) vs. industry accepted benchmarks and established enterprise players.

Voicera Ensemble produces 2x & 3x the coverage @ 90% target accuracy as compared to the #1 & #2 Major public Transcription APIs.

B. “Cliffs Notes” for meetings are where the value really is

Our research took two paths. One path was quantitative and came from analyzing the data in our systems. The second path was qualitative and stemmed from us talking to more than 300 users over the period of 2 months. From both methods we established that users did not want to relive an entire meeting. They do not want to review a full transcript because that is not an efficient use of time. They did want to do it a couple of times the first week or two, but once they had a level of trust in the accuracy of those transcripts they moved on to the highlights. Transcripts are a safety net, but highlights are the value that is extracted from the meeting. Highlights are the notes that can be used to share insights, guide follow up and ensure that the customer is heard properly. Our focus quickly shifted from accuracy in full transcripts to the accuracy and readability if highlights.

C. Actionability drives performance

Which leads us to the most important piece — that the highlights themselves have to be actionable. They have to come via email in a format that is useful. They have to be capable of being delivered to where your team is doing their work. And not only do they need to be able to be delivered, but they need to be simple to use. We refer to this as Actionable Activation. From a usability perspective this refers to things like diarization (the ability for the AI to recognize different speakers and assign their speaking to text), zoning (the AI recognizing when the highlight should start and end when automatically collecting the moment), and customization (the ability for the user to help train the AI what is important to them that may or may not be important to others). For highlights to be actionable these elements need to be addressed and delivered to the workflow system of your choosing, systems like email, Salesforce, Slack and others. Meetings are traditionally ephemeral. Our goal is to change that so that outcomes from a meeting can impact your workflow. Users do not need a new platform to log into. They want a platform that delivers info to where they already are.

Why? Because Progressive Attention AI delivers the future for Conversational Data

The development of a fully functioning Progressive Attention AI is key to the future of how companies will leverage Conversational Data. If you can aggregate conversations so that they can be analyzed and acted upon, you can impact everything from employee retention to customer and partner revenue.

  • Does meeting culture and tone in internal meetings directly correlate to employee retention?
  • Does tone and actionability of meetings lead to higher value deals for your enterprise?
  • Does volume of customer contact + tone of that contact relate to converting “raving fans” for your brand?

These are just the beginning of what Conversational Data could provide. To make this a reality you need two things. You need to aggregate the data and you need to extract the value to process the most important moments and generate higher levels of insight. The accuracy needed to be able to automatically act on the conversational data is higher than most systems are able to produce today. Extracted highlights that are processed by a dual system AI could now make these meetings accurate and actionable, and that is fundamentally why Progressive Attention AI is of importance in the workplace.


Voicera is an AI technology company based in Menlo Park, CA. Voicera focuses on the utilization of AI technology to harness voice in the workplace, connecting meetings to the rest of your collaboration systems. This is done my offering Eva; the enterprise voice AI, to both individuals and the enterprise. Eva listens and takes notes, and automatically provides those notes so your meetings can be activated. You can find out more about Voicera and Eva by visiting www.voicera.com and signing up for a free account.