The human-to-machine communication model

Discover better insights with higher confidence faster than humanly possible. It is the key to taking a first step in the right cognitive application direction: asking not what could it do, but what it should do. With artificial intelligence (AI), we have decades of data and research into human thought processes and communication to use a blueprint. To simulate human relationships, we begin by observing and better understanding ourselves.

Jennifer Aue
IBM Design

--

Published January, 2018 by IBM Developer Works.
Written by Jennifer Sukis and Leah Lawrence.

August 2020 update:
We are proud to share that The Human-to-Machine Communication Model has been honored with IBM’s High Value Patent Award by the IBM Corporate Intellectual Property Team — reserved for those patents deemed most critical for IP income-generating transactions for generating significant Intellectual Property (IP) income.

Introduction

So you want to build a cognitive application, but you want it to be great. You want it to be useful, exciting, and inspiring — in essence, to create a truly cognitive experience. You might be wondering what is a cognitive experience? Should the application I’m designing be cognitive? If it should, can I measure how cognitive it is?

Cognition is the mental action or process of acquiring knowledge and understanding through thought, experience, and the senses. Cognition describes how humans process thoughts to communicate with one another. For computers, cognition describes a system that simulates this human thought process by using algorithmic models that are intended to augment human cognitive capabilities. A cognitive computer’s goal is to interact with humans in a way that feels natural to us — to augment our own human cognitive capabilities.

This, in turn, raises a fundamental question: How does the cognitive system simulate human cognitive processes? The answer is it must activate our human senses (sight, sound, touch, and so on), it must create a motivating experience, and it must inspire our thought processes — all with the goal of enabling us to acquire knowledge and shape our understanding. We call this presence. The most compelling cognitive solutions are ones that create a presence in our lives and through that presence, augment our own cognitive capabilities.

When compared with non-cognitive systems, cognitive applications go beyond what we experience today in transactional apps (for example, push a button, get a determined response). They are distinguished by IBM as systems that have the ability to understand, reason, learn, and interact naturally. To accomplish this, cognitive systems analyze massive quantities of data to compose insightful, contextually aware, and continuously improving relationships with users. Their growing knowledge of a user’s needs, goals, and values allows them to provide individualized responses, suggest relevant insights, and reveal contextually significant discoveries.

To understand, reason, learn, and naturally interact, there are five elements of human thinking and communication that cognitive systems must be able to recognize, understand, analyze, and simulate:

  • Perception
  • Motivation
  • Reasoning
  • Learning
  • Knowledge
The Communication Model for Cognitive Systems. IBM Patent Application 15843302.

There are different levels of cognitive function depending on the autonomy of the application. A low-level cognitive application needs a lot of assistance from the user or programmer, while a higher-level application behaves more on its own. A brand new application might start out with low cognitive function because a programmer must train knowledge and behavior until the application responds reliably. Over time, it will have more advanced cognitive functions. To get to more advanced functions faster, IBM Watson™ provides a base framework to build upon. Each cognitive capability can also have different levels of functions depending on the amount of programmer or user intervention.

One interaction between a user and a cognitive application might not need every one of these capabilities, but the application itself will need every capability to complete a wholly cognitive experience — something that has presence with the user.

The human-to-machine communication model ties the components necessary for cognitive systems together into a methodology for creating cognitive experiences. Its purpose is to guide and inspire intentional innovation, and provide a structure for making responsible design decisions based on humankind’s needs, values, and expectations.

Part 1. Input: Understanding the world

Knowledge

Humankind’s impetus to interact with technology — a hammer, microwave, or quantum computer — aligns directly with the technology’s ability to improve human lives, to extend our strength or reach. For cognitive computing, the improvement that lures us to interact with it is its ability to process and synthesize vast amounts of data that augments our thinking, allowing us to make better decisions and make new discoveries faster than humanly possible. It’s this wholly unique capability that’s helping doctors spend less time researching and more time caring for patients, creating targeted lesson plans for every student’s unique needs, helping companies serve millions of customers simultaneously, personally and proactively.

Knowledge is the summation of everything a cognitive system knows — from the ground-truth data that it’s originally trained with to the learnings of every interaction it experiences. Cognitive systems can be trained on any topic if given a model for that domain. They are especially good at reading, identifying, and remembering massive amounts of unstructured information in a way that would be impossible for the human mind to process. They can analyze thousands of pages of content and summarize highlights, or listen to hours of music then compose their own songs, or browse terabytes of images to reveal relationships and patterns across previously unrelated research. They improve their ability to provide us with personalized responses, relevant insights, and new discoveries with every new piece of data they add to their knowledge base.

Knowledge is the application’s ground truth and ever-growing expertise and skill set:

  • Good — The application has subject-matter expert knowledge that makes the application a skillful tool for problem solving.
  • Better — The application allows a user or programmer to update the knowledge base with trained or live data.
  • Best — The application updates its knowledge base on its own using live sources.

Perception and motivation

To respond to input from the outside world, cognitive applications need to understand context — the circumstances surrounding an event, a statement, or an idea — for the system to fully comprehend the meaning of a user’s intent at the moment of interaction and provide insightful, timely, natural responses.

For instance, recognizing the date, author, quality of information, and validity of sources regarding an incoming article allows a cognitive system to determine what priority to give the new information. Similarly, the state of the world at the moment of interaction also provides important context for understanding a user’s needs. If negative press comes out regarding a company’s product, the CFO might want to immediately begin analyzing possible repercussions in stock prices. A cognitive system could couple the news alert with significant insights and stock analysis, knowing it’s a priority for the CFO.

Context comes from any source that affects the system’s ability to provide intelligent responses to a user and should be considered from two viewpoints that correspond to human thought processes.

Perception

“Hey computer, where is Jones?”

If a cognitive system knows that the user is at home rather than in his car, it might infer he’s likely asking about his dog, Jones, rather than Jones Street. Furthermore, if the system knows that the vet recommended that Jones goes outside every 2 hours because Tom received the instructions in his email, and that following the vet’s instructions was important to Tom in the past because he set medication reminders by using the last email from the vet, then it can deduce that it would be valuable to alert Tom when Jones needs to go outside and where he is in the house, rather than waiting for him to ask.

Perception is the application’s ability to consume, organize, and classify information about the user’s physical and digital, and current and historical context. Perceptual data includes things like location, date, time, mood, expression, environment, physiological responses, connected applications, networks, and nearby devices. It uses APIs to stream information about the world, including weather, traffic jams, delays, events, and social media. The more data a cognitive system can collect surrounding perception, both historically and in the moment, the more insightful and natural its response can be.

Perception is the application’s ability to consume, organize, and classify information about the user’s physical and digital context:

  • Good — The application classifies and organizes information according to its pre-training.
  • Better — The application is able to classify and organize new information from live sources and from what it has learned.
  • Best — The application infers information based on other information. For example, if Mary is in the hospital and her doctors have instructed her to drink fluids, a cognitive system can see that she has a glass that she’s drinking from and can inform her doctors that she is staying hydrated.

Motivation

Understanding motivation gives cognitive applications knowledge about a user’s priorities, goals, and values so it can customize an insightful response that meets the user’s expectations for interaction with the system. Data that defines a user’s motivation can be sought out through their setup experience, preferences, responses, expressions, and interactions over time. As the user’s interaction history grows, so does the system’s understanding of the user’s needs and behaviors, improving this knowledge with each interaction.

Motivation allows the system to understand and prioritize behavioral and personal information about the user that reasoning can then use to create a valuable response. It does this by evaluating the success of past responses when the user was in similar circumstances that defined his needs and values in that moment. For example, a cognitive system can decide not to interrupt a call with a work notification because in the past, the user has dismissed similar notifications when on the phone with his mom. It can choose to alert a user to a news feed he isn’t subscribed to because the user has recently been focusing on this new topic at work. Or it could listen in on a meeting and decide to send the team feedback for improving how they run stand-ups along with the meeting minutes, knowing their manager’s goal is to improve agile practices.

Cognitive systems also must account for why a user is asking a question of that specific system. What would the expectations be for interacting with a system that is built by an athletic clothing brand versus a system that is built by a music-reviews brand? If a user asked each system what to do on a Saturday night, he’d rightfully expect a range of answers based on the values the brand of that system represents. Cognitive systems have goals and values that are defined by their creators that need to be accounted for and expressed in their responses to meet the user’s expectations.

Think of perception and motivation as the core components for making a user feel understood. The system should reflect ways that it knows the user, remembers past interactions, and anticipates needs without being directed. It should reduce friction and cut the number of steps it takes to complete a task. When done well, a cognitive system should feel like it really knows you and understands your needs.

Motivation is the application’s ability to understand the user’s intent, priorities, goals, and values:

  • Good — The application knows the demographic factors and business focuses of its users and surfaces information accordingly.
  • Better — The application identifies individuals and their specific behaviors. It recognizes user’s emotions and responds with the most appropriate emotion.
  • Best — The application proactively interacts with the user, based on how he is likely to respond. For example, an assistant knows who he is assisting and plans ahead of time on how to accommodate their needs.

Part 2. Output: Responding naturally

Reasoning

An application can have intelligent interactions with a user simply by providing the ground-truth information that is stored in its knowledge base. However, by using what it has learned about the context of a user, the cognitive application can go beyond a literal translation and respond with a more valuable, big-picture answer.

Reasoning is the application’s ability to have cognitive interactions by considering all of the information available through perception, motivation, and knowledge. Even if a solid knowledge base alone produces intelligent responses, it does not feel cognitive if the application does not also serve the individual and consider his context in some way.

By applying confidence scores to potential answers based on contextual findings and previous interactions, the system can reason about how to compose a response that is personalized and predictive. A single interaction might not always have all of the capabilities in use at the same time, but a cognitive application can compose its response based on what it’s learned to date, then aim to improve that response in the future based on the user’s reaction to that response.

For a first-time use case, a cognitive application will not have enough information about the user or the context to form a response relevant to them. This is where the application will need to present a default set of information based on assumptions about the user. For example, if John is a new sales manager at IBM browsing for leads, all the application knows is that John is usually a male’s name, his role-based goals, and IBM’s values as a company. The app will suggest content that is statistically preferred by sales managers and provide direction in a tone of voice that aligns with IBM’s values. As John clicks options, the app starts to understand his behaviors and preferences. Perception and motivation capabilities collect the contextual information about John so the application can use reasoning to provide personalized and intelligent interactions.

The purpose of reasoning is to intelligently tailor large amounts of information to an individual’s needs and situation. An intelligent response can come from the app’s knowledge base alone, but to feel like it truly understands the user, it has to consider and apply what it’s learned about their context and historic interactions.

Reasoning is the application’s ability to have intelligent interactions based on contextual and historic knowledge of the user:

  • Good — The application will generate predetermined responses that are specific to the domain or targeted problem space. It does not necessarily use perception or motivation capabilities when forming responses, but relies heavily on pre-trained knowledge.
  • Better — The application generates creative responses, and relies on perception, motivation, and knowledge capabilities when forming responses.
  • Best — The application anticipates the user’s need, responding to him directly, and making recommendations that benefit his specific needs and context beyond anything they stated explicitly.

Learning

With every interaction, cognitive applications update their knowledge about a user, new data, and the world based on the user’s response. Maybe the user immediately clicked the link the system suggested, or perhaps they dismissed it without reading the content. Learning, the application’s ability to improve interaction over time, updates a matrix of information about the user, the context, and the app’s expertise and skill set. Cognitive systems are constantly updating the way that they interact with people based on their findings from individual and collective historic experience. They remember past interactions and adjust responses based on those learnings by making adjustments to the confidence scoring of content in the matrix.

Consider a cognitive system that’s intended to behave as an educational companion with a child as they progress from elementary to high school. The system would recognize when the child’s interaction capabilities improved as they started making faster decisions or absorbing more complex content. The system could adapt to respond to the child’s evolving needs by changing its tone, providing more challenging queries and drawing on its knowledge base of childhood development needs to provide customized exercises that support the child’s learning goals.

Through constant interaction and user feedback, a cognitive application learns to train itself for a specific user, increasing the system’s accuracy and value.

The improvement of cognitive systems by learning over time is similar to how humans develop. When first started, the system is learning and absorbing large amounts of new information, but it’s not knowledgeable enough to be personalized, insightful, or predictive. Over time, with more interactions and feedback, the system improves and becomes increasingly intelligent and skilled at perceiving and predicting what users need and value. As it ages, it continues to become progressively more sophisticated and knowledgeable.

Learning is the application’s ability to interpret user responses and apply that knowledge to improving interactions over time:

  • Good — The application allows the user to train pre-packaged information in the interface or code. It does not necessarily train the perception or motivation capabilities, and its purpose is to create more trained knowledge.
  • Better — The application learns through user interaction and behavior and explicit feedback. It trains perception and motivation capabilities.
  • Best — The application updates or trains its own information, without user intervention.

Conclusion

For those of us designing cognitive human-to-machine experiences, it’s easy to be overwhelmed by the speed and volume of incoming developments — quantum computing, algorithms for nonverbal communication, anthropomorphic embodiments; not to mention the avalanche of speculation around possibilities and implications, both good and bad.

Technology revolutions like the one we’re experiencing today can dazzle and blind us when we begin looking for opportunities to apply them. The more buzz and hype, the greater the pressure to rush into creating something using whatever new features have caught our imagination. The benefit of moments like these comes in failing fast, the discovery of realizing what we don’t know, and rediscovering what will continue to hold true for years to come.

With the dawn of AI, we need a best-practices guide for what a cognitive relationship with machines — ones that can hold a conversation, interpret our emotions, predict our needs, and draw from the entirety of human knowledge — could and should look like. When we find ourselves in such unfamiliar territory, the best way to begin is by reminding ourselves of the purpose for any technology innovation: to improve the quality of human life.

It is the key to taking a first step in the right direction — asking not what could it do but what it should do. With AI, we have decades of data and research into human thought processes and communication to use as a blueprint. To simulate human relationships, we begin by observing and better understanding ourselves.

We initiated our research by looking forward — imagining what we wished human-to-machine relationships looked like. We spent time with robots, observing the thoughts, feelings, and expectations they evoked. We realized that the more human-like the computer’s embodiment the more it’s expected to respond like a real human. Anything less is a disappointment.

Simultaneously, we began this work by looking back, pulling research publications from past decades to rediscover what science could confidently state about the nature of human cognition. What we found were core elements of thought processes and communication that computers would need to simulate to develop a cognitive relationship with humans. We evolved elements into the components of the human-to-machine communication model with the purpose of defining a process that can be used to strategically design and measure cognitive interactions rooted in known truths about human needs and values to contribute to global efforts of improving the quality of human life as we enter a new era of technology, relationships, and possibilities.

Jennifer Sukis is a Watson AI Practices Design Principal at IBM based in Austin, TX. The above article is personal and does not necessarily represent IBM’s positions, strategies or opinions.

--

--

Jennifer Aue
IBM Design

AI design leader + educator | Former IBM Watson + frog | Podcast host of AI Zen with Andrew and Jen + Undesign the Grind