Common Sense — Still not Common in AI

Tassilo Klein
Sep 4, 2020 · 9 min read

Tassilo Klein and Moin Nabi (SAP AI Research)

Deep learning has heralded a new era in artificial intelligence, establishing itself in integral parts of today’s world within a short time. Despite its immense power — often achieving super-human performance at specific tasks — modern AI suffers from numerous shortcomings and is still far away from what is known as general artificial intelligence. These shortcomings become particularly prominent in AI’s limited capability in understanding human language. Everyone who has interacted in one way or another with a chatbot or text generation engine (e.g. OpenAI’s GPT-3) might have noticed that the longer the interaction goes on with the machine, the staler it gets. When generating long passages of text, for instance, a lack of consistency and human-feel can be observed. Essentially, this highlights that the model behind does not really understand what it says and does. Rather it is more or less walking along paths of statistical patterns of word usage and argument structure, which it acquired during training from perusing through huge text corpora. This rotelike behavior of replicating statistical patterns reveals the absence of a crucial component: common sense.

Image for post
Image for post

But what exactly is common sense? Actually, there exists no clear definition of what it is. It is one of those things we often take for granted and only notice when it is missing. Basically, common sense incorporates aspects from literally everything we deal with — ranging from natural laws, social conventions to unwritten rules. Consequently, the spectrum covered by the concept of common sense is quite broad, explaining the fluffy nature of its definition. Even though common sense is quite generic and applies to all kinds of domains, one particular medium enjoys importance in terms of constituting a popular testbed: natural language. Hence it is no big surprise that injecting common sense into NLP is a fundamental research challenge. And because text processing applications have far-reaching practical implications for consumers, common sense in AI is more than just an academic gimmick.
To better understand why this is the case, let us first look at the shortcomings of current models in more detail.

Why Deep Learning is struggling with common sense?

Image for post
Image for post

The absence of these human reasoning capabilities is precisely what makes machine learning models take shortcuts with its seemingly non-intuitive behavior. This problem becomes particularly prominent in the presence of infrequent but significant events, such as when machines lack generalization schemes. For that reason, those events are also referred to as “black swans”, which highlights the essence of the issue in a more figurative fashion. This quaint metaphor has its origin in the long-prevailing assumption in Europe that all swans are white. A system such as a self-driving car AI might only have been exposed to white swans during training. In the absence of sophisticated reasoning mechanisms, the car control system might react in a rather unpredictable way when confronted with something new. Given the sheer infinite combinatorial space of concept in the real world, mastering black swans requires that a model possess a notion of transfer in terms of concepts. Knowing the concept of “animal” with the subgroup of “swan” and the concept of color, it should be able to connect these both together without having seen this combination before. That’s why mastering black swans entails acquiring the capability to conceptualize during training to facilitate a transfer of concepts. However, as the space of combinations is huge, plausibility gauging at inference time is crucial, which directly connects it to common sense. Commonsense reasoning, with its inherent ambiguity in terms of concepts and their relationships, constitutes a case in point in this regard. To truly reason about common sense, a model has to come up with a process of concept disentanglement and compositional inference. Now as we know a bit more about common sense and its importance and touched the intersection with AI — how is common sense actually defined in the AI space? If you expect a crisp definition, you might be again disappointed. However, one of the first definitions of commonsense in AI was put forward by AI pioneer John McCarthy, who actually coined the term ‘artificial intelligence.’ In his seminal work “Programs with Common Sense” (1958) he wrote

“We shall therefore say that a program has common sense if it automatically deduces for itself a sufficiently wide class of immediate consequences of anything it is told and what it already knows.

[…]

Our ultimate objective is to make programs that learn from their experience as effectively as humans do.”

Assessing Commonsense Reasoning

1) The trophy doesn’t fit in the suitcase because it is too small.

2) The trophy doesn’t fit in the suitcase because it is too big.

Answers Candidates: A) the trophy B) the suitcase

In this example, the nouns are “the trophy” and ‘the suitcase,” with the ambiguous pronoun being “it.” As can be seen, changing the adjective from “too small” to “too big” changes the direction of the relationship, which makes the tasks extremely hard. Thus resolving this entails the conceptualization of an item (trophy) and a container (suitcase) via the relation (fitting). Therefore it should be clear that understanding the high-level concepts behind allows us to resolve all kinds of combinations for resolution, i.e., replacing the suitcase with some other container, the AI system should still come to the same conclusion.
Now that you are familiar with common sense and a way to test it, we will discuss how common sense reasoning has been approached technically.

Commonsense Reasoning in AI

  • Rule and knowledge-based approaches
  • Generic AI approaches
  • AI language model approaches

Current best-performing approaches are from the latter category. The underlying assumption of these methods is that their training corpora, such as encyclopedias, implicitly contain some commonsense knowledge that the model can usurp. However, this assumption is problematic because such texts barely incorporate common sense due to the assumed triviality. These methods usually function in a two-stage learning pipeline. Starting from an initial self-supervised model, commonsense-aware word embeddings are then obtained in a subsequent finetuning phase. Fine-tuning enforces the learned embedding to solve the downstream WSC task only as a plain coreference resolution task. Additionally, to fully utilize the power of language models, conventional approaches require annotated training data in terms of what is right and wrong. However, the creation of large labeled datasets and knowledge bases is cumbersome and expensive as it is done manually by experts. This applies particularly to commonsense reasoning, where compiling the complete set of commonsense entities of the world is intractable, due to the potentially infinite number concepts and combinations.
Language models capture the probabilities of word occurrence based on the text they are exposed to during training. Apart from capturing the word statistics, neural language models also learn word embeddings from representation and raw text data. The recently proposed BERT picks up the notion of language modeling in a slightly different way. Instead of optimizing a standard language model objective — modeling the word probability given a preceding context — BERT has a pseudo-language model objective. Specifically, BERT leverages what is known as a masked language model that is trying to complete sentences, from which words were replaced by a mask (“_____”) randomly.

“The trophy does not fit into the suitcase, because ____ is too big.”

To solve this task, the model trains a so-called attention mechanism. It provides cues such as to which words the model might want to pay more attention to when solving a task. In terms of the preceding example, more attention to the word “trophy” than “suitcase”, because knowing that the subject is trophy is more plausible. However, as we will see shortly, filling in words like this is particularly challenging due to the inherent ambiguity and technically requires a notion of common sense. Apart from improving the performance of models, self-attention also suggests providing insights into a model’s inner working. This is quite a desirable property as Deep Learning is often taunted as being a black box. In addition to the masked word prediction, training BERT entails another auxiliary classification task. Specifically, it is a binary classification objective, predicting whether two sentences are successive. All this taken together, yielded embeddings that could be easily transferred by finetuning to a wide range of downstream tasks, which propelled the domain of NLP to a new era.

What’s up next?

References:
Gregory L. Murphy, “The big book of concepts”, 2002
John McCarthy, “Programs with common sense”, 1958

SAP AI Research

We are a part of SAP Artificial Technologies and this is…

Tassilo Klein

Written by

Senior Researcher at SAP ML Research, Berlin

SAP AI Research

We are a part of SAP Artificial Technologies and this is our blog where we write about our current machine learning research projects and share the latest news.

Tassilo Klein

Written by

Senior Researcher at SAP ML Research, Berlin

SAP AI Research

We are a part of SAP Artificial Technologies and this is our blog where we write about our current machine learning research projects and share the latest news.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store