What is Semantic AI? Is it a step towards Strong AI?

Puneet Agarwal
8 min readOct 23, 2020

--

Modern artificial intelligence can decide on its own whether it should use the width of a person’s lips to detect smile, or is it some other factor, or a combination of multiple factors (referred to as representation learning). This and a few other achievements of modern AI (such as reinforcement learning), have forced people to re-think whether Artificial General Intelligence (AGI or Strong AI) can actually be achieved anytime soon? No wonder, many articles have been published on this topic recently: Nature Journal [1], Forbes Magazine [2], McKinsey Consulting [3] etc. These articles profess that AGI is far from reality, anytime soon. After reading this blog one can realize “why do they say so” and also understand more about a new and emerging form of artificial intelligence, “Semantic AI”, which I believe is a step ahead of current form of AI (weak AI).

In this article, I first share a perspective on the need of Semantic AI in enterprise context and draw allegiance to “Reading Comprehension” (M1). Next, we will see how humans perform such tasks (M2 optional read), and then understand how Semantic AI will attempt to solve such problems (M4). In between, a brief background on Knowledge graphs will also be covered (M3).

M1: Comprehension in Enterprises IT:

Similar to reading comprehension against a small paragraph (as we did in our schools), in enterprises, we need to answer ad-hoc questions against business operations. For example:

a) Document Comprehension: A rather, well known business demand is that “Can a system read a document or a set of documents, and answer question?” For example, design documents of a large ERP system; process documents of procurement division — such as quotations, purchase orders, delivery notes, invoices etc.; or in pharmaceutical domain, documents containing information such as usage, side-effects, dosage etc. of a drug. Not only on these semi-structured documents, but also on unstructured documents such as Security/ HR policy documents.

Sample Questions for Enterprise Comprehension
Table 1: Sample Questions for Enterprise Comprehension

b) Drawing Comprehension: Instead of a paragraph, or a document, can a system read a complex industrial drawing and answer questions. For example, see questions given in Table-1 against a very simple drawing shown in Figure-1? Similarly, can we answer such questions against a set of complex drawings? Even when certain information is not explicitly available in the drawing.

Figure 1: House Floor Plan used for demonstration of Drawing Comprehension
Figure 1: House Floor Plan (image Taken from https://commons.wikimedia.org/wiki/File:3-bedrooms_house_floor_plan.png, word “DINING” was removed from the drawing to make a point in the blog)

c) Department Comprehension: Can there be a system that knows all the people and their roles, as well as the process followed in a department? Can this system answer question against that department? See questions in Table 1 against IPR Cell (Department that is custodian of patents) of an enterprise.

d) Infrastructure Comprehension: Can a system comprehend all the IT systems involved in a large infrastructure that provides a special service, for example, telecom service, port operations, airline operations etc. Can the system then answer ad-hoc questions that are often required in contact center operations of the related service. Can a system answer question on functional matters of port operations?

e) Business Operation Comprehension: Can a system read all the IT systems involved in a business function, as well as all the related documents, and read people’s mind also, if required; to answer questions against that business function, such as procure-to-pay function of F&A? or Payroll processing? Or other similar functions.

M2: Comprehension via Natural Intelligence (optional read)

Enterprise Comprehension, such as those described above, are often performed by subject matter experts (SMEs). They first gather the necessary data, analyze the situation, and make inference.

Key elements of human intelligence are Knowledge, sensory perception, linguistic comprehension, and logic. Our ability to use all of these together to draw an inference and the ability to learn newer things very fast helps us perform comprehension related tasks with ease. Let’s look at some of the key components of natural intelligence in detail:

A) Knowledge: (Probably related to memory, as studied in epistemology.) We structure our knowledge in our own way, when storing that knowledge in our memory. We can a) retrieve facts from that knowledge and can b) correlate newer knowledge with prior knowledge. Most of us will agree that we associate a “type” to every new artifact or an entity we learn about. For example, if we hear about a new person, say “Geoffrey Hinton”, we associate him (entity) with many entity-types, such as “Person”, “Male”, “Scientist” etc. When we use such knowledge with logic, we are able to draw inference.

B) Pattern Recognition: For drawing comprehension human can easily identify the patterns of shapes that indicate sofa, dining table, etc. Human can then draw inferences using background knowledge to answer questions as listed in the Table above.

C) Logic: A lot has been written about logic in philosophy, however three common forms are also applicable in artificial intelligence are: Deductive, Abductive, and Inductive. We (human) use these three forms interchangeably to draw inference of conclusion or to learn newer tricks to perform any task.

M3: Background on Knowledge Graphs (KG):

A Knowledge Graph comprise of entities and relationships. For example, “Hinton lives-in Toronto”. Here, “Hinton” and “Toronto” are entities and “lives in” is a relationship between the entities. A knowledge graph is a network of many such entities and relationships. Key features of a knowledge graph are (also shown in Figure-2):

Key Features of Knowledge Graphs in contrast with relational databases (RDBMS)
Figure 2: Key Features of Knowledge Graphs with respect to relational databases (RDBMS)

1. An entity can have more than one entity-type. As we saw in the context of Natural Intelligence that “Geoffrey Hinton” is a “person”, “scientist”, “male” etc.

2. Relationships, between a pair of entities, have associated textual labels.

3. There can be more than one relationship between a pair of entities, e.g., “Nicolas Cage acted-in Lord of War” and “Nicolas Cage produced Lord of War” indicates two relationships between the pair of entities.

4. An entity can have many different values of the same attribute, e.g., there can be many different authors of a document.

5. Knowledge graphs have a provision for Ontology, which hierarchically defines the entity-types, e.g., “actor” is a “artist” is a “person”.

M4: Comprehension via Semantic AI

Semantic AI provides a framework, to perform end to end complex tasks automatically. It uses many different machine learning and logic-based approaches, and also utilizes the background knowledge often stored in knowledge graphs. Semantic AI is a key enabler for a Digital Twin. Using this technology, we can convert an enterprise from their current state of being driven by tacit information (implicitly available or stored), to a desired state of being driven by explicit knowledge, which can propel their journey towards Business 4.0.

Shapes used for showing a dining table in 2D drawings
Figure 3: Different Shapes, often used, to indicate dining table in a 2D house layout

Pattern Recognition via Artificial Intelligence: We can use machine learning to detect basic shapes such as lines, circles, rectangles, arcs etc. or directly identify frequently occurring complex shapes such as the one for dining table. Further, there can be many variations of a complex shape, as shown in Figure-3 for a dining table. Modern AI can identify the dining table in-spite of such variations with high accuracy, however via significant supervision. Probably, this is one of the reasons why this form of AI is considered as weak or narrow AI.

How does Semantic AI work?: I observed that the key steps involved in Semantic AI are (also shown in Figure-4):

How does Semantic AI Work? Steps involved in Semantic AI; Detection Interpretation, Reconciliation, and Action
Figure 4: Steps involved in Semantic AI: Detection, Interpretation, Reconciliation, and Action

1. Detection: First we parse an input (document, image, or drawing) with respect to the background knowledge and create a structured representation of the input in another knowledge graph. Here, most often the objective is to detect basic elements directly available in the input file, e.g., lines, circles etc. in a drawing, or words used to mention an entity in a sentence. Sometimes, we use supervised models (artificial intelligence for pattern recognition) to directly recognize complex shapes such as dining table. In the context of a document, in the detection step, we identify phrases that indicate an entity. As a result of parsing, we get a structured representation of the input, which can be referred to as Parsed-KG, if stored in KG.

2. Interpretation: We also interpret the parsed-KG to establish newer insights, i.e., new entities. We associate the parsed entities to an entity-type already present in the knowledge graph. We also correlate detected entities present in the parsed-KG with each other and with the entities of the background-KG. For example, some of the rectangles are associated with entity-type “room”. We also link various rooms with each other using relationships such as “left of”, “right of” etc. Sometimes deductive logic, is the best approach and sometimes pattern recognition is also required for interpretation, such as in natural language processing. The evolved parsed-KS is referred to as inferred-KG.

3. Correlation & Reconciliation: The input drawings don’t explicitly contain all the information, which are required to be inferred and reconciled based on prior knowledge and thumb rules. For example, can we infer the size of the living room based on the information present? We also need to reconcile and ensure that all inferred facts have an agreement with each other. Another example of correlation is that via use of background knowledge and deductive logic we associate a “room” that contains the dining table to another entity type “dining room”. We often use background knowledge and logical reasoning in this step. Thus inferred-KG becomes comprehended-KG after this step.

4. Action: The comprehended-KG can be used by human or another IT system for amplification of SMEs or automation of a task. Information retrieval is a critical component of the action step. Knowledge Retrieval: Often times, there is a need for compositional reasoning [4] for retrieval of specific information from a knowledge graph. We convert a natural language question into a sketch for structured query (SQL or SPARQL etc.) and execute it on the knowledge graph to retrieve the desired information.

Above steps can help us perform drawing and/ or document comprehension. However, for comprehension against a Department, Infrastructure, or Business function we need to gather, integrate, and clean the data from multiple systems, documents and people. It becomes more complex because often only partial information is allowed to be stored in the knowledge graph. I will probably touch on such aspects in my future articles/ blogs.

Conclusion: In summary, Semantic AI is a key enabler for comprehension on large artifacts (rather than a paragraph) such as business operations of an enterprise. The software architecture that combines statistical approaches with background knowledge to draw higher level inference can be called Semantic AI, which probably is a step towards a stronger form of artificial intelligence, however far away from Strong AI.

Hope you enjoyed reading this, as much as I did, while writing. Thanks for reading. Please share comments. Puneet Agarwal

References

1. Ragnar Fjelland, “Why general artificial intelligence will not be realized”, Nature, June 2020; (URL: https://doi.org/10.1057/s41599-020-0494-4)

2. Navin Joshi, “How Far Are We From Achieving Artificial General Intelligence?” Forbes, June 2019; (URL: https://www.forbes.com/sites/cognitiveworld/2019/06/10/how-far-are-we-from-achieving-artificial-general-intelligence/)

3. Federico Berruti, Rob Whiteman, “An executive primer on artificial general intelligence” April, 2020; (URL: https://www.mckinsey.com/business-functions/operations/our-insights/an-executive-primer-on-artificial-general-intelligence)

4. Puneet Agarwal, Maya Ramanath, Gautam Shroff, “Retrieving Relationships from a Knowledge Graph for Question Answering”, ECIR 2019.

--

--

Puneet Agarwal

Head Semantic AI and Principal Scientist, Tata Consultancy Services Ltd.