Knowledge graphs or plain ol' snake oil: How would you know?

Mike Dillinger
11 min readNov 7, 2023

The overarching goal of knowledge graphs is to store and reproduce conceptual information in a machine-accessible way, as verifiable facts linked to objective data — not only as statements or opinions. We want and need to transform the extremely valuable knowledge in peoples’ minds into tangible, FAIR, and far more valuable assets that we can accumulate, share, and reuse at scale.

Interested CEOs and investors ask (with good reason): Is this knowledge graph stuff hype or the real deal? Is there really knowledge in the machine or only in the developer’s imagination? Being able to answer this question clearly and convincingly is, in my mind, a question of survival for AI and for the semantic technology industry in particular. So it's worth a longer post than usual.

TL;DR. Takeaways for the Hurried and Harried

  • We want and need to transform the extremely valuable knowledge in peoples’ minds into tangible, FAIR, and far more valuable assets (knowledge graphs) that we can accumulate, share, and reuse at scale.
  • If you need a human to understand and validate what was meant by a fact or triple, then it’s not a knowledge graph.
  • If facts or triples don’t document components and characteristics of things (not strings), then it’s not a knowledge graph.
  • If facts about the components and characteristics of thing concepts are not explicitly supported by evidence or criteria, then it’s not a knowledge graph.
  • Taxonomies and databases are hardest to convert to knowledge graphs when categories and column names (the relations or predicates in knowledge graph triples) are not explicitly defined. Ontologies are easiest, in fact they're often knowledge graphs in disguise.
  • Knowledge graphs are here, are growing, and are finding ever more use cases.

No definition?!

In spite of its importance, I keep hearing that there is no standard definition of knowledge graph — so people can use the term almost any way they want. This view is incorrect, counterproductive, and dangerous.

  • It’s incorrect because there is a continuing and growing consensus on what knowledge graphs are and aren’t — based on a lot of work done spanning the last 50 years (see extensive reviews like Hogan et al (2021) )— even if some non-practitioners aren’t aware of it. I’ve been working to gather and synthesize elements of this consensus in a series of blog posts like this one on LinkedIn.
  • It’s counterproductive because concepts are first and foremost tools. Concepts — in particular what I call keystone concepts — tell us what’s important and what’s not, what to include and what to exclude, what’s constant and what varies. Think of the concept of customer, which drives almost all business processes at a company. If your company doesn’t leverage this tool (i.e., align on a clear, detailed concept of customer), then your marketing, sales, product, and production processes will be all over the map — and your real customers will be confused about your value. Knowledge graph is clearly a keystone concept, so it’s essential that your team aligns on what it means.
  • And it’s dangerous because ill-defined concepts undermine the value of crucial, systematic concepts that happen to have the same label. An ill-defined concept can be (and often is) abused to create exaggerated expectations and misunderstandings, which confuses both investors and consumers. Ill-defined concepts also make us waste time and resources because they are unclear as goals, which endangers the success of your project or company — your keystone concepts are supposed to guide where you focus your efforts or investments.

Criteria

These are the top 3 criteria that I use to distinguish knowledge graphs from other, often very valuable resources that are often just impoverished reflections of real knowledge graphs.

  1. If you need a human to understand and validate what was meant, then it’s not a knowledge graph.

Knowledge graphs are for computers, so these graphs need to include machine-accessible information about concepts for computers to understand (even approximately) labels, strings, or statements. Only-human-readable labels or definitions do not give computers useful information about concepts, even if they’re great reminders for us. Labeling things just isn’t enough.

Knowledge graphs help us avoid what I call categorization by declaration: How do we know that X is/has Y? Answer: Some labeler/taxonomist/subject matter expert declared it to be so. But when we’re double checking data quality or when the labeler/expert is absent, with this approach we don’t know what the categorization was based on — that information (the concepts) — stays in the labeler’s head and never makes it into the computer. So in this case the humans have the concepts — without them, the computer can’t understand and validate the statements that we store in a graph.

Taxonomies, for example, contain systematic collections of statements like X instanceOf Y or Y subcategoryOf Z, often arranged as trees. But for many reasons, most taxonomies are built by declaration, which severely limits their scale, reliability, and (re-)usability. So all too often the only things we know about X, Y, and Z are their labels (or some only-human-readable definition strings). The intricate, valuable knowledge and reasoning behind these statements stay stuck in the taxonomists’ brains and are not machine accessible. This means that taxonomies are most often not good examples of knowledge graphs, even though reliable taxonomic relations play a crucial role in building and leveraging knowledge graphs. Taxonomies often end up being trees of interconnected but unsupported opinions that require human interpretation, not trees of grounded facts.

Example: tnahpele subcategoryOf lammam

This is what the computer sees in many "knowledge" resources: just strings. It's veryimportant to remember that AIs (and humans!) can’t verify (or leverage reliably) a statement like this until they have access to the concepts that each item denotes — i.e., until the inert strings become concepts.

Contrast this approach with categorization by description: in this case, we know that X is/has Y because we have an explicit checklist of features — triples in a knowledge graph, for example — that describe or define the concepts X and Y. This list is what ontologists call the axiomatization of a concept and what machine learning engineers associate with featurization of an entity. Using this approach, we know which facts the categorization was based on regardless of where or who the labeler is.

A crucial difference between knowledge graphs and other graph-structured data?

Knowledge graphs store concepts as networks of interconnected, grounded facts or features, not of unsupported labels, opinions, or declarations.

The meaning or semantics of each node concept is its graph neighborhood — the relations and other node concepts that are connected to it.

2. If graph triples don’t document components and characteristics of things (not strings), then it’s not a knowledge graph.

Knowledge graphs are about world knowledge, not knowledge of the pixels, sound waves, or strings that we use to represent the world. A lexical database or language model — even if structured as a graph — is not a knowledge graph, even though they are extremely helpful for building knowledge graphs.

When we describe a thing or kind of thing that we don’t know the name of, we mention its components as well as characteristics like attributes, behaviors, and relations to other things. This information is what we call a concept: we model concepts as collections of features with certain kinds of information about some thing, regardless of what people call it. So things are separate (and very different) from strings or labels, and knowledge graphs focus on documenting what we know about things, even when they may also include information about labels.

Here’s one litmus test for knowledge graph vs other graph:
If the label is missing (as in the example collections of features below), can you tell which thing or kind of thing is being described? (Underlined items are pointers to concepts documented elsewhere; items in italics are string literals.)

This criterion about the kinds of information needed in a knowledge graph (components and characteristics of things) has significant implications in practice:

  1. We need to unpack concepts and document them piece by piece, or feature by feature. Otherwise, computers have no access to concepts or meaning.
  2. We need to be careful to document and process strings and things separately to capture the facts that a) strings can relate to different concepts, and b) the same concept can be related to different strings. It will likely be best to train strings and concepts as separate modalities in a multimodal LLM framework to make this separation work.

Another crucial difference between knowledge graphs and other data graphs?

Knowledge graphs are built from explicit, machine-accessible concepts that include information about components and characteristics of things, not strings.

3. If the components and characteristics of concepts are not explicitly supported by evidence or criteria, then it’s not a knowledge graph.

When we say, for example, X subcategoryOf Y, what evidence or criteria — grounding — can we use to verify that? We focus on categorization by description (described above), so we need to specify which kinds of features we will accept as evidence in our concept descriptions.

For knowledge graph concepts, we usually rely on two kinds of features for evidence: sensations and concepts that are unpacked elsewhere.

Sensations for computers are what is sometimes called raw data: pixels from images, frames from video, sound waves from audio, sensors, etc. — or the annotations that we use to describe them. For example, the sensations for concept Q3133 (whatever its label might be) can be described with an annotation such as rgb(0, 255, 0) or by using a canonical swatch or set of swatches as reference. And we have reliable algorithms to compute how similar some other sensation is to Q3133, when we ground the concept in sensations.

Concepts for computers are collections of features with certain kinds of information. For example, concept Q3133 can also be described as an example of Q1075 (Q313 instanceOf Q1075), as representing a culture (Q3133 represents Q24527291), as studied by a particular field (Q3133 studiedBy Q14620), etc. where each of these concepts is in turn unpacked or defined explicitly — not just labeled. So knowledge graph triples are an alternative format for the features that power other approaches to machine learning.

The features that we collectively call grounding have very important roles:

  1. They describe and define the concepts (i.e., their meaning or semantics) that we want to store and relate.
  2. They distinguish each concept from all others to ensure that it is unique.
  3. They make explicit the evidence or criteria that we need to validate statements about how they are related.

Examples for Discussion

Let’s apply these criteria to some examples.

Are terminology databases, dictionaries, glossaries, and encyclopedias knowledge graphs? Terms in term bases are linked to other terms defined elsewhere, so they have an implicit graph structure — it’s likely to be possible to reformat them reliably in triple form, for example. Terminology work consists of carefully and explicitly delimiting concepts in definitions and sometimes documenting taxonomic relations as well — then choosing preferred labels for them (as well as documenting other observed labels for the same concept). Delimiting and documenting concepts is crucial, core knowledge graph development work, but there are two differences:

1. Term concepts are defined using free-form human-readable definitions rather than structured features or triples. So we need humans to interpret and leverage them.

2. Term databases often structure and focus on characteristics of strings — which are not directly relevant for knowledge graphs — in addition to definitions.

I see terminologists and lexicographers as crucial front-line workers for developing and validating reliable knowledge graphs and these resources as extremely valuable starting points.

But until we can re-interpret human-readable definitions reliably as arrays of non-redundant features, we won’t be able to leverage these kinds of human-centric resources effectively.

Are taxonomies knowledge graphs? Taxonomic information is a crucial part — only a part — of a knowledge graph, so well-constructed taxonomies should play a very important role in knowledge graph construction. Unfortunately, many taxonomies are built on classification by declaration: people doing taxonomy work often have little specialized training (they’re “accidental taxonomists” per Heather Hedden’s apt phrase), have only generic guidelines, sometimes define key concepts, and have few tools to re-use and double check their work.

As a result, taxonomies rarely have the explicit definitions that are needed for knowledge graphs.

Are ontologies knowledge graphs? Ontologies in fact are a key subtype of knowledge graph that focuses on the extremely valuable relations between category concepts but de-emphasizes individuals. Axioms are assertions that are equivalent to tuples like knowledge graph triples. In more recent work, ontologies in many fields have systematically included individuals, as well, so that now knowledge graphs and ontologies are nearly synonymous.

Ontologies are the easiest kind of resource to use for building knowledge graphs.

Are language models knowledge graphs? Initially, large language models only shared with knowledge graphs the abstract structure of interrelated nodes, but in these LLMs the nodes were strings, so they were not knowledge graphs and in fact did not contain any concepts. LLMs model exceedingly well sequences of strings (regardless of meaning) with strings as nodes and a single predicate like nextString. But this kind of model has no access to concepts or meanings, only to strings, so no conceptual world knowledge is included and no reasoning over concepts is possible. Predicting the best next string creates sentences that humans enrich with their own concepts, creating the illusion of an AI that understands and manipulates concepts. These older LLMs were neither given conceptual information at training nor induced to build it autonomously — they simply mimic a human tendency to produce coherent sequences of strings. However, LLMs synthesize so much information that they can in many cases reliably produce knowledge graph triples. We still don’t understand which predicates they do well with and which they do not.

Text-only LLMs model strings, not concepts.

Are multimodal language models knowledge graphs? Multimodal LLMs are created when researchers explicitly train models to pair strings and pixels or swatches (for image generation), strings and notes or audio (for music generation), etc. In these cases the inert strings of earlier models (that are only related to other strings) become meaningful pointers to things, not strings. The sensory features like pixel values constitute the (distributed) concept that gives a string its meaning in these models. Multimodal LLMs, however, require us to train the LLMs in novel ways to capture conceptual knowledge as well as sensations.

Multimodal LLMs map strings to concepts created from sensations.

Upshot

I’ve mentioned in earlier posts that I really hate dichotomies like “is or is not a knowledge graph” or human only vs AI only. In fact, I strongly suspect that we’d be better off simply banning booleans. So in this post I’ve identified some criteria for a bit more nuanced, more incremental and explainable way to decide how similar different resources are to knowledge graphs.

We want and need to transform the extremely valuable knowledge in peoples’ minds into far more valuable, tangible and FAIR assets that we can accumulate, share, and reuse at scale. And knowledge graphs are real for a growing number of domains, and provide growing coverage of the concepts in each one of them.

--

--