What is an Ontology?
The simplest ontology definition you’ll find…or your money back*
This is a short blog post to introduce the concept of an ontology for those who are unfamiliar with the term, or who have previously encountered explanations in the real of artificial intelligence (AI) that make little or no sense, as I have. I’m aiming to “democratise knowledge of this topic” as one of my colleagues put it.
Having spent the last six months reading around ontologies and artificial intelligence (AI) while at Vaticle, with numerous frustrated and cyclical google searches, I think I’m in a good position to attempt a simple explanation of an ontology. If, like me, you aren’t a philosopher, mathematician or hard-core computer science PhD, you may be put off by definitions such as this from Wikipedia:
“…an ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. It is thus a practical application of philosophical ontology, with a taxonomy…”
Another example:
“An ontology is a formal, explicit specification of a shared conceptualization.”
Maybe it’s just me, but although I know the meaning of those words, I don’t really understand anything when they’re put together as above. Google isn’t massively helpful either:
To be fair, if you stick with it rather than run away screaming, the Wikipedia article I cited above continues to explain that an ontology is:
“…a model for describing the world that consists of a set of types, properties, and relationship types. There is also generally an expectation that the features of the model in an ontology should closely resemble the real world (related to the object)…”
This makes more sense to me, as I’m familiar with object-oriented C++, and it doesn’t sound that unlike it. I thought I’d illustrate the definition with an example using an ontology in TypeDB to define objects, express their properties, and show how the objects relate to one another.
Still with me? I hope I’ve not slipped into the territory of confusing semantic terminology just yet (by the way, please see the end of this article for more examples of ontological explicative horrors).
The TypeDB Model
In TypeDB, we can create an ontology using the following four concepts:
- Entity: Represents an objects or thing, for example: person, man, woman.
- Relation: Represents relationships between things, for example, a parent-child relationship between two person entities.
- Role: Describes the participation of entities in a relation. For example, in a marriage relation, there are roles of husband and wife, respectively.
- Attribute: Represents the attributes associated with an entity, a relation, or even another attribute, for example, a name or date. Attributes consist of primitive types and values, such as strings or integers.
TypeDB has its own declarative graph query language, called TypeQL, for expressing an ontology that can be to be loaded into the database. Let’s look at a simple example from the TypeDB documentation, which uses genealogy data from a family that lived in the 18th and 19th century. Here is the ontology:
define# Entitiesperson sub entity,
plays parentship:parent,
plays parentship:child,
owns identifier,
owns firstname,
owns surname,
owns middlename,
owns age-at-death,
owns death-date,
owns gender;# Attributesidentifier sub attribute, value string;
firstname sub attribute, value string;
surname sub attribute, value string;
middlename sub attribute, value string;
age-at-death sub attribute, value long;
death-date sub attribute, value string;
gender sub attribute, value string;
birth-date sub attribute, value string;# Relationsparentship sub relation,
relates parent,
relates child,
owns birth-date;
The sub
keyword expresses sub-typing, so person sub entity
is simply describing that a person is a sub-type of the built-in TypeQL entity
type.
In the ontology above, there is one entity sub-type, or category of object, which is a person
. The person
can take one of two different roles in a parentship
relation with another person entity: a parent
or child
role. The parentship
relation has a single attribute associated with it (a date of birth), while the person
entity has a number of attributes, such as names, age and gender.
Of course, this is a contrived example, designed to show the basic elements of a TypeDB ontology. It is possible to build a more extensive hierarchy through inheritance (entities to represent a man or a woman could inherit from an abstract person entity for example) and to introduce additional relations and roles (for example, marriage).
Why Do We Need an Ontology?
We need an ontology to allow TypeDB to discover whether the data has any inconsistencies (also known as validation) and to extract implicit information from data (known as inference).
As an example of validation, consider a glitch in the data whereby a person was incorrectly named as a parent of themselves. In adding the data, TypeDB would spot that a person entity was attempting to take both parent
and child
roles in a parentship relation, and would flag it up as an inconsistency.
TypeQL uses automated reasoning to perform inference over the data and ontology, to discover implicit associations. A nice example is to use the gender of a person
to infer more specific details of their role in a parentship
relation (whether they are the mother
or father
, or daughter
or son
).
We can extend the ontology we defined above to show those additional roles:
person sub entity,
plays parentship:son,
plays parentship:daughter,
plays parentship:mother,
plays parentship:father;parentship sub relation,
relates son,
relates daughter,
relates mother,
relates father;
We also need to add to this ontology some “rules” to follow to infer these new roles in this parentship relation. The rules effectively say that, in a parentship relation:
- if the parent is male and the child is male, the roles are
father
andson
. - if the parent is female and the child is male, the roles are
mother
andson
. - if the parent is male and the child is female, the roles are
father
anddaughter
. - if the parent is female and the child is female, the roles are
mother
anddaughter
.
The rules extend the ontology and are applied by TypeQL across the dataset to infer new knowledge from what is already contained in the data.
To find out more about how to build an ontology with TypeDB, we recommend you take a look at our documentation: in particular the Quickstart Tutorial and the Schema Overview. If you’re stuck or need to talk to us, please join our growing TypeDB Discord channel.
*Money back guarantee
I hope this post has simplified a difficult topic to describe. As the article is provided for free, we are sorry that we can’t offer you a refund if I failed. But let me know in the comments.
I’d like to be clear that I mean no disrespect to the original authors of some of the above terrifying explanations of ontologies in artificial intelligence (AI), who are experts in their fields and writing for the cognoscenti. Just for entertainment, here are a few other amazingly opaque definitions. Please hit us up in the comments or tweet @VaticleHQ if we haven’t included your personal favourite…
Gruber 2008: “ …an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse.”
Gene Ontology Consortium: “Ontologies are ‘specifications of a relational vocabulary’. In other words they are sets of defined terms like the sort that you would find in a dictionary, but the terms are given hierarchical relationships to one another. The terms in a given vocabulary are likely to be restricted to those used in a particular field or domain, and in the case of GO, the terms are all biological.”
If you liked this post, please take the time to recommend it or leave us a comment.