Sentences & Notations

Published in

OpenLink Software Blog

3 min readJul 20, 2016

Wrapping our minds around the simple notion that a Sentence is a Datum (i.e., a single unit of observation [1]), can be challenging in any context.

Imagine the confusion that arises in the digital realm, where a variety of different interchangeable and bi-directionally translatable notations may be used to represent the same sentence. The selection of notation is often based only on the preferences of the writer, or sometimes, the reader.

This confusion is the root of a lot of issues that have been stifling innovation in Artificial Intelligence oriented realms such as Machine Learning, Natural Language Processing, Entity Extraction, Reasoning & Inference Engines, and modern Deductive Databases (e.g., RDF triple/quad stores).

Examples

Here is a collection of sentences represented using different notations. English is for those humans who speak and read that notation. RDF-Turtle and JSON-LD are notations for both humans and machines that understand these particular notations.

English Notation Example

I am a Person.My name is “Kingsley Uyi Idehen”, in English. I am the main entity of the page at <https://twitter.com/kidehen>.I am identified by the following pages: <https://www.facebook.com/kidehen>,
<https://www.twitter.com/kidehen>,
<https://plus.google.com/112399767740508618350>.

RDF-Turtle Notation Example (using Nanotation)

{<#i> a schema:Person .<#i> schema:name "Kingsley Uyi Idehen"@en .<#i> schema:mainEntityOfPage <https://twitter.com/kidehen> .<#i> schema:sameAs <https://www.facebook.com/kidehen> ,
                   <https://www.twitter.com/kidehen> ,
                   <https://plus.google.com/112399767740508618350> .}

The above RDF-Turtle, visualized via the OpenLink Structured Data Sniffer:

Result of Processing RDF Languages sentences created using RDF-Turtle Notation

JSON-LD Notation (using Nanotation)

## JSON-LD Start ##
{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "schema": "http://schema.org/"
  },
  "@id": "https://medium.com/p/8b090cf28574/edit#i",
  "@type": "schema:Person",
  "schema:mainEntityOfPage": {
    "@id": "https://twitter.com/kidehen"
  },
  "schema:name": {
    "@language": "en",
    "@value": "Kingsley Uyi Idehen"
  },
  "schema:sameAs": [
    {
      "@id": "https://www.facebook.com/kidehen"
    },
    {
      "@id": "https://www.twitter.com/kidehen"
    },
    {
      "@id": "https://plus.google.com/112399767740508618350"
    }
  ]
}
## JSON-LD End ##

The above JSON-LD, visualized via the OpenLink Structured Data Sniffer:

Result of Processing RDF Languages sentences created using JSON-LD Notation

What does this mean?

If we can get beyond the distraction of squabbling over specific notations, we may realize a powerful yet underutilized frontier, in which we are able to communicate with Digital Agents (Bots) using language that possesses the same information delivery prowess as English — one of many notations used by humans to encode and decode information.

Typical Software (Basic Digital Agents) can be transformed into highly intelligent Enhanced Software (Smart Agents), simply by building in an ability to understand basic sentences, such as those above.

Illustration by John O Gorman shared via comment thread

There is a common core of logic, upon which all human languages are based. To bring the benefits of this core into the software world, we simply need to stop making preferences of notation — akin to preferences for black over blue ink, or Palatino over Times New Roman typeface — a distraction.