Generating a Knowledge Graph comprising Linked Data from a Tweet — using Nanotation

Kingsley Uyi Idehen
OpenLink Software Blog
6 min readJun 29, 2017

On a good day, a Knowledge Graph is misperceived as a “nice to have” aspect of the Web that’s rife with pitfalls en route to realization.

To address problem outlined above, we created service called URIBurner. This service delivers “deceptively simple” exploitation of Web DNA — in the form of Linked Data — by adhering to the following principles:

  • Entities are named unambiguously using Hyperlinks — providing the time-saving benefit of negating nuances associated with creating and publishing 5-Star Linked Data.
  • Entities Names resolve (when clicked or de-referenced) to Entity Description Documents — providing the benefit of cost-effective Knowledge Graph construction
  • Entity Descriptions — in the form of Subject (Entity), Predicate (Attribute), Object (Value) structured sentences where each component is identified by a Hyperlink bar the Object which can also be identified by a Typed or Untyped Literal

Anyone using one of the following browser extensions in tandem with URIBurner (or with their own private instance of the Virtuoso Sponger Middleware Module) can contribute content to a Knowledge Graph through a simple URL pattern and/or mouse-click:

In addition to URIBurner, ODE, and OSDS, we introduced the concept of Nanotation whereby anyone can construct and post Linked Data wherever text content is accepted, using notations such as JSON-LD or RDF-Turtle. This powerful mechanism for Data Definition has the added benefit of Data De-silo-fication, since that Data is no longer held captive by any Social Media service.

Using Nanotation to Generate a Semantic Web of Linked Data from a Tweet — A How-To Guide

Twitter is a spectacularly under-utilized platform. For instance, it provides a global zeitgeist (a/k/a “hey twitter”) platform that’s crowdsourced by its members.

Twitter’s use of # hashtags as topic identifiers and @ handles as agent (person, organization, or bot) identifiers are the key to its nascent functionality as a highly functional Knowledge Graph enclave.

Here’s a use-case example that demonstrates how Nanotation, URIBurner, ODE, and OSDS collectively leverage Twitter as a Knowledge Graph enclave and launch-point into the larger LOD Cloud Knowledge Graph.

Scenario

Earlier today, I stumbled across a tweet that included an infographic titled “Your Digital Marketing Map” — from an similarly titled document.

Of interest to me was the use of the following hashtags: #Ads, #DigitalMarketing, and #SEO. Bearing in mind the uniform use of these tags, I sought to use a reply Tweet to make some explicit statements about what those hashtags identified to achieve two goals:

  • Describe how the referents of (entities identified by) those hashtags are related, in a structured manner
  • Save these notes to the knowledgebase behind URIBurner (i.e, a Virtuoso multi-model RDBMS instance that naturally handles data represented as RDF sentences)

Steps

(1) I write my reply tweet, using nanotation to describe the entity relationship types that I have in mind, using terms from the SKOS vocabulary — which describes terms for constructing taxonomy trees.

Nanotation-laced Tweet

Nanotation text based on depiction above, with a few DBpedia integration enhancements.
{
<#DigitalMarketing> a skos:Concept.
<#DigitalMarketing> skos:related dbpedia:SEO, dbpedia:Advertising.
<#DigitalMarketing> skos:broader dbpedia:Marketing .
}

(2) While viewing my tweet, I can click on the OSDS icon (found in my browser’s toolbar), and OSDS will present a translation of the nanotation I embedded in the tweet.

(3) Alternatively, leveraging the fact that our ODE extension adds a context menu item to my browser, I can place my mouse pointer over the URI that identifies my Tweet and then use CTRL+Click combination to invoke an Extract, Transform, and Load (ETL) operation on URIBurner, which then returns the page depicted below:

Result of translating the RDF sentences created via nanotation embedded in the tweet

(4) The “Browse using” dropdown at the top of that page lets me switch to alternative views that provide additional functionality, such as pivoting from the page describing a single tweet to a page that lists every instance of a Tweet that exists in the URIBurner database.

(5) When I click on the entry labeled “Embedded Turtle Statement 7”, URIBurner returns a page that explicitly reveals the subject, predicate, object structure of that statement.

(6) Bearing in mind my goal of creating a taxonomy tree from the hashtags in the original tweet, I click on the “DigitalMarketing” link which returns the page below. As you can see, DigitalMarketing is presented as a sub-category of Marketing, in line with my goal.

What’s most important here is the fact that the nature of the relationship type (skos:broader) that connects Marketing and DigitalMarketing is coherent to and understandable by both a human and a piece of RDF-compliant software that understands Relationship Type (Relations) Semantics as defined by a particular vocabulary.

What happened here?

I took notes using a tweet by treating hashtags and words used in a subject→predicate→object structured statement, using the digital shorthand provided by nanotation.

Why is this important?

I am able to create notes at a whim that cumulatively enrich the URIBurner database, which in turn enriches the larger LOD Cloud Knowledge Graph. Most important of all, I have a powerful ability to tap into this knowledge using a variety of methods:

  • Keyword Search
  • Precision Find — where I might start with a keyword or a known URI, and use attribute filtering to more specifically home in on what I seek, based on the subject or object roles of all the entity relationships in the URIBurner database
  • SPARQL Query over HTTP
  • SQL Query using an ODBC, JDBC, ADO.NET, or OLE-DB connection

Conclusion

Capturing notes is the most powerful practice I know for skills acquisition and enhancement.

In a world that increasingly demands accumulation and agile exploitation of knowledge, being able to create notes wherever and whenever is priceless!

Imagine what this all means when those notes progressively enhance a Knowledge Graph (be it public, private, or a mix) that fuels Artificial Intelligence through Machine Learning, and beyond.

Links

Related

--

--

Kingsley Uyi Idehen
OpenLink Software Blog

CEO, OpenLink Software —High-Performance Data Centric Technology Providers.