What about Juan? What do we really understand about him?

Data representation using JSON vs XML , is there a difference? Yes!
Why have an extra step if your use case for customer information is Web Apps, Cloud Analytics, Marketing Stacks, REST API’s, Customer Data Platform, Big Data, AI, etc?

I have recently had conversations where I am reminded that IT often does not think, or know, of what lies beyond XML for data representation and system interchange.

That coupled with a recent post by Kurt Cagle, “JSON-LD rewrites the Semantic Web”, I decided that I could provide examples of what is possible when you think outside of the boX .

In the process, I will address several topics relevant to IT developers, architects, and directors including:

How does JSON format aid in managing and interchanging information?

Why do so many developers not know about it and its advantages?

What is JSON-LD in the context of Schema.org??

What is magical about Linked Data, Semantic Web and RDF formats?

What are Perse and MeGraph and why do I care?


Let’s start with an introduction to my subject Juan…

Juan Kristoff Canadian , aka Kris

What about Juan? Juan is a fictitious contact to demonstrate alternative data formats that are commonly used to integrate systems, to markup or tag content with metadata, and increasingly used to add context and meaning to text based information.

The contact information for today's examples is:

Contact Info record

Name: Juan 'Kris' Canadian
Full Name: Mr. Juan Kristoff Canadian Esq.
Address: Delavan, WS 53115
Location: Delavan, Wisconsin, United States
Job Position: Information Architect to the stars
Organization: Northwestern Life
Home Phone: +1–425–123–4567
Work Email: juankristoff@example.com
Birthdate: January 1, 1970
Photo: http://www.example.com/dir_photos/my_photo.gif

Data representation and formats are key to the ability for systems to be integrated or to simply exchange information.

Let’s start simple with a common format for exchanging contact information.

The contact information for Juan can be published as a Virtual Card File (vCard) and imported into most every contact management tool imaginable. As the first example of how contact information can be formatted, here is a vCard text file:

vCard

BEGIN:VCARD
VERSION:3.0
FN:Mr. Juan Kristoff Canadian Esq.
N:Canadian;Juan;Kristoff;Mr.;Esq.
NICKNAME:Kris
TEL;TYPE=WORK:tel\:+1-425-123-4567
TEL;TYPE=HOME:tel\:
ADR;TYPE=HOME:;;;Delavan;WS;53115;United States
ORG:Northeastern Life
TITLE:Information Architect to the stars
BDAY:1970-01-01
EMAIL;TYPE=INTERNET;TYPE=WORK:juankristoff@example.com
NOTE:Photo: http\://www.example.com/dir_photos/my_photo.gif
END:VCARD

The vCard standard proposal also defines several other formats for the same information. For instance, an xCard is an eXensible Markup Language (XML) document:

xCard

<?xml version="1.0" encoding="UTF-8"?>
<vcards xmlns="urn:ietf:params:xml:ns:vcard-4.0">
<vcard>
<n>
<surname>Canadian</surname>
<given>Juan</given>
<additional>Kristoff</>
<prefix>Mr.</prefix>
<suffix>Esq.</>
</n>
<fn>
<text>Mr. Juan Kristoff Canadian Esq.</text>
</fn>
<org>
<text>Northeastern Life</text>
</org>
<title>
<text>Information Architect to the stars</text>
</title>
<photo>
<parameters>
<mediatype>
<text>image/gif</text>
</mediatype>
</parameters>
<uri>http://www.example.com/dir_photos/my_photo.gif</uri>
</photo>
<tel>
<parameters>
<type>
<text>work</text>
<text>voice</text>
</type>
</parameters>
<uri>tel:+1-425-123-4567</uri>
</tel>
<adr>
<parameters>
<label>
<text>Delavan, WS 53115
United States</text>
</label>
<type>
<text>home</text>
</type>
<pref>
<integer>1</integer>
</pref>
</parameters>
<pobox/>
<ext/>
<street></street>
<locality>Delavan</locality>
<region>WS</region>
<code>53115</code>
<country>United States</country>
</adr>
<bday>

19700101T000000Z
</bday>
<email>
<text>juankristoff@example.com</text>
</email>
<rev>

<timestamp>20180424T195243Z</timestamp>
</rev>
</vcard>
</vcards>

XML and sister technologies XML Schema and Extensible Stylesheet Language (XSL) first appeared about 20 years ago. My first reaction was, wow, this solves so many problems!

Since that time XML became dominate in many system development areas. It is especially used and useful in system integration tools and Enterprise application integration platforms. Many enterprise programming languages (.Net, Java, etc) and stacks support XML extensively.

But what I have found is that many vendors and enterprise integration experts have been slow to adopt lighter weight scripting approaches, especially JavaScript based solutions.

As the saying goes, when you have a hammer… you use it to beat on everything, or something similarly dramatic.

In addition to XML, vCard defines a JavaScript Object Notation (JSON) format called jCard. JSON is particularly useful for JavaScript language apps and scripting environments because it’s the native serialization for JavaScript objects.

It is lighter character wise than XML in most cases when exchanging information and does not require function libraries for import / export in JavaScript — it IS JavaScript.

For example, a jCard version of Juan’s contact information is:

jCard

["vcard",
[
["version", {}, "text", "4.0"],
["n", {}, "text", ["Canadian", "Juan", "Kristoff", "Mr.", "Esq."]],
["fn", {}, "text", "Mr. Juan Kristoff Canadian Esq."],
["org", {}, "text", "Northeastern Life"],
["title", {} ,"text", "Information Architect to the stars"],
["photo", {"mediatype":"image/gif"}, "uri", "http://www.example.com/dir_photos/my_photo.gif"],
["tel", {"type":["work", "voice"]}, "uri", "tel:+1-425-123-4567"],
["adr",
{"label":"\nDelavan, WS 53115\nUnited States of America", "type":"home", "pref":"1"},
"text",
["", "", "", "Delavan", "WS", "53115", "United States of America"]
],
["bday", {}, "datetime", "1970-01-01T00:00:00Z"],
["email", {}, "text", "forrestgump@example.com"],
["rev", {}, "timestamp", "2008-04-24T19:52:43Z"]
]
]

While jCard is not necessarily popular in practice, JSON formatted data has grown in importance far beyond contact information.

With many, if not most, use cases of stored structured information, data ends up in a web browser based interface.

This include providing back-end services for single-page applications as I describe in “Fundamentally Different Way to Simplify Web Applications”.

Why not just store or have the primary format be JSON instead of XML?


Structure Data, JSON-LD and Schema.org

Several years ago Google, Microsoft, Yahoo and Yandex, founded Schema.org as an open source collaboration of semantic markup vocabularies.

Since then millions of sites add structured content makeup “to power rich, extensible experiences”. “Schema.org vocabulary can be used with many different encodings, including RDFa, Microdata and (JavaScript Object Notation for Linked Data) JSON-LD.”

Search engines use this semantic information to produce richer useful results. For instance, doing a google search on Google [ https://www.google.com/search?q=google ] produces result that includes a details frame on the right side to the page:

This structured information results from RFDa, Microdata or JSON-LD markup in the google.com page content.

Cagle’s post JSON-LD rewrites the Semantic Web does and excellent job of describing the flexibility of JSON-LD and then its inherent importance to Semantic Web content markup.

I could embed a JSON-LD version of my contact info example in this page to inform search engines of example what the elements are and what they mean relative to the Schema.org definition of Person.

For instance, using Google’s Structured Data Testing Tool I can produce this:

{
"@type": "http://schema.org/Person",
"http://schema.org/email": "juankristoff@example.com",
"http://schema.org/familyName": "Canadian",
"http://schema.org/givenName": "Juan",
"http://schema.org/additionalName": "Kristoff",
"http://schema.org/honorificPrefix": "Mr.",
"http://schema.org/honorificSuffix": "Esq.",
"http://schema.org/jobTitle": "Information Architect to the stars",
"http://schema.org/name": "Juan Kristoff Canadian",
"http://schema.org/telephone": "(425) 123-4567",
"http://schema.org/birthDate": "1970-01-01",
"http://schema.org/worksFor": "Northeastern Life"
}

These can be added to page content to enable search engines to understand each data element meaning and to produce richer results.


JSON-LD has become an important way for the web to become the Semantic Web by enabling Linked Data. Three additional magical things about JSON-LD include:

1) with use of a defined and popular vocabulary, meaning (aka semantics) can be defined as part of the data itself

2) JSON-LD permits addition pieces of information (semantic triples) to be appended as layers. This allows ad-hoc addition of information without schema changes and enables Linked Open Data enhancement.

3) JSON-LD is one of several interchangeable Resource Description Framework (RDF) syntax (RDF/XML, Turtle, JSON, n3, etc) , which make it and ideal publish and subscribe format which extreme flexibility and scalability.

Once last feature to point out is that JSON-LD has several variations of formats that are useful for different types of application.

A great way to try out this flexibility to to experiment with JSON Playground whose output is shown here:

JSON-LD Formats

// Compacted
{
"@context": "http://schema.org/",
"type": "Person",
"additionalName": "Kristoff",
"schema:birthDate": "1970-01-01",
"email": "juankristoff@example.com",
"familyName": "Canadian",
"givenName": "Juan",
"honorificPrefix": "Mr.",
"honorificSuffix": "Esq.",
"jobTitle": "Information Architect to the stars",
"name": "Juan Kristoff Canadian",
"telephone": "(425) 123-4567",
"worksFor": "Northeastern Life"
}
// Expanded
[
{
"@type": [
"http://schema.org/Person"
],
"http://schema.org/additionalName": [
{
"@value": "Kristoff"
}
],
"http://schema.org/birthDate": [
{
"@value": "1970-01-01"
}
],
"http://schema.org/email": [
{
"@value": "juankristoff@example.com"
}
],
"http://schema.org/familyName": [
{
"@value": "Canadian"
}
],
"http://schema.org/givenName": [
{
"@value": "Juan"
}
],
"http://schema.org/honorificPrefix": [
{
"@value": "Mr."
}
],
"http://schema.org/honorificSuffix": [
{
"@value": "Esq."
}
],
"http://schema.org/jobTitle": [
{
"@value": "Information Architect to the stars"
}
],
"http://schema.org/name": [
{
"@value": "Juan Kristoff Canadian"
}
],
"http://schema.org/telephone": [
{
"@value": "(425) 123-4567"
}
],
"http://schema.org/worksFor": [
{
"@value": "Northeastern Life"
}
]
}
]
// Flattened
{
"@context": "http://schema.org/",
"@graph": [
{
"id": "_:b0",
"type": "Person",
"additionalName": "Kristoff",
"schema:birthDate": "1970-01-01",
"email": "juankristoff@example.com",
"familyName": "Canadian",
"givenName": "Juan",
"honorificPrefix": "Mr.",
"honorificSuffix": "Esq.",
"jobTitle": "Information Architect to the stars",
"name": "Juan Kristoff Canadian",
"telephone": "(425) 123-4567",
"worksFor": "Northeastern Life"
}
]
}
// N-Quads
_:b0 <http://schema.org/additionalName> "Kristoff" .
_:b0 <http://schema.org/birthDate> "1970-01-01" .
_:b0 <http://schema.org/email> "juankristoff@example.com" .
_:b0 <http://schema.org/familyName> "Canadian" .
_:b0 <http://schema.org/givenName> "Juan" .
_:b0 <http://schema.org/honorificPrefix> "Mr." .
_:b0 <http://schema.org/honorificSuffix> "Esq." .
_:b0 <http://schema.org/jobTitle> "Information Architect to the stars" .
_:b0 <http://schema.org/name> "Juan Kristoff Canadian" .
_:b0 <http://schema.org/telephone> "(425) 123-4567" .
_:b0 <http://schema.org/worksFor> "Northeastern Life" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
// Normalized
_:c14n0 <http://schema.org/additionalName> "Kristoff" .
_:c14n0 <http://schema.org/birthDate> "1970-01-01" .
_:c14n0 <http://schema.org/email> "juankristoff@example.com" .
_:c14n0 <http://schema.org/familyName> "Canadian" .
_:c14n0 <http://schema.org/givenName> "Juan" .
_:c14n0 <http://schema.org/honorificPrefix> "Mr." .
_:c14n0 <http://schema.org/honorificSuffix> "Esq." .
_:c14n0 <http://schema.org/jobTitle> "Information Architect to the stars" .
_:c14n0 <http://schema.org/name> "Juan Kristoff Canadian" .
_:c14n0 <http://schema.org/telephone> "(425) 123-4567" .
_:c14n0 <http://schema.org/worksFor> "Northeastern Life" .
_:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .

These format transformations as easily accomplished with a number of open source libraries. Since my preferred platform and language is Node.js and JavaScript, I often use these NPM packages:

https://www.npmjs.com/package/jsonld

https://www.npmjs.com/package/json2csv

https://www.npmjs.com/package/csvtojson

https://www.npmjs.com/package/xml2js


These tools form the basis of defining “Things not Strings” or semantic orientation for data. They are central components in tool kits I developed including:

Semantic Data Master (SDM) — Next Gen MDM

“Create and maintain an enterprise conceptual meta and data model; Capture semantic references in master data model with understanding of objects and relationships between objects, i.e. ”Things not Strings”; Simplify Data Architecture and reduce future technical debt.”

Semantic data definition is a great way to handle Master data management elements. It allows the Meaning of the data to be Embedded in the data. The purpose of the SDM toolkit is to enable transformation of datasets with the help of JSON Schema and Ontology. But that is a different story.

Semantic Data Master transforms allow me to show you another very important format for schema and data values. That is the ubiquitous Comma Separated Value file or CSV.

This is the format of a SDM FormFieldDef and FieldVal files where snippets with Juan look like:

FormFieldDef portion

FieldVal record

I would be happy to go over the detail one one one, contact me at jstewart@asteriusmedia.com

I have used this set of tools for many client projects and have evolved and improved the methodology over time. I dove into semantics and the inherent simplicity and extensiblity of knowledge graphs in “A little ditty about Knowledge Graphs, Places, Things and Einstein

https://www.linkedin.com/pulse/little-ditty-knowledge-graphs-places-things-einstein-jeffrey-stewart/

All of these examples gets us to what is the future of data management endeavors, truly semantic data sets with meaning enabling questions to be asked of the data.

What I produce are data sets in a format where information can be easily appended, logical reasoning inference engines and cognitive computing services can surface hidden information based on complex questions posed.

The ability to layer addition information to a datasets is key to creating knowledge graphs and solutions such as 360-degree customer views.

Sounds way too hard and not needed? Well, these techniques are how Google, LinkedIn, Facebook, Microsoft, Amazon and IBM are using to produce Q&A systems and how Big Data analytics is done.


My final example format for Juan is a semantic network or semantic graph. Here is a visualization of a simple graph using my MeGraph and Perse projects.

http://megraph.azurewebsites.net/public/default.html

A portion of the semantic network graph dataset looks like this:

{
"@graph": [
{
"@id": "Person_Juan_Kristoff_Canadian",
"@context": "http://schema.org",
"@type": "schema:Person",
"rdf:type": "dbo:Person",
"dbo:type": "foaf:Person",
"rdfs:label": "Person: Juan Kristoff Canadian",
"dc:abstract": "What about him? Fictitious contact to demonstrate JSON-LD, Semantic Datasets and MeGraph",
"place:locations": [
"Place:United_States",
"Place:Wisconsin,_United_States",
"Place:Delavan,_Wisconsin",
"Place:Person_Juan_Kristoff_Canadian_PostalAddress"
],
"schema:name": "Juan Kristoff Canadian",
"schema:givenName": "Juan Kristoff",
"schema:familyname": "Canadian",
"schema:additionalName": "Kristoff",
"schema:honorificPrefix": "Mr.",
"schema:honorificSuffix": "Esq.",
"schema:telephone": "(425)123-4567",
"schema:birthDate": "1970-01-01",
"schema:agerange": "45-54",
"schema:gendar": "Male",
"schema:email": "juankristoff@example.com",
"schema:organization": "Northeastern Life",
"schema:jobTitle": "Information Architect to the stars",
"place:location": [
"Place:PostalAddress_Home"
],
"place:basedNear": [
"Place:Delavan,_Wisconsin"
],
"owl:sameAs": "",
"schema:addressLocality": "Delavan",
"schema:addressRegion": "WS",
"schema:postalCode": "53115",
"schema:streetAddress": "",
"schema:City": "Delavan",
"schema:State": "Wisconsin",
"schema:County": "United States",
"vcard:vcard": [
"VCard:Juan_Kristoff_Canadian"
],
"rdfs:seeAlso": [
"http://www.example.com/dir_photos/my_photo.gif"
]
},
{
"@id": "Place:Delavan,_Wisconsin",
"@context": "http://schema.org",
"@type": "schema:Place",
"rdf:type": "dbo_City ",
"dbo:type": "dbo:Place",
"rdfs:label": "Delavan, Wisconsin",
"dc:abstract": "Delavan is a city in Walworth County, Wisconsin, United States. The population was 8,463 at the 2010 census. The city is located partially within the Town of Delavan but the two entities are politically independent. City events include the Delavan Train Show in March, Cinco de Mayo in May, Heritage Fest in August, and Scarecrow Fest in September. (en)",
"place:basedNear": "",
"owl:sameAs": [
"http://dbpedia.org/page/Delavan,_Wisconsin"
],
"schema:City": "Delavan",
"schema:State": "Wisconsin",
"schema:County": "United States",
"dboisPartOf": "Place:Wisconsin,_United_States"
},
{
"@id": "Place:PostalAddress_Home",
"@context": "http://schema.org",
"@type": "schema:PostalAddress",
"rdf:type": "",
"dbo:type": "dbo:PostalAddress",
"rdfs:label": "Home Address",
"dc:abstract": "",
"place:locations": [
"Place:Delavan,_Wisconsin"
],
"place:basedNear": "Place:Delavan,_Wisconsin",
"owl:sameAs": "",
"schema:addressLocality": "Delavan",
"schema:addressRegion": "WS",
"schema:postalCode": "53115",
"schema:streetAddress": "",
"schema:City": "Delavan",
"schema:State": "Wisconsin",
"schema:County": "United States",
"dboisPartOf": "Place:Delavan,_Wisconsin"
}
],
"@id": "Person_Juan_Kristoff_Canadian"
}

This purpose of my RDF/JSON-LD Perse ontology is to structure Person, Organization, Place, Event, and Action information ( see alsoEverything you need to run your business is on your wrist or in your pocket” )in the form of a semantic nets or more broadly Knowledge representation and reasoning (KR²).

As you can see, the transformation of tabular information and XML representation into forms to facilitate Big Data and AI is pretty straight forward.


Conclusion

I have tried to show you the power and simplicity of JSON and RDF formats, to provide demonstrations of JSON-LD and Schema.org for SEO as well as its broader uses and benefits as structured Linked Data, aka Semantic Graphs.

Deliverables today:

  • Example of JSON-LD embedding for fictitious contact Juan
  • Example of variety of format versions of JSON-LD and it relationship to XML
  • Provide description of JSON as core format of MDM and introduce SDM
  • A brief intro and benefits of Perse and MeGraph

I am looking for clients who are frustrated with their current approaches for modeling customer experiences and seek a 10x gain in value from their existing customer data platforms.

Interested?