Semantic Data Master (SDM) — Next Gen MDM

Create and maintain an enterprise conceptual meta and data model; Capture semantic references in master data model with understanding of objects and relationships between objects, i.e. ”Things not Strings”; Simplify Data Architecture and reduce future technical debt.

As you create digital products, you must take action to minimize future technical debt while architecting an operationalized business model.

I consult and train organizations on marketing and sales automation technology integration.

This almost always benefits from a discussion of Enterprise and Business Architecture.

Based on those experience I propose an essential building block for every digital product: Semantic Data Master

Semantic Data Master (SDM) is a proof of concept (POC) of JavaScript processes that enables JSON Forms automated form building via JSON Schema definitions derived from Web Ontology Language (OWL) ontology business models.

SDM is simple, business-oriented definition of a data architecture as part of a larger Digital Product Master (DPM) It provides a single, coherent view of data and relationships at the core of a digital product.

The components define a Data Model of how application systems are integrated together and how data flows between Tools and Teams. A SDM provides a canonical model for an enterprise architecture. A Data Model has three perspectives which are represented in SDM with:

  • A Conceptual view is provided by the OWL ontological representation of the business entities and their relationships
  • A JSON Schema representing Logical view of how entities are connected together via properties, constraints and restrictions
  • A Physical view of as realized via JSON Forms touchpoints and data store functionality

Together these views of a model “explicitly determines the structure of data”. They provide specification of business entities in a Business Model and the data relationships, semantics and constraints that support Process Model application integrations.

Overview of data modeling context: Data model is based on Data, Data relationship, Data semantic and Data constraint. — Wikipedia

Overview of data modeling context: Data model is based on Data, Data relationship, Data semantic and Data constraint.

In addition to data structure, the model defines data properties and organization that defines a Data Architecture.

The resulting Data Architecture are part of a Technology Model in an overall Digital Product Model Master set artifacts.

The SDM defines a reference architecture for master data representation which provides conceptual design for ensuring that the underlying physical data structures can support a wide range of data access needs.

The reference model is a data dictionary that enables data integration, synchronization and reflection and facilitate interoperable system connections.

It does so while leveraging and promoting best practices (not reinventing the wheel) and warehousing the meaning of data properties and entity relationships.

This is a next step Master Data Management (MDM), and Reference Data Management (RDM), technique to create a reference model using Linked Data Semantic Ontologies that drive data dictionaries and reference data types for multiple application touchpoints and prepares datasets for AI / ML.

The Web is about links

The Web is about links… The Semantic Web is about the relationships in those links — Dan Brickley

Beside being an ideal way of creating a reference dictionary with MEANING (aka Semantics), building a metamodel with Linked Data technology prepares data for AI and ML.

As Dan Brickley states in Semantic Web: Learning from Machine Learning, “Semantic Web formats and approaches…provide mechanisms that can help provide more real world background knowledge to ML workflows.

A chief advantage to semantic references in MDM is to not only align concepts from multiple data sources to a common concept, but provide context to the concept in the greater world.

If I define First Name, fname, _fn, and cellD3 as the same as a master reference givenName I understand the interoperability of that field between systems.

On the other hand if I assign these to a semantic reference <http://xmlns.com/foaf/spec/#term_givenName> I give reference definition meaning in the Friend of a Friend (FOAF) ontology / vocabulary.

This in turn can inference numerous Linked Data datasets and provide the basis for machine understanding of what a first name data element means.

Digital Businesses tend to assemble enabling tools and systems via loose coupling of various cloud services resulting linked systems and linked data across the Web

By enabling system data to express facts that include their own meaning, applications can interpret the meaning of data information from the content.

This is a major objective of building a Semantic Web.

A lot of people have been working on making the software smarter. In the semantic web, we’re going to start making the data smarter.” — David Siegel

Motivation

This POC project is inspired by several dev projects that I am bringing together to produce a new solution for a number of complex problems.

One part is an extension of work discussed in Fundamentally Different Way to Simplify Web Applications in which I leverage JSON Schema for automated application form generation.

Another part is based my exploration of semantic tools for knowledge management based systems. An example are several of my projects for organizational competitive intelligence.

A frequent criticism of semantics is the unrealistic expectation that human tag information and content when they create it. They won’t. Automated classification and cognification semantics enhancement process is needed.

I describe a version of this I call MrKnowItAll in So what’s different? Can you say “Personality as a Service”? .

It is about the interactive building of a knowledge graph for Knowledge Management and Competitive Intelligence system framework to Gather, Classify, and Enhance Information to create Useful Knowledge and Insight Discovery

The project(s) are organized around a knowledge management ontology with metamodel(s) defined by Web Ontology Language(OWL) for ontology development and semantic graph creation.

Another example is the Perse Ontology as described in Can’t We Just Have a Conversation? “Of course, what would you like to know?”

And finally I am motivated to complete this POC because of a LinkedIn post and discussion by Kurt Cagle. The gist is the use of JavaScript as the glue between data, queries and application code. My project ties a number of these steps together into a cohesive solution in JavaScript.

I am showing how JavaScript can not only be the glue, but the implementation tool based on related standards providing object modeling, schema definitions, data field specifications and HTML form generation and control.

The linchpin is JSON Schema and the realization that all data models use the elemental data types defined in JavaScript Primitive Data Types (string, number, boolean, undefined) and HTML 5 Input Type Attributes .

SDM is a component of the larger framework I call Digital Product Model Masters. DPMM is centered on the advantages of Business Architecture methodologies and using the construct of needed Business Capabilities are a combination of People, Process and Technology.

Business architecture is the bridge between the enterprise business model and enterprise strategy on one side, and the business functionality of the enterprise on the other side.” — Wikipedia

What is the Problem?

Many industries are struggling with much more complexity in their business.

The sources and feeds of customer data has exploded and making sense of all of the activity requires refinement of those data.

Businesses need to have a 360-degree view of products, customers, store locations, supply chain, unfinished goods, and digital assets… the list goes on and on.

MDM systems and techniques have long been used to provide comprehensive enterprise view by importing and aligning data from multiple disparate systems.

Often the result is a master file, that provides a common point of reference. MDM streamlines data sharing among personnel and departments and enables analytics that supports decision making.

CIOs, CTO’s, CMTO’s, IT Directors and Managers, System Architects, Data Scientists, Software Developers seek to create wholistic enterprise data that captures on complete and refined views (or at least they should).

This has been the realm of Master Data Management (MDM).

With increasing data complexity, MDM systems have adapted with increased complexity with multi-domain modeling, reference data management, and semantic tools and techniques.

By defining a standards based approach, SDM does this by integrating a semantic data model(s) plus a practical implementation that support DevOps, reduces risk, and improves time to market.

Elements of Semantic Data Master Models

I use standards to defined elements of an SDM. The POC project is an instance of a Semantic Data Master Models.

The standards define specific files formats that capture details of each of the Conceptual, Logical and Physical perspective views of a data model.

The objective of this project is to connect and transform these formats through important relationship definition mapping to produce a front-to-back architectural solutions.

Processes

The project is an exploration of the tools and techniques that map and transform these file formats. Each process function transforms one standard format into the next or provide import / export functionality for a system. The visualization of the these processes is:

Components

OWL (OWLJSONLD) Entity Ontologies

  • Web Ontology Language (OWL) TTL or JSON-LD that specifies entities, properties, relationships and restrictions for domain
  • Import Export JSON-LD, TTL, XML/RDF, N3

JSON Schema (JSONSchema)

  • JSON Schema that specifies the objects and properties of domain object fields and properties
  • Import Export JSON, JSON-LD, TTL, XML/RDF, N3, SQL Script, Form Field Defs, Owl JSON-LD

Form Field Definition (FormFieldDef) Schema

  • Field definition table with columns that specify HTML5 Form Element, Input Types and Restrictions
  • Import Export CSV / XLS / Sheets, JSON Schema

Field Editor Schema (FieldSchema)

  • CSV or JSON Array with records that specify individual domain field definition of types, restrictions, attributes, rules and triggers

HTML JSON Forms (JSONForm)

  • JSON Form generation of form elements defined by JSON Schema specification

Benefits

The benefits for developing and the using an SDM process engine include:

  • Create and maintain an enterprise conceptual meta and data model
  • Capture semantic references in master data reference model -> understanding of objects and relationship between objects, ie ”Things not Strings”
  • Elegant and manageable data properties and schema definitions that all enterprise info can map to -> dramatically aiding Analytics and Cognification
  • Easy transformation between meta models, schema definitions and application data field specifications
  • Automate system setup and configuration changes -> dramatically less administrative work and errors and a move toward DevOps throughout and Enterprise

Towards an Open Source solution

#semanticdatamaster (SDM) is a proof of concept open source project in the making that explores using Linked Data Semantic ontologies to drive JSON Schema and Schema Forms.

SDM consists of JavaScript in the form of UI, Node.js, and AWS Lambda functions that enable form building via JSON Schema definitions and Web Ontology Language (OWL).

The intention is to empower the use of JSON Schema (via jsonform) to describe a data model and to produce the JSON Schema from OWL specifications using a version of owl2jsonschema.js.

This would be in alignment, more or less, with the purpose of XML Schema and eventually the Shapes Constraint Language SHACL to define schema dependencies, constraints, and restrictions.

SDM can import table extracts (SQL server, spreadsheets, Salesforce csv’s, etc) as CSV then transforms and maps them using a JSON Schema definition file.

The base JSON Schema is generated from from a domain specific OWL metamodel.

The schema defines field properties in a dataset and a CSV can be exported with schema column name definitions.

These field property definition can be edited using HTML 5 Form Elements, Input Types, and Attributes. Changes (physical) can be saved back into the JSON Schema file (logical) and potentially flowing back into OWL metamodel (conceptual).

A CSV or source system data, such as Salesforce Contacts or Companies, can be imported and and ETL process generates a JSON-LD version of the source dataset.

The JSON-LD contains the values or content associated with a JSON Schema. This can be exported as a Content tree and be used to populate form default values in a JSON Schema continuing Schema, Form and Values for use in forms libraries.

This in turn can be feed into an HTML5 form using JSON Forms with the same JSON Schema definition and content file.

Remember that an HTML5 web app used JS / CSS3 to create UI and UX. A JSON Form library builds the scaffolding and default content. It can be embedded in any app framework for form generation and control.

The Typical Alternative

While a controlled example, this process is vastly simplified and less costly then what is is typically done today in most organization.

An example is how application development is typically done. Applications are often developed by separate teams and in multiple organizations. The all end up creating many project artifacts including:

  • UML diagrams
  • Data dictionaries
  • Database schemas
  • Object models
  • Form elements
  • Integration mappings

This process may or may not include using a master reference. More than likely each team works alone and likely has different team members over time. That and organizational changes and M&A quickly produces a multitude of systems. Master Data Management is often brought in to solve the Tower of Babel problem.

German Late Medieval (c. 1370s) depiction of the construction of the tower. By Meister der Weltenchronik

What we need is a BabelFish — a Universal Translator and extensible knowledge base.

http://puredigitalco.com/2017/10/11/babel-fish-reality/

Next Steps toward Things not Strings

There is a lot of truth in that people do not learn until they are ready to learn, and many need to see an working example to comprehend any new idea.

In presenting a different way of solving a problem, I want to publish and drive an open source project for this solution.

The Semantic Data Master has a lot of moving parts, all of which are based substantially on standards and other open source projects.

The goal is to make a complex idea simple. In order to do this I NEED YOUR HELP. Contact me on how we can collaborate to coevolve this solution.

Working together we can simplify data architecture and reduce future technical debt by creating ”Things not Strings”