The Data Model as the (dynamic) soul of your project. Part 1

Ricardo Cid
4 min readFeb 16, 2016

--

Or how to build a web application that is future proof.

How cool it would be to write a data model. Insert it in some sort of magic slot and have its database, API, Back-end, cache, Front-end, etc built on the fly and deployed for you.

You know what be way cooler? To be able to modify such data model at will without breaking anything (eternal backwards and forward compatibility).

Let me trip a little bit more: My data model should be technology agnostic. I should be able to grab my data model to a brand new shiny slot and have it running on a completely different stack before I go to lunch (including exporting all that data from the old to the new system).

Why not? After all, if the Data Model is the soul of my project and the data it contains is the blood. Whatever body (stack) I’m currently using should be irrelevant. Only jumping from body to body an idea will be immortal and always taking advantage of the latest technologies.

This sounds like a vampire story? Why aren’t we building software like this already? All the technological pieces are in place (thank you NoSQL, thank you dynamically typed languages, thank you client side frameworks, thank you modern ORMs, etc).

How could this be implemented? Let’s use a simple example:

Example 1: I want to help a dog shelter get some love giving owners for their dogs (but I only have one day to finish the project). After a 5 minute call with the shelter’s director where I ask what’s generally what they can easily provide for each dog, I write the following Data-model:

{
"_entity" :"dogs4adoption",
"_semantic":"https://en.wikipedia.org/wiki/Pet_adoption"
"_attributes":[
{
_name:"dogs_name",
_type:"string",
_arrange: 1,
_level: 1,
_widget: "text",
_default: "",
_cardinality:"single",
_semantic:"https://en.wikipedia.org/wiki/Personal_name"
},{
_name:"breed",
_type:"entity:breeds",
_arrange: 2,
_level: 1,
_widget: "select",
_default: "",
_cardinality:"multiple",
_semantic:"https://en.wikipedia.org/wiki/Dog_breed"
},{
_name:"gender",
_type:"genders",
_arrange: 3,
_level: 1,
_widget: "select",
_cardinality:"single",
_semantic:"https://en.wikipedia.org/wiki/Gender"
},{
_name:"short_description",
_type:"string",
_arrange: 4,
_level: 2,
_widget: "textarea",
_default: "",
_cardinality:"single"
_semantic:"http://www.merriam-webster.com/dictionary/description"
},{
_name:"neutered",
_type:"boolean",
_arrange: 5,
_level: 2,
_widget: "checkbox",
_default: false,
_cardinality:"single"
_semantic:"https://en.wikipedia.org/wiki/Neutering"
},{
_name:"dog_picture",
_type:"image",
_arrange: 6,
_level: 2,
_widget: "image",
_default: "",
_cardinality:"multiple"
_semantic:"https://en.wikipedia.org/wiki/Image"
}]
}

I’m defining the entity called “dogs4adoption” which has the following attributes:

_entity [string]

This is the unique name I’ll use to refer to this particular data subject. In other words, I’m creating a new data type and this is its name. (Its rough physical representation is a table in a rdbms or a collection in a document DB)

_attributes[array]

An entity has a series of attributes that all together form the structure of this data type.

In other words. In order for something (object, document, etc) to be considered of type x, it must have each one of the attributes specified in the _attributes list

An entity has one or more attributes each attribute is described by the following characteristics:

_name [string]

This is the name of the attribute. _name is url safe and only contains alphanumeric and “_” (underscore). The reason of such strict rules is that _name plays the role of an id. I personally consider this should be just a random hash but readability is also important.

_type [string]

Type is either a primitive type like int,string,double,boolean or another entity which will itself be built out of primitives (or yet other entities).

When you refer to another entity you are effectively connecting entities and creating relationships between them.

_source (optional)[uri]

While _type refers to the entity metadata, _source refers to the data itself. That is the logical/physical duality of the entities: An entity not only represents a description of the type but also all the records stored that conform to that type (in whatever permanent storage our application is currently living)

_source is by default equal to _type and almost never included in the data_model… unless it needs to be different. Why it would be different? Maybe I want to refer to the Dog’s Breed list maintained by the American Kernel Club. My local _type : breed would still be described in the model but the interfaces could use the AKC to auto suggest or validate the entries.

_arrange (optional) [int]

Determines the order the attributes are arranged in any given representation. This somewhat banal property is irrelevant for stack layers like DB or API but extremely important for the Frontend part.

_level (optional) [int]

Determines how important is this attribute in relation to the others. The attributes that have lower numbers are more important. i.e: If I need to show a preview of an item that conforms to this type I would only show those attributes with level 1.

_widget (optional) [string, URI]

The ideal representation of this attribute. i.e in interfaces : An address would have a “map” widget while an picture would have a “image_gallery” widget. Ideally this is a URI that represents such representation.

_default (optional) [string]

The value that should be inserted in the attribute if none is specified.

_semantic (optional) [URI]

You know… for robots to know what I’m talking about. This attribute should generally contain a URI pointing to a semantic dictionary or similar.

_cardinality (optional) [“single”,”multiple”]

Determines if it is ok to have more than one of what the attribute specifies. i.e: For dogs it is ok to have more than breed (“multiple”) but they might have only one gender (“single”).

==========================

In the next post I’ll write about how a Document DB is a perfect match to implement this.

--

--