An Introduction to Modeling and Language Engineering — Christmas Edition

Published in

itemis

12 min readDec 21, 2017

It’s Christmas time again. Time to work less and play more. As a kid, a parent or just for the fun of it. Christmas is also the time of meeting friends and family and answering questions like “How is your job?” and “What does itemis do?”.

With modeling and language engineering being the core of many itemis projects and activities, let me try to introduce some modeling basics with the help of games and toys — hopefully making it a bit more accessible and fun!

About Models, Abstractions and Meta Models

Already back in 1976 the statistician George Box figured it out: “All models are wrong!”
The Playmobil® horse was never a real horse, the LEGO® spaceship was not a real spaceship and Super Mario® and Lara Croft were not real either. But we didn’t have a problem with them not being real because they were perfectly useful to play with.

Let’s take the Millennium Falcon from the Star Wars universe as an example to explain a few things that we deal with at itemis every day: models, abstractions and meta models.

For those born before 1932 or after 2014, this is a picture of the Millennium Falcon.

(Picture by William Warby: https://www.flickr.com/photos/wwarby/31618033034)

It is obvious that a “real” Millennium Falcon is a pretty complex system: hyperdrive, weapons, hiding places and so on. Spaceships are very big and very expensive to build.

So in order to model a spaceship as a useful toy, we have to introduce abstractions. The abstractions define the specific aspects that the model encompasses. This reduces complexity and also makes the model affordable.

Here are a few examples of how to model the Millennium Falcon and what the main characteristics of the models are:

A LEGO® model: consists of colored plastic components that can be be mechanically attached to one another, representing the rough shape and color of the spaceship. It is a lot smaller than a real spaceship. It can’t fly.
A iron cast mode: made of metal like a real spaceship. No moving parts to play with. Does not fly.
A CAD computer model: digital model of the structure and geometry of the individual components of the spaceship and how they relate to each other. Still can’t fly.
A space ship model in a Star Wars computer game: digital model, looks exactly like the real thing, it includes behavior of the real Millennium Falcon: and yes, it can fly!

All of the above models are useful in different scenarios and for different types of users. Hundreds of other models exist serving different purposes and focusing on different aspects of the physical structure, the components or the behavior of the Millennium Falcon. Think of photos, puzzles or live-size models used in the film studios during the production of the Star Wars movies.

Take away points:

Models are very useful representations of our reality: we use them all the time.
Models reduce complexity by introducing abstractions.

Now that we have an idea what models are, we can have to look at another important modeling aspect: meta models.

Meta models define the elements that can be used in the modeling process and how they relate to one another. In the world of LEGO® the meta model defines the type of building blocks and the way they can be connected.

Meta models can also be seen as the rules that are associated with building the model. Strong meta models enforce strict rules and boundaries when creating a model. Weaker meta models give more freedom and apply fewer restrictions.

Creating a clay model of a spaceship gives maximum freedom and creativity in the modeling process. The shape, the size and the details of the ship can vary greatly. The clay does not restrict guide the modeling process. The clay meta model is weak.

A 500 piece puzzle of our Millennium Falcon in contrast has the strongest possible meta model. There is only one way to use the puzzle correctly. There is no room for creativity in the process. The puzzle’s meta model automatically helps you to “get it right” as missing or wrongly placed parts can easily be spotted.

The LEGO® model is in between the clay model and the puzzle model in terms of the degree of freedom and the number of limiting rules. It allows the player to combine the available parts in a very flexible way while at the same time enforcing some basic rules: e.g. bricks only fit together in a certain orientation.

What is more fun to play with? That only depends on personal preference.

What is more fun to work with? That depends on the task at hand.

If the model is small and easy enough to fully understand all dependencies and side effects, then a weak meta model with hardly any rules might just be fine. Think of Microsoft Word® or Powerpoint® as “clay of system modeling” — with no real limitation to the author’s creativity.

In a more complex world with thousands of requirements, product variations and many sometimes conflicting forces such as cost, performance, safety and security you might want to give up some “creative freedom” and in return get some support from the system to ensure the model’s correctness or completeness.

Take away points

Meta models define the underlying model elements and how they relate to each other. In a very real sense, they define the “language” you can use to express your models.
The structures and rules enforced by the meta model might limit the degrees of freedom of the user but in return can help to “get the model right”.

Model Transformations

Transformers® make good christmas presents. They have the ability to transform from a car or a truck into a robot — and then back again. This is a very cool trick!

System models can also be transformed from one model to another model. Depending on the underlying meta models, this transformation might or might not work equally well in both directions.

LEGO® models again can serve as an example. If one model consists of big, black bricks it can easily be transformed into a model consisting of smaller, colorful bricks without losing any previously modeled information like the physical shape of the modeled object.

The transformation in the opposite direction might not work equally well as the larger bricks are limited in their ability to form smaller shapes and black bricks can’t represent different colors.

Of course at itemis we typically don’t play with LEGO®. We deal with models like UML, SysML, EMF, Franca or Autosar. Very often it is useful to transform models. Every software language consists of specific notations, syntax and grammar (= meta model). The programs written in a software language are models. Because of this, model transformation can help us to pull off a very useful trick: it can help us to generate software code from other models like EMF or Franca. This is usually referred to as “Code Generation”.

Yes, and sometimes we actually do play with LEGO®.

(itemis LEGO® model and instruction manual courtesy of Mathis Birken)

About Models and Languages

As we have seen, modeling always implies the presence of an underlying meta model. In the LEGO® world the meta model includes the various types, shapes and colors of the building blocks and the principles of how they can be connected. So the meta model of any given LEGO® set is defined by the types of pieces provided in the box and the principles of how the pieces can be connected. The pieces and the connection mechanisms can be seen as the “language” of any given LEGO® set.

In software engineering models are often created based on text-based programming languages. In this case the meta model or “language” defines the syntax, the grammar and the possibility to include functions in the process of modeling.

Programming languages do not only vary in terms of their syntax and grammar, they are also different on the level of the underlying features and concepts.

C is often deployed on specialized hardware like microcontrollers. These controllers have limited resources and are often used in vehicles, industrial systems or home automation. The concepts and the structure of C mirrors the hardware instruction set and is optimized to be resource efficient.

Java in contrast is a higher level, object oriented programming language. It typically does not run on resource-restricted micro controllers but on servers with plenty of resources. Since Java programs often are bigger than software written in C, it includes higher level concepts that help to structure and reuse software functions e.g. by introducing classes with the ability to inherit abilities.

SQL — Structured Query Language — is another type of programming language. SQL is designed for one specific purpose: handling information stored in relational databases. The syntax and the provided functions therefore are tailored to the domain of relational databases.

While being different in scope and structure, C and Java both fall in the category of general purpose languages. They are not designed to solve one particular type of problem but can be used across many domains. Languages like C++, C#, Python or XML fall in the same category.

SQL in contrast is tailored to one group of tasks in one domain: the query and manipulation of relational databases. It therefore falls in the category of Domain Specific Languages or DSLs. Other DSLs are HTML or MATLAB.

What is the equivalent of general purpose languages in the world of LEGO®? Well, when I grew up there where only 5 or 6 LEGO® colors available, and the shapes of the blocks were mostly rectangular. LEGO® figurines looked very generic. Building a spaceship required some creativity and some the willingness to compromise.

Today there is a variety of Domain Specific LEGO® “DSLs” available. Pirates with their ships, policemen including German Shepherds — and even our Millennium Falcon.

(Picture by Kim Do-hyun: https://www.flickr.com/photos/stickkim/6966161674)

Note the custom parts used in the spaceship and the custom figurines and their accessories. With a very specific task at hand — building a Millennium Falcon — custom parts like antennas or cockpit windows make the modeling process a lot more efficient and the results more impressive.

Building a LEGO® Millenium Falcon in 1976 would have been a much more challenging task. Wondering if anybody would have recognized Chewbacca with a red top and blue pants…

So who gets to enjoy the advantages of Domain Specific Languages — be it the realm of computer languages, system engineering or LEGO®? Well, not everybody.

In the past there had to be a significant number of potential users (= market size) of programmers, engineers or gamers to justify the development and maintenance of any DSL. Consequently, DSLs often maintained a rather broad and technical character — the common denominator of the targeted group of experts.

So if you belong to the group of specialists that need to write queries for relational databases (SQL) or if you are a Star Wars fan you are lucky and you get your DSL.

However, if your area of specialization or interest is “niche”, you have to make do with general purpose languages and go back to using square and rectangular lego bricks.

Take away points:

The modeling language and the meta model are closely related. They define the functions and concepts that can be expressed by the model.
Domain Specific Language (DSLs) provide customized notations, syntax and meta models to efficiently express functions and concepts for specific domains.
In the past, creating DSLs for computer languages or systems was cumbersome and expensive and therefore limited to larger user groups and/or broad, general domains.

Language Workbenches

Now that the advantages of DSLs haven been established, the question is how to give more experts and specialists in their domains access to more efficient ways of modeling?

Luckily we finally have arrived in the age of 3D printing!

Licensing and copyright issues aside, a 3D printer in conjunction with the matching software would allow LEGO® players to define and print any LEGO® parts of their choice. Unicorns instead of horses: simply print them! Star Trek instead of Star Wars figurines: just print them!

The new parts now can be engineered exactly to the player’s needs.

Fortunately, with respect to creating Domain Specific Languages in modeling and software engineering, there also have been significant advances in the past years.

Martin Fowler’s concept of Language Workbenches (LWB) is already more than a decade old. His vision and ideas in the meantime became reality.

Language Workbenches are software tools that are designed to efficiently build new programming languages including their syntax, grammar and underlying concepts (= meta models).

Language Workbenches are the “3D printers of software engineering”. They allow the efficient creation of DSLs — even for very small groups of experts with very specific modeling needs. The user group for DSLs can now be as small as one team in one department.

Today, a number of Language Workbenches are available. One category focuses on a textual modeling approach. A popular example for a textual Language Workbench is Xtext.

Projectional Language Workbenches on the other hand are not limited to one type of notation. Instead they can flexibly project the content of models in any type of representation according to the user‘s needs and preferences. Jetbrain’s Meta Programming System (MPS) today is the most popular projectional Language Workbench.

Projectional LWBs store model data in trees. This specific tree is also referred to as Abstract Syntax Tree (AST). The modeling information is stored in the elements of the AST and can then be projected to any notation: a text, a table, a mathematical formula or any form of graphical representation. As part of the projection process, the system can highlight certain aspects of the underlying model and hide other aspect. The user can flexibly pick and choose the abstractions relevant for the task at hand — “abstractions on demand”.

Let’s have a brief look at one real world example.

Security experts analyze and evaluate the security properties of technical systems. This could be a vehicle or an IOT device. In order to do that they have to first understand the basic structure and the main functions of the system. Here, they work closely with the engineering team that is familiar with the components, interfaces and data associated the with system. Based on this structure, security goals, attack vectors, damage potentials and propagation paths have to be modeled in an iterative way. Identified risks then are mitigated by e.g. adding encryption and the required keys. This in return has an affect on the original architecture and functions of the system.

While it is not important to fully understand every detail of the specific workflow above, the modeling challenges hopefully become clear. We have to define a model that designers, architects and security engineers can work on simultaneously while focusing on different aspects of the model. And, different tasks within the process benefit from different notations: graphs, tables, text, charts etc.

What a DSL in a customized editor built with a projectional language workbench could look like is depicted below. Please note that these are different projections of ONE model emphasizing different abstractions for different steps in the analysis phase incorporating a specific DSL for the security domain.

(Mulitple notations / projections used in ONE security model)

Take away points

Language Workbenches are tools to efficiently build Domain Specific Languages. They are the “3D Printers of software engineering”.
Projectional Language Workbenches give advanced modeling options as they can provide different views and notations on the same model.

So what is the job of a Language Engineer?

Language Engineers solve challenging problems in the realm of system or software modeling by creating Domain Specific Languages and models.

Often these problems are the result of increased complexity and the inability to manage this complexity with the established software tools and methods.

The reasons for the increased complexity can vary. From a growing number of functions and product variants, stricter regulatory documentation needs or the requirement to analyse the dependencies of cross cutting concerns like costs, performance, safety & security along the entire product development process.

Often weak meta models, e.g. requirements written in plain prose text, and the missing tool support for the experts “to get their models right” are the root cause for the need to advance in the area of methods and tools.

Language Engineers are typically part of a multidisciplinary team that includes team members and domain experts from the customer side. This always implies that language engineers get deep insights into many different domains: from automotive to industrial automation, medical, insurance or telecommunication. Language engineers also have to learn and understand dependencies of the work products of different user groups on the customer side. Jointly with the domain experts language engineers then help finding the right abstractions including the syntax, grammar and functions to model all relevant aspects of a domain. Language engineers are usually also involved in designing and building the tools that allow the domain experts to then create and work with the models.

The DSLs and the tool created for the purpose of the security analysis above is a typical example for the work of language engineers. Security methodology experts, potential users from different customers and the itemis language engineers defined the model including the relevant abstractions, suitable notations and projections. itemis then built the editor, in this cased based on a projectional LWB MPS from Jetbrains, to model, analyze and document security analysis projects.

If DSLs and language engineering sound like viable business approaches or exciting personal challenges for you for 2018, feel free to reach out and find out more.

But for the days ahead: work less and play more! Perhaps with LEGO®.

Enjoy the festive season and have a happy and healthy 2018!

Originally published at blogs.itemis.com.