Using a Graph Database to Explore Your ArchiMate Model

Ljubica Lazarevic
Mar 18 · 9 min read

Explore your enterprise architecture assets with Neo4j. *Republished to Medium with updates for GDS*

There is a lot of potential benefit in importing and interrogating your enterprise architecture components in a graph database; you can easily do extensive analysis on dependencies, redundancies, what-if and root cause analysis. ArchiMate is a commonly-used framework for representing different architectures; it is well-defined and used by many organisations and templates are available in many diagramming packages.

There are a number of great blogs showing the power of doing just this in Neo4j, such as:

The challenge in getting started can be from moving from existing modelling tools and artefact repositories to be able to start using the power of graph databases. Fortunately, this is relatively straightforward with Neo4j. In this blog we are going to cover, step by step, how to:

  • Import an existing ArchiMate diagram into Neo4j using either the Archi database plugin, or Neo4j Cypher queries
  • Export data from Neo4j back out again
  • Query the diagram, both for basic items such as lone elements and connectivity, to using some graph algorithms to look at element relationship strengths

The diagramming package we use explore the ArchiMate diagram is Archi, an open source modelling tool.

*Updated blog post*

You can now follow along in Neo4j sandbox with no downloads necessary! Read on to find out more.

Getting started — ArchIsurance Case Study

This is a great example which covers the complexity your architecture can start to expand to. It’s difficult to view the entire model, as there is a lot of information and complexity conveyed within the model, and it is necessary to explore the data using the different views that have been created.

We are going to import the entire case study into Neo4j, and start to interrogate the different elements of the model, starting with dependencies to components, to using the graph algorithm procedures now available to look for connectedness within the diagram.

Importing data into Neo4j

  • Via the Archi database plugin
  • Via the Neo4j CSV importer

Via the Archi database plugin

The Archi database plugin is available from here and version 2.07 support data exports to Neo4j. Once the plugin has been downloaded and copied into the Archi plugin folder, and the details for your running Neo4j instance have been entered, it is simply a matter of exporting the model into Neo4j. Calling up the Neo4j browser and running MATCH (n) RETURN * will show the loaded data:

The export tool stores the element names and relationship types within the properties of the respective nodes and relationships. For example:

  • Relationship type is stored as a property on the relationship, e.g. [:relationships {class:”TriggeringRelationship”}]
  • Element type is stored as a property on the node, e.g. (:elements {class:”BusinessRole”})
  • Element name is also stored as a property on the node, e.g. (:elements {name:”Customer files service”})
  • Any properties attached to elements are represented as additional nodes with relationship “hasProperty”, and within the node name:value persisted by the name and the value properties on the node itself

Via the Neo4j CSV importer

We can also use the LOAD CSV tool available in Cypher to load our Archimate model. Firstly, “Export→Model to ~CSV”. This will create three CSV files:

  • elements.csv
  • properties.csv
  • relationships.csv

I’ve taken the liberty of making the export available on my GitHub repo, so you can follow along using a blank Neo4j Sandbox. All of the Cypher queries below don’t require you to download anything. This will produce the same output into Neo4j as the Archi export plugin.

Load the elements data:

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/lju-lazarevic/misc/main/archi/elements.csv' AS line
CREATE (:elements {class:line.Type, name:line.Name, documentation:line.Documentation,
id:line.ID })

Load relations info:

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/lju-lazarevic/misc/main/archi/relations.csv' AS line
MATCH (n {id:line.Source})
WITH n, line
MATCH (m {id:line.Target})
WITH n, m, line
CREATE (n)-[:relationships {id:line.ID, class:line.Type, documentation:line.Documentation,
name:line.Name}]->(m)

Exporting the model back to Archi

We can export the information back into Archi, again via the CSV function. Firstly we need to create outputs that resemble the CSV format Archi expects. To do this we will create the two output files we originally imported, and via Cypher queries, create the expected output, clicking the export function once the query has executed.

Output the element in an Archi csv friendly format (elements.csv)

MATCH (n)
WHERE n.id <>"null"
RETURN n.id AS ID, n.class AS Type, n.name AS Name, n.documentation AS Documentation

Output the relations in an Archi csv friendly format (relations.csv)

MATCH (n)-[r]->(m)
RETURN r.id AS ID, r.class AS Type, r.name AS Name, r.documentation AS Documentation,
n.id AS Source, m.id AS Target

From here, we use the import CSV function in Archi, including properties.csv we didn’t use, and we’re good to go.

Exploring ArchIsurance in Neo4j

As we’re going to use some targeted queries in the examples looking at particular layers, the following query will take the ArchiMate element types (from the class property on the nodes) and apply them as labels to the nodes, courtesy of APOC:

MATCH (n:elements)
CALL apoc.create.addLabels(n, [ n.class ]) YIELD node
RETURN node

We can also change the relationship type to the ArchiMate relation ( [:relationships {class:”reltype”} ] ) via Cypher query if we wish to do so.

As we’ve now added new labels to the nodes, we can colour them up in the browser, which helps us to visualise the different layers of the elements:

Relationship directions between nodes are a powerful concept in Neo4j, especially when it comes to interrogating our ArchIsurance model: we can ignore the direction when we’re understanding the strength of connections between elements, and we can use it to understand underpinning dependencies for components within the layers.

The following queries outlined below are answering questions we have about our model, ranging from are there any inconsistency issues such as forgetting to add relationships, to practical considerations such as identifying points of failure and understanding component dependencies.

Q1: Have we got any elements that haven’t been linked in?

This question seeks to answer whether we have any elements in our ArchiMate diagram that we may have forgotten to link to other elements. We achieve this by searching for nodes without relationships:

MATCH (n)
WHERE NOT (n)--()
RETURN n.name, n.class

Q2: Are there any application elements that don’t link from other applications or lower layers?

Building up from the previous question, we are now looking to see whether there are any application-level components that are missing upward relationships from other elements, most likely in the technology layer:

MATCH (n)
WHERE (n:ApplicationComponent OR n:ApplicationService OR n:ApplicationFunction)
AND NOT (n)<-[]-()
RETURN n.name, n.class

Q3: What business services are dependent on the “Policy Data Management” application component?

Now we are specifically interested to understand what business services have dependencies on an application component, even though they are not directly joined:

MATCH (bizServ {class:"BusinessService"})<-[*1..10]-({name:"Policy Data Management"})
RETURN DISTINCT bizServ.name

Note that because we’ve added labels to the nodes, the following query also returns the same result:

MATCH (bizServ:BusinessService )<-[*1..10]-({name:"Policy Data Management"})
RETURN DISTINCT bizServ.name

Q4: What elements are impacted if technology service “Claim Files Service” stops working?

This question is seeing to understand our dependencies and knock-on effects of the technology service not being available:

MATCH (n)<-[*1..10]-(:TechnologyService {name:"Customer File Service"})
RETURN DISTINCT n.class, n.name
ORDER BY n.class, n.name

As we’ve represented the ArchIsurance ArchiMate model as a graph, we can use graph algorithms to find heavily connected elements that may not necessarily be immediately obvious. To do this, we will use the Neo4j Graph Data Science Library.

Q5: What are the most connected elements in my estate?

By understanding which are the most connected components, we can get a rough idea of the importance of that element and whether we wish to split the element out. We use the Pagerank algorithm (of Google fame) to do this.

First of all, we’ll create a graph projection:

CALL gds.graph.create(
'myGraph',
'elements',
'relationships')

And now to run PageRank:

CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
WITH gds.util.asNode(nodeId) as node, score
RETURN node.name, node.class, score
ORDER BY score DESC

Q6: What are the most connected application/technology elements in my estate?

We can extend the query above and make it specific we want to look at application and technology level elements. The interest here is to see how interconnected these elements are. This starts to give us a feel of dependencies and risk assessments in case of outages. We’re going to use the labels we’ve attached to the nodes to specifically call up all the elements within the application and technology layer. First of all, let’s add labels for everything that lives either in the application or technology layer:

Add Application Layer labels:

MATCH (n:elements)
WHERE n:DataObject OR n:ApplicationComponent OR n:n:ApplicationCollaboration
OR n:ApplicationInterface OR n:ApplicationFunction OR n:ApplicationInteraction
OR n:ApplicationProcess OR n:ApplicationEvent OR n:ApplicationService
CALL apoc.create.addLabels(n, [ "ApplicationLayer" ]) YIELD node
RETURN node

Add Technology Layer labels:

MATCH (n:elements)
WHERE n:Node OR n:Device OR n:SystemSoftware OR n:Path OR n:CommunicationNetwork
OR n:Artifact OR n:TechnologyCollaboration OR n:TechnologyInterface
OR n:TechnologyFunction OR n:TechnologyProcess OR n:TechnologyInteraction
OR n:TechnologyEvent OR n:TechnologyService
CALL apoc.create.addLabels(n, [ "TechnologyLayer" ])
YIELD node
RETURN node

Now that we’ve added the new labels, we can create our new graph projection for our PageRank algorithm:

CALL gds.graph.create.cypher('apptechGraph',
'MATCH (n) WHERE n:ApplicationLayer OR n:TechnologyLayer RETURN id(n) as id',
'MATCH (n)-->(m) WHERE (n:ApplicationLayer OR n:TechnologyLayer) AND (m:ApplicationLayer OR m:TechnologyLayer) RETURN id(n) as source, id(m) as target')

And now for PageRank once again:

CALL gds.pageRank.stream('apptechGraph')
YIELD nodeId, score
WITH gds.util.asNode(nodeId) as node, score
RETURN node.name, node.class, score
ORDER BY score DESC

Extending the model

Using a tool like Archi and maintaining the model in Neo4j makes this easy: we can import and export the ArchiMate model back and forth between the tools without needing to delete/sync other data sources, provided that sensitivity to existing elements is considered and the unique identifiers do not change. In the next post we’ll start to explore what this might look like.

Final Words

Geek Culture

Proud to geek out. Follow to join our +1.5M monthly readers.