Executable (computational) graphs & diagrams

Pavel Vlasov
Nasdanika
Published in
11 min readJun 24, 2024

--

June 2024 release of Nasdanika Core Graph has introduced a new approach to constructing executable/compute graphs and diagrams. This story provides an overview of the approach and its possible applications. It starts with introducing graph concepts and terminology.

Table of Contents

· What’s in it for me?
· What is a graph
Hello, World!
Computing circumference
· Executable/Computational Graphs
· Metaphors
PCB
Cubicle
City
· API
Graph
Processors & Configs
Connection processors and pass-through connections
Capability processor factory
· Diagrams
· Conclusion

What’s in it for me?

Before delving into the WHAT and HOW, let’s take a look at the WHY!

The approach explained in this story helps me to maintain 31 GitHub repository with 123 Java modules containing 2.5K Java files with 415K lines of code, as well as 27 web sites with about five thousand pages. Generated module graph helps me to understand dependencies between Nasdanika modules. Full module graph also contains external dependencies. I use generated class diagrams and graphs to quickly grasp relationships between model elements. All of this allows me to efficiently navigate between levels of abstraction, and manage complexity.

I also generate sites from diagrams, which are graphs: Nasdanika documentation site, architecture documentation sites such as Internet Banking System. While the Internet Banking System is a demo, I used the approach to document and communicate real-life applications with multiple runtime environments and dozens of components.

So, the approach explained here may help you to increase your personal efficiency and effectiveness, and efficiency and effectiveness of your team or organization.

What is a graph

Wikipedia defines a graph as:

a structure amounting to a set of objects in which some pairs of the objects are in some sense “related”. The objects are represented by abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge (also called link or line)

Nasdanika’s term for vertex is Node and for edge is Connection. Connections are directional.

Many real life things can be represented as a graph, and may have multiple graph representations used for different purposes. Let’s take a look at a couple of examples!

Hello, World!

Let’s take a look at the classic “Hello, World!” and how it can be represented as a graph for different purposes. The diagram below shows a graph of character nodes connected by “Next” connections.

A graph of character nodes with “Next” connections.

Instead of characters we can use tokens representing more than one character as shown below:

“Hello, World!” encoded using cl100K_base encoder
  • 9906 — “Hello”
  • 11 — “,”
  • 4435 — “ World”
  • 0 — “!”

And we can use words:

Words graph

We can modify the words graph by introducing a dictionary. With the dictionary we can count occurrences of each word and navigate to that occurrence in the text (graph):

Words graph with a dictionary

In the above graph we have two types of nodes:

  • Dictionary entry
  • Word in text

And two types of connections:

  • Next connection between words
  • A connection between a word and a dictionary entry

Then we can extend our dictionary with semantic connections — “Hello” is similar to “Hi”, and “World” is similar to “Universe”. Similarity may have a value, e.g. a number between -1 and 1 with positive values for similar concepts (synonyms) and negative numbers for opposite concepts (antonyms).

Dictionary with semantic connections

We can take it even further by generating an embedding and adding it to a Hierarchical navigable small world (HNSW) index, so we can perform similarity searches. HNSW is a graph-based search technique.

Indexed embedding for similarity search

Computing circumference

In this section we’ll take a look at a graph representing computation of the circumference of a circle with a given radius:

r = 2.718
C = 2πr

The above computation can be represented as the below graph:

In this graph we have nodes of 6 different types. Some nodes types support values:

  • Constant/Literal contains numeric value
  • Variable contains name of the variable
  • Variable reference contains name of the variable

There are 3 types of connections:

  • Containment — statement list (Solution) contains statements
  • Operand — assignment and multiplication have (contain) operands
  • Variable reference

Executable/Computational Graphs

In an executable graph elements (nodes and connections) collaborate to perform some actions. In a computational graph the actions are computations of some result. In this story we will use executable and computational interchangeably.

Nasdanika uses 3-layered approach to build computational graphs:

  • Value/data
  • Graph element — node/connection
  • Processor
3 layers of computational graphs

For our circumference graph the value layer contains numeric literals and variable names. Values may have its own internal structure. For example, a PDF document contains pages, articles, paragraphs, lines and words. Azure Document Intelligence model also contains coordinates of text blocks.

We may say that the value level would typically contain “raw” data obtained from some external data source. That data needs to be enriched with additional information in order to compute result/perform actions, this is why we need the graph level on top of it. For example, we may need to cross-reference GitLab Contributor with a person in the company directory. Or we may need to link PDF or OCR documents paragraphs to named entities in those paragraphs.

In our case the variable reference node needs to be connected to the variable assignment node.

Finally, processors “sit” on top of the nodes and perform actions/computations. As we have seen in the “Hello, World!” section we may have different graphs on top of data/values. We may also have different processors on top of graphs. In our computational example we may have:

  • Synchronous (pull) processors — a call to multiply would multiply the operands and return the result.
  • Reactive (push) processors — client code would subscribe to the multiply processor and receive multiplication results via the subscription. The multiply processor would in turn subscribe to the operand processors and perform computations when a new value is published to it.
  • Asynchronous (pull) processors — a call to multiply would return a CompletionStage and the result will be obtained via one of .then methods.

We may also use different data types for our computations — Float, Double, BigDecimal. Depending on the data type we’ll get different precision and performance.

Metaphors

Before we delve into low level mechanics of node wiring and processor configuration, let’s take a look at a few metaphors. Metaphors are a great communication tool, IMO, and different metaphors would resonate with different people. Pick a metaphor which resonates with you!

PCB

When I was a teenager I spent a lot of time building different electronic devices — amplifiers, radio receivers, computers. So my brain was wired to think in terms of signal propagation and transformation. Maybe this was the defined factor for designing the Nasdanika Graph API the way it is.

Computer motherboard with sockets (nodes) connected by wires. Processors are plugged into the sockets.
Processor wiring

In this metaphor there is no value layer:

  • The graph layer is a PCB
  • Pass-through connections are wires
  • Connection processors are 2-pin components such as capacitors, diodes and resistors
  • Node processors are CPU’s, RAM, PCI cards such as video cards

Cubicle

In this metaphor a cubicle is a node connected to other nodes with a computer and a phone. A person working in that cubicle is a processor — they don’t need to worry about connectivity — it is provided by the node (cubicle)

City

In this metaphor roads are connections, buildings are nodes and people /organizations working/operating in those building are processors.

We may extend the metaphor by introducing waterways and saying that roads are bi-directional, but waterways are unidirectional — things go only downstream.

API

This section explains core concepts and classes/interfaces of the graph API.

Graph

The graph API has 3 interfaces:

  • Element — super-interface for Node and Connection below. Elements may have internal structure which can be navigated by accepting a visitor.
  • Node — extends Element and may have incoming and outgoing connections.
  • Connection — extends Element and has source and target nodes.

There are object (value) flavors of node and connection:

as well as a number of specializations, e.g. EObjectNode.

There are different ways to construct graphs from data/values, consult documentation, sources and examples for details.

Processors & Configs

Previously it was mentioned that the approach to executable graphs has 3 layers — data, graph (nodes & connections), and processors. There is actually one more layer — config.

Configs (click to open an interactive diagram)

It was not mentioned because it is a helper layer taking care of establishing inter-processor communications given the fact that when, say, the Multiplication processor is created in the above diagram, the π Literal processor may not yet exist.

Using our PCB metaphor you may thing of config as a socket. Or, using the cubicle metaphor, it is a phone and a phone number — Multiplication can call the π Literal and wait for it to answer, which might not be right away. In the API it is implemented as a call-back — Multiplication would be notified when the π Literal becomes available.

There are 3 types of configs corresponding to 3 types of graph elements:

Using configs processors obtain endpoints to communicate with other processors and register handlers so other processors can communicate with them. Endpoints and handlers are objects of a certain type. Endpoints may be of the same type as handlers or of different type.

VGA/HDMI adapter

If an endpoint type is different from a handler type you may think of it metaphorically as an adapter. Say, VGA -> HDMI.

For example, in a distributed system an endpoint method may have data argument of type InputStream. That stream would be saved to some storage (say committed to Git) and a handler method corresponding to the endpoint method would have an argument of type URI or URL.

So, handlers and endpoints can be thought of as connectors (VGA/HDMI/…) with method corresponding to pins.

Connection processors and pass-through connections

Connections may have processors in the same way as nodes. You may think of a connection processor as a processor for a node with one incoming “connection” — source node, and one outgoing “connection” — target node.

Let’s introduce one more metaphor to exemplify — an electronic circuit as shown below.

Emitter repeater — capacitors and resistors can be represented as nodes or as connections

If we wanted to create a simiulator for this circuit using a graph with processors we could represent wires as “pass-through” connections with no processors:

Pass-through connection

Capacitors and resistors could be represented either as nodes or as connections with processors:

Connection with processors

Capability processor factory

As with graph elements, there is a number of ways to create processors for elements. June 2024 release introduces a new approach which leverages the Nasdanika Capability framework.

With this approach processors are provided via capability providers and client code uses CapabilityProcessorFactory to collect processors from capability providers.

This approach provides high level of decoupling, flexibility and reuse — the same factory can be used for multiple graphs and the same graph may leverage different factories by specifying different requirements and by specifying different dependencies in pom.xml.

Diagrams

Graphs are typically represented as diagrams. Diagrams are graphs, Nasdanika Drawio API is built on top of the Graph API.

However, diagram graphs AS-IS are typically not very useful because they reflect low-level diagram structure. In order to diagram a problem domain of interest you’d need to “map” diagram elements to the elements of your problem domain. There is a book on the subject — Beyond Diagrams. Core concepts are available in the free version, you’d need to pay for the full version only if you decide to use the approach explained in the book and want to accelerate getting up to speed.

One example of diagram mapping is the Internet Banking System.

In my opinion a great value proposition of executable diagrams is What You See is What You Get — diagrams can be progressively elaborated to make them executable. The same diagrams may be used to generate documentation and report system health by highlighting diagram elements using, say, runtime metrics or development metrics — code coverage, number of vulnerabilities, … This is drastically different from the “traditional” approach — draw a diagram, code something. Whether that something aligns to the diagram — who knows?

Conclusion

The goal of this story is to nudge you in the direction of “thinking in graphs”, thinking non-linearly. While text is a graph, it is a very simple form of a graph. AI gets better and better at dealing with text. How can humans stay competitive? By thinking on a high-level in graphs and using diagrams, not text, to grasp high-level concepts. Lower-level linear “thinking” can be delegated to AI. For example, there might be a node processor implementation which uses GenAI to generate code. GenAI can also be used for summarization using RAG on top of graphs and text similarity along with graph proximity — number of hops, connection “strength”.

The Nasdanika approach to graph execution provides flexibility in graph and processor creation and allows to bind decisions at appropriate points in development process — not too soon, not too late.

There is a lot more that I’d like to explain in this story, e.g. deployment of executable graphs as Maven modules, federated graphs built from multiple sub-graphs, … However, it would turn this story into a book. So I’d rather add chapters to the Beyond Diagrams book, when I get a chance.

When I started writing this story I created compute graph diagrams just for illustration purposes. Then I’ve realized that they can be used to perform actual computations. I’ve created Compute Graph GitHub repository for this effort — you may watch it if you are interested.

The above case with compute graph diagrams illustrates one of important points of the Beyond Diagrams book — a diagram/graph may be created without a backing model/notation and that notation may be added later, even elicited from a set of related graphs/diagrams. Such diagrams may be created by non-technical people and then technical people may create domain models from them, processors, reusable shape libraries, execution using Nasdanika CLI, graph inheritance, …

One of my former colleagues once said that he wanted “PowerPoint to code” solution. Well, it is quite possible PowerPoint -> Drawio -> Domain model -> Generated code or executable diagrams.

--

--

Nasdanika
Nasdanika

Published in Nasdanika

Stories about Nasdanika capabilities and their applications

Pavel Vlasov
Pavel Vlasov

Written by Pavel Vlasov

I'm a principal engineer at U.S. Bank at day and an Open Source enthusiast at my free time focusing on productivity capabilities for "Java Guerilla Innovators"

No responses yet