Data Integration Language

Raymond Meester
5 min readSep 23, 2022

DIL or Data Integration Language is a programming language to make it easy to write data integrations.

DIL logo

DIL has three basic concepts: Data, Components and Links. These three concepts are inspired by flow-based programming. All other concepts are derived from these three concepts.

This blog is a summary of the DIL language. For a more technical in-depth look, you can read this blog.

Concepts

1. Data

In computing, data often means simply a fact that has a certain form (an image, a sound file, a record, a text, an object and so on). DIL has a narrower definition of data, namely data as a data message.

2. Components

Components can be seen as software blocks with which you can build a solution. A component encapsulates a set of functions that processes data.

Component A

A component may contain one or more other components on a lower level.

As seen in the above diagram, components in DIL are nested. The component that contains all other components is on the first level, while the subcomponent are a level below and so on.

3. Links

Components can be linked with each other:

A DIL program

Component links and levels together make a DIL program:

Components can also link between levels. It’s really the data message that travels between levels and through components.

Each level in DIL is named differently:

1. INTEGRATION (first level component or root component)

2. FLOW (second level component)

3. STEP (third level component)

4. BLOCK (fourth level component)

5. CORE (fifth level or core component)

Why are there multiple levels? This is because sometimes there is a need for low-level (technical) and sometimes for high-level (business) implementations. A lower level provides the building blocks for a higher level. Conversely, if you miss a building block on a higher level, you can go a level deeper to implement it.

Based on what level you are, DIL thus targets both business developers as programmers.

Roles

We will now discuss the role of every component on the various levels:

1. Integration

An integration is a component on the root level. Integration as a concept has nothing to do with a specific programming language or technology. It revolves around a set of data. For example, “orders” or “employees”.

2. Flow

A flow is a higher-level component that describes the flow of the integration. For example, “order from A to B”. The steps are a series of processing steps performed on a message.

3. Step

Steps are components on a mid-level. A step performs an action on data (in the form of a message).

4. Blocks

Blocks are low-level components that are the building blocks of a step. The types of blocks are based on the core components.

5. Core

The core components are components on the lowest level. They are, so to say, atomic components which have no links.

Core types

Summary:

Blocks are assembled from core components. Together, these blocks form a step. Multiple steps are a flow, and linked flows form an integration. These levels make it flexible to implement data integrations for both technical as business use cases.

Hello world!

Here is a “Hello World” example in DIL:

<flow>
<name>HelloWorld</name>
<steps>
<step>
<type>source</type>
<uri>timer:foo</uri>
</step>
<step>
<type>sink</type>
<uri>print:Hello World!</uri>
</step>
</steps>
</flow>

The output:

2022-09-02 15:54:32.675  INFO 1848 --- [2 - timer://foo] 1-2                                      : Hello World

Let us short analyze this program. First to mention is that the program is named “HelloWorld”. With the name or an ID the program can be managed, like starting and stopping. The program contains two levels: flow and step. One source step with a timer that triggers the program and one sink step that prints “Hello World!”.

The example is quite minimal and implicit. In some cases, it’s better to be very explicit to have full control of the program. The same example, but now explicit:

<dil>
<integrations>
<integration>
<id>1</id>
<name>default</name>
<options>
<environmentName>PRODUCTION</environmentName>
<stage>PRODUCTION</stage>
</options>
<flows>
<flow>
<id>12345</id>
<name>HelloWorld2</name>
<steps>
<step>
<id>1</id>
<type>source</type>
<uri>component:timer:foo</uri>
<links>
<link>
<id>link1to2</id>
<type>sync</type>
<bound>out</bound>
</link>
</links>
</step>
<step>
<id>2</id>
<type>sink</type>
<uri>block:print:Hello World!</uri>
<links>
<link>
<id>link1to2</id>
<type>sync</type>
<bound>in</bound>
</link>
</links>
</step>
</steps>
</flow>
</flows>
</integration>
</integrations>
</dil>

Though this gives exactly the same output as the first example, it explicitly defines levels, ID’s and links. The first example is easy to write yourself, the second is better when you generate the file for example with a visual tool.

Try it yourself!

The “hello-world” program and other examples can be tried with Assimbly Gateway an open source integration tool.

Download & Run

java -jar gateway-[version].jar

There are three ways to try out the examples:

Folder

  1. Create a file in a text editor
  2. Save the file in the in the deploy folder:
/{user.home}/.assimbly/deploy

The file (and changes in the file) are automatically detected. Delete the file to stop the flow.

GUI

  1. Login to the gateway (admin/admin by default):
http://localhost:8080

2. On the main page, go to:

Actions → Create service

3. Create and save the flow

4. On the main page run the flow.

API

  1. After starting the gateway, go to the following URL:
http://localhost:8080/#/admin/docs

2. On the (Swagger) API documentation page, use the following endpoint:

/api/integration/{integrationId}/flow/test/{flowId}

Note that integrationid and flowId should match the ids in the DIL configuration. For integrationid you can use 1 as default.

More reading:

--

--