GETTING STARTED | FLOW VARIABLES | KNIME ANALYTICS PLATFORM

KNIME Analytics: Flow Variables, The Red Line

Creating dynamically running workflows with a few clicks

Kerem Kabil
Low Code for Data Science

--

As first published in LinkedIn Pulse

A workflow can be defined as a collection of nodes. Depending on used nodes and their configurations, workflows can run either statically or dynamically.

So, what does statically and dynamically running mean?

A workflow can be defined as statically running if it generates the same result at each run. On the contrary, a workflow generating different results by using different parameters at every run can be defined as dynamically running. To make a workflow dynamic, we need flow variables.

How do flow variables appear in a workflow?

Generally, there are two different lines, black and red, connecting nodes in a workflow. The black one carries data like table, image, json, html etc. from one node to another while the red one carries variables.

All good, but how can we create them?

The answer to this question depends on your needs. At this point, we just need to decide what we need and apply it. Let’ s take a look at how can we use flow variables in a workflow.

There are many different ways to create and use flow variables in a workflow. Here we are going to describe two very popular ones.

Configuration Nodes For Flow Variable Creation

Sometimes we need to ask the user to enter a value -parameter- before running a workflow. This value may vary according to the needs of the user. So, we need to create a structure that captures the values entered by the user. Configuration (Double Configuration, Integer Configuration etc.) and Selection nodes (Value Selection, Single Selection etc.) allow us to handle this problem.

We can see a sample workflow above. Thanks to the Value Selection node, we can capture the values entered by a user. As we see above, there are two values, Gender and M, entered by a user. These two values are defined as variables by the user. Now, let’s feed our workflow with these variables. To do that, we need to connect the corresponding nodes with a red line.

Now, let’s take a closer look at the configuration dialog of the Row Filter node. When we jump to the Flow Variables tab, we can see the ColumnName and Pattern selections. As given below, we can define these selections to be fed by variables.

ColumnName is fed by a user-defined variable, e.g. the column Gender; and Pattern is fed by another user-defined variable, e.g. the value M in the column Gender.

If we take a look at the output of the Row Filter node shown below, we can see the data filtered by the value M in the Gender column thanks to flow variables.

Thus, if the user enters another value, we can easily consider that the output data will be filtered with another user-defined value. In this way, each value entered by a user will be filtered automatically in the final table. This means that our workflow works dynamically.

If our goal is to filter a table by a specific value, this is not the only way to do that. There are multiple ways to build a workflow in KNIME. In this case, we could have used only the Row Filter node to filter a table by a specific value, but the workflow will then turn into a “static” structure. This means that each time we want to change the filtering criteria, we need to configure the Row Filter node again.

Building a Loop

Another way to use flow variables is to build a loop. Sometimes, we don’ t want to pick a user-defined value. However, we may need to use some values iteratively.

As we can see in the example above, there is a recursive structure fed by flow variables at each iteration.

Consider, for example, the case where we have transaction data, and we want to mark those data with a specific value like “1" or “0".

First, let’s take a look at our data in the figure below.

What we want to do here is to mark each transaction according to whether a transaction contains a certain value or not.

Let’s show these markups in a new column with the help of the Rule Engine node.

As we can see above, we have just one marked transaction so far. If we run the Loop End node, all transactions will be marked thanks to a flow variable loop structure.

Thanks to the Table Row to Variable Loop Start node, we have marked all transaction data using flow variables in each iteration.

Conclusion

In this article, we illustrated how flow variables are created and how they can be used in general. Surely, the specific use of flow variables depends on the user’s needs. The practical applications of flow variables are not limited to those shown in this article, as only some of the most common uses are mentioned.

--

--