GETTING STARTED | COLUMN SELECTION | KNIME ANALYTICS PLATFORM

Default Include or Exclude New Columns in KNIME

Enabling different data selections with just one click

Bob Peers
Low Code for Data Science

--

As first published on Creative Data

It’s common when creating a new workflow that after the workflow is finished the source data can change. The most common scenario is when connecting to a database table or view and the data gets extra columns added.

In this scenario, you need to make sure that your workflow doesn’t break because of the introduction of new data.

The way to control this is through the Enforce exclusion and Enforce inclusion settings in the column selection lists of many configuration dialogs.

KNIME Column Filter dialog.

Enforce exclusion

This means that the list of fields you add to the Exclusion list will always be removed from the workflow but any new columns that may appear will automatically be added to the workflow in the Include list

Enforce Inclusion

This is basically the opposite. Only fields included in the Include list will be added to the workflow, any new columns appearing will automatically be added to the Exclude list and removed from the workflow.

Default Include or Exclude?

In general, I want the data coming from a data source, whether it’s a database or a file, to be static so new fields are not automatically fed through my workflow. If I want them to propagate I’d rather control this myself. For this reason, I set this to Enforce inclusion for the initial nodes directly after loading the data.

For nodes later in my workflow, I do want changes to propagate, so if I make a new column it will automatically pass through to downstream nodes. Therefore, I set this to the default Enforce exclusion for all the downstream nodes after initial data load.

The KNIME default for new nodes is Enforce exclusion which in many cases is fine but just be aware what effect this will have on your workflow if your data source changes to include extra fields.

--

--

Bob Peers
Low Code for Data Science

Builder. Automate all the things. Generally curious person.