Speed up your Netezza integration in Metl

Nathan Richardson
Data Weekly by Jumpmind
3 min readMay 7, 2018

I recently used the open source Metl integration platform to load a large (7+ million row) table into the Netezza database platform. The load of the table took approximately 23 minutes which was not fast enough for me so I began my quest to determine where the bottleneck was. After multiple tweaks to the RDBMS Reader and RDBMS Writer component of rows per message, number of threads, batch mode, use previously cached table metadata, etc, I began my search of an alternative way to load my data and I found one that pleasantly surprised me giving me a 5x faster performance.

My test was a simple one. I wanted to perform a full load of one table from a separate database platform to a Netezza database. The table contained over 7 million rows and only 12 columns. For my initial attempts I was using a typical flow example as below:

Initial Metl Flow

Using this flow, the fastest time I had was just under 23 minutes. Here are the times and scenarios I tried before I decided this was not fast enough:

After researching various Netezza load techniques I stumbled upon using External Tables which allows Netezza to treat an external flat file as a database table. Metl does really well creating text files so I modified my flow to handle the extraction of data to a flat file first and then added the logic necessary to perform the load using an External Table in a SQL Executor component. Here is what I ended up with.

Metl Flow using External Table

With this flow I tried the below scenarios each resulting in an actual Netezza load time of approximately 4 seconds and the best overall run time of just over 4 minutes. Clearly a much more acceptable time than the previous Metl flow design. Based on these results my next challenge was to determine how fast I could generate a flat file of the data.

What does the SQL Executor look like?

First, create an external table that references your flat file, then clear the contents of your target table followed by an insert into the target from the external table. That’s it.

Here is an example of what the SQL Executor might look like:

Wait, what about…

You might now ask, what about when I don’t have a full table load and want to do upserts, will this method work? Yes it will, the difference is your flat file you generate should only contain the records that are to be updated and inserted. Then in the SQL Executor instead of performing a truncate table you will delete the matching records first and then do an insert.

Here is an example of an upsert:

For more details about Metl visit the Wiki page on GitHub (https://github.com/JumpMind/metl/wiki).

--

--