Data Exchange Rates: Understanding the Phases of a SymmetricDS Batch
“How fast will this go?” is a question we often get asked when setting up database sync scenarios. And it’s a tricky question because of how much SymmetricDS relies on your existing databases to get its work done. Often, “How fast will this go?” is answered with another question: “How fast will your database go?” That is, SymmetricDS replication is usually limited by the speed of your source and target databases. This article will discuss the phases of a SymmetricDS batch with some details so you will be better equipped to tune for performance.
Pinpoint the Bottleneck
If you are interested in seeing your replication be faster and more efficient, the first step is determine where your current bottleneck is. A good place to start is your outgoing batch screen in SymmetricDS Pro, or in the sym_outgoing_batch table in the open source version. It’s important to understand which phase of your batch’s replication is the bottleneck.
The Phases of a Data Batch
You can see the phases of the batch load on the Pro console, under Manage > Outgoing Batches and locking on a batch.
Let’s talk through each phase and note some things that could slow that phase down.
Routing Phase
Routing is the process of creating outgoing batches in SymmetricDS. It determines where your captured data should be delivered. The time routing takes is how long it took to read captured rows from sym_data and create the corresponding sym_data_event and sym_outgoing_batch rows.
Causes of Slow Routing
Symptom: Slow access reading sym_data
Suggestion: Check the table statistics and general health of the indexes on sym_data. Many of the sym tables grow and are shrink regularly, so it’s important to do regular maintenance on these tables to keep them fast. This suggestion alone covers the majority of routing performance problems we see.
See our maintenance and tuning guide:
https://support.jumpmind.com/kb/article/14-Maintenance-and-Tuning
Symptom: Slow access writing sym_data_event or sym_outgoing_batch
Suggestion: Same suggestion as above.
Check For: Non-default routers in use.
Suggestion: The “Default” Router is the fastest. The subselect router can be expensive as it requires more database hits in order to route the data. The beanshell router can also be expensive if it’s doing anything computationally intensive (and also routing).
Routing Log Hints:
SymmetricDS will log whenever a statement takes longer than about 20 seconds to execute. For example:
2018–07–10 16:06:20,286 INFO [corp-000] [JdbcSqlTemplate] [corp-000-job-1] Long Running: (20222ms.) select max(data_id) from sym_data
The sym_data_gap table is also used during routing. Here is an example where the system detects slow reads from sym_data_gap:
2018–07–10 16:07:55,310 INFO [corp-000] [DataGapFastDetector] [corp-000-job-1] Querying data in gaps from database took 85017 ms
Whenever routing takes more than 60 seconds, SymemtricDS will emit status logging messages like this:
2018–07–10 16:14:14,352 INFO [corp-000] [RouterService] [background-refresher-1] Routing item, data.enqueue.time.ms=0, data.events.insert.count=1000, data.events.insert.time.ms=140081, data.read.total.time.ms=9, data.routed.count=500, data.router.time.ms=2, total.time.ms=150109
Extract Phase
The extract phase is where captured data is read from sym_data and transferred to a staging file on the disk for delivery to one more more nodes.
Causes of Slow Extract
Symptom: Slow access reading sym_data
Suggestion: See suggestion above for sym_data… Statistics and index maintenance can’t be emphasized enough.
Check for: Inefficient Extract Transforms
Suggestion: Review what extract transforms you have assigned to the table. Too much row by row computation or SQL could slow things down.
Check for: Issues with Lobs
Suggestion: Does your table contain one or more large object field? Long fields (LOB’s or CLOBS) can pose special challenges for efficient replication. There are some trade off options such as:
- Stream Lobs (under Configure > Table Triggers.) if this is false, the system will attempt to capture lobs into sym_data, which can slow down insert or update times or your application. If this is true, lobs are not captured to sym_data upfront, which could speed up your application’s inserts but slow down the extract process.
- Capture Lobs: Could make sense when you’re using Oracle and have stream lobs = false. This is an Oracle-only setting that changes the SymmetricDS trigger DDL to use “clob” functions and work around 4k character limit during data capture and initial load situations.
Log Hints:
This kind of logging could be generated from a very slow extract query to sym_data:
2018–07–10 16:31:10,254 INFO [store-001] [SimpleStagingDataWriter] [store-001-pull-default-2] Batch ‘?’, for node ‘?’, for process ‘transfer to stage’ has been processing for 60 seconds. The following stats have been gathered: LINES=1, BYTES=0
Network
Network represents the time it takes to transfer the batch bytes over the network. Normally this is compressed and is pretty fast. It is rare that network transfer is the source of a bottleneck. If it is, check for firewalls and proxies in between, and double check what hops network traffic is taking to route between your client and server.
Filter
Filter represents the amount of time it takes the target to process an incoming batch before loading it.
Check for: Slow LOAD table transforms. LOAD transforms run on the target side and could contribute to slow filter times, especially if they are running their own SQL against the target database.
Load
Finally we get to the point of actually loading the data. This is probably where we see most of the replication bottlenecks in practice.
Symptom: Slow Updates
Suggestion: Check statistics and indexes on the target table. An update has to first locate the existing record to update (like a select). An INSERT needs to update any corresponding indexes on the table efficiently.
Symptom: Slow Inserts
Suggestion: Check foreign key indexes involved with the table. If the target table is involved in a foreign key relationship, it’s possible the database is scanning parent tables to check the integrity of the foreign key relationship.
- Consider a bulk loader. SymmetricDS supports bulk loaders for various platforms, with Postgres and MySQL being some of the most full featured. The upside of a bulk loader is dramatically increased performance. The downside is that they don’t all support partially populated tables (ie, some require the table to be empty when starting the load).
Log Hints:
Here is an example of logging of a 100 rows batch, where each row is taking 1 second to INSERT.
2018–07–10 16:38:28,365 INFO [store-001] [DataProcessor] [store-001-dataloader-4] Batch ‘139’, for node ‘001’, for process ‘data load from stage’ has been processing for 60 seconds. The following stats have been gathered: {STATEMENTCOUNT=48, LOADMILLIS=60027, STARTTIME=1531255048328, INSERTCOUNT=48, FILTERMILLIS=6, ITEM INSERTCOUNT=48, LINENUMBER=48}
I hope these hints help you tune your SymmetricDS batch loading times. If you have any suggestions, please leave me a comment!