Keeping Databases in Sync “Open Source Style” using SymmetricDS

Chris Henson
Data Weekly by Jumpmind
4 min readDec 19, 2017

Keeping databases in sync is a common problem across the IT landscape. Whether it is sharing data across a microservices architecture, getting data from an on premise data center to the cloud or dispersing data to Sqlite enabled IoT devices … SymmetricDS can help out!

What is SymmetricDS?

SymmetricDS is an open source database replication tool. It is platform-independent, web-enabled, and database-agnostic. SymmetricDS was built to replicate relational data changes uni-directionally or bi-directionally, and also has advanced features such as conflict resolution, data transformation and data routing.

SymmetricDS is a mature product. The first release of SymmetricDS was in November of 2007. Since then, it has gone through three major releases and support for numerous databases has been added. In the neighborhood of 18 different databases are currently supported. The most common databases are Oracle, Sql Server, Postgres, DB2 and MySQL. SymmetricDS runs on small systems like Android phones and big systems like the iSeries (AS/400).

SymmetricDS also features many extension points that can be used to customize behavior. Extensions can be used for things like implementing data loaders (to bulk load data or write to a NoSQL database) or publishing changes to a messaging queue (like RabbitMQ, JMS or Kafka).

How does SymmetricDS work?

The SymmetricDS application process hosts what are called nodes. Each node represents a database that is being synchronized. All configuration is stored in database tables which are installed when a node is brought online.

Configuration is based on node groups. A node belongs to a group and inherits the synchronization profile for that group. All configuration is synchronized across all nodes that belong to the same synchronization network. Because the configuration is stored in relational database tables the configuration is kept in sync by SymmetricDS itself!

Multiple nodes can be hosted in the same SymmetricDS application process or they can be hosted in different SymmetricDS processes that exist in different locations. When nodes are remote, SymmetricDS sends data changes over HTTP/S in a CSV format.

A SymmetricDS node is represented by a properties file that exists in an engines directory under the installation. Each properties file in the engines directory represents a different node. The properties file contains node specific settings and database connection information.

When a SymmetricDS node starts up, it checks to make sure that configuration and runtime database tables are up to date with the current version of SymmetricDS. If the node already has configuration loaded, it then starts a series of jobs that run in the background to perform synchronization. If the node has not joined a synchronization network yet, it uses the registration.url property found the in the properties file which is uses to “register” with the network.

Based on the configuration, SymmetricDS installs database triggers on tables that are configured to sync. When data changes the database triggers store data in a data capture table called sym_data. After data is captured, it must be routed to target nodes. This is done by the route job. The route job creates batches for target nodes. Captured data is batched to be transactionality consistent. Batches are stored in the sym_outgoing_batch table where a status is tracked for each target node.

Depending on the configuration, data is either pushed to or pulled from other nodes when there are outstanding batches. The transmission process happens when a node connects to another node via HTTP as part of the push or pull jobs.

The following diagram demonstrates how data is captured and synced.

And voila … data is replicated!

Where is SymmetricDS Used?

SymmetricDS was originally built to scale to sync thousands of point of sales databases for the retail industry. While it still has a good foothold in retail, SymmetricDS has since spread out to lots of different industries including:

  • Medical
  • Government
  • Logistics
  • Energy
  • Finance

Use cases across industries include but are not limited to:

  • Consolidating data across distributed locations
  • Bridging applications with different data models
  • Moving data to and from the cloud
  • Migrating from one database platform to another
  • Syncing mobile devices

Getting Started with SymmetricDS

The SymmetricDS software can be downloaded from the community site at symmetricds.org. Source code can be found on Github. The project is sponsored by JumpMind, Inc.

--

--