Data Migration with Zero Downtime

Eric Long
Data Weekly by Jumpmind
4 min readMay 24, 2019

--

Move applications to a new database while keeping systems online.

Whether switching database platforms or upgrading to a new version, moving data from an old database to a new one is a big task that requires planning and automation. Some businesses are open around the clock and can’t afford downtime during the migration. Fortunately, data replication software can provide a solution that keeps databases online and minimizes the impact to application systems.

Migration Assessment

Before starting the data migration, perform an assessment of the database and the environment. Conduct an inventory of the objects in the database, including all tables, views, triggers, and procedures. Some data replication software only deals with tables, so you may need a plan for moving (and converting) other objects. Make a list of tables with the number of rows and size on disk. Some data replication software will let you balance tables across multiple processes. Are there tables using large object (LOB) data? These tables can sometimes be difficult to migrate and may need isolated for special attention.

Automation

The data migration should be automated and configurable, and you’ll want to execute the steps in a lower environment until it performs well. Create a test environment, using a backup copy of the production database, where you can simulate the data migration and refine the process. Be careful to perform steps just like you plan to in production. For example, if tables will be created on the new database, then always perform that step, instead of taking a shortcut to clear the tables each time.

Three Steps to Synchronization

The first step is to enable change data capture (CDC) for the old database to begin data replication. With CDC, any changes made are immediately loaded to the new database. The new database can be empty at this point, and the replication should be able to ignore changes for missing tables.

The second step is to start an initial data load to populate the new database. The replication may have an option to create the tables and other objects in the new database, otherwise you’ll need to use your own scripts. Some replication software can output a data definition language (DDL) file that can be edited for fine tuning before running on the database.

With two parallel production databases, applications can be migrated over time.

The initial load can be a long-running process. It’s helpful if the replication software has a screen to monitor the data load progress and estimate the time remaining. If not, follow the logs and watch disk space as it accumulates as a rough way to monitor the migration. The speed of the load can be improved with multiple threads and direct path loading, if the replication software supports it.

The third step is data validation. After the initial load, the CDC backlog is applied automatically, and both databases are put into sync. Now validation can be performed to prove the process worked and data matches. Replication software might include this feature or provide a separate utility for data comparison. If not, some simple row counts and spot checking of rows can help with validation.

Cutting Over

It’s called zero downtime because there are two databases online for applications to use, but there is still a cut-over when you change an application to use the new database. In some cases, you’ll simply change the connection settings and restart the application. When migrating to a new platform, you may have a new version of the application to deploy. Some businesses have many complex applications, so they need more time to migrate applications when they are ready.

We can extend the cut-over window for application migration by enabling CDC for the new database, so replication goes in both directions. The replication should have conflict management to resolve cases when two users modify the same piece of data. Two parallel production databases means that applications can be moved to the new database or rolled back to the old one at any time.

Conclusion

While moving all your applications to a new database is a daunting task, a smooth data migration can be achieved through careful planning and an automated process. Data replication software is a key solution that keeps systems online during the migration and minimizes impact for the business.

--

--

Eric Long
Data Weekly by Jumpmind

I’m a software developer for JumpMind, a dedicated Linux user, and a technology enthusiast who appreciates the benefits of open source.