How We Performed Data Migrations for a Monolith to Microservice Transition

Sai Charan Chinta
Unibuddy
Published in
5 min readJun 4, 2021

Data migration can be quite tricky especially on a live application which is constantly reading and writing data. In most cases, you will need to write code that populates both the new and old data stores. You would also have to facilitate roll-forward and rollback if needed. In this blog, we will talk about how we migrated our data into a new data store all while keeping the application actively serving requests.

Pre-requisites: Kafka and MongoDB basics

Setting the context

We have a monolithic application which uses MongoDB as its main database. We have a complex user module(call it User V1) which is hard to maintain and not flexible to extend. We also have an ever growing volume of user data and we are already hitting the limits on the database.

So we went ahead and wrote a new user module(User V2) from scratch which is flexible and easier to extend. This module uses a separate data store and so will remove some of the load from our database. Also, one of the main changes is the way the user data is structured and hence, we need to first translate the existing data from V1 to V2 and also take care of any updates or new data constantly being added to V1.

Note: There are reasons why we chose to write a new module instead of a new service which I won’t get into in this post but the eventual plan is to move the V2 code into a separate service once the data migration is completed

We want to ensure the following when we do the data migration:

  1. We want the service to always run and correctly serve the requests while the migration is happening
  2. Writing to both databases in a reliable way and keeping them in sync is very hard. So we made a choice to always keep the V2 database in sync with V1 database and not vice versa.
  3. Since rollback is not an option, we want to be sure that there are no data discrepancies between the two sets of data
  4. Only switch to the new implementation when it’s absolutely safe and we are 100% confident
The current state of the monolith. The dotted line is where we want to go to!

So, how did we do the migration?

This is not a straight forward data dump since the V2 data structure is different from V1 which means we need an intermediate step which transforms the data from V1 to V2.

We could maybe do a data dump and have a migration script which does the transformation but how do we handle updates? Also, these updates need to be ordered. For example, if a user is created and then updated, we do not want to process the update first and create next because that would just fail. So we need some form of an ordered queue in the middle.

But how do we get data into the queue? Kafka Connect to the rescue! Kafka also ensures the events are always ordered! MongoDB provides an official MongoDB Kafka Connector which could transfer data between Kafka and MongoDB. This connector can copy the existing data into Kafka and then watches over a collection/database and sends the changes to Kafka as change stream documents

Note: we are using confluent cloud and its managed connectors for this use case

Now that we have a Kafka topic with the user data, we could just write a consumer which consumes this data and uses the transformation logic to write data to the new DB.

This is how our data migration pipeline looks like

Data migration pipeline

We chose to have the consumer(event consumer in the above picture) as a separate service because we don’t want to make too many changes on the monolith. This service just gets the event data and calls the UserV2 module’s API endpoint which transforms the event data into the new format and writes it to the V2 database.

There are a lot of advantages with this approach:

  1. The whole pipeline is very non-intrusive because none of our services are affected
  2. We don’t have to write a lot of code to set this up
  3. The main access point is the origin database which just follows an event notification pattern to notify us about any update as a change stream event
  4. We have a lot of user data so there is guaranteed to be a lot of messages in the Kafka topic but we do not want to overwhelm our monolith with requests. Instead the event consumer handles it and we can easily control the throughput by changing the number of instances of event consumer.

Once we have this pipeline running we have the two data stores in sync and the data is guaranteed to be eventually consistent.

But when do we switch to use the new implementation?

We need to be confident to do this because rollback is not an option. We also need some way to check if the V2 data is actually correct. So we created a hook inside the UserV1 module to call UserV2 in parallel every time user data is requested and compared the results. This hook sends a slack alert with necessary information every time there is a data discrepancy.

This is how it looks like:

A simple hook which compares V1 and V2 data

We leave the hook in place and monitor for some time. We could initially get a lot of alerts but once the consumer catches up with all the messages, we should see a drastic fall in the number of alerts. Since the data is going to be eventually consistent, you could verify these alerts manually and be sure that it’s a false positive.

Finally after monitoring for some time and we verify that the data is being synced correctly, we can change the hook to always use UserV2.

The final state of the monolith

TL;DR

Here are the summary of steps we did to complete a full migration

  1. Move all the data in V1 database to a kafka topic using Kafka Connect API
  2. Write a consumer logic which consumes this user data in the topic, processes it and writes it to V2 database. We will now have the two databases in sync.
  3. Implement a hook which also queries data from V2 every time V1 data is requested, compares the result and triggers an alert
  4. Once there are no more alerts, we change the hook to only use V2

--

--