Replacing the old with the new: Tale as old as time
It happens often when coding that there is a new feature, endpoint, variable, etc. introduced to replace an older one. It’s most likely part of a bigger system that is currently in use.
Looking at the picture above, if you were to take out one of the supporting blocks, the whole structure would fall down. To avoid that, you would need to replace it first with another similar block. Likewise, when replacing the use of one system with another, there is a set of steps we found that can be generalized to offer a smooth transition.
Context:
We recently refactored a system that tells us which health insurance plans we can offer to a particular company. In France, most companies have an identifier called a CCN (Convention Collective Nationale) which we use at Alan to check and build compliant health insurance coverages.
In addition to the fixed list of plans we offer companies with a particular CCN, we sometimes make exceptions and offer additional plans after discussions with the company. The whitelisting system for the additional plans was initially based on offering a group of plans. We had a Company <-> Plan Group relation. The plan group had a list of plans associated with it. However, this made things complicated as we had to create new groups just to offer one particular plan to a company.
To simplify this, we introduced a new table called the WhitelistedPlans table to relate a company to a plan on a 1:1 basis. This way a company could be whitelisted on several plans. This took out the plan group abstraction and made it easier to see what we could offer a company. Below is an illustration of what we were replacing.
Before getting started, it’s good to take a step back and ask the right questions.
- Who does this affect: Is there anyone that needs to be aware of this change? Talking to stakeholders early on in the process can help clear roadblocks ahead of time.
- What will change: Will there be a breaking change between the two systems? Is there any data that needs to be moved over?
- When does it need to be done: By when should the new system be used? Maybe this feature needs to be out before a fixed date.
- Is a backup plan needed: Is the change you are making on a critical system that cannot go down? It’s good to make sure you have a backup plan in this case.
- How is success measured : How do you know when the migration is successful? Maybe this doesn’t mean replacing all the usages or sunsetting the old system completely, but surfacing an error to stop any new usages.
Here are the steps we took when doing the replacement:
Step 1: Check out current usage
For our first step, we went through how we use the associated tables and where in the codebase we would need to replace these usages. As companies sign up on Alan all the time, we needed to make sure that the system is always available to deliver what plans a company can sign up for.
Step 2: Handle new entries
We were migrating from using company.plan_group.plans to company.whitelisted_plans.
This meant that any new whitelisted plans needed to be updated in both tables so we don’t lose track of any data. Therefore, we made a code change where we created new entries in both the Company table and the WhitelistedPlans table.
Step 3: Migrate existing data
As we already had plans whitelisted for a company through plan groups, we needed to backfill this data into the WhitelistedPlans table. Only once this was done could we return the same result from both systems.
Step 4: Monitor usage
In order to see if the new system replaces the old system robustly, we added in some logging. This was specifically in the cases where the result returned by the old and new system was different. It’s also a good idea to add in some monitoring. Check out this article where Chaïmaa mentions how we leveraged monitoring tools when doing a big refactor.
To gradually release new features you can also use feature flags. This can be a check statement where you can choose if either the new or old system is used when a call is made. Doing this provides an easy way to go back to the old system if needed and makes for a good backup plan.
Using all of the steps above, we were able to safely replace the usage of company.plan_group.plans to company.whitelisted_plans. We did this all while keeping the system up and running!
Summarizing the above, here are the steps that we found were common to replacing the usage of an old system with a new one:
- Step 1: Check out current usage
- Step 2: Handle new entries
- Step 3: Migrate existing data, if needed
- Step 4: Monitor usage
- Step 5: Bonus step! Go get yourself some ice cream (Optional, but highly recommended)
Wrapping up:
Knowing when the migration from one system to another is complete is important to define. This can look like completely switching to the new system and getting rid of the old system or could mean supporting the old system for a while until all usage is moved over. Either ways, outlining this is crucial to inform stakeholders of what they can expect to happen.
Another point to note is that you may not need all the steps mentioned above. For example, Step 3 of migrating existing data is optional if your system does not depend on any stored data.
I hope this post has shed some insights on the important questions to ask and main steps involved when replacing an old system with a new one. Remember, success can look different depending on what you are replacing. Now, on to getting some ice cream 😋