How to Migrate to Your New Platform

Dominic Wong
7 min readJul 29, 2015

--

There’s a lot of talk out there on microservices debating whether or not it’s a good pattern and the technologies and frameworks you should be using to implement them. This seems to have convinced many companies that they need to rewrite their systems. However, there seems to be very few blog posts on the realities of how you go from your existing system to a new (microservices) one. In this post I wanted to outline some of the techniques and patterns that we used at Hailo while we migrated from v1 to v2 of our platform. No doubt there are some things that we would do differently given a second chance but I hope this can add some real experience to the wider discussion. For more info on how the Hailo platform works please take a look at this excellent three part post.

Multitasking is hard — maintaining two platforms in parallel

What most people tend to gloss over is that creating your new microservices platform will take a long time. Much longer than you think. At Hailo it took about three months to get something vaguely functional and run some production traffic through it. It took another six months before we had a barebones replica of the full system and a further three months before we were fully happy to run an entire city on it. Your company cannot survive twelve months (or more!) of no new feature development while your engineers build the new platform. This means you will inevitably find yourself maintaining and improving the current platform while building the new one. Consequently, v2 will have a moving target when trying to identify all the features to be implemented to replace v1. If you only take away one point from this post make it this — do whatever it takes to make this period as short as possible; running two platforms in parallel is painful.

Build new features on the new platform

As soon as v2 is ready you should halt any new development on v1 which should now become “maintenance mode”. Any new development on v1 will be duplicated (i.e. wasted) effort. Building new features on the new platform will also prove the new platform as quickly as possible. Getting developers to build new features on your platform early provides a great feedback mechanism and can help to focus on which aspects of the platform need most attention.

If you absolutely must implement a feature on v1 you should ensure that the feature is developed on both platforms by the same team. Getting the same team to do both implementations should force them to think about backwards/forwards compatibility of the feature which should help for a smoother migration.

Mixing old and new

You’ll likely need to call your new services from your old platform and vice versa. There are a number of techniques to help achieve this

Wrappers/Shims

The most obvious way to get your old and new services talking to each other is to provide wrappers or shims to present the appropriate interface to the consumer and to pass the correct parameters to the provider of the API. Wrappers are typically written by consumers of the service to allow the underlying implementation to be swapped out. Shims are written by the provider to intercept calls to v1, translate the API parameters to v2 and pass on to v2 of the service (or vice versa)

Dual Publish

While migrating services you will probably be running old and new services, which provide the same functionality, in parallel. If they use different data stores you may need to call both services in parallel to ensure both old and new stay in synch. For example if creating customer records you might need to call both v1 and v2 customer services to create the customer record in both old and new systems. This obviously presents problems in the event of failure — what happens if customer creation fails in v1 but succeeds in v2?

Data Synching

As an alternative to dual publish you can instead keep old and new in synch in the background. Create a “synching” service which will trigger calls between v1 and v2 when data is changed. This is slightly less real time than dual publish but is arguably easier to implement than having to intercept and fan out calls.

Reusing Data Stores

If you are lucky enough to have got your domain model correct the first time around and you are happy with your data stores you could use the same data store for v1 and v2 of the service. Connecting v1 and v2 services to the same MySQL/Cassandra/etc instance could save you lots of data migration hassle but is only really advisable if you’re absolutely happy with the data model.

Migrating clients

The above is great for calls that originate from services running on your platform, but what about external clients? Your mobile apps, web pages, third party clients are probably all going via some API layer which will proxy the call to your platform. Something like HAProxy does a great job of this. As you migrate functionality from v1 to v2 you can configure HAProxy to switch the traffic over to the new platform feature by feature.

At Hailo, we wanted to be able to control traffic redirecting at a more granular level. We wanted to be able to route X% to v1 and Y% to v2 and we wanted to be able to do this based on some characteristics that we defined so that we could have “stickiness”. For example, a specific customer/city/entity should always be routed to v1 or always be routed to v2 to ensure they have a consistent experience. At the customer level this gives us the ability to canary new releases/features by slowly dialling up the percentage of customers being directed to v2. Doing this at a city level gives a nice way to treat an entire geography as your beta testers. Obviously writing your own proxy layer isn’t something that you should do lightly but for us the benefits outweighed the extra complexity.

The Human Side

It’s important to note that migrating to your new platform is about more than just choosing your tech stack; there are a wealth of non technical aspects to consider too. Firstly, making the transition to writing microservices requires a developer to change the way they think about writing services. Implementing truly distributed systems requires you to start thinking much more about things like coordination, locking, idempotence, scalability, fault tolerance, monitoring, debugging, etc. This can be more of a learning curve for some than others. Get your developers talking to each other, to bounce ideas and designs off each other, and to share best practices. Working out what is the correct size for a “micro” service is a pretty inexact science and something you get a feel for with experience and is probably specific to your platform.

Make sure you deliver value quickly. As with any major rewrite, replatforming your company to microservices will be a long process. Getting some early wins will not only boost morale and get buy in from others but it will also allow you to iterate and improve your platform faster with real feedback. Try to identify suitable features from your roadmap which could be good candidates to implement on the new platform; maybe something that isn’t mission critical in the first instance while you prove out your system.

Be sure to spread the love. If you are working on both v1 and v2 of your platform try to ensure that everyone gets to work on the new shiny. If you have a team working on implementing v2 don’t forget those who are working on v1 (you know, the thing that is actually generating traffic/revenue for the company). Rotate people in and out wherever possible; this will help make everyone feel like owners of the new platform and provide more opportunities for feedback.

Flipping the Switch

Once you’re ready to migrate your systems to v2 there are a few techniques you can employ to help to reduce the risk of issues.

Practise, Practise, Practise

Do as many dry runs of migration and, crucially, rollback steps as possible. You should be so well practised that you are completely relaxed about the process and can do it in your sleep (which may help if you need to do your migration at 4am during your quiet period).

Test in Live

You’ve tested in staging and everything works great but are you sure it will work the same in live? You’ll be surprised how easy it is to forget some configuration or other change when setting up your new services in live. Test your live systems before the migration wherever possible. At Hailo we added a feature to our API layer that allowed us to choose which version of the platform was used for client calls by adding a header to our requests. This allowed us to test in live using real apps while real users were still being directed to the old platform.

Canary Cities

Still not sure about your new system? Pick out a city (or some other easily divisible subset of your users) and make them a canary release to test. Perhaps best to choose a low traffic city or one with understanding users in case the worst happens.

Most (if not all) of the above should apply equally well to any platform migration, not just microservices, but with the microservices movement there seems to be many more companies investing in replatforming right now. It’s not something to be taken lightly but with the right preparation and execution it should hopefully pay off for your company.

--

--

Dominic Wong

Music lover, tech geek, couch potato, ex @hailo @monzo