Deployment strategies at scale using Edge at MakeMyTrip
Reliable, Robust, Rapid and Scalable approach for deployment automation
At MakeMyTrip, deployment automation is a crucial automation step over which Engineering teams rely to ship their changes to production reliably and rapidly. For this purpose we have developed our own application deployment automation tool :- Edge.
Deployment automation at MakeMyTrip takes utmost responsibility of shipping new features and fixes to production environment reliably and as fast as possible. Edge ensures zero downtime during deployment, with maximum parallelization; Staggered mode deployment through timely “canary” checks to ensure that new changes are not degrading system, application & biz KPIs.
Salient features ensuring reliability and speed:-
- Zero downtime staggered deployments
- Canary checks for metric comparison
- Auto roll-forward or roll-back based on Canary decision
- Parallel deployments across data center
- Application health checks
- Robust and readily available reporting
- Scheduled roll outs.
1. Distributed Architecture
Edge deployment framework in itself is a distributed application which has a capability of having multiple frontend and backend services running over multiple nodes across data center.
2. Limiting distractions & empowering Engineers
Different teams in Engineering use Edge for their application deployments, In total there are ~250 different projects. Hence it is necessary to have a good amount of restriction, checks and balances on team members such that it is easier to find relevant projects.
Edge uses widgets to ease out the pain for visualizing information at different places. A consolidated widget gives a good summary to the team members regarding what all is happening within their respective projects.
The MMT Edge system has extensive reports — and end to end tracking — which comes handy in gathering deployment issues related to a project — these can be tracked via Jira tickets and Git changes that were shipped to the production.
Engineers are provided required authorization based on their team, this also helps in enforcing required restrictions to ensure correct people are authorized to take actions respective to teams & projects.
Another feature of Edge is that it can take a preferred scheduled time for deployment. Roll out auto kicks off as per defined schedule.
3. Handling large volumes of application deployments
At MakeMyTrip, like any other eCommerce setup, we are constantly rolling out some new feature/fix to production environment. Edge is developed on the ideology of Docker and its service orientation allows Edge to be horizontally scalable.
Running multiple docker containers help in achieving high parallelization and at the same time efficient utilization of available resources. At present we support more than 1K deployments per day.
4. Deploying to heterogeneous environments
Edge’s design is such that it can be plugged into any environment. At MakeMyTrip we do deploy web application, micro services & topologies over production machines hosted in data centers and on servers hosted on MakeMyTrip internal private cloud.
All one needs to do it to write deployment strategies as per the environment and Edge is ready to deploy over it. Keeping every environment isolated from one another ensures that there are no conflicts at execution time.
5. Grooving with Docker and Elastic Search
Our approach of using Docker in building Edge proved to be beneficial in several ways. Any new feature in Edge can be rolled out seamlessly using CI setup at Jenkins and an in house Docker registry configured over VMWare harbor . It is straightforward to horizontally scale Edge workers or back-end services to cater high number of deployments at once.
Using Elasticsearch for storing deployment logs helps Edge to run as a distributed service. Multiple backend or front-end services can rely on the data present in Elasticsearch. In addition ELK provides interesting data and its trends over the time that are helpful in understanding and estimating deployment needs of different teams and projects.
Logging of deployment tasks is crucial for investigation and tracking of events in the deployment life cycle. Edge frontend and backend keeps all the crucial logging at Elasticsearch such that logs are available to every service running in the Edge cluster.
Some interesting data from ELK
- Number of deployments per team or project.
- Number of successful deployments.
- Average time taken to perform health checks.