Push-button pipeline for digital transformation API stack with an automated Canary release (using Terraform & Jenkins)
Introduction
Continuous integration and continuous deployment are now mainstream among software companies. This pipeline can be easily understood as the process pathway through which we can deliver a single unit of production-ready software. For IT leadership, understanding the CI/CD pipeline is critical to keeping your organization on track for digital transformation goals. CI and CD form the backbone of the modern DevOps environment and companies are doing continuous delivery throughout its lifecycle. But can your pipeline perform push-button deployments of any version of the software to any environment on demand?
That’s where the true challenge lies in the CI/CD process.
Key challenges with push-button pipeline define beleaguered DBT solutions
No single full-stack deployment pipeline
A CD pipeline implementation may sound easy and smooth for deployment. However, many teams face challenges in implementing these pipelines for large digital transformation solutions. Especially when standing up full environments with all the feature functionalities like exp APIs, microservices, etc. A single microservice CI pipeline is great for development but for building integration and performance environments for on-demand for MVPs need single full-stack pipelines.
An interesting fact here is the accrued cost of manually setting up full environments for every release adds up to a mountain.
Inadequate configuration management and non-reusable IAC (infrastructure as a code) scripts
One of the common problems faced by organizations with huge infrastructure today is inadequate configuration managing for infrastructure, i.e not able to maintain the consistency of the services used. For example, an application has to be built and deployed in 2 environments: Test and Prod. Both environments have a different configuration as Tests need smaller capacity servers as compared to Prod. And application running in the test environment is deployed only in one geo-location. In the cloud, prod needs to be deployed in multiple regions for failure recovery.
Lack of canary deployment, let alone automated one
As a best practice, application code is first tested in pre-production environments. But even when continuously integrated, deploying code into production can still be manual, tedious, and error-prone. This leads to ever-larger differences between code to be deployed and code running in production, fueling a risky vicious cycle and could negatively affect site-reliability. The failure of code in production is a huge cost to the organization. And it is one of the biggest challenges to DevOps teams to prevent downtime in an application after every deployment. Because of huge interdependencies between multiple services, frequency of deployments, and with multiple underlying software, it is challenging to implement rollouts of application code in smaller versions.
No Enforcement of Engineering Discipline
Integrating the entire networking stack within apps/codes such as service discovery, routing, circuit breaking, load-balancing, functional testing, performance testing, and security authorization is a huge challenge with an application as an abstraction of the infrastructure layer.
Our Framework to solve these challenges
Modular & Reusable Infrastructure as a Code
Terraform as Infrastructure as a Code — A high-level configuration syntax is used for describing support. It allows a blueprint of datacenter to be versioned and treated as you would any other code. Also, infrastructure can be shared and re-used. Execution plans eliminate any surprises when Terraform manipulates infrastructure and change automation helps change and in what order can be known by the previously mentioned execution plan and resource graph, which helps in avoiding many possible human errors. By using Terraform, any N-tier DBT solution can be scaled easily by modifying a single count configuration value because the creation and provisioning of a resource are codified and automated, elastically with load becomes trivial.
Externalized Configurations & Consistency in deployment strategy across environments
Configuration is externalized in AWS SSM (KV store) for seamless changes in infrastructure parameters like auto-scaling groups or instance types or ami ids.
Specification such as server capacity, geo-location, auto-scaling instance types can be managed as configurations here and can be modified as desired without any code changes.
Using Jenkins as the orchestrator:
• Picks the modules to execute
•Configurations are injected into Modules
•Modules apply configurations on tasks
•Tasks execute to accomplish use cases
Two different environment deployments (e.g. qa and uat) created by the same codebase and pipeline in the cloud account. Just add those values to the KV store and the pipeline will take care of deploying the tested software, maintaining consistency in deployment strategy across regions and environments.
Cloud-Native
Support for cloud native DBT solution. Using App Mesh, Istio and Kubernetes for the microservices which are built with enhanced resiliency in the cloud eco-system. Supporting operational analytics like monitoring or observability using the cloud specific services. All this with security in mind using IAM based controls and KV store with versioning and sensitivity control for key information.
Support for Canary Deployment
This pipeline supports the canary deployment (or rollout) to introduce a new version of a service by first testing it using a small percentage of user traffic, and then if all goes well, increase, possibly gradually in increments, the percentage while simultaneously phasing out the old version. If anything goes wrong along the way, abort and rollback to the previous version.
The canary support is automated so that you can define the new service and update the KV store to define the percentage and use the pipeline to push the changes. Confirm canary option is added to allow for business teams to validate the canary results and push the changes to 100% accordingly.
Engineering Practices
Service discovery, routing, circuit breaking, load-balancing are addressed with cloud-specific tools like AWS app mesh or Istio (in GCP). Being part of the deployment template, each service gets registered there and can be monitored and observed automatically once deployed. Logging follows the cloud architecture as it uses fluentd module to enable service logs to be captured by the AWS cloudwatch or cloud logging service in GCP.
Functional testing and performance testing are integrated as test harness and part of the pipeline. Newman (Postman command-line version) gets kicked off with the defined scripts (test cases written in Postman) to test at every level of the architecture to certify the software.
Security authorization follows the cloud-native principle of using IAM policies. Creation of the service accounts, roles, and privileges are created programmatically using IAC code. This ensures safety and makes the deployment less error-prone per environment.
Framework in Action
Conclusion
In a world where every company needs to think like a software company to be cutting-edge and innovative, the role of DevOps is immense. For successful DevOps and to meet the most important business objectives, CI/CD is a must. And CI/CD with the right attributes is even more critical to save labor cost, improve time-to-market, set engineering standards and quality. Our framework does just that and it alleviates IT leaders’ headaches for constant DBT solution failures and takes a step closer to achieve No-Ops.
Contributors
Anupam Singh — Senior Associate, DevOps
Amit Srivastava — Senior Software Development Engineer
Amit Sharma — Director, Engineering
Ravi Evani — Vice President, Engineering