Our requirements from a deployment pipeline

Published in

ClearTax Engineering

3 min readApr 10, 2017

Recently, we (ClearTax) started to look for a uniform, standardised deployment pipeline.

We have grown, and we now have applications written in several languages that interact with each other. We didn’t want each product team to reinvent the wheel — we wanted a common infrastructure platform that will take care of most of the problems faced by individual teams, enabling them to focus on the product.

There are a lot of options out there (Ansible! Chef! Jenkins 2!). It’s hard to make a decision when faced with so many options.

I thought it would be useful to first articulate what we actually wanted. This blog post is just about the features we wanted in a deployment pipeline — it’s not exhaustive, and it’s tailored for our needs — but I hope it’s useful for others too. I will explain our final platform choices in a future blog post.

Note: for sake of brevity, this is mostly a bulleted list of points (the outline). Please leave a comment if something is not clear.

Must Have: Fully Automated Deployments

Deployment to staging should be automatic once you merge in a pull request to master.
Production deployment should be 1-click (promote current staging to production) or automatable (eg: deploy latest staging to production once a day).

Must Have: Developer Servers

Developers should be able to provision servers for any branch they are working on. The ability to deploy your current PR to a URL in order to let others play with it, run tests, etc.
We should be able to do this ourselves, without jumping through hoops or approval requests.
This should be as friction-free as possible.

Must Have: Immutable Deployments

All deployments will be immutable.
Immutable means that once an application is deployed to a server, any change will result in a new server being launched, and the old server being de-provisioned.
Individual servers are throw-away. There is no local state.
Deployments are deterministic and predictable.

Must Have: Zero downtime deployment strategy

Deployment should not incur any downtime.
Rough process for ensuring this: launch new servers with latest release, wait for servers to be healthy, switchover to new server cluster at load balancer level, gradually de-provision the older servers.
Use blue-green style deployments for critical services, and where ability to roll-back quickly is crucial.

Must Have: Audit Trails

Ability to see when deployments happened, who triggered a deployment, etc.
Traceability is a must in case of any issues we run into.

Must Have: Easy to use, self-service nature

You should not need to learn new syntax or new markup languages in order to deploy your branch to a URL.
The deployment tool should have a UI (CUI or GUI) that is intuitive to use, and does not require learning arcane incantations.
This is a requirement as we want everyone to be as self-sufficient as possible.

Must Have: ‘Platform’ layer that takes care of shared requirements

New applications have all the basic requirements available by default: log aggregation, monitoring, alerting, etc.
Default configuration of firewalls, security groups, etc made available to each new application.

Good To Have: Credential Management

Code should never have credentials committed to it.
The deployment pipeline should take care of giving credentials for the current environment (staging, production, etc) to the running application automatically.
Credentials should not be in plain text, and not accessible to the developers.

Good To Have: Managing ‘Static’ Resources

Ability to manage resources such as databases, load balancers, network configurations (i.e., AWS VPC settings, AWS Subnet Groups, etc) within the deployment pipeline itself.
Self-service tools for provisioning new resources like this.

Good to Have: Windows Support

Some of our applications and services are written in .NET.
Need for using the same pipeline for deploying a service to either Windows or Linux.

Good to Have: Ability to deploy non-web services

Not just HTTP: We need the ability to deploy queue consumers, background workers, other daemons, etc.
Have the same level of monitoring, log aggregation, alerting, etc available to non-HTTP services.

Good to Have: Speedy deployments

Deployments should be fast, and easy to do.
It should not take too long for new code to be live. Avoid wasting time waiting for deployment to occur.

Next blog post (which I should get around to writing soon!) should talk about the different tools we evaluated, and what we ended up using.

Suggestions, thoughts? Please leave a comment.

I’m currently setting up a infrastructure / site reliability team for ClearTax. Please let me know if you’re interested! You can email me at ankit@cleartax.in