Our requirements from a deployment pipeline

Recently, we (ClearTax) started to look for a uniform, standardised deployment pipeline.

We have grown, and we now have applications written in several languages that interact with each other. We didn’t want each product team to reinvent the wheel — we wanted a common infrastructure platform that will take care of most of the problems faced by individual teams, enabling them to focus on the product.

There are a lot of options out there (Ansible! Chef! Jenkins 2!). It’s hard to make a decision when faced with so many options.

I thought it would be useful to first articulate what we actually wanted. This blog post is just about the features we wanted in a deployment pipeline — it’s not exhaustive, and it’s tailored for our needs — but I hope it’s useful for others too. I will explain our final platform choices in a future blog post.

Note: for sake of brevity, this is mostly a bulleted list of points (the outline). Please leave a comment if something is not clear.

Must Have: Fully Automated Deployments

  • Deployment to staging should be automatic once you merge in a pull request to master.
  • Production deployment should be 1-click (promote current staging to production) or automatable (eg: deploy latest staging to production once a day).

Must Have: Developer Servers

  • Developers should be able to provision servers for any branch they are working on. The ability to deploy your current PR to a URL in order to let others play with it, run tests, etc.
  • We should be able to do this ourselves, without jumping through hoops or approval requests.
  • This should be as friction-free as possible.

Must Have: Immutable Deployments

  • All deployments will be immutable.
  • Immutable means that once an application is deployed to a server, any change will result in a new server being launched, and the old server being de-provisioned.
  • Individual servers are throw-away. There is no local state.
  • Deployments are deterministic and predictable.

Must Have: Zero downtime deployment strategy

  • Deployment should not incur any downtime.
  • Rough process for ensuring this: launch new servers with latest release, wait for servers to be healthy, switchover to new server cluster at load balancer level, gradually de-provision the older servers.
  • Use blue-green style deployments for critical services, and where ability to roll-back quickly is crucial.

Must Have: Audit Trails

  • Ability to see when deployments happened, who triggered a deployment, etc.
  • Traceability is a must in case of any issues we run into.

Must Have: Easy to use, self-service nature

  • You should not need to learn new syntax or new markup languages in order to deploy your branch to a URL.
  • The deployment tool should have a UI (CUI or GUI) that is intuitive to use, and does not require learning arcane incantations.
  • This is a requirement as we want everyone to be as self-sufficient as possible.

Must Have: ‘Platform’ layer that takes care of shared requirements

  • New applications have all the basic requirements available by default: log aggregation, monitoring, alerting, etc.
  • Default configuration of firewalls, security groups, etc made available to each new application.

Good To Have: Credential Management

  • Code should never have credentials committed to it.
  • The deployment pipeline should take care of giving credentials for the current environment (staging, production, etc) to the running application automatically.
  • Credentials should not be in plain text, and not accessible to the developers.

Good To Have: Managing ‘Static’ Resources

  • Ability to manage resources such as databases, load balancers, network configurations (i.e., AWS VPC settings, AWS Subnet Groups, etc) within the deployment pipeline itself.
  • Self-service tools for provisioning new resources like this.

Good to Have: Windows Support

  • Some of our applications and services are written in .NET.
  • Need for using the same pipeline for deploying a service to either Windows or Linux.

Good to Have: Ability to deploy non-web services

  • Not just HTTP: We need the ability to deploy queue consumers, background workers, other daemons, etc.
  • Have the same level of monitoring, log aggregation, alerting, etc available to non-HTTP services.

Good to Have: Speedy deployments

  • Deployments should be fast, and easy to do.
  • It should not take too long for new code to be live. Avoid wasting time waiting for deployment to occur.

Next blog post (which I should get around to writing soon!) should talk about the different tools we evaluated, and what we ended up using.

Suggestions, thoughts? Please leave a comment.

I’m currently setting up a infrastructure / site reliability team for ClearTax. Please let me know if you’re interested! You can email me at ankit@cleartax.in

Like what you read? Give Ankit Solanki a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.