Stop using separate environments for development, testing, and production (DTAP)
Why DTAP encourages queues and waste. And what the alternatives are.
If you are working with Scrum and you have a DTAP-pipeline or DTAP-street (Development > Testing > Acceptance > Production), you have a great opportunity to improve. I strongly believe that DTAP — or at least having separate environments for testing (T an A) — is an anti-pattern in an Agile environment. In this post, I explain why and what you can do about it.
P.s. if you like to listen instead of reading, you can listen to us read this post in our podcast The Liberators Network.
Recognize this? …
I don’t say this lightly. I’ve just seen and been in too many Scrum Teams that struggle with their DTAP-pipeline. This struggle manifests in several behaviors:
- ‘Deploy Friday’: When deployment takes a considerable effort, teams often postpone deployment to the final day of the sprint. This becomes increasingly pressing as the number of environments that need to be updated increases (like ‘Acceptance’ and ‘Testing’). Because deployments rarely go the way you want them to go if you have a lot to deploy and you do it by hand, the Fridays will end up being very stressful. Worse, a lot of breaking issues are often discovered on this final day of the sprint;
- Complicated administration: In an effort to keep track of what features are deployed to which environment, teams resort to all sorts of complicated administration. Like a Scrum Board with different columns for the different environments (I’ve seen one team with 10 columns), or tag/label-based administration in a tool like JIRA. Administration requires time and discipline;
- ‘Where is this running again?’: Conversations within the team, and especially Daily Scrums, tend to feature a lot of talk in terms of ‘What environment is this running on?’ or ‘Have you already deployed the feature to environment X?’. Teams spend a lot of resources, time and effort to figure out and keep track of where a feature is deployed;
- ‘This only happens on staging’: The idea of a DTAP-pipeline is that you can test functionality in an identical environment before rolling out to the next phase. But without exception, teams that I work with have to work with under-powered, slow and insecure pre-production environments. This is often further complicated by different configurations of the server and the application, causing bugs and problems in particular environments. Teams spend a lot of time dealing with environment-related issues or making comments like ‘It isn’t that slow on Production’ or ‘This only happens on Staging’;
- ‘We can’t move X to environment Z because Y is not done/tested/accepted’: The batch-wise nature of how DTAP-pipelines are used encourages teams to see deployments as ‘new versions’ encompassing a whole bunch of features. As new features are added to the batch (the ‘new version’), already tested features have to be re-tested to make certain everything still works. This obviously takes time and effort.
These are some of the behaviors I often observe when it comes to DTAP-pipelines. But why do I consider it an Anti-Pattern?
Why is DTAP an Anti-Pattern?
In a DTAP-pipeline, the flow of features tends to be like this:
Features are pooling up in an environment until the team feels ready (or has the time) to deploy to the next phase. The bigger the pool, the trickier this becomes. The deployment will take more time, a lot has to be configured and tested on the new environment and there is an increasing chance of bugs and errors. A vicious cycle tends to form as teams postpone deployment in favor of adding more new features.
So, DTAP-pipelines encourage the creation of queues. Agile software development, on the other hand, encourages the removal of queues. Why? Because queues cause all sorts of waste:
- Wait-time: Features that are awaiting deployment to the next phase of a DTAP-pipeline are not adding any value until they are deployed to production. Until features are available on the production environment, they are essentially ‘inventory waste’;
- Maintenance: Maintaining the various environments, keeping them up-to-date and diagnosing environment-specific issues often takes a lot of time and effort;
- Cost: DTAP-environments are expensive. Especially when they mirror production-environments, which is what a good DTAP-pipeline should do. It’s true that corners are often cut here, using lower-end hard- and software. But this means we’re also losing one of the major reason why you’d want to have a DTAP-pipeline in the first place: to ‘test’ features in a realistic environment;
- Errors: There is more risk in deploying 50 new features in one batch than only one (or a handful). There’s more that needs to be configured, deployed, tested and checked. And there is a lot more room for things to break down. Errors will be harder to diagnose as more has changed, meaning that transparency is also lost;
- Coordination: More environments require more coordination. Teams need to track which feature is deployed to which environment. Bugs need to be associated with environments. Every environment represents a particular ‘state’ of the codebase, and this has to be tracked somewhere to make sure that customers & stakeholders are seeing the right things;
- Silos: In a subtle way, DTAP-pipelines maintain ‘silo-thinking’. By separating ‘development’ and ‘testing’ into different environments, it is easy to maintain that ‘testing’ is a separate responsibility, and not something done by a Development Team;
- Slow feedback: In a DTAP-pipeline, integration issues can surface anywhere down the pipeline. This is especially annoying if a lot of time passes between the phases. If a bug is discovered by a customer on ‘acceptance’, but this feature was developed five sprints ago, teams have to essentially rediscover how the feature was developed again. And what about performance-issues that only appear on production (with actual data)?
DTAP encourages the creation of queues, whereas Agile software development encourages removal of queues.
If there are so many problems associated with a DTAP-pipeline, why are there so many of them? Why are they often heralded as a ‘best practice’? One reason is that a DTAP-pipeline naturally fits well in a waterfall-environment, where all features flow from one stage (development) to another (e.g. testing). It makes little sense in an Agile development environment, however. The most compelling reason for having a DTAP-pipeline in Agile environments is because they can increase the reliability and stability of deployments. When a customer or key user is testing new features, it is obviously annoying when the environment breaks down because of changes by developers. A DTAP-pipeline offers a ‘solution’ for this by creating separate environments that are only updated occasionally (batch-wise). This way we can keep them stable and available so that a particular task (e.g. acceptance, testing or training) can be carried out without disruptions.
Are we actually solving the problem here? Or are we just hiding it under the rug? It seems to me that we are fighting the symptoms, but not curing the problem. Why are our deployments causing instability? Why are we unable to deploy reliably? Why are changes in the codebase breaking our application this often?
Imagine if …
Imagine a team that can reliably deploy a single feature to a live environment without disrupting it. They develop new features on their local machines and continuously integrate work through version-control (e.g. Git). Features are developed on a ‘develop’-branch in their version-control system. When a feature is done, it is committed to a ‘master’-branch in their version-control system. A build server picks up the commit, builds the entire codebase from scratch and runs all the automated tests it can find. When there are no issues, the build is packaged for deployment. A deployment server picks up the package, connects to the webserver/webfarm, creates a snapshot for rapid rollback and runs the deployment. The webfarm is updated one server at a time. If a problem occurs, the entire deployment is halted and rolled back. The deployment takes place in such a manner that active users don’t experience disruptions. After deployment, a number of automated smoke tests are run to verify that critical components are still functioning and performing. Various telemetry sensors monitor the application throughout the deployment to notify the team as soon as something breaks down.
Does this team need a full DTAP-pipeline? No. They need an environment where they can develop new features (preferably local) and a production-environment. They certainly don’t need Test-, Acceptance- and Training-environments. The team can perform all desired levels of testing on the development- or on the production-environment. When necessary, features can be deployed to the production environment in a manner that makes them invisible to all but a select group of ‘acceptance users’ (like the customer). The reasoning here is that “the production-environment is the most realistic test-environment.”. So let’s test new features there in a safe and controlled manner as possible.
The bottom line is that instead of creating an extensive and expensive DTAP-pipeline, and introducing a lot of waste in your process, a more Agile approach is to aggressively reduce environments and simplify your pipeline by keep asking yourself: “What is needed from us and our tooling to move towards this model as possible”:
Just to make sure: here’s what I’m not saying …
To make sure we’re on the same page, let me emphasize what I’m not arguing in this post:
- DTAP is always bad: DTAP can serve a purpose. But it needs to be aggressively automated and optimized to reduce queues from forming. And so does the process of pushing features through as quickly as possible. Both rarely happen. So I think we should explore alternative approaches, of which there are quite a few;
- No more Testing and Acceptance: Absolutely not! Testing (including acceptance) is incredibly important. Although these activities are very important and should be automated where possible, I don’t believe you need to have different dedicated environments for them. This promotes the creation of queues and silo’s, and you don’t want that;
- Yay! Coding on production: If this is what you took from the post, I suggest you read it again. Deploying well-tested features to production, through a well-oiled automated deployment-process is entirely different from copy-pasting code to your server through FTP, Remote Desktop, e.d. That’s just dumb;
The Agile Alternative
Suppose you’re reading this and you agree. That’s great. But where do you start? This might feel like a daunting task. Thankfully there are a lot of tools, frameworks and concepts that can help us here. Although there will be effort involved in moving towards a simpler deployment-pipeline, imagine how much time, effort and money you’ll save by not having to maintain all these environments. Below I offer some tips to get started:
- Independent features: Work towards creating independent features that can be deployed in isolation from other features. Break-down features into smaller (but still valuable) bits if they are too large for a sprint;
- Use feature-toggles: Use feature-toggles to make features invisible on the production-environment until they are ok-ed and/or sufficiently tested. This is also how a company like Spotify releases (even incomplete) features. This might seem counter-intuitive, but it is the fastest and best way to learn if a feature continues to work with the existing codebase and on the most realistic environment you have; the production-environment;
- Automate testing and deployment: You can’t reliably test and deploy new features if you have to do it by hand. Automation is key if you want to move towards the Agile alternative. Start building up a solid battery of automated tests. Follow the testing pyramid; a lot of unit tests, some integration tests, and a few UI-tests. Also, work towards including performance- and security tests. Automate deployment and potential rollbacks with frameworks like CircleCI, AppVeyor, Docker, and Azure that offer a lot of this out-of-the-box;
- Use branching in your version-control system to keep things transparent: The simpler the better; use a ‘develop’-branch for features that are under development and a ‘master’-branch for features that are on production. If you use ‘feature’-branches, limit their use and merge them back into ‘master’ at least before the end of the Sprint — otherwise, you lose transparency. This way your VCS can tell you exactly where features are. If there’s a bug in your live-environment, you can check out the ‘master’-branch to fix it;
- Use green/blue deployments: Use a deployment-pattern like ‘blue/green-deployment’ to allow rapid fallback in case of problems. This is also a great way to deploy complicated applications without disrupting users;
- Create an early-warning system: Wrap your environment in a net of sensors that monitor the behavior of the system. Monitor up-time (e.g. UptimeRobot, monitor performance (e.g. NewRelic, monitor security. If an issue still gets through, despite the extensive automation o testing and deployment, you’ll at least know it as soon as possible;
- Use containers: containerization with tools like Docker and Kubernetes have made it much easier to automate deployments and reduce the impact of errors. Containerized applications can be quickly replaced with other versions or taken offline or duplicated to increase capacity.
In this post, I made the argument that DTAP-pipelines are a source of avoidable waste. In Agile environments, where we need to respond more quickly and deliver value faster, we do well to explore alternatives that remove the need for DTAP and improve reliability and quality. Thankfully, there are many approaches and technologies available, and an increasing number of teams have experience with this. Good luck!