Applying the Three Ways of DevOps to Accelerate Your Organization
The emergence of DevOps has marked a seismic shift in the software world in recent years. The allure is the opportunity it offers to organizations to increase their levels of productivity by orders of magnitude and outpace their competitors to win in the marketplace.
However, DevOps draws from a wide array of areas of research and represents their convergence into a broad set of philosophies, practices, and tools. With such a vast body of knowledge to consider, it’s hard to know where to begin when attempting to adopt it.
Fortunately, we can look to DevOps thought leaders to point us in the right direction. First introduced in The Phoenix Project and The DevOps Handbook is the concept of The Three Ways. It captures the conceptual underpinnings of the entire movement. This model is a powerful tool that identifies the characteristics of DevOps maturity and describes the path that your organization can follow to get there.
We’ll be leveraging The Three Ways in this article by examining some practical actions your organization can take while pursuing DevOps maturity. It can help when the purported benefits of DevOps seem elusive. This feeling burdens organizations that struggle with the underlying principles of DevOps.
We’ll discuss in more detail, within the context of The Three Ways to provide the insight necessary to fight past barriers to adoption. As with any cultural transformation, the adoption of DevOps requires a shift in mindset and values embodied by specific practices and approaches. This guide is a collection of ideas to get started or to strategize about potential next steps. It can be used by any technology professional at any level of your organization to accelerate or even begin its DevOps journey.
The Three Ways
The First Way
The First Way is about accelerating the pace of delivery through your value stream. It can encompass the work of an individual contributor, a team, or even an entire organization. It describes how the business defines value-creating functionality for the development organization. Development builds the software that captures the value and passes it to operations to deliver it as a service to the customer.
The arrow only points from left to right, suggesting that there is never a backward flow. The implication is that known defects never get passed downstream, never negatively impacting the whole system with local optimization, and always pursuing greater throughput by continuously unlocking a greater understanding of the system to improve it.
The Second Way
In The Second Way, your organization will establish a feedback loop that amplifies signals of quality and efficiency and enables the practice of continually making improvements by addressing any uncovered issues. You create a virtuous cycle of refinement, which allows a better understanding of customer needs and faster detection of problems, ideally moving to the predictive phase to prevent the issues from occurring in the first place.
Now you can begin the work of shortening the feedback cycle, which paves the way to add even more sensing mechanisms to detect weaker signals. By “sensing mechanisms,” we mean ways to inform developers or operations of issues occurring in production. “Weaker signals” refers to different characteristics of the running software that provides insight into the quality, stability, or other essential aspects of the system.
The Second Way helps your organization be proactive by reacting to predictive indicators of problems and addressing them before problems occur. Most of the detection mechanisms can be automated, which eliminates waste and helps your entire organization move faster without fear of breaking something.
The Third Way
As your organization leverages the apparatus created in The First and Second Ways, The Third Way revolves around the idea of enabling rapid experimentation for an even more in-depth understanding of customer needs. Since the apparatus is all-around promoting a fast flow, prevention of issues, and recovery from problems, organizations can take more chances in The Third Way and can conduct bold experiments right in production.
The cultural impacts of these concepts are apparent in several practices. Teams able to begin their journey on The Third Way regularly allocate time for improving daily work. They may also intentionally introduce faults into the system to test their ability to respond and recover to improve the system and their skills. Organizations will also reward bold experimentation for fostering innovation, nurturing learning, and embedding courageous behaviour into their cultural DNA.
The Theory of Constraints
To better understand The Three Ways, we must understand the Theory of Constraints (ToC).
Introduced in The Goal by Dr. Eliyahu Goldratt, the ToC revolves around the idea that at any given point in time, an organization has a single bottleneck that limits that organization from improving the throughput of its value-delivery workflow. Until the blockage gets removed, the organization will not be able to improve its throughput.
In other words, when looking at an organization’s workflow from end-to-end, there is one critical point of friction or waste that is preventing the achievement of greater efficiency. The organization must focus on removing that problem point. As you do so, another bottleneck in the system will emerge, and then you must repeat the process.
The ToC provides steps to transition the organization from being limited to exploiting that constraint:
- Identify the constraint.
- Exploit the constraint, meaning to apply whatever tools or knowledge is available to improve the performance of the constraint.
- Subordinate and synchronize to the constraint, meaning to focus all supporting activities to ensure that they are improving the constraint.
- Elevate the performance of the constraint, meaning to take any further, potentially radical actions to ensure that the constraint is no longer a bottleneck.
- Repeat from Step 1 because now the constraint has moved somewhere else.
The ToC has led to revolutionary transformations in manufacturing and technology. The ideas behind it are timeless and applicable to any organization that delivers value to customers.
Applying The Three Ways
Armed with a general understanding of these concepts, let’s look at ways to leverage them.
Use Value Stream Mapping to Identify Bottlenecks
One of the best pieces of advice from The Phoenix Project is to start with value stream maps (VSMs). Using VSMs, you can gain situational awareness of how value flows through your organization and begin to optimize the flow. This practice sets you on the path to address The First Way.
VSMs can take many forms. There are multi-day sessions that involve representatives from the entire organization, and smaller Agile value stream mapping sessions that are hours long and focus on part of the value stream. While it’s true that the ToC tells us that we should be looking at the entire flow to identify the organizational bottleneck, your group might need to settle for smaller sessions in order to get started.
The following example-VSM shows a fictitious value stream. The vital information that it displays:
- The steps in the process.
- The various people or systems that are responsible for each step.
- The cycle time for each step, meaning the time that it takes after the work has started.
- The lead time for each step, meaning the time that it takes the instant the work in the previous step has been completed.
Using the information gathered in the VSM, your group can identify the bottleneck and begin the work to remove it. VSMs should be revisited on a regular basis since the bottleneck will move around.
Practice Continuous Integration
Continuous Integration (CI) is the practice of automating the merging of code into a central code repository branch. As CI has evolved, it’s become expected that merging occurs with high frequency — often many times per day. As one of the most straightforward practices to adopt and the many benefits it bestows, CI is part of The First Way.
The main benefit of doing this is that it introduces automation to the code development flow. It also reduces the potential friction that emerges with code branches that get worked on for extended periods. Finally, it provides a centralized environment where automated tests can be guaranteed to run instead of having developers run automated tests locally where environmental or local code differences may yield different testing outcomes.
CI is a prerequisite of many of the practices we’ll discuss shortly. It’s also a well-established practice that most modern software teams already employ.
Peer Code Reviews
Another well-established practice in software is Peer Code Reviews. DevOps highlights this as one of the most important techniques a software team can adopt because it ensures that another set of eyes is on the code before making its way into production. This approach establishes quality as a high-priority concern by building in a rigorous inspection.
In addition to creating a focus on quality, it can also be leveraged to spread knowledge around the team, enforce agreed-upon standards like coding style agreements, reinforce the team’s process, and be used as a training mechanism.
Of course, a Peer Code Review is only useful if teams are taking them seriously. Often, time-pressures, a lack of understanding of standards, and other forces prompt developers to approve code with only a cursory glance to keep code moving through the value stream quickly. This behaviour circumvents the process and introduces the risk of defects making their way downstream. It is a failure to recognize the purpose of Peer Code Reviews.
To combat this, teams can review code as a group to establish a higher bar for the quality of the reviews themselves and communicate levels of acceptable quality.
What we want to avoid is “rubber-stamping” our reviews, meaning that reviewers only give a superficial look at the code before approving it. This sort of behaviour results in a missed opportunity to leverage the most critical quality-control activity we have at our disposal — this is where we conduct checks that can’t be automated. It’s the difference between the level of inspection that humans can provide versus automation. Mature DevOps teams understand this and conduct thorough Peer Code Reviews.
Practice Continuous Delivery
Continuous Delivery (CD) is the well-known practice of getting changes into customer’s hands quickly and safely through the automation of deployments. Naturally, there is much more to it than that. The point we want to make here is that CD is pretty much a must-have for The Three Ways.
In addition to the benefits to throughput when considering The First Way, having a CD pipeline in place facilitates both of the other Ways. For The Second Way, your CD pipeline is the system that you can add more sensing mechanisms to amplify ever weaker signals and ever faster feedback.
For The Third Way, the CD pipeline is where you’ll put your Feature Flags to release your rapid experiments as you approach DevOps maturity.
For teams to successfully adopt CD, it’s essential to follow the principles at the heart of the philosophy behind it:
- Build quality in — detect and fix mistakes in ever-faster feedback loops because the further downstream they make it, the more waste they generate.
- Work in small batches — delivering smaller work units fosters agility, shortens feedback cycles, simplifies the detection and addressing of problems, and improves efficiency. Delivering smaller collections of work streamlines the process.
- Automate repetitive tasks, let people solve problems — let computers do the mind-numbing, repetitive, uncreative work that humans tend to be bad at, and instead let them be proactive, creative problem-solvers. This approach makes work more rewarding and engaging and increases efficiency by orders of magnitude. It’s a win-win practice.
- Relentlessly pursue continuous improvement — improve all aspects of work part of the work itself. Every team member should be working to improve things, whether they are significant capabilities that will super-charge productivity, micro-optimizations that make things slightly better, or simple habits that help keep entropy at bay.
- Everyone is responsible — every team member should have an active hand in ensuring the quality, correctness, and stability of their product, as well as the mechanisms that they leverage to deliver it.
For a more nuanced understanding and advice on adoption, the original book Continous Delivery by Jez Humble and David Farley is an excellent place to start. Jez Humble also has a great collection of resources on adoption at continuousdelivery.com.
Pursue Fast Automated Testing
You can’t have a CD pipeline without automated testing, or else all you’ll be delivering is a product with inferior quality very quickly. When you do have tests, make sure they’re fast, or they will become a bottleneck, and your delivery speed will suffer.
A prevalent model describing the relationship between various types of tests is The Test Pyramid.
Even though UI Tests exercise more parts of the system and thus provide more value, they are also traditionally slower and harder to maintain. Unit and integration tests tend to run faster, even though they test less of the holistic system. By relying on mixed types of testing, you can maximize speed while still getting value out of tests.
Considering the bottom of the Test Pyramid, unit tests (and especially when you write unit tests using TDD) can cover much ground and are generally the fastest tests to run because they run purely at the code level. The reason that the bottom of the Test Pyramid is wider than the top is that unit tests tend to be easier to write, run, and maintain. Even though they do not test as much, they still cover a wide array of cases eliminating the need to run slower tests from further up the Test Pyramid.
Additionally, a relatively unheralded effect of unit tests is that they tend to improve code modularity and other characteristics associated with more maintainable codebases. These lead to even further reduction of waste and a better throughput of changes.
In the context of The Three Ways, fast, automated testing is a critical part of The First Way. It’s something that must be continuously maintained lest bottlenecks form or outages occur, and adversely impacts throughput. Quality gets reinforced in an environment where changes are ideally delivered rapidly.
Adopt a Continuous Improvement Mindset
At the heart of DevOps is the Lean concept that continuous improvement is how you can maximize your throughput and stave off the effects of entropy, which leads to regression or friction in your workflow. We can adopt this by continuously reviewing the entirety of our system, look for bottlenecks, and fix them.
The Plan-Do-Check-Act cycle is a tried and true methodology to ensure the adoption of a continuous improvement practice. PDCA has heavily influenced Agile methods, so they all have the concept of regular improvement cycles built-in. A few examples are:
- Sprint Retrospectives in the Scrum Agile framework.
- The Lean notion of the Andon cord where you stop the line to swarm problems to find a solution. Learning is shared immediately across the group.
- The Service Delivery Review in Kanban, where there is a regular discussion of how the workflow can be improved.
No matter the specific practice you adopt, the vital part is to ensure that teams do not skip these improvement activities. They are just as important as delivery. Thus, regular reflection on how the work went and thinking of ways to improve it is a critical aspect of The Three Ways.
Practice Test-Driven Development
As I’ve written in a previous article, there is a natural affinity between TDD and DevOps. Please refer to that article if you would like to understand more about the benefits and adoption challenges of TDD.
To summarize, TDD is one of the best ways to prevent tech debt from accumulating because it helps prevent it from entering the system in the first place. Expertise with TDD can also speed up delivery because it eliminates waste from the process. So there is a natural fit for TDD as we seek to maximize our throughput in The First Way.
Use Monitoring and Alerting
These sensing mechanisms have been at the heart of DevOps, SRE, and ops for a long time. These are how you realize The Second Way. The main take-away is that, just as with every other DevOps practice, these tools should not be regarded as a project-based activity; the work in maintaining and improving them is never done.
Another concept in play here is working towards being proactive in preventing problems from occurring in the first place, rather than continually reacting to situations. Monitoring and alerting is a baseline requirement of shifting to a proactive mindset.
The final concept we should consider related to The Second Way is identifying and amplifying different signals. Detect outliers and use them in active prevention. Consider other metrics that can offer more in-depth insight through varied perspectives. Perform analysis of monitoring data to understand the nature of problems.
Set Aside 20% Tech Debt Reduction Time
If technical debt is the great destroyer of projects, products, and even companies, then the next logical question is: how do we prevent it? By addressing it, of course.
The number need not be 20%, although DevOps thought leaders recommend that specific amount because it’s tough to address existing debt and prevent it without investing that amount of time proactively. This equates to one day per week per developer. What’s important is to get started on this immediately. Operationally, what this usually looks like is that technical teams are given a set amount of time to take on work that is entirely unrelated to feature-work. The idea is that developers are closest to the symptoms of tech debt and can spot it with the least effort, and so giving them the freedom to address it helps reduce technical debt.
This can be adopted in many different ways. Some ideas are:
- Allocating a set portion of buffer time in the amount of work that teams take on and reserving those for tech debt
- Blocking off parts of developers’ calendars regularly to free them up to take care of this work
- Employ hack days specifically targeted to solve tech debt through innovation
- Regular tech debt sprints (e.g., every fourth sprint)
The point is to build the habit of regularly reducing technical debt. It can’t accumulate to the point where the development workflow is halted, and the influx of defects kills the ability to move forward.
This practice fits best with The First Way in that it’s all about improving the workflow and reducing waste. However, it often tends to be a characteristic of The Third Way in that mature DevOps organizations seem to be at the point where they can practice it.
Practice Trunk-Based Development
When attempting to get a continuous integration workflow in place, which is necessary for a continuous delivery workflow, you must adopt trunk-based development. The reason for this is that long-lived feature branches are seductively easy to create, but inevitably cause problems when they have to be merged into the trunk. Also, code drift is inevitable with multiple, long-lived feature branches. You have to resist this urge and look to other mechanisms to support CI / CD workflows. We’ll discuss feature flags shortly as a discrete item, but in addition to those, you can use branching by abstraction where you commit production code without key pieces that cause the code to run in production.
In a trunk-based development workflow, the only branches your team should have are short-lived feature branches, such as Git Pull Requests. Vincent Driessen’s git-flow is a famous example of non-trunk-based development. So we would want to avoid this and instead consider the GitHub flow. Vincent says as much in his “Note of reflection” in the link above.
Employing this strategy plays well with The First Way and the concept of small batch sizes. Instead of large, coupled batches of code, we work in small batches of fully-functioning production code that expose us to less risk of regression, merge conflicts, and other waste in our workflow. This is an excellent way to improve throughput, but it requires the discipline to resist the urge to make the short-sighted trade-off of creating multiple long-lived branches.
Support Feature Flags
Feature flags are one of the riskiest practices mentioned in The Three Ways. This is because they introduce administration overhead and complexity with development and testing. There is also the cautionary tale of Knight Capital’s $460 million repurposed feature flag mistake. This doesn’t make feature flags inherently evil, but it does underscore the need for explicit management policies around feature flags.
However, this level of risk doesn’t make them any less critical because you can’t achieve maturity with continuous delivery or The Three Ways without them. The implication is that if you want to be a DevOps organization, you must become competent at them.
There are several easy workarounds to feature flag management. The first is to avoid the temptation to create your own system. There are several feature flag management tools out there — some are even free. This approach beats managing hard-coded files, which is what most people start with for some reason.
Another practice to adopt is to make sure that a feature flag isn’t needed to delete it. You don’t want to be the next Knight Capital.
Minimizing the number of feature flags is also essential. Feature flags naturally introduce complexity to your software. We want to reduce complexity, so avoiding long-lived, global feature flags is a good practice to follow.
This implies that there should be some centralized administration. Ideally, each team can manage its own flags. If many groups share code across a highly-coupled system like a monolith, this incurs more coordination overhead, as well as risk.
The gist is that feature flags are not for every team. Still, similar to other practices in this list, are a critical aspect of The Three Ways and align to a continuum of competencies and values that lead towards DevOps maturity vis-à-vis The Third Way.
Perform Rapid, Short-Lived Experiments in Production
The Third Way is all about taking advantage of the speed and safety by achieving the first two Ways to enable rapid experimentation. This phase is where all of that work pays off, and this is how you bake innovation into your team’s daily work. It opens a whole new set of possibilities.
The reason that we list this practice as part of The Third Way is the mechanisms specified in the previous Ways are needed to perform experiments.
- You must be able to deploy experiments quickly, so a CD pipeline is needed.
- You have to turn on and off experiments to avoid exposing your entire user base to an experimental feature, so feature flags are needed.
- Your developers need time to execute their experiments, so they must not be inundated with bugs, feature requests, or technical debt. The pace of development must support this.
- There needs to be a good relationship with the people coming up with experiments — whether they are other technical folks, user experience researchers, or business people.
At a practical level, we carry out experiments using mechanisms like A / B testing. This practice helps teams craft better user experiences and finds ways to improve the product and business results. It is possibly the most reliable indicator of DevOps maturity.
When to Adopt Practices
The following chart is intended to provide insight into the relationships between the various practices. It also emphasizes the alignment of purpose to each of The Three Ways. It’s interesting to see that there’s a rough order to the timing of adoption. It’s also clear that your organization can get started on adoption immediately.
Of course, the chart is not meant to be taken literally. You should adopt whatever makes sense for your organization at a given time, and it will not look the same for every organization.
DevOps is very broad and requires the alignment of many different proverbial stars in your organization to succeed. So it’s challenging to adopt. To overcome such an overwhelming set of challenges, sometimes you need an idea of where to start. Hopefully, I’ve managed to convey the value of using The Three Ways to establish the context for a few new goals for your organization and provided clarity on the sorts of activities your organization can take to meet those goals.
In case it’s not apparent, there are as many philosophical and cultural changes that need to be in place to support a successful DevOps adoption. The best thing to do is to create a culture where there is space for these concepts and to start trying them out. The list here is by no means exhaustive, but every idea is something that should be in place in a successful, mature DevOps organization.
Best of luck with your DevOps journey!
The best way to understand The Three Ways is to read all of the books by the authors that invented it. In order: The Phoenix Project, Beyond The Phoenix Project, The DevOps Handbook, Accelerate, and The Unicorn Project are all wonderfully-engaging and relevant reads for any professional in the technology industry.
Gene Kim, one of the DevOps luminaries and author of the books mentioned above, has a great blog post that summarizes The Three Ways if you want to hear the description of the concept from one of the minds that conceived it.
This Atlassian article on peer code reviews does a great job of highlighting the benefits. Please read it to acquire a deeper understanding of the process and its advantages.
The Lean Production blog provides a great write-up that captures the essence of the ToC, albeit from a project management viewpoint. Andrew Fuqua has written a great article on applying the ToC to Agile software development that’s worth a read.
This Pete Hodgson article is probably the best feature flag article around.
SRE is an intense topic, and the area of DevOps concerns called Site Reliability Engineering (SRE) has strong foundations in this area. Check out the SRE Book to learn more. Also, The DevOps Handbook provides a great breakdown of some of the ways of applying these concepts.
I’m an employee of IBM. The views expressed in this blog are mine and don’t necessarily reflect the positions, strategies, or opinions of the company.