“I can’t believe I put off this book for so long… it’s awesome”
I wouldn’t classify myself as an avid reader, which shocked me when I found myself glued to “The Phoenix Project” over the course of 3 days! This book definitely caters to DevOps, developers, IT professionals, and more, but it’s nothing like a technical textbook — rather a journey of an earnest engineer looking at doing things the right way.
This fictional tale follows our protagonist, a VP of IT Operations of a fictional company. Don’t be fooled by that lavish executive sounding position, our protagonist has very much a down-to-earth personality, and following him along on his “adventures” to rectify the deeply seated issues that he has faced in his organization, felt more reminiscent of the day-to-day observations that we see some time at work — like accumulated tech debt and incidents occurring at times.
Let’s talk about the three lessons that I learned from reading this book!
Lesson 1: Problems within workflows in software development are not exclusive, and might be solved in our collective history as humans
In the 1980s, this plant was the beneficiary of three incredible scientifically-grounded management movements … Theory of Constraints, Lean Production, and Total Quality Management.
- Chapter 7
I wanted to quote the book above on this particular portion, just to highlight how strongly I feel about this chapter. Software development workflows and processes cannot be seen as beyond management issues in any other industry. At the core of any workflow, we are dealing with people, and work to be completed directly or indirectly by the very same people.
Take the Theory of Constraints in the quote above:
Every process has a constraint (bottleneck) and focusing improvement efforts on that constraint is the fastest and most effective path to improved profitability.
- Theory of Constraints (leanproduction.com)
Sounds familiar? In our software development processes, we usually refer to these bottlenecks as “blockers”. The “blockers” can be the lack of front-end engineers designing the web applications, back-end engineers having to deal with complex deployment processes, or DevOps engineers having to deal with Ops issues more than 50% of their time and not able to work on their core tasks.
In order to resolve these blockers, our protagonist is literally thrown into the heart of the solution — a factory floor that has faced the same problems before. There, he is challenged by his mentor to understand the 4 types of work that the same factory floor has faced, some of which will illustrate why organizations are stuck in the downward spiral of not being about to keep up with their plans and deadlines.
Does the above sound familiar? It should sound familiar since it encapsulates every type of work that I.T. companies are plagued with in today’s world.
Understanding these 4 categories of work allows us to appreciate how the first 3 can exist and lead to the fourth category. It usually follows the same “storyline” of attempting to complete work in the first 3 categories but incurring some “debt” due to poor decisions or temporary fixtures (perhaps due to tight datelines). The price to pay is the periodic occurrence of unplanned work, jeopardizing other plans at times.
Having a good understanding of the above allows you and your organization to do these three things, for the core objective of reducing unplanned work:
- Analyze the past, to identify and backlog your tech debt which has led to substantial resources for resolving unplanned work.
- Prioritize completion of tech debt, and differentiate unplanned work that is truly essential for the present (don’t just prioritize everything!). Try to resolve those tech debts that can either cause more tech debt (e.g. security protocols not followed) or leads to the most resources for unplanned work.
- Set up a roadmap for the future — which could be investigating into newer technologies to utilize, and making better technical choices. This is to minimize or prevent similar mistakes from re-occurring again.
To reduce and eliminate unplanned work, another lesson from the past is using the “3 ways”, which I will illustrate in the following chapter.
Lesson 2: It is an arduous journey to a productive culture — through the “3 ways“
The predominant theme throughout the entire book was the introduction of the “3 ways”. Just like Sun Tzu’s art of war, the “3 ways” is a methodology introduced by the mentor of the protagonist, used towards eliminating unplanned work and achieve the desired transformation of an organization.
Let’s understand what is covered under the “3 ways”.
The First Way
The First Way helps us understand how to create fast flow of work as it moves from Development into IT Operations, because that’s what’s between the business and the customer.
Under The First Way, organizations seek to shorten the cycles between development to delivering to the customers. One of the ways this is achieved is by prioritizing the core goals of the organization over those in any of its teams.
A theme in the book revolves around a security engineer impeding other teams in order to meet his team’s objectives, only to eventually be enlightened that his efforts had much easier alternatives, some allowing the business to succeed even more. His realization paves towards understanding The First Way — by acting in the interest of the business to build core solutions (such as monitoring measures for better visibility of the systems) over individualistic goals.
Another important point to note is the visibility of tasks that are in flux towards achieving the organizational goals. If tasks that are in flux are “invisible”, it becomes hard for leaders in the organization to deliver content to customers on time, and create long term plans for their products. Only when tasks are visible, can priorities be set in place and tasks with dependencies to be chained effectively like a well-oiled machine. This should also apply to unplanned work, which we addressed in the last chapter.
The Second Way
The Second Way shows us how to shorten and amplify feedback loops, so we can fix quality at the source and avoid rework.
In The Second Way, the key term to note is feedback loops. Following from The First Way where the fast flow of work is crucial and in a single direction, The Second Way focuses on optimizing this flow by introducing a reverse flow of information from I.T. Operations back to Development. This results in the formation of the feedback loop.
To execute The Second Way, we need to closely monitor each stage from the single flow between Development and I.T. Operations. This again tags on the visibility of tasks from The First Way, but more to understand how workflows can be improved (e.g. automation of manual tasks). Improving the workflow also encompasses better detection measures and feedback to the developers for recovery.
By practicing The Second Way effectively, we seek to eliminate certain sources of unplanned work, such as incidents resulting from bugs, by detecting and fixing them at earlier phases before the feature is released.
The Third Way
And the Third Way shows us how to create a culture that simultaneously fosters experimentation, learning from failure and understanding that repetition and practice are the prerequisites to mastery.
In The Third Way, the focus is to reward learning from failure, so as to encourage experimentation. The social aspect of an organization is crucial to the success of The Third Way, such as eliminating finger-pointing and implementing blameless postmortems. In doing so, we can encourage more information regarding the failure to surface, and look into preventing the re-occurrence of the same mistake in the long term.
Another aspect of the organization involves resilience engineering, such as the branch of Chaos Engineering and actively look into breaking the system for surfacing issues. In doing so, we can pair it with measures from The Second Way, to receive feedback and actionable fixes to be rolled out.
All Three Ways
After talking about the three ways, it is quite clear that while the results are desirable, executing the measures is not going to be easy. Each of the three ways requires the organization and individual contributors to recognize that these are much needed changes for the longevity of the business and that resources to achieve them are acceptable costs. The book perfectly mirrors this journey from start to finish, showcasing the long discovery and implementation process along the way.
Lesson 3: What should I start doing, as an individual contributor?
While the book does mostly follows our protagonist in his adventure to remove the downward spiral of debt and unmet deadlines, there are lessons that individual contributors can learn from the individual contributors who appear throughout the story:
1. Don’t silo information, and don’t be the single point of failure.
“Every time that we let Brent fix something that none of us can replicate, Brent gets a little smatter, and the entire system gets dumber.”
- Chapter 10
One of the running themes of “The Phoenix Project” was an engineer named Brent, who everyone in the organization needed to solve their problems. This was due to the fact that he was one of the only engineers who had so much context about the company’s operations and could effectively solve many of the problems arising from unplanned work.
In a way, Brent was the solution that they needed. However, he was at best a short term solution, as there was only one of him in a company with thousands of employees, and the book showed the dangers of being overly-reliant on Brent throughout most arcs of the story when he was not readily available.
I would also add that I am also guilty as being one that wore my numerous responsibilities as a “badge of honor” and putting in more time at work. Over time, I got to understand the dangers that it posed to the company, and actively shared my knowledge and responsibilities with my colleagues. This helped the company in having multiple respondents for key systems, and spread the risk.
2. Seek to support and not to impede, through listening and understanding. The business must win!
“I can’t believe John is grandstanding in front of the auditors. It’s times like this that make me wonder whose side he’s really on.”
- Chapter 5
We’ve talked about John briefly in the last chapter. Within the book, he was one of the active elements that obstructed the protagonist from making progress. This is something that some of us might be familiar with, as working adults. Perhaps a department or engineer, who insists on “their way or the highway” in getting things done. This severely contravenes the three ways, in that nobody wins, and the business definitely loses.
Communication is key here. On either side of discourse, we may not have the full context of the other side, and in turn the full picture. Not only the business loses, but people will also start to distrust each other too.
On the contrary, we should seek to listen to the other party, and understand their position. Learn to offer suggestions that work for everyone, and foster a culture where people can trust each other to make the best choices instead of impeding each other.
3. Be open to change such as for blameless postmortem culture, both as the observer and offender
“By removing blame, you remove fear; by removing fear, you enable honesty; and honesty enables prevention”
- Chapter 4
It is very instinctive for us as humans, to want to associate actions and results to the origin. The entire society is built on top of that mentality, with law and order in place to identify individuals who commit crimes and exact the appropriate punishment. In the workplace, it can be the same at times. When mistakes are made, the person who committed them is made known to all in the organization while punishment is exacted (e.g. pay cut, getting fired).
Hence, it might be very counter-intuitive for us, both as the “offender” and “observer”, to act in a positive way. As the “offender”, we try to hide our mistakes, and as the “observer”, we frown upon the “offender” for making mistakes.
While counter-intuitive, we should not hide mistakes when we are the “offender”, but seek to fix the problem, and participating actively in the postmortem process to prevent the issue from occurring again. In a similar sense, as the “observer”, treat the mistakes as a learning process for oneself too.
I would very much recommend reading up on the blameless postmortem culture at Google and trying to introduce said culture to your organization, which focuses on educating rather than blaming, and fixing problems to prevent mistakes from being committed again.
To conclude, the above represent a snapshot of the important learnings that I derived from the book. After reading the book, I tried to emulate these experiences, and to be honest, executing the same solutions in real life was a lot harder than it sounds. Props to the author who has done a great job to illustrate these experiences with realism (even though in a fictional sense) and made it easier for any engineers to quantify the tasks that an organization should be undertaking.
To end off, regardless of being an engineer, product manager, or even sales in any I.T. company, I would very much recommend anyone this book! 10/10 !
Have fun! Ciao~