Nurturing High-Performing Teams with DevOps Culture

Daniel Yokoyama
7 min readMar 20, 2023

--

Para os leitores brasileiros e outros que prefiram o português, tem uma versão traduzida deste artigo aqui.

Have you ever asked yourself why is DevOps a topic so confusing, full of misdirections, and massive tutorials regarding all the “Ops” stuff, as if all a “Dev” should do to know DevOps, is to become a SysAdmin altogether?

In my daily job, I’m in charge of a Platform-Engineering team and one of my goals is to enrich the DevOps culture through the organization, which can be challenging depending on how crude the teams are regarding the principles and values that drive such culture. So, when people try to find out how to learn DevOps, they usually come across a huge amount of material concerning Operations.

I mean, try to put yourself in the shoes of a rookie and figure out how overwhelming that would be… as if it is not overwhelming already. I had a hard time trying to learn DevOps, and I have over 20 years of experience. And, just to be clear, I’m not saying that those things are not valuable matters to learn about. I’m just saying that DevOps is not about none of that.

What is DevOps?

The cover of the book “The DevOps Handbook: How to create world-class Agility, Reliability, and Security in Technology Organizations”, by Gene Kim, Jez Humble, Patrick Debois and John Willis.

So I decided to write about DevOps, the challenges of implementing a DevOps culture, and discuss a little about what sort of topics should be considered when trying to implement DevOps across the organization.

Creating a DevOps culture is crucial for organizations that want to improve the speed, quality, and reliability of their software delivery. However, creating this culture can be challenging, especially when developers are unaware of the best practices and values that a DevOps culture is supposed to bring into the organization. In this article, we will explore the key principles and practices of DevOps, and how they can be applied to overcome these challenges.

Gene Kim’s Three Ways

The cover of the book “The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win”, by Gene Kim, Kevin Behr, and George Spafford

I believe the most fundamental DevOps concept comes from The DevOps Handbook (2016, by Gene Kim, Jez Humble, Patrick Debois, and John Willis). I’m talking about the Three Ways of DevOps, introduced in a previous book: The Phoenix Project (2013, by Gene Kim, Kevin Behr, and George Spafford), but much better explored in the later one.

The First Way is about creating flow, from development through to operations and delivery by smoothing the whole way it takes for the code to leave the developer’s machine and get into production, applying any checks needed, and assessing how well it fits a functioning new version of the product.

The Second Way is about creating feedback loops so everyone can learn and improve from their experiences. So, how well is this code working? Are there any new bugs or issues regarding unexpected behavior? How long does it take to fix them? How is the user experiencing the product? What else can we learn from the product that could help us have an insight into what changes we should do to make it better?

Finally, the Third Way is about creating a culture of experimentation and learning, by celebrating successes and failures and creating a safe environment where people feel comfortable taking risks and trying new things.

In summary, if I had to give a short answer about what the heck is a DevOps Culture, I would happily say that it is an organization that values all Three Ways and let itself be driven by them.

Nurturing High-Performing Teams

The cover of the book “Accelerate: Building and Scaling High Performing Technology Organizations”, by Nicole Forsgren, PhD, Jez Humble, and Gene Kim

But if you need a more tactical approach to learning how to implement a DevOps Culture, I can also mention another book by Gene Kim: Accelerate — The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations (2018, by Nicole Forsgren, Ph.D., Jez Humble, and Gene Kim). The book elaborates on how Kim, Humble, and Dr. Forsgren worked together to conduct the yearly survey for the State of DevOps Report by Puppet and to what conclusions they came with the data they got. It is an ode to all of the Good Practices we’ve been talking about for the last decades, but now with concrete data, research methodology, and science.

According to the book, a DevOps culture is critical for high-performing technology organizations. It’s characterized by a shared understanding of the goals and objectives of the organization, as well as a willingness to experiment and learn from failures. Leaders play a critical role in creating a DevOps culture, by creating a safe environment where experimentation and learning are encouraged, and promoting a culture of collaboration and shared responsibility.

When developers are unaware of good practices, it can create bottlenecks and delays in the DevOps process, slow down delivery, and cause frustration for everyone involved. To address this, organizations should focus on educating their developers about the best practices of DevOps, and why they are important. Here are some best practices that Kim and his colleagues have identified:

  • Continuous Integration: Developers should frequently integrate their code into a shared repository, and use automated testing to catch errors early.
  • Continuous Delivery: Code changes should be automatically built, tested, and deployed to production.
  • Monitoring and Logging: Organizations should have automated monitoring and logging systems in place to quickly identify and respond to issues.
  • Infrastructure as Code: Infrastructure should be treated as code, and managed through version control systems so that changes can be easily tracked and managed.
  • Collaboration: Teams should work collaboratively, and share responsibility for the delivery and support of their systems.

The Emergence of Platform-Engineering Teams

Another key trend in DevOps is the ascension of Platform Engineering teams. These teams focus on building and maintaining the underlying infrastructure and platforms that enable software delivery teams to work more efficiently and effectively. They create standardization and automation around core infrastructure components, such as databases and networking, so that development teams can focus on building applications rather than worrying about the underlying infrastructure. By providing a solid foundation for DevOps teams to work on, Platform Engineering teams help to improve the speed, quality, and reliability of software delivery.

Consider for a minute what any backend developer has to worry about when creating a service for a REST API, and imagine all the buzz regarding DevOps added to that to make this service operational. So that’s the problem a Platform-Engineering team is supposed to solve. They get all the plumbing (quoting Armon Dadgar, Co-founder and CTO at Hashicorp) regarding networking, automation, security, Load balancers, Caching Services, Reverse Proxies, Firewalls, Cloud Services, Observability Services (logs, metrics, tracing) and make them available as services for the Software Engineers so that they don’t have to understand all of that stuff (containers, Kubernetes, service-mesh, everything else) in order to take their services and place them into production, benefiting from all the stability, security and availability delivered by the platform.

SRE as an Engine for Feedback Loops

Site Reliability Engineering (SRE) is another methodology that combines software engineering and operations to create highly scalable and reliable software systems. SRE emphasizes the importance of automating repetitive tasks, reducing the impact of failures, and designing systems for resilience and scalability. SRE teams work closely with development teams to ensure that applications are designed to be highly available and that they can quickly recover from failures.

More than that, SRE relies on the fact that Reliability can be engineered (no pun intended). I mean, how can you make a service more reliable? Well, SRE works toward understanding what indicators should be used to define reliability (SLIs), such as response time, or error rates. Then the team, alongside the business experts and stakeholders tries to figure what is the acceptable level of failure for those indicators (SLOs), meaning what is the turning point where everyone agrees it’s not worth trying to prevent failure (maybe because it makes it too expensive, or maybe because the user themself won’t notice any improvement on their experience). Finally, Error Budgets have a role to play, by helping the teams to know when to stop deploying and start reviewing their issues for a while. Observability is also at the heart of it all, by using tools to measure, monitor and alert the service’s unexpected behavior.

Conclusion

In summary, creating a DevOps culture is critical for organizations that want to deliver high-quality software quickly and reliably. However, creating this culture can be a challenge, especially when developers are not aware of the best practices and values that a DevOps culture should bring to the organization. By educating developers on DevOps best practices and encouraging a culture of experimentation and learning, organizations can overcome these challenges and create high-performing software delivery teams. The rise of Platform Engineering and Site Reliability Engineering teams can also help improve the efficiency and reliability of software systems.

P.S.: Special thanks to Frederico Vitorino for help with proofreading. You are amazing!

--

--