Accelerate to Survive
In Trendyol Tech, we are considering about how to improve software delivery performance as much as how to improve software quality. While improving our software delivery process, the book called “Accelerate” is one of the most important guides for us.
In this article, First I will talk about what is described in Accelerate and then I will briefly talk about what we are doing to accelerate the software delivery.
Measuring Software Delivery Performance
The performance of software development teams is measured in different ways. However while doing this, many organizations use wrong criteria. For example, one of the most common incorrect criteria is velocity. Each team can assign different point to a story and teams game point of stories when velocity used as a performance criteria. Therefore, it is a misleading criteria for performance measurement.
Software delivery performance can be evaluated on four key metrics: delivery lead time, deployment frequency, time to restore and change fail rate.
Lead time is the time until an incoming request is satisfied. When the delivery time is short, we can get feedback from customers faster. Thus, we can make corrections in our applications faster. It will also be easier to identify and solve problems. If we want to reduce the delivery time, we must split stories into small pieces. Moreover, it is going to reduce the error rates. Therefore, how to make sure that our stories are small enough? At this point, deployment frequency helps us. If we deploy our applications frequently, it means that we achieve to split stories into small pieces enough.
Lead time and deployment frequency are metrics for measured software delivery tempo. If we want to measure stability, we need to focus on time to restore and change fail rate.
Time to restore is the time that takes to restore a service when an incident is occurred. The change failure rate is a measure of how often deployment failures occur. In other words, it represents how often software, infrastructure or configuration changes fail.
Team culture has a significant impact on software delivery performance. In a team with a good culture, people will be more satisfied with their work. First of all, teams must have a culture that focuses on their goals. If so how can organizations improve their team culture?
- Information flow should be provided quickly and reliably.
- People should trust each other and work collaboratively.
- Qualified culture require higher quality decision making. But those decisions are more easily reversed if they turn out to be wrong.
- One person should not be held responsible for failures.
- New ideas should be welcomed.
- Responsibilities should be shared.
Continuous delivery is a set of capabilities that enable the rapid and reliable software delivery process. Teams that implements CD practices experience less deployment pain and burnout. In addition to this they also have better culture and they reduce rework or unplanned work. There are five primary principles for continuous delivery.
- Build quality in; organizations should invest in teams and the tools they use. In this way, dealing with problems will be faster and it will cost cheaper.
- Work in small batches; teams generally tend to keep the scope of the work large. When we divide the work into small pieces, we can see the impact faster.
- Computers perform repetitive task, people solve problems; Repeated works should be automated. Thus, teams can focus on solving problems and they can invest more in their system.
- The most remarkable feature of high performance teams is that they are never satisfied. They constantly strive to improve their system.
- Everyone is responsible: Different people take part in different stages of the software delivery process. Everyone involved in the process should act in close cooperation.
We need to establish some basics well to be able to implement continuous delivery. Unfortunately many software development teams work with long running branches. Merging all these branches can be tiring because of conflicts. Hence, our branches should have a short lifetime. If we want to achieve this, we must split stories into small pieces. Furthermore, after each merge, pipelines should be triggered and run tests. Developers should be notified immediately if tests fail.
Teams that have reliable test automations can work more confidently. In addition, tests make it easier to to determine where we have errors in our code. Developers should maintain acceptance test and fix them. When developers maintain acceptance tests, two important benefits arise. Firstly, code become more testable. Furthermore, it proves to us how important TDD is. Secondly, when developers are responsible for the automation tests, they put in more effort to maintain and fix them.
For a powerful architecture, we should focus on a loosely coupled system. When our system is loosely coupled, we can do our tests without requiring an integrated environment and we can deploy applications independently of other applications it depends on.
If an organization has a good architecture, communication between the teams is less necessary. In other words, when architecture is loosely coupled, teams are also loosely coupled. Thus, independently of other teams, we can work on our assignment. But it does not mean that there should be no communication between teams. However, too much communication and too many meetings affect delivery performance negatively.
Also teams should not be forced to use a programming language or tool. They should be free to make their choices according to their needs. In this way, they can do their job with more satisfaction and fun.
Fear of teams during a deployment tells us a lot about delivery performance of a team. Deployment pain specify that teams need to improve their culture and software delivery performance. Too much deployment pain indicates that software development and delivery are not sustainable.
We should have comprehensive test and deployment automations to reduce deployment pain. Also applying trunk based development practices and having loosely coupled architecture helps us to reduce it. We need to pay attention to deployability while we are writing our code.
Burnout is defined as a physical and mental fatigue caused by overwork and stress. Organizations must prevent team burnout to speed up software delivery process. Because burnout makes us see our work insignificant and it creates a feeling of desperation.
Common Problems That Can Lead To Burnout
• Work overload: job demands exceed human limits.
• Lack of control: inability to influence decisions that affect your job.
• Insufficient rewards: insufficient financial, institutional, or social rewards.
• Absence of fairness: lack of fairness in decision-making processes.
• Value conflicts: mismatch in organizational values and the individual’s values.
How to Prevent Burnout
• Fostering a respectful, supportive work environment that emphasizes learning from failures rather than blaming.
• Communicating a strong sense of purpose.
• Investing in employee development.
• Asking employees what is preventing them from achieving their objectives and then fixing those things.
• Giving employees time, space, and resources to experiment and learn.
Leaders should help their teams to improve their delivery process performance. They should also inspire and motivate their employees.
How leaders can invest in their teams?
• Create space and opportunities for learning and improving.
• Establish a dedicated training budget and make sure people know about it.
• Encourage staff to attend technical conferences at least once a year and summarize what they learned from that experience.
• Encourage teams to organize internal events where teams get together to work on technical debt.
• Set up internal hack days, where cross-functional teams can get together to work on a project.
What Are We Doing to Accelerate?
We imported all git repositories to GitLab and started using GitLab CI / CD. When a code is merged to the master branch; unit tests, automation tests and sonar analysis are being triggered on pipeline. When a step in the pipeline fails, it sends a notification to the slack channel.
Lunch and Learn
We organize lunch and learn events in the team regularly. Thus, all of our team members support each other to learn new things.
Detect Repetitive Tasks
We detected our repetitive tasks and started automating them. Hence, we had more time to solve the problems.
By organizing code reviews, we talk about how to improve our code quality. In this way, we make our code more readable and maintainable.
Four Key Metrics
We are making a Grafana Dashboard to follow four key metrics. (delivery lead time, deployment frequency, time to restore, change fail rate).