Putting the Toyota Kata into practice

Part 3 — Where we are, where we want to go and how to do it

8 min readOct 18, 2017

In the previous article, I showed how we arrived at our focus area and a clear understanding of our challenge. The next step in the planning phase helps us to understand where we are with regards to our challenge, our current condition. We didn’t see our development pipeline as safe, and the level of automation was very low, close to zero.

To start with, we looked at the obvious and easy to measure, lead times. We had a lot of historical data to help us understand the timing. That was different with the level of automation. We didn’t even have a clear picture of what was going on during the release process. Every person we asked had a different view on things and putting it all together was almost a job for CSI.

This was the result of our first investigation of the release process. 95% of the post-its represents a manual step.

Documenting and assessing all of these details took some time, but it was necessary and very valuable. The saying The devil lies in the details was very applicable in this case. However, the initial investigation gave us a good picture of our current situation. At least we thought it did. The next step for Sergi was to find the current condition of our process in regards to the challenge. Often we start vaguely with the description of the current condition. One point I found important is to express the current condition as explicit and quantitative as possible.

I asked Sergi, “If you look at our challenge, how would you describe the Current Condition?”

“Oh, it takes more than 24 hours to release the module, and we can only release it once a week.”, he said.

“Wow, once a week? That doesn’t sound very agile. Where does that come from?”, I wondered.

“I don’t know”, Sergi shrugged, “but I found a lot of interesting stuff we can look at to improve and by the way, I found out more about…”

We chatted about the details of how things were done and the sad truth was that the releases had more restrictions than we initially assumed.

“You know, when I spoke with guys from production they told me that releases only could be done during Tuesday, Wednesday and Thursday. And the releases needed to be done between 9 am and 2 pm.” Ok, we added all this to the definition of the current condition. Initially, it looked like this:

Release time >24 hours once a week Tuesday, Wednesday or Thursday between 9 am — 2 pm

That was a good start, and it gave a hint about concrete things we are measuring: how often, how long and when. Those are certain things we wanted to improve. But it felt not complete. As with many things in life the definition of the current condition needed to improve and evolve.

Where we want to be — Target Condition

Before we even started the initiative I made the bold statement to senior management that we should be able to release working software within 15 minutes and that we can make it happen. That is quite a step forward if you are looking at the 24 hours we had at that moment for the specific module we picked as a guinea pig and for most of our other modules we were counting in weeks for a release, but that is another story…
So the discussion around the description of our Target Condition started with something like this:

Release time equal or less than 15 minutes

Very soon we realised that we needed to add something about the time constraints too since that was a real constraint (see current condition). Next iteration of the Target Condition looked like this:

Release time equal or less than 15 minutes. Release at any time during office hours (9 am — 5 pm).

Ok, we got some more detail to it now, but it still didn’t feel right concerning our challenge: Provide a safe environment without human intervention. The part that said without human intervention gave a clear hint, and the third version of the challenge looked like this:

Fully automated release in less than or equal to 15 minutes at any time during office hours (9 am — 5 pm).

The point I want to make is that spending just enough time on the definition of your target condition to get you going is the trick. It often means to do a couple of iterations in the improvement work to redefine or evolve your different definitions. A regular improvement kata coaching activity is the opportunity for this. We had this coaching activity weekly, and I’ll write about it in another article. The definition of our target condition led to refinement of the current condition and the addition of the automation element.

Things in our way

Ok, we got the general outline. We know where we want to improve, what we want to improve. We see the current condition of our process, and we know where we want to be in the future, the target condition.

”Where do we go from here?” Sergi was wondering how far this would take us. Until now it seemed not very obvious how this could lead him to a plan.

“Why do you think our current condition is so much worse than the target we have set us?”. I hoped that my question would open the door to the list of obstacles to tackle.

”Oh, I have some ideas about that…” Sergi said and started a short detour into the area of complaints about how things are handled around here. I intended to lead Sergi to concrete things to tackle.

”Let’s have a look and compare current and target condition and see if we can find something there.” I gave him a hint as a starting point. “Why do you think the releases are limited to Tuesday, Wednesday and Thursday?”

”That is easy”, Sergi said, “There are many manual steps in the process. There also seem to be some steps that need approval from a top manager.”

“Tell me more about these approvals.” The conversation went on like this for a while and resulted in a list of things Sergi wanted to look into. He thought that resolving these “things” would improve the situation but was not sure of it.

”These things are the obstacles to reach your target condition,” I explained to him that they are like rocks lying in his way to reach the target condition. So Sergi took his first list of obstacles and put it in an area on his canvas he called it the Obstacle Parking Lot. The way to get rid of the obstacles is the real beauty of the improvement kata. Earlier I mentioned that at the core of the Improvement Kata lies the PDCA-cycle. It is a scientific model to experiment your way towards improvement. So let’s look at how Sergi used the PDCA-cycle in praxis.

He picked an obstacle from the parking lot that seemed suitable to start with and not too difficult to remove. That is very common when you start with the Improvement Kata. What happened is that Sergi found many obstacles which were no-brainers from an improvement point of view. Things we just needed to do or change or get rid off. In other words: real process waste and obvious stuff because Sergi knew the outcome. By doing this Sergi could get rid of the weekday constraint, which of course changed the current condition:

Release time >24 hours once a week any working day between 9 am and 2 pm

After this, things became a bit more difficult and were not just about the removal of waste. The outcome of the action was not clear anymore. The obstacles became complicated or even complex. Let’s walk through a complete PDCA-cycle and have a look at how Sergi tackled the obstacle removing.

Plan

Obstacle: Manual Approval is required > 17:00 or Fridays

Hypothesis: We hypothesise that the development team can release the CatalunyaAir module into the production environment themselves instead of the release managers. By that, we can bypass the restriction of only three possible release days during the week.

Sergi’s idea was that the CatalunyaAir module is isolated enough that the development team has all the information to release themselves. They would be the ones that would need to do the fixes anyway if things went wrong.

Do

Experiment: The development team will check the percentage of traffic on the servers before pressing the release “button”. The development team will monitor the module in the production environment, and the KPI to monitor would be server load.

Expected Result: Release at any time during a working day. Expedite approval from upper management not needed anymore. Server load should not exceed 40%.

Check

Observed result: The team could manage a released into the production environment without approval from release management or upper management. Server load did not exceed 40%. Interviews during the preparation of this experiment showed that there is no agreement to standardise the learnings for future releases.

Act

What we’ve learnt:
1) No clear system KPI’s defined
2) There are concerns regarding the availability of people for monitoring. (e.g. Who will monitor this release after business hours?)
3) Rollbacks are manual and difficult and currently impossible to perform by the development team.

So the outcome of this first experiment was not entirely positive. Although the development team managed to do a release, the new process could for several reasons not be standardised. The good thing is that we have found new obstacles to remove. Again some are simple no-brainers, and others need experimentation. My point here is that experiments very often fail. From a theoretical point of view, the optimal learning is when half of your experiments fail. By failing, we mean that an experiment didn’t give the expected outcome. Continuing you repeat the PDCA-cycle until you have removed the obstacle or you have come to the conclusion that you cannot remove it. If that is the case, you might have to rethink the direction of your improvement.

Sergi has practised the improvement kata for more than a year now. He has become quite good at it and only now and then he falls into the trap of causality or reductionist thinking, a common pitfall for project managers. For that it is important that the kata practitioner, also called the learner, has a kata coach and gets help through regular coaching sessions.

What happened to the fuzziness of the project? Sergi stopped using the word project; he talks about the 15-minutes initiative instead. The fuzziness though didn’t go away. In fact, Sergi likes it now. It makes the job much more exciting and also opens new doors all the time to run improvement experiments. From a management point of view, the initiative was a revolution for Sergi. A the time of this writing there were 120 people engaged in the initiative, but nobody has been assigned to it, e.g. no team. All the participants run their local improvement experiments driven by the intrinsic motivation to have a fearless development environment. I will write about Sergi’s new leadership style and the incredible success he has with it.

I hope Sergi’s adventure inspires you to try the Improvement Kata yourself. It doesn’t need to be an initiative that involves 100 people. I recommend you to start small since you probably won’t have a Kata Coach around you. Good enough to try, safe enough to fail.

Good luck!