In our previous post, we introduced Monitoring and Observation as the main work axes of our Quality Assistance team. This post is about the former and, more specifically, about monitoring qualitative data from inputs gathered at our Team Health Check.
In a few words, the Squad/Team Health Check (from now on, THC) is a team-aimed self-evaluation framework for visualising what to improve next. It was first introduced by Spotify in 2014 and I strongly recommend you to read the original article by Henrik Kniberg for a complete understanding. What you will find here is a story-telling of our particular THC experience at Mercadona Tech: our setup, our first outcomes, our challenges and some tips for driving the sessions that we have extracted so far.
The outcome was somehow uncertain because THC success depends primarily on transparency, trust and criticism acceptance within the organisation. After a few months here, we hinted that it might go well with our teams’ culture. Mercadona Tech’s lean approach to adopting new processes and tools was key for putting this into practice.
The core ideas to retain before the kick-off were two, and so they were explained to our colleagues:
- Self-assessment implies that we are not comparing teams, because personal opinions are intrinsically subjective.
- This framework must primarily serve the team, although managers and people supporting the team might also benefit from the data gathered.
What is being evaluated?
We chose ten topics, almost identical to Spotify’s. For our first run, in a totally lean approach, we decided NOT to buy the cards set and have a thumbs 👍🏼/🤛🏼/👎🏼 voting method instead, being topic definitions displayed as slides on the board. However, for the second run we went fancy and bought them 😊
List of topics assessed: Delivering value, Releasing, Speed, Suitable Process, Health of code, Fun, Learning, Mission, Pawns or Players and Teamwork.
In our teams -no surprises- there are engineers, product managers and designers. From the very beginning, we were convinced about running THC sessions only with Engineering, remaining Product’s attendance optional and excluding Design as topics looked far from their scope.
After three iterations, we have seen the benefits of having everyone invited but also felt sometimes that product managers or designers, when attending, were a bit lost in some engineering conversations. We enabled abstentions as a voting option for those cases as well as for the newcomers.
- Teams: 5
- Timing: from 35 to 70 minutes.
- QA feeling: excellent, because the level of transparency exceeded our expectations. Debates were very constructive in each topic.
- Feedback: although no specific feedback was gathered, the general feeling about the sessions was very good and the experience was perceived as positive.
Things we learnt:
- The identification of improvement actions was delegated to teams and the THC ended up being a snapshot of health but nothing else.
- Sometimes it was hard to define what fitted into a topic and what didn’t, or even whether their votes were about their individual feelings within the team or within a wider context (the whole engineering team or even the organisation).
Second round (3 months later)
- Teams: 7
- Timing: from 60 to 100 minutes. This time we booked some time for teams to identify improvement actions related to their unhealthy topics. Also, we dedicated a minute to clearly define what each topic meant. This resulted in a good time investment for everyone. As you can see, as any other process in Mercadona Tech, we try to improve our THC iteratively by analysing what did or not work well after every round of sessions.
- QA feeling: sweet and bitter, depending on teams. Some teams were unable to identify improvement actions whereas their self-perceived health was not great. We will come back to this later on the article.
- Feedback: we introduced a 5-star service rating at the end of each session and the global score was very good. We even had some ideas for improving the framework, which was great because it meant teams actually cared about THC and took some ownership on it.
Something we learnt: some discussions went off-topic and so did some votes. Again, more on this later in this post.
Third round (5 months later)
- Teams: 8
- Timing: from 60 to 100 minutes
- Exceptional setup: half of the team was working remotely because of the COVID-19 policies and we had to use Google Forms instead of a white board.
- QA feeling: exactly the meh anticipated by Henrik Kniberg in his article. We had the impression that a new health map was being generated but with not much coming from our teams in terms of improvement actions.
- Feedback: very good but only partially received, which was interpreted as a warning from our side.
Things we learnt:
- This activity does not work well in remote mode. Whiteboard and thumbs up/down are much more powerful than online forms when your goal is putting the team together and enable debates.
- We (all) had to put more effort into the actions part, as commented by some people in our feedback form.
We must honestly affirm that, generally speaking, teams have helped us run these sessions in a climate of transparency and goodwill. However, as mentioned above, some enemies might appear sometimes and ruin an exercise of this kind, so prepare yourself to identify and tackle them quickly:
- Off-topic or too shifted discussions: they will waste everyone’s time and it is your responsibility to promptly stop them.
- Shy or too respectful people: all voices must be heard and your mandate as facilitator is to give them the speaker (which does not mean to force them, either).
- Participants trying to convince disagreeing colleagues: make sure debates are constructive and respectful. The goal is not to reveal THE truth but to reach consensus and get the team aligned.
- System players: it might happen that some colleagues see this activity as an opportunity to be heard as results are then escalated and discussed with managers. If your organisation has other forums where these people can express their worries or complaints, try to kindly stop them before the whole health map gets unreal. Remember that your main goal is to facilitate the identification of measures or actions to improve teams, and this will definitely not happen for problems that remain out of the team’s scope.
Analysing the results
In every session, one of the QA engineers has the conductor/facilitator role whereas the other one takes notes. The goal of this is using notes as an anonymised fine-grained source of information to understand what remains behind the health map. Obviously, this approach requires an immense workload for QA but these efforts are considered valuable so far.
So, at this stage, we have a double analysis in our plates: qualitative and quantitative.
In our role of Quality Assistance, having a quarterly picture of cross-team concerns or pains is a tremendous value. For the time being, we have based this analysis in a simple keyword extraction per THC topic, which leads us to a big brain squeeze where we look for common issues across teams. Knowing about them beforehand is an advantage for an accurate interpretation of this massive amount of information.
As a result, we are able to present a list of hot topics that partially explains our yellow and red spots.
Equally to session notes, we also keep track of anonymised votes so that a simple colour (green, yellow or red) does not hide the real feeling about each particular topic. So, basically, we enter these votes into a data model that allows us to generate some interesting graphs such as these:
It is only at this stage when we start exploring cross-team data and identify patterns. Hopefully, trends will appear after several rounds but any time-related pattern will be carefully analysed considering staff changes and other events that may have an influence on global data.
Are we any better than 3 months ago?
This is naturally the main question to expect from people after seeing you running several THC sessions and spending hours in the post analysis. However, the answer is not easy, even if trying to be totally data-driven. There are some events that make your monitoring model not perfect. Some of these are visible, such as rotations, leavings, arrivals or teams splits, but others are hidden and difficult to anticipate: individual feelings, shifted discussions or even intentional votes for upraising some discomfort.
From my point of view, monitoring remains a secondary benefit or the THC, although it may come in handy at some point in the future. Each snapshot is however a single piece of value that should complete the picture of a team’s status along with all their other metrics.
Is this framework useful? Some takeaways.
Sometimes a model like this can be really helpful. Sometimes it’s more like “meh” — people dutifully do the workshops or surveys or whatever, and the data is then ignored. (Henrik Kniberg, 2004)
Although the framework is well perceived by our teams and globally supported by managers, the benefits for the teams themselves are to be proven yet. I believe teams engagement with a continuous improvement philosophy is key for making of THC a success but we also have to reinforce our facilitator role when it comes to identifying actions for improvement.
Separately, we (QA) will be called to ensure that data analysis (qualitative or quantitative) provides us with great value in an effective manner. A growing number of squads or teams means an increasing effort from our side, so we will have to be watching over this cost-value ratio.
Despite the latest meh feelings, after running a total of 20 sessions in three iterations, I am still convinced we can make this work at Mercadona Tech and, if this happens, we will let you all know.