Can you develop a machine learning product without agile?

Going against the grain as a product owner

Published in

Humans of Xero

5 min readJun 26, 2020

The role of product owner (PO) was born out of agile methodology. So as a PO, I never thought twice about setting up scrum and Kanban, and leading the agile rituals for teams. But when I was tapped on the shoulder to join Xero as a senior PO, I joined a young team of data scientists and machine learning (ML) engineers who hadn’t used an agile framework before.

And that made me wonder: did we really need to follow an agile framework to get the best outcomes for our users?

Questioning my approach

My new team was running a technology project that built artificial intelligence (AI) services for other Xero teams. In other words: really exciting stuff. Everyone was working furiously towards a beta that we had earmarked for a few months’ time, and I was ready to dive in with my beloved agile rituals. I spent the next few weeks observing the team, looking for evidence that their lack of traditional agile methods was somehow destructive. But what I found when I looked at the outcomes really surprised me.

This was a team that consistently hit their deadlines, and seemed to have a collective understanding of customer value and what it takes to deliver it. They guarded each other from going down rabbit holes and were one of the most productive teams I’d worked with. This team didn’t use any semblance of agile, but when I looked at outcomes, they were crushing it. How was this possible? And how could I justify introducing an agile approach when there didn’t seem to be any grounds to do so?

Picking my agile battles

That’s when I fell into an introspective hole. What is agile? Why scrum? What outcomes did I want to achieve? Why did I want to introduce agile to this team, other than my opinion that it was new and cool and a core part of my role as a PO? Instead of throwing new agile methodologies at the team for the sake of it, I instead focused on answering a few important questions:

Was there a clear direction?
Were we solving real customer problems?
Was there a clear connection between each piece of work and the overall direction?

I armed myself with all the user research I could find, to understand what was important to users and how our service fit in with their needs. Fortunately, Xero has a fantastic centralised repository of user research studies available to all staff, so the information was readily available. With the research in mind, I defined the metrics that best represented our value to the user. As it happened, these mirrored the metrics that the team already obsessed over:

Latency: our service needed to respond in X seconds 95% of the time
Coverage: we needed to be confident on Y% of our predictions
Accuracy: we needed to maintain Z% accuracy for all confident predictions

Then I set about aligning every new piece of work to these metrics. As a weapon in my crusade, I chose the Product Kata*, and the first victim would be our arbitrary beta.

* As a side note, if you haven’t used a Product Kata before, check it out. It’s a really great way to track your progress in what can be an otherwise fuzzy, unstructured experimentation phase. The artefact produced by the end is really amazing for communicating your progress and journey.

Switching to user-centric outcomes

Seeing as our beta deadline was self-imposed, the team came together and decided to change it from a point in time when we wanted a bunch of features built, to a time when we achieved a set of user-centric outcomes. Specifically: latency, coverage, and accuracy. We started by focusing on latency. To measure our end-to-end latency, we sprinted towards an API that returned anything at all, and found we were already twice as fast as our target. However, after a premature celebration, we also discovered that measuring straightforward metrics like latency isn’t always straightforward.

What we were measuring was throughput, and the problem was this measure changed depending on how much we scaled out our pipeline. When we started recording the different classes of input and their combinations with the infrastructure parameters, it quickly became a massive headache. So what did we measure? Throughput or latency? Both? After some to-ing and fro-ing we decided to consult the ultimate authority — the customer. With the customer front of mind, our experimental conditions to achieve our latency target became:

end-to-end response time
under production load volumes
and with inputs of the same diversity as they would be in production

Connecting the dots

Within a few weeks we had a functioning service with an acceptable latency. The next milestones to fall would be our ambitious accuracy and coverage targets. We approached these in the same way as our latency metric, with the Product Kata. We ruthlessly prioritised tickets that increased these metrics, deprioritised those that didn’t, and we stopped as soon as we reached our targets.

Before too long, our service was achieving the outcomes we had defined for our beta, two weeks before the arbitrary deadline. An honest retro revealed the team really liked the direction and the structure, although we noted that it got a bit complicated agreeing on the experimental conditions. The question was, would we have still made the deadline without the ‘fancy approach’ (my manager’s words)? It was impossible to say.

After thinking about it, we decided it was all about risk. We could have made our arbitrary deadline without a Product Kata, but by linking our deliverables directly to the expected outcomes, there was a much higher chance that our solution would solve our users’ problems. The risk when you work towards a list of features is that you might do things that don’t directly contribute to value. And if you don’t measure your outcomes, you might launch with a false start.

Going back to the fundamentals

I was initially brought into the team because of my experience with agile, but I ended up learning that sometimes the best thing you can do is not dive into an agile framework. I learned to be deliberate about change — because change is hard and changing things for the sake of it is foolish.

If that means I’m a PO who doesn’t practice within an agile framework like scrum or Kanban, then that’s ok, because my job is to help the team bring as much value to users as possible. Everything else is a tool to get us there.

Focusing on outcomes is a fundamental principle in product management, but it took a new job and a great team to teach me that this principle transcends agile frameworks. As a result, we’ve made huge strides forward and now prioritise work based on customer value. We have short-term, tangible goals to shoot for and a really cool process (the Kata) to manage our work experiments.

Of course, my work isn’t done and it’s only the start of my product journey at Xero. In fact, I’m already starting to think about the next big step. But that’s a post for another day. In the meantime, I’d love to know how you would have approached the situation. Do you have any lessons to share? I’d love to learn from your experience.