Making Agile work in Data Science

DoYoung Kim
Gousto Engineering & Data
4 min readFeb 22, 2021

Agile methodology is widely adopted in software engineering but its strict methods make it hard to apply to the field of data science. The uncertainty in data science means we need more creativity, flexibility and research. Yet, the benefits of agile methods make it attractive for any team to use. Over time, our squad has found a unique way of incorporating agile into our ways of working.

About the squad (Daikons)

Daikons are a team of 4 data scientists and 1 product manager. We’re responsible for developing data science products within the Menu Tribe at Gousto, one of which is the menu planning algorithm. This is an optimisation algorithm that helps the Food team plan the weekly menu in an automated, data-driven way. It also optimises for metrics such as costs and recipe diversity.

Short sprints

Our 1-week sprints may be one of the most striking aspects of our ways of working. I thought this wouldn’t be compatible with DS projects as they tend to take longer than software engineering tasks, but it works well. In DS, there is a lot of uncertainty which makes planning for longer sprints very difficult. This short cycle time helps us ensure that:

a) we break down any tasks into small outcomes we want to achieve

b) we quickly adapt to findings we come across, whilst keeping aligned with our longer-term goals and roadmap.

If we’re unsure of exactly how to approach the problem, we often create a ticket with ‘Spike’ issue type. A “Spike” issue type means the outcome consists of further tickets where the problem is better defined. A short sprint with specific tasks helps us solidify what tickets to prioritise in the next sprint.

Flexibility

In strict agile methods, nothing should be added to sprints once they’ve been set. We’ve adopted a more flexible approach! We largely adhere to the rules, but we might add more tickets once we’ve completed a spike or change our sprint goals or scope to adapt to new findings. We think this is okay! The key is to stick to the plan but be prepared to change if our requirements change. We don’t get upset about things not being perfect.

Teamwork and Pair Programming

We work collaboratively in the DS team and we share responsibility across projects. If there is a particular problem that is harder to solve, we work in a group and brainstorm possible solutions. This is especially helpful when it comes to abstract problems which we tackle every day in Menu. After all, many heads are always better than one when the solution to the problem is not obvious. We do a lot of pair programming, and we are able to pick up tasks in any project we’re working on, as we have good visibility over what’s happening in each project. Since many of us joined after lockdown, the increased face time also has the added benefit of building stronger relationships.

Shared Responsibilities for Ceremonies

We share the responsibilities for agile ceremonies such as sprint planning, retrospectives, and refinement, rather than having one person lead everything. This holds each of us responsible and drives us to constantly think about improvements in our ways of working rather than depending on someone else.

Morning stand-up with relevant members only

Like many other scrum-based teams, we have a stand-up every morning to discuss what we’ve been up to, and what we would like to work on that day. As we are a small team of data scientists, all the work that’s going on is relevant and any of the tickets can be picked up by anyone. We’ve found this to be very effective. In previous companies, I’ve been in stand-ups that involved the entire tech team. I found that this didn’t work for me as a data scientist as my work was quite removed from the rest of the team. As a result, stand-ups felt very demoralising. In many companies where DS is often a relatively new function, this is something to keep in mind.

Discovery Epics

When we are about to start a new project, we start with the Discovery phase. This involves brainstorming possible approaches, high-level outcomes, splitting the project up into epics and t-shirt sizing each epic to come up with an estimate for the project (see image below). Each epic should have a clear acceptance criteria. Often in software engineering but even more so in DS, dependencies, risks, technical approach and estimates are unclear for larger projects until we do additional research through spikes and workshops. We, therefore, have a separate ‘Discovery Epic’ per project to help us solidify these at the start. The outcome consists of refined epics that are ready for development. Having clearly defined epics also helps us set timelines and avoid the pitfall of never-ending analysis.

An example of t-shirt sizing, a way to give relative estimates to stories/epics. It enables us to think of the relative complexity without getting bogged down by providing numerical estimates.

The Future

As the team grows, we’ll find that we have software engineering needs when creating and productionising well-polished data products. This means, we’ll soon have software engineers and designers embedded in the squad and we’ll be responsible for delivering end-to-end data projects. When we do, we’ll have to work out how to adapt our ways of working to suit this new team, for example, ensuring stand-ups stay relevant for everyone whilst keeping everyone aligned. Keep an eye out for any future updates as we grow and adapt!

--

--