How and Why to Use Agile for Machine Learning

Patrick Dougherty
Slalom Insights
Published in
5 min readMay 16, 2019

One question our clients frequently ask is how to apply structure to their data science and machine learning initiatives to avoid taking up permanent residency in the “Trough of Disillusionment”.

Machine learning investments often take on the appearance of an academic research project, with flexible deadlines and unlimited or open-ended budgets. “Failing slow” is the norm, and projects are prioritized not by the potential business impact but by team preference.

Enter agile & scrum: created to solve similar issues for software development, the principles are still relevant, but can they be applied to machine learning? Based on our experience the results speak for themselves, and while millions of words have been written about each, we’ll focus on a high-level overview here. First, let’s see how some of the key concepts of scrum adapt to a machine learning initiative.

Product: The product is the machine learning model and any associated integrations of that model into downstream systems. The product is defined by the customer need it is being developed for and hinges on two components.

  1. Business Outcome: How will the model output be used? Is it a real-time consumption use case where another application will call the model endpoint via Rest API? Or are batch predictions made on a regular interval (daily, weekly, monthly) sufficient?
  2. Data Availability: What data attributes are available as inputs to the model? Do they have enough detail to be useful (i.e., if you are building a model to predict daily sales, do you have data that updates at least daily)? Are the source systems for the data compatible with your Business Outcome?

Product Backlog: The Business Outcome and Data Availability needs form the user stories for a machine learning product. Data Availability can be further segmented into data integration (gaining access to raw data sources) and feature engineering (transforming raw data into useful predictors).

Sprint: A sprint is the key unit of iterative development in the machine learning process. It is typically 2–4 weeks and encompasses all activities shown in Figure 1 below. In a typical sprint, the data scientist can access raw data from a new or existing data source, perform feature ideation and engineering, and then re-train and test the machine learning model before deploying it to an endpoint or performing batch scoring. Model accuracy at the end of the sprint matters less than the improvement in accuracy relative to the previous sprint.

Figure 1: Structure of a machine learning sprint

Sprint 0: Sprint 0, or the first sprint, follows a slightly different structure as it focuses on laying the foundation for future sprints. Sprint 0 is when the data scientist will create the feature engineering and model training pipeline. This is significantly accelerated if the data scientist has access to a “machine learning workbench” of some type, where access to data sources and software or packages for model development already exist. Sprint 0 is also when the data scientist begins selecting the candidate algorithms most relevant for the data types and the business stakeholders.

Sprint Review: In the sprint review meeting the team shares an update with product stakeholders. The topics include recently engineered features, an update on model accuracy, and progress made toward the Business Outcome. This is also a great chance to review an updated feature importance ranking (shows which inputs are most helpful in predicting the target variable) to begin building trust in the product.

Backlog Refinement: Before each sprint, the team will re-prioritize the backlog based on findings from the most recent sprint. This is an opportunity to gather input from cross-functional stakeholders on the merits of data sources and obstacles of integration. These sessions are an opportunity to democratize machine learning concepts and are not overly technical. The ideal outcome is participation from a broad audience and a number of hypotheses which will be tested in the coming sprints.

Team Structure: The key roles to build an Agile ML product are Executive Sponsor, Product Owner, Data Subject Matter Expert, Data Engineer, Machine Learning Engineer, and Data Scientist. This core team can be augmented with Software Engineers, DevOps Engineers, and other roles as needed based on product definition.

Results

As with any framework, the concept of Agile ML is only as useful as the outcomes it creates. For one client in particular, a proof of concept for anomaly detection delivered via this framework was highly successful for three key reasons:

  1. Agile ML encouraged non-technical stakeholder involvement by introducing visibility into a process that can often feel like a black box. Project sponsors were able to participate in the feature ideation process by sharing their hypotheses of which input data would correlate to the outcome variable (i.e. average daily temperature as a predictor of daily sales revenue) and then re-calibrate each sprint in the backlog refinement sessions.
  2. Agile ML kept the data science team focused on outcomes by treating the model as a product and iteratively adding features. The build and deploy sprint structure perfectly integrates to a continuous delivery DevOps mindset that constitutes best-in-class data science teams. By establishing our candidate models in Sprint 0, we avoided chasing accuracy through increased model complexity and instead prioritized data sources and features.
  3. Agile ML reduced the time-to-value to weeks instead of months. Because we tracked how the machine learning model iteratively improved over each sprint (Figure 2), we were able to make a more informed decision about when to stop model development and move on to the next initiative rather than chasing a perfect model with an open-ended timeline.
Figure 2: Model classification accuracy at the end of each sprint

Slalom successfully delivered a proof of concept model that enabled business stakeholders to predict in advance, and therefore mitigate, anomalies. Although in this case we did find meaningful predictors, we set the expectation that failure was possible and were able to time-box the potential failure using the sprint structure. If we had not achieved significant improvement in model accuracy between Sprint 1 and 2, and then again between Sprint 2 and 3, we would have stopped development on this product and moved to the next.

Next Steps

So, what’s next? How do you put some structure around your machine learning product development efforts but avoid wasting time and angering your team of well-compensated PhDs?

Working with Slalom can be a great start, but not in the way you may be thinking. We’ve found that the best way to adopt Agile ML is through a partnership model, where we work together to develop a machine learning minimum viable product (MVP). Ideally, Slalom fills some of the team roles and you fill the others. Your team ramps up on the Agile ML structure and personalizes it to their pre-existing team dynamics. Together we create a meaningful return on investment with an MVP that (for example) detects fraud, predicts sales, or optimizes marketing spend.

Patrick Dougherty is a Solution Principal of Data and Analytics in Slalom’s new Charlotte office. You can reach him via e-mail at Patrick.Dougherty@slalom.com.

--

--

Patrick Dougherty
Slalom Insights

Co-Founder and CTO @ Rasgo. Writing about AI agents, the modern data stack, and possibly some dad jokes.