A simple strategy to an effective management of machine learning products all the way from research and development up to production systems

Published in

mm1 consulting

6 min readJul 20, 2022

It is no secret that across different industries, modern enterprises apply machine learning-based technologies to create significant business value. For example, using machine learning (ML) models for predicting order volume in e-commerce helps to optimize the daily operations and thus, saving millions of Euros annually compare to conventional prediction methods. But this article will not tell you the many opportunities that AI offers for business. We assume you already know the benefits of ML and are looking for ways to professionalize and improve its operations.

Why managing machine learning products all the way from research up to production is a must

We often see a “fire and forget” approach when it comes to ML models. This means each new version of a model replaces the last and everything done until that point is “forgotten”. This approach is sufficient in the short run and with small teams. Yet, take for instance the above-mentioned use case. During the research and development phase hundreds of versions of the model are created using different data sources, features, architectures, etc. But even the best ML model today needs to be retrained tomorrow to accommodate for data drift and other factors that might compromise its reliability and performance. In other words, hundreds to thousands of versions of a ML model will be trained along its life cycle. And, as the organization grows, it becomes crucial to professionalize the ML development and operations (MLOps). This requires the implementation of a ML Model management system to handle all models along the entire life cycle — from research and development to production environments and even the phase out. The lacking on this leads in our experience to four common threats:

Rapid growth leads to inefficiency

The number of data use cases being implemented grows rapidly. This is actually great! However, the number of use cases by data scientist is also growing rapidly. For a while it works fine as all questions from the operations team can be answered fast. Moreover, very quickly you realise that the data scientists are spending too much time in answering questions rather than working on new use cases and their specific area of expertise. The manual effort to find out things about “old” models is just too high as there is no proper information over the model’s life cycle.

High risk of brain drain

Suddenly a part of your data scientist team left. No one can tell what or how some models work as the code is usually not self-explanatory. Thus, any necessary changes to the model require a lot of manual effort. In the worst-case scenario your team is forced to “re-do” the whole thing — meaning the organization lost all the know-how created over time.

Unexpected results that no one can explain

A team working on a given model gets some unexpected good results. However, no one can tell you why — new data, other data, other time period, new features, etc. As many people work on the same, the tracking of the experiments become sloppy to the point where no one can tell for sure what happened.

Failure to conduct peer review (or audits)

To ensure quality standards and know-how-transfer organizations foster peer reviews before deploying anything into production. However, as there is no centralized model management system, this process requires too much effort and is often omitted in favour of a quick time to market. This of course endangers the quality and the long-term anchoring of the know-how in the organization. And don’t make me start with audits… it is practically impossible to do without a professional model management system.

Do not worry if you are not using a professional model management system today. In the next section we will discuss an approach based on best practices for the implementation of such systems

To what scope does machine learning management make economic sense?

Machine Learning Management can be understood as the discipline of maximizing the business value and increasing the maturity level of ML operations step by step. That is, one does not only take model training into account, but manages the full ML life cycle and increases its degree of automation. To do this effectively, efficiently, and scalable, integrated ML management systems come in handy. However, we have seen a boom in the number of tools/ systems on the market, making it difficult for organizations to keep the pace. We at mm1 have long-term experience in establishing and improving ML management both in terms of organizational and technical dimensions. If you are interested in earlier data points on this topic, please read this article on managing machine learning cycles.

Option space for decision-making and criteria for selecting the right system

As mentioned above the number of tools/ systems is booming as new options are coming to a highly competitive market and selecting the right tool/ system for a given organization is crucial for its success. Stemming from different industry projects, mm1 knows the rapidly growing option space of these systems in detail. Drawing on project experience, a set of three to five systems are usually interesting in enterprise contexts — depending on the IT infrastructure as well. Selecting and implementing system should be done thoroughly based on a framework covering seven dimensions.

Business needs: the first step is all about understanding the real business needs of an organization for such a tool/ system. This varies widely depending on the level of maturity, the ambition and the strategy. A typical example for a business need is that a ML management system must not significantly change the development process of a data scientist. There are of course strategic needs such as future changes in the core IT infrastructure as well.
User stories & Prioritization: Business needs are written as user stories with greater detail and prioritized respectively. User stories are written from the perspective of the end user and include further information such as known dependencies and out-of-scope-topics.
Defining the option space: depending on the user stories, a benchmark should be performed leading to an option space of 3 to 5 suitable tools/ systems for detailed analysis
Proof of Concept: each tool/ system is tested with a real ML use case to ensure the promised features are implemented in the expected quality. In best case the tools are tested with data and algorithms already being used within the organization. We encourage an iterative process that involves the end user during each step to foster a feedback loop. In this step, we typically expect to reduce the option space to 2 or 3 tools/ systems
Final assessment and decision: The remaining tools are assessed according to their economic value, fulfillment of the user stories, internal IT guidelines, operability, and organizational strategy. Based on this assessment, a decision for the implementation of one tool/ system is made
Technical implementation: The tool/ system is integrated/ deployed into the IT infrastructure (this includes the integration architecture, the creation of data bases, the deployment of so-called tracking servers, etc.)
Processes and adjustments: For a successful and sustainable use of the tool it is not sufficient to be technically ready, but to have the right processes in place — both within a team and between teams of data scientists. This requires in some cases the adjustment of the tool itself.

Learnings from working with W&B and MLflow

In several mm1 projects, MLflow and Weights & Biases (W&B) have shown to be some of the most complete tools when it comes to fulfillment of business needs. MLflow is an Open-Source software with the biggest community in the market, while W&B is as of today the most used paid independent tool among ML practitioners. Both are great options and starting points to establish a professional ML management, but whether they are the right fit for you, is another story. Do you want to know how exactly a ML management system looks like? Just ask us for a demo. We will be happy to show how an industry state of the art ML management system could look like in production.

About the authors

Dr. Michael Eble is an expert in complex innovation projects with international partners and suppliers in business-critical contexts. His technical expertise in connectivity and data products as well as cloud technologies is complemented by market and product knowledge in several industries. His clients also benefit from his proven conceptual, organizational and communication skills.

Juan Garcia-Sievert is an expert in the field of industrialized machine learning applications and operations. Covering the whole data-value-chain from Infrastructure to life cycle management, Juan has helped organizations conceptualize, build, deployed and operate data driven business models and assets