How do you evaluate MLOps platforms?

Published in

MLOps.community

4 min readDec 22, 2021

Companies that pioneered application of AI at scale did so using in-house ML platforms (facebook, uber, LinkedIn etc.). These capabilities are now available in off-the-shelf products. The rush to MLOps has led to too much choice. There are hundreds of tools and at least 40 platforms available:

Image by Thoughtworks, from Guide to Evaluating MLOps Platforms

This is a very difficult landscape to navigate. Let’s understand the big challenges and then we’ll introduce some new free material that aims to address these problems.

Challenge #1: Overwhelming Choice

Chip Huyen gathered data on the ML tool scene in 2020. Chip found 284 tools and the number keeps growing. Evaluation don’t have to consider all of these tools in detail as long it’s possible to narrow them down to the ones that are most relevant to start with. That’s not easy though because to do that we would want clear categories and MLOps categories are not clear…

Challenge #2: Blurred Categories

Typically we get a picture of what software does what by putting products in categories. There are attempts to do this with the LFAI Landscape diagram, the MAD landscape and github lists. But often ML software does more than one thing and could fit into any of several categories. Platforms are especially hard to categorise as they explicitly aim to do more than one thing.

Because software that does several things is hard to categorize, ML platforms all tend to end up in a single category of ‘platform’. This obscures what emphasis each platform has and also loses all of the nuance of how different platforms do similar things in different ways.

Challenge #3: Shifting Landscape

ML categories are hard to keep track of in part because new categories keep appearing. Feature stores, for example, haven’t been around very long but are now a significant category. This affects platforms too as platforms are introducing big new features and changing their emphasis (in part in response to what new tools appear).

Challenge #4: Complex Problems

ML is complex. It’s also a big field. Platforms do a wide range of things so that means understanding regression, classification, nlp, image processing, reinforcement learning, explainability, AutoML and a lot more.

Challenge #5: Range of Roles and Stakeholders

Not only are the problems various and complex, there’s also a range of roles involved in the ML lifecycle. Data Scientists, ML Engineers, Data Engineers, SRE, Product Managers, Architects, Application Developers, Platform Developers, End Users etc. Different roles have different points of interaction with the software and different needs from it.

Challenge #6: Build vs Buy and other controversies

There’s a lot of discussion of build vs buy trade-offs in the ML space. How much control do organisations need over their stack? How much does that control cost? Build vs buy is often presented as an either-or but it is more of a spectrum. This is just one of the confusing controversies in the ML space (consider how controversial AutoML is).

How do we get on top of all this?

Let’s looks separately at specialist tools vs platforms.

For specialist tools the mlops.community has some great profiles that you can drill into on the website. Tools are broken down by category and the website offers feature lists, introduction videos and a feature to compare tools side-by-side.

For understanding the platforms landscape and dealing with the trade-offs Thoughtworks has launched a Guide to Evaluating MLOps Platforms:

https://www.thoughtworks.com/what-we-do/data-and-ai/cd4ml/guide-to-evaluating-mlops-platforms

This is available for free without any sign-up. It addresses how to distinguish MLOps platforms. The guide looks at MLOps platforms through different lenses and offer categories to break them down. For example, here’s a picture from the guide showing the distinction between MLOps platforms and tools:

It’s a spectrum with some all-in-one platforms trying to address the whole ML lifecycle, specialist tools address individual parts of the lifecycle and in between there’s many focused platforms that address multiple elements of the lifecycle but not the whole lifecyle. Elsewhere in the guide, we look at other questions such as the strategies for low-code and AutoML platforms, the roles and personas involved in MLOps and how to structure an evaluation to suit the needs of your organisation.

Understanding the high-level platforms landscape is just part of an evaluation. We also need to apply this knowledge and see how to compare platforms against each other. For this we’ve released an open source comparison matrix:

https://github.com/thoughtworks/mlops-platforms

The matrix is structured to highlight how vendors do things in their own ways and also point to more detail in the product documentation. We’ve also included in the repository a series of profiles that describe the product directions of popular platforms concisely and in a marketing-free way.

We hope you find this material helpful and welcome contributions in github. Feel free to ask any questions on the mlops.community slack (and tag Ryan Dawson on there so that I see it).