What does the future of MLOps have in store?

Chiara Gambarini
Inkef

--

Long gone are the days when MLOps was regarded as a castle in the air concept. The hype around MLOps has turned into inevitable realty, and organizations are still heavily investing resources in the search for tools enabling them to sustainably put their AI capabilities into action. And here at Inkef, we share the excitement for innovative solutions in the MLOps space!

Much has been written about MLOps and its promise to revolutionize how enterprises leverage AI, but let’s rewind the clock back a few ticks and quickly recap.

The rapid advancement of computing power and cloud-based infrastructure, combined with increased availability of data and open-source tool kits has driven the adoption of AI/ML across businesses of all sizes at a dramatic pace. In the past decade, we witnessed an explosion of AI applications, from supply chain optimization tools to fraud detection systems and ML-driven drug discovery and production. However, while organizations realized that AI is not a “nice to have”, but rather a key ingredient for success, they also faced the reality that operationalizing and scaling machine learning to drive business value is not an easy feast. Thus, with the wider adoption of AI/ML for use across businesses, the implementation of structured processes for developing, deploying and maintaining these solutions in a reliable and efficient manner became imperative.

MLOps entered the chat

There are a lot of pieces to the MLOps puzzle: from managing datasets, to training and monitoring models to implementing scalable processes that can be repeated across an organization, MLOps represents the set of best practices to make sure these pieces fit well together, allowing companies to effectively manage all the domains of the ML lifecycle.

Since its origins back in 2015, the concept of MLOps has been in the spotlight. Despite the numerous resources invested and the rapid growth achieved, MLOps is still a relatively new concept in the AI world and there is still ample room for improvement. At the end of the day, data speaks for itself: about 1 out of 2 AI projects never makes it into production and for those that do make it, it takes 7 months on average to get a model to that stage, and additional 7 months for it to start delivering business value. While there are multiple reasons behind these failures, it’s clear that there are several challenges that still need to be addressed. Clearly the industry is still booming and closing the gaps in MLOps represents a lucrative market opportunity, valued at USD 6B by 2028.

The market has been consistently growing at a strong pace and the landscape of MLOps tools has become richer and increasingly complex. A lot of players entered the arena, competing in a global market and promising a better way to productize ML, from big-cloud providers to niche and specialized tools. To help navigate the space, we can distinguish between three main categories of players:

  1. Cloud-native stack offered by the big Cloud providers, think about your Amazon, Google or Microsoft. These players offer a nearly-complete solution that covers most of the developer experience. One great benefit of these solutions is that they simplify integration challenges through a unified experience. However, the smoothness of these end-to-end experiences doesn’t always translate well to an on-premise ecosystem or multi cloud strategy.
  2. Platforms that offer an integrated, context-agnostic stack such as DataRobot or MLflow. These players aggressively landed a strong market position addressing some of the weaknesses of Cloud-native propositions and expanded, to now almost offering a complete experience, adding more domains of the model development lifecycle to their core proposition.
  3. Specialized tools that provide best-of-breed modular software that can be easily integrated into any workflow such as BentoML for model deployment, or Neptune.ai for experimentation and model registry. While adopting these tools might result in having to patch bit and pieces together, it also allows organizations to tailor and customize their ecosystem and adopt best-of-class solutions across the whole ML development lifecycle.

The bottom line is that although the competition is fierce and we foresee consolidation in the space, with larger players acquiring smaller ones, there is currently no single technology stack that outperforms the rest by offering a complete solution. As such, we believe there are still pocket of opportunities for specialized players to land a strong position in the market and bring about fundamental change. In this article we elaborate on one of the MLOps domains (More to follow in future articles!) that makes us excited and in which we are keen to explore future investment opportunities: model observability.

But before jumping straight into it, we want to highlight three main trends that we believe further stress how important it is for organizations to up their observability game:

  1. High-speed production — it became very competitive to deploy AI, the faster you go to the market and the better solution you provide, the easier for you it is to survive in a fast-paced and hypercompetitive environment where AI capabilities became a clear differentiative factor.
  2. Need for reducing costs — putting AI into production can be very costly, in addition models deteriorate quickly and performance downfalls and drifts are to be expected, e.g. with new data coming into the pipeline. Thus, it is imperative to effectively monitor and maintain existing models in production.
  3. External pressures for transparency and compliance — AI solutions need not only to be accurate, but also explainable. Explainability efforts continue after the model is built. It should always be clear how a model is behaving and why.

Model Observability

With more and more models being deployed into production, we believe model observability to be paramount to successfully maintain, scale and operationalize ML solutions. It comes with no surprise that building models in a research environment and deploying them in the real world are two considerably different concepts. Once put into production, models tend to degrade and performance to worsen due to common challenges such as model or data drifts, distribution changes, data quality issues and bias. Observability tools allow engineering and data science teams to better understand, monitor, troubleshoot and explain ML models across the different stages of the development lifecycle: from training to deployment and while in production. These tools not only answers to questions around what did go wrong and when, but also provide insights into the how and why enabling tech teams to not detect and resolve issues faster and in a more efficient manner, but to fully comprehend and explain their model’s behavior and predictions.

To quote Aparna Dhinakaran, Co-Founder and CPO at Arize AI:

Observability is the key difference between a team that flies blind after deploying a model and a team that can iterate and improve their models quickly.

The watchlist includes: Aporia, Arize, Deepchecks, Superwise

Are you a founder, investor, operator, or anyone who is as excited as we are about the future of MLOps? Look out for future blog posts on the space and do reach out to chiara[at]inkef.com to have a chat.

--

--