ML platform — How can you assess the distinct capabilities of platforms? (Part 2)

Bhaumik Pandya
6 min readDec 8, 2021

--

1. How do you know which platform fits best to your organizational requirements?

As we illustrated in the previous article, to find the most suitable ML platform that meets your organizational requirements, a framework is required, which helps assess the capabilities of ML platforms. Hence, we have come up with an assessment framework that encompasses the end-to-end ML lifecycle. To understand this framework, let’s first define the necessary functional capabilities a ML platform should cover, to enable continuous development, integration, deployment, and monitoring of a ML solution.

2. What are the functional capabilities you need to consider when assessing a ML platform?

To quote Ron Schmelzer: “There’s no such thing as the Machine Learning Platform”. In a world where lines are drawn through an already complicated software and service environment functional capabilities are needed to be defined to set a baseline on what is necessary to bring a ML solution to production. The result is a capability map that covers the end-to-end ML lifecycle and comprises of 5 functional areas: Data Ingestion & Storage, Experimentation Zone, Continuous Integration, Industrialization Zone and Data Presentation.

A ML platform ideally should cover these functional areas and components to enable an end-to-end ML-lifecycle

Each functional area consists of components, which ensure the distinct requirements imposed on the respective functional area. Projects and use case implementations usually start by ingesting the collected data (Data Ingestion & Storage), followed by experimenting and developing solution algorithms (Experimentation Zone). ML models created in the process are then tested, integrated, and deployed to production (Continuous Integration). Further, those models are validated and continuously monitored (Industrialization Zone). The utilization of AI applications and ML solutions is possible through different types of endpoints, such as REST-API, Batch Service and Dashboards (Data Presentation). Let us take a closer look on each functional area.

I) Data Ingestion & Storage

Data is a first-class citizen in ML and availability of high-quality data for training and serving models is a fundamental capability for any ML Platform. Provision of data required for model training is made possible by a data pipeline. A ML platform ideally provides various connectors so that a connection to different data sources (batch and streaming, structured and unstructured) can be ensured via the data pipeline. It is also necessary that further functionalities for data quality testing, data transformation and data versioning are provided as part of the data ingestion and storage.

II) Experimentation Zone

The development of ML models is an iterative process in which the first step is to experiment with different algorithms and configurations (e.g., different hyperparameters) so that the best model result can be achieved. To make this process traceable and automated where needed, it is important that functionalities to compare, share, reuse and collaborate on model development are supported. A ML platform should therefore provide an experiment management component that centrally manages this experimental development and stores, tracks and evaluates the model versions, along with pertaining artifacts and metadata. Supporting visualization options (e.g., for metrics) are advantageous here.

III) Continuous Integration

The training of models is usually carried out using programming languages ​​such as Python (or sometimes R) and usually takes place in a different environment than when used later in production. This fact brings along two challenges: On the one hand, it must be possible to persist the models. On the other hand, the portability of the models between different environments must be guaranteed. This challenge requires a uniform model format such as ONNX or PMML for storing models (including artifacts) and their dependencies. It must also be possible to assign the stored models to the parameters and metrics from the training runs. These functionalities are often covered by a model store including a model registry. A feature store is just as relevant, with which the storage, re-usability, and provision of the features, both during model training and during model serving (e.g., generation of predictions in productive environment) is facilitated. Feature Store and Model Store are crucial for the continuous automation, integration, and provision of models across the entire ML lifecycle, but they do not replace the entire ML CI/CD process. It is therefore important to check that the ML ​​platform supports building a pipeline, automating processes, and packaging as well as deploying models to a suitable environment. A crucial factor thereby is the ability to integrate external tools for orchestrating and automating CI/CD workflows.

IV) Industrialization Zone

For the long-term success of a ML solution, it is important that the performance of ML ​​models in production does not decrease over time and that the model is at least monitored and validated against a baseline model. Ideally, the models should be continuously retrained. A ML platform should therefore provide model monitoring to track relevant model KPIs and to trigger an automatic model retraining via a predefined schedule or via certain triggers. Ideally, monitoring KPIs should act as a trigger for the model retraining. In addition, it should be checked whether the ML ​​platform either provides its own visualization functionality or supports a simple connection to monitoring tools (e.g., Prometheus, Grafana).

V) Data Presentation

To finally be able to serve the results or predictions from the model to consuming applications, a ML platform should provide model serving functionalities. This includes topics such as model orchestration and the testing of models (e.g., A/B tests). Other essential functionalities for the data presentation are model insights i.e., the ability to share findings from the model training phase with different stakeholders. Visualization options in the ML ​​platform e.g., in the form of a dashboard or integration with visualization tools such as Tableau or Power BI, are very helpful.

3. Okay, we have the framework, cool! How to use it?

To recapitulate, we introduced the 5 functional areas: Data Ingestion & Storage, Experimentation Zone, Continuous Integration, Industrialization Zone and Data Presentation, including the components that ensure the necessary requirements of these functional areas. Moving forward, let us define the assessment indicators that will enable the evaluation of different ML platforms by assessing those components.

The ML platforms are assessed based on coverage and maturity scores of each component. The overall coverage and maturity score of a functional area is the cumulative average of the scores of the components, which comprise the respective functional area.

So, what are coverage and maturity?

Scoring of Component and Functional Area with respect to coverage and maturity

Scoring of a component

A components coverage score indicates the availability of the component on the subject ML platform and can either be covered or not covered. The coverage score does not provide any insights on a component’s efficiency or effectiveness; thus, an additional indicator is essential. As a second assessment indicator, we introduce the maturity score which indicates the component’s capabilities, robustness, and readiness for production. As a logical consequence, a component with no coverage does not have a maturity scoring.

Scoring of a functional area

For calculating the overall coverage and maturity score of a functional area we resort to the enclosed components and their respective coverage and maturity scores. In this case, the coverage score of a functional area provides information about the percentage of covered components, whereas the maturity score of a functional area indicates the overall capability and readiness of the underlying components.

So why do we need two distinct assessment indicators: coverage and maturity?

Let us imagine a Platform ‘A’ which has only a little coverage of components but high maturity in the covered components and another platform ‘B’ that covers every component but with little maturity. Without a distinction of coverage and maturity, both platforms might score the same, but from an operational perspective this makes a huge difference. You would either need additional tools to enable the full ML lifecycle or lack the capability to fully industrialize your workflow.

4. So, What’s next?

Now that you are aware of our ML platform assessment framework and the 5 functional areas, let us start assessing ML platforms of various vendors.

In the upcoming articles you will see how different platforms are assessed based on this framework.

Stay tuned!

--

--