Introducing MLCube
MLCube is a set of best practices for creating ML software that can “plug-and-play” on many different systems.
About MLCube
The machine learning (ML) community has seen an explosive growth and innovation in the last decade. New models emerge on a daily basis, but sharing those models remains an ad-hoc process. Often, when a researcher wants to use a model produced elsewhere, they must waste hours or days on a frustrating attempt to get the model to work. Similarly, a ML engineer may struggle to port and tune models between development and production environments which can be significantly different from each other. This challenge is magnified when working with a set of models, such as reproducing related work, employing a performance benchmark suite like MLPerf, or developing model management infrastructures. Reproducibility, transparency and consistent performance measurement are cornerstones of good science and engineering.
The industry needs simple and interchangeable building blocks that can be easily shared for experimentation then later composed into mature and robust workflows. Prior works in the MLOps space have provided a variety of tools and processes that simplify user journey of deploying and managing ML in various environments, which include management of models, datasets, and dependencies, tracking of metadata and experiments, deployment and management of ML lifecycles, automation of performance evaluations and analysis, etc.
MLCube isn’t a new framework or service; MLCube is a consistent interface to machine learning models in containers like Docker. Models published with the MLCube interface can be run on local machines, on a variety of major clouds, or in Kubernetes clusters — all using the same code. MLCommons provides simple open source “runners” for each of these environments that make training a model in an MLCube a single command, but MLCube is also designed to make it easy to build new infrastructure based on the interface.
So you can view MLCube as a wrapper for models that describes and standardizes a few things, like dependencies, input and output format, hosting and so on.
MLCube status
MLCube is currently a pre-alpha project with an active development team. We invite experimentation and feedback, code contributions, and partnerships with ML infra efforts. Join our Mailing List and say hello.
MLCube examples
The repository referenced below contains a number of MLCube examples that can run in different environments using MLCube runners.
- MNIST MLCube downloads data and trains a simple neural network. This MLCube can run with Docker or Singularity locally and on remote hosts. The README file provides instructions on how to run it. MLCube documentation provides additional details.
- EMDenoise MLCube downloads data and trains a deep convolutional neural network for Electron Microscopy Benchmark. This MLCube can only run the Docker container. The README file provides instructions on how to run it.
- Matmul Matmul performs a matrix multiply.
Tutorial: Create an MLCube
Step 1: Setup
Get MLCube, MLCube examples and MLCube Templates, and CREATE a Python environment.
# You can clone the mlcube examples and templates from GtiHub
git clone https://github.com/mlcommons/mlcube_examples
# Create a python environment
virtualenv -p python3 ./env && source ./env/bin/activate
# Install mlcube, mlcube-docker and cookiecutter
pip install mlcube mlcube-docker cookiecutter
Step 2: Configure MLCube using the mlcube_cookiecutter
Let’s use the ‘matmult’ example, that we downloaded in the previous step, to illustrate how to make an MLCube. Matmul is a simple matrix multiply example written in Python with TensorFlow. When you create an MLCube for your own model you will use your own code, data and dockerfile.
cd mlcube_examples
# rename matmul reference implementaion from matmul to matmul_reference
mv ./matmul ./matmul_reference
# create a mlcube directory using mlcube template(note: do not use quotes in your input to cookiecutter): name = matmul, author = MLPerf Best Practices Working Group
cookiecutter https://github.com/mlcommons/mlcube_cookiecutter.git
# copy the matmul.py,Dockerfile and requirements.txt to your mlcube_matmul/build directory
cp -R matmul_reference/build matmul
# copy input file for matmul to workspace directory
cp -R matmul_reference/workspace matmul
Edit the template files.
Start by looking at the mlcube.yaml file that has been generated by cookiecutter.
cd ./matmul
Cookiecutter has modified the lines shown in bold in the mlcube.yaml file shown here:
# This YAML file marks a directory to be an MLCube directory. When running MLCubes with runners, MLCube path is
# specified using `--mlcube` runner command line argument.
# The most important parameters that are defined here are (1) name, (2) author and (3) list of MLCube tasks.
schema_version: 1.0.0
schema_type: mlcube_root
# MLCube name (string). Replace it with your MLCube name (e.g. "matmul" as shown here).
name: matmul
# MLCube author (string). Replace it with your MLBox name (e.g. "MLPerf Best Practices Working Group").
author: MLPerf Best Practices Working Group
version: 0.1.0
mlcube_spec_version: 0.1.0
# List of MLCube tasks supported by this MLBox (list of strings). Every task:
# - Has a unique name (e.g. "download").
# - Is defined in a YAML file in the `tasks` sub-folder (e.g. "tasks/download.yaml").
# - Task name is passed to an MLBox implementation file as the first argument (e.g. "python mnist.py download ...").
# Every task is described by lists of input and output parameters. Every parameter is a file system path (directory or
# file) characterized by two fields - name and value.
# By default, if a file system path is a relative path (i.e. does not start with `/`), it is considered to be relative
# to the `workspace` sub-folder.
# Once all tasks are listed below, create a YAML file for each task in the 'tasks' sub-folder and change them
# appropriately.
# NEXT: study `tasks/task_name.yaml`, note: in the case of matmul we only need one task.
tasks:
- tasks/matmul.yaml
Now we will look at file ./matmul/tasks/matmul.yaml.
cd ./tasks
Cookiecutter has modified the lines shown in bold in the matmul.yaml file shown here:
# This YAML file defines the task that this MLCube supports. A task is a piece of functionality that MLCube can run. Task
# examples are `download data`, `pre-process data`, `train a model`, `test a model` etc. MLCube runtime invokes MLCube
# entry point and provides (1) task name as the first argument, (2) task input/output parameters (--name=value) in no
# particular order. Inputs, outputs or both can be empty lists. For instance, when MLCube runtime runs an MLCube task:
# python my_mlcube_entry_script.py download --data_dir=DATA_DIR_PATH --log_dir=LOG_DIR_PATH
# - `download` is the task name.
# - `data_dir` is the output parameter with value equal to DATA_DIR_PATH.
# - `log_dir` is the output parameter with value equal to LOG_DIR_PATH.
# This file only defines parameters, and does not provide parameter values. This is internal MLCube file and is not
# exposed to users via command line interface.
schema_version: 1.0.0
schema_type: mlcube_task
# List of input parameters (list of dictionaries).
inputs:
- name: parameters_file
type: file
# List of output parameters (list of dictionaries). Every parameter is a dictionary with two mandatory fields - `name`
# and `type`. The `name` must have value that can be used as a command line parameter name (--data_dir, --log_dir). The
# `type` is a categorical parameter that can be either `directory` or `file`. Every intput/output parameter is always
# a file system path.
# Only parameters with their types are defined in this file. Run configurations defined in the `run` sub-folder
# associate parameter names and their values. There can be multiple run configurations for one task. One example is
# 1-GPU and 8-GPU training configuration for some `train` task.
# NEXT: study `run/task_name.yaml`.
outputs:
- name: output_file
type: file
Our input file shapes.yaml that we have copied previously into the mlcube workspace contains input parameters to set matrix dimensions. We need to remove the automatically generated parameters file.
rm ../workspace/parameters_file.yaml
Now we will edit file ./matmul/run/matmul.yaml.
cd ../run
The lines you need to edit are shown in bold in the matmul.yaml file shown here:
# A run configuration assigns values to task parameters. Since there can be multiple run configurations for one
# task (i.e., 1-GPU and 8-GPU training), run configuration files do not necessarily have to have the same name as their
# tasks. Three sections need to be updated in this file - `task_name`, `input_binding` and `output_binding`.
# Users use task configuration files to ask MLCube runtime run specific task using `--task` command line argument.
schema_type: mlcube_invoke
schema_version: 1.0.0
# Name of a task.
# task_name: task_name
task_name: matmul
# Dictionary of input bindings (dictionary mapping strings to strings). Parameters must correspond to those in task
# file (`inputs` section). If not parameters are provided, the binding section must be an empty dictionary.
input_binding:
parameters_file: $WORKSPACE/shapes.yaml
# Dictionary of output bindings (dictionary mapping strings to strings). Parameters must correspond to those in task
# file (`outputs` section). Every parameter is a file system path (directory or a file name). Paths can be absolute
# (starting with `/`) or relative. Relative paths are assumed to be relative to MLCube `workspace` directory.
# Alternatively, a special variable `$WORKSPACE` can be used to explicitly refer to the MLCube `workspace` directory.
# MLCube root directory (`--mlcube`) and run configuration file (`--task`) define MLCube task to run. One step left is
# to specify where MLCube runs - on a local machine, remote machine in the cloud etc. This is done by providing platform
# configuration files located in the MLCube `platforms` sub-folder.
# NEXT: study `platforms/docker.yaml`.
output_binding:
output_file: $WORKSPACE/matmul_output.txt
Now we will edit file ./matmul/platforms/docker.yaml
cd ../platforms
Edit the docker image name in docker.yaml. Change “image: “mlcube/matmul:0.0.1” to “mlcommons/matmul:v1.0”
# Platform configuration files define where and how runners run MLCubes. This configuration file defines a Docker
# runtime for MLCubes. One field need to be updated here - `container.image`. This platform file defines local docker
# execution environment.
# MLCube Docker runner uses image name to either `pull` or `build` a docker image. The rule is the following:
# - If the following file exists (`build/Dockerfile`), Docker image will be built.
# - Else, docker runner will pull a docker image with the specified name.
# Users provide platform files using `--platform` command line argument.
schema_type: mlcube_platform
schema_version: 0.1.0
platform:
name: "docker"
version: ">=18.01"
container:
image: "mlcommons/matmul:v1.0"
Step 3. Create a Dockerfile for your model container image
You will need a docker image to create an MLCube. We will use the Dockerfile for ‘matmul’ to create a docker container image:
Note: the last line of the Dockerfile must be
“ENTRYPOINT [“python3”, “/workspace/your_mlcube_name.py”]” as shown below.
Now we will edit the my_mlcube/build/Dockerfile
cd ../build# Sample Dockerfile for matmul (Matrix Multiply)
FROM ubuntu:18.04
MAINTAINER MLPerf MLBox Working Group
WORKDIR /workspace
RUN apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common \
python3-dev \
curl && \
rm -rf /var/lib/apt/lists/*
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py && \
rm get-pip.py
COPY requirements.txt /requirements.txt
RUN pip3 install --no-cache-dir -r /requirements.txt
COPY matmul.py /workspace/matmul.py
ENTRYPOINT ["python3", "/workspace/matmul.py"]
Step 4: Build Docker container Image
cd ..
mlcube_docker configure --mlcube=. --platform=platforms/docker.yaml
Step 5: Test your MLCube
mlcube_docker run --mlcube=. --platform=platforms/docker.yaml --task=run/matmul.yaml
ls ./workspace
cat ./workspace/matmul_output.txt
Conclusion
MLCube is a set of common conventions for creating ML software that can “plug-and-play” on many different systems. MLCube makes it easier for researchers to share innovative ML models, for a developer to experiment with many different models, and for software companies to create infrastructure for models. It creates opportunities by putting ML in the hands of more people.