Kubeflow with MLFlow

Varun Mallya
DKatalis
Published in
2 min readMay 27, 2022

--

Kubeflow and MLFlow are very well-known tools in the MLOps circle.

Kubeflow is a comprehensive ML Platform with features which range from auto ML to scheduling pipelines. MLFlow, on the other hand, works well as an artifact registry with a great experiment logging interface.

Kubeflow coupled with MLFlow is a marriage made in MLOps heaven.

Check out Github

To begin we would first want to deploy MLFlow on k8s. This can be achieved like any other deployment. We start with a Dockerfile:

FROM python:3.7-slim-buster

RUN pip3 install --upgrade pip && \
pip3 install mlflow==1.20.2 boto3 google-cloud-storage psycopg2-binary


ENTRYPOINT ["mlflow"," server"]

We are deploying our MLFlow on GKE with the GCS artifact store and Cloud SQL as the backend store. Please feel free to add any dependencies if you are deploying in a different environment.

Create a deployment.yaml:

Ensure that the env variables are passed via a config map or through some other means.

apiVersion: v1
kind: ServiceAccount
metadata:
name: mlflow-sa
namespace: mlflow
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mlflow-deployment
namespace: mlflow
labels:
app: mlflow
spec:
replicas: 1
selector:
matchLabels:
app: mlflow
template:
metadata:
labels:
app: mlflow
spec:
serviceAccountName: mlflow-sa
containers:
- name: mlflow
image: varunmallya/mlflow:latest
imagePullPolicy: Always
command: ["/bin/bash"]
args:
[mlflow server --host 0.0.0.0 --default-artifact-root ${MLFLOW_S3_ENDPOINT_URL} --backend-store-uri ${MLFLOW_TRACKING_URI}",
]
ports:
- containerPort: 5000

And then, its corresponding service.yaml:

apiVersion: v1
kind: Service
metadata:
name: mlflow-service
namespace: mlflow
spec:
selector:
app: mlflow
ports:
- protocol: TCP
port: 5000
targetPort: 5000

MLFlow Deployed!

Now, we would like to have an MLFlow tab as part of the central dashboard in Kubeflow. To achieve this we need to define a virtual service which will make the MLFlow service available via Istio Ingress.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: mlflow
namespace: mlflow
spec:
gateways:
- kubeflow/kubeflow-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /mlflow/
rewrite:
uri: /
route:
- destination:
host: mlflow-service.mlflow.svc.cluster.local
port:
number: 5000

Then we make a small change to the central dashboard config map:

kubectl edit cm centraldashboard-config -n kubeflow
# add this under the other menu items
{
“type”: “item”,
“link”: “/mlflow/”,
“text”: “MlFlow”,
“icon”: “icons:cached”
}

After which we proceed to restart the central dashboard deployment

kubectl rollout restart deploy centraldashboard -n kubeflow

There you have it MLFlow nested within Kubeflow!

Enjoy working, tinkering, and experimenting with MLFlow & Kubeflow? Perhaps you would make an excellent fit for the team!

--

--