Kubeflow with MLFlow
Kubeflow and MLFlow are very well-known tools in the MLOps circle.
Kubeflow is a comprehensive ML Platform with features which range from auto ML to scheduling pipelines. MLFlow, on the other hand, works well as an artifact registry with a great experiment logging interface.
Kubeflow coupled with MLFlow is a marriage made in MLOps heaven.
To begin we would first want to deploy MLFlow on k8s. This can be achieved like any other deployment. We start with a Dockerfile:
FROM python:3.7-slim-buster
RUN pip3 install --upgrade pip && \
pip3 install mlflow==1.20.2 boto3 google-cloud-storage psycopg2-binary
ENTRYPOINT ["mlflow"," server"]
We are deploying our MLFlow on GKE with the GCS artifact store and Cloud SQL as the backend store. Please feel free to add any dependencies if you are deploying in a different environment.
Create a deployment.yaml
:
Ensure that the env variables are passed via a config map or through some other means.
apiVersion: v1
kind: ServiceAccount
metadata:
name: mlflow-sa
namespace: mlflow
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mlflow-deployment
namespace: mlflow
labels:
app: mlflow
spec:
replicas: 1
selector:
matchLabels:
app: mlflow
template:
metadata:
labels:
app: mlflow
spec:
serviceAccountName: mlflow-sa
containers:
- name: mlflow
image: varunmallya/mlflow:latest
imagePullPolicy: Always
command: ["/bin/bash"]
args:
[mlflow server --host 0.0.0.0 --default-artifact-root ${MLFLOW_S3_ENDPOINT_URL} --backend-store-uri ${MLFLOW_TRACKING_URI}",
]
ports:
- containerPort: 5000
And then, its corresponding service.yaml
:
apiVersion: v1
kind: Service
metadata:
name: mlflow-service
namespace: mlflow
spec:
selector:
app: mlflow
ports:
- protocol: TCP
port: 5000
targetPort: 5000
MLFlow Deployed!
Now, we would like to have an MLFlow tab as part of the central dashboard in Kubeflow. To achieve this we need to define a virtual service which will make the MLFlow service available via Istio Ingress.
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: mlflow
namespace: mlflow
spec:
gateways:
- kubeflow/kubeflow-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /mlflow/
rewrite:
uri: /
route:
- destination:
host: mlflow-service.mlflow.svc.cluster.local
port:
number: 5000
Then we make a small change to the central dashboard config map:
kubectl edit cm centraldashboard-config -n kubeflow
# add this under the other menu items
{
“type”: “item”,
“link”: “/mlflow/”,
“text”: “MlFlow”,
“icon”: “icons:cached”
}
After which we proceed to restart the central dashboard deployment
kubectl rollout restart deploy centraldashboard -n kubeflow
There you have it MLFlow nested within Kubeflow!
Enjoy working, tinkering, and experimenting with MLFlow & Kubeflow? Perhaps you would make an excellent fit for the team!