ML Platform User Experience: Build for Data Scientists

Arpeet Kale
2 min readJan 6, 2022

--

Image by @karolina-grabowska
Photo by Karolina Grabowska from Pexels

I’ve been part of building large scale ML Platform twice over multiple iterations in a cross functional capacity. Here are 5 key principles to follow for providing a seamless user experience to your data scientists.

Abstraction

Data Scientists do not need to know where data is stored, how it is pulled or how to deploy containers on K8’s. Provide SDK/Libraries to make it easy for them to create, configure & deploy their experiments on your platform.

Interactive Experimentation

Data Scientists need a space to test their code, look at data and attributes before running a full training cycle. Provide Jupyter notebooks to interactively work with data & model algorithms.

Flexibility

Provide the data scientists the capability to author & run experiments in the framework of their choice. At the very least provide first-class citizen support for TensorFlow and PyTorch.

Research to Production

Create a pipeline to easily promote model algorithm (eg. neural network) and related business logic (pre/post-processing) to production. Provide a CI/CD pipeline to package the data scientists code behind an API & deploy as a container.

Visibility

Capture metrics from the model prediction workflows and provide way to visualize the model behavior (eg. compute SHAP values). Data scientists need a feedback loop to measure how the model behaves in production and improve iteratively.

--

--

Arpeet Kale

Founding Engineer @ Skiff Building Usable Privacy - https://www.skiff.org/ Previous: Deep Learning Platform @ Salesforce Einstein AI