Introducing Seldon Core — Machine Learning Deployment for Kubernetes

We’re excited to announce the release of a new open-source platform that enables data science teams to easily run and manage models in production at scale.

Seldon Core focuses on solving the last step in any machine learning project to help companies put models into production, to solve real-world problems and maximise the return on investment.

Data scientists are freed to focus on creating better models while devops teams are able to manage deployments more effectively using tools they understand.

The goals of the platform are to:

  • Enable data scientists to deploy models built with any machine learning toolkit or programming language. We plan to initially support Python-based tools/languages including Tensorflow, Scikit-learn, Spark and H20.
  • Expose machine learning models via REST and gRPC automatically when deployed for easy integration into business apps and services that need predictions.
  • Handle full lifecycle management of the deployed model with no downtime, including updating the runtime graph, scaling, monitoring, and security.

Inference Graphs

Instead of just serving up single models behind an API endpoint, Seldon Core allows complex runtime inference graphs to be deployed in containers as microservices. These graphs can be composed of:

  • Models — runtime inference executable for one or more machine learning models.
  • Routers — route API requests to sub-graphs. E.g. to serve experiments such as AB Tests or more dynamic Multi-Armed Bandits.
  • Combiners — combine the responses from sub-graphs. E.g. ensembles of models.
  • Transformers — transform request or responses. E.g. transform feature requests.
Example Runtime Model Graph

Why this is important?

Efficiency — traditional infrastructure stacks and devops processes don’t translate well to machine learning, and there is limited open-source innovation in this space, which forces companies to build their own at great expense or to use a proprietary service. Also, data engineers with the necessary multidisciplinary skillset spanning ML and ops are very scarce. These inefficiencies cause data scientists get pulled into quality-of-service and performance-related challenges that takes their focus away from where they can add the most value — building better models.

Innovation — only when a model is in production can you measure its performance on real-world problems. Seldon Core enables you to decouple data science from application release cycles for faster iteration cycles. Experimentation processes such as AB testing and multi-armed bandits are usually placed much higher in the stack, often using tools that were not designed for machine learning. The tight coupling of models with the inference graph leads to faster iteration cycles and an opportunity to build and optimise more use cases faster to accelerate innovation and ROI.

Freedom according to research from IDC in 2017, 40% of European organisations stretch applications across clouds. They do so for commercial, backup, resilience or regulatory reasons. Having a single cloud-agnostic ML deployment platform will facilitate the use of multiple clouds for machine learning. Also, model-building tools like TensorFlow are continually evolving, so organisations need an elegant framework-agnostic solution to avoid the trap of monolithic architectures, and to facilitate agile data science teams and processes.

Getting Started

Seldon Core runs on any cloud platform, on bare metal servers and individual laptops. You can get started quickly by installing the official release via Helm:

helm install seldon-core --name seldon-core --repo

We’ve designed a Kubernetes Custom Resource for a Seldon Deployment, which means you can manage deployments directly with kubectl instead of having to learn a new CLI:

kubectl apply -f my_ml_deployment.yaml

Seldon Community

Our team of data scientists and engineers at Seldon are building and managing the Seldon Core project directly on Github, not in a private company repo and project management systems, so expect to see lots of activity from us.

Your contributions are very welcome — here are some ideas to get started and contributor guidelines. Big thanks to @errordeveloper from Weaveworks for submitting the first community pull request! Please don’t hesitate to create an issue if you have any feedback or questions.

Seldon Core is available under an Apache 2.0 license. You are free to use it for small scale projects or scale to thousands of nodes running on your own infrastructure. We hope it helps you solve large and meaningful problems.