Real-Time Prediction Serving, Simplified


Why is Prediction Serving Harder Than Training?

  • Building pipelines is cumbersome and unfriendly to open source — if a pipeline’s models weren’t trained on AWS Sagemaker, the user has to manually construct a Docker container for each model, along with a webserver that wraps the model and handles requests.
  • Deploying pipelines often requires hacking around the systems’ supported APIs: None of these systems support pipelines with parallelism, which is an important requirement for real-world pipelines executing multiple models. A pipeline with, say, 3-way parallelism would have to deploy three different Sagemaker pipelines and build a separate proxy service to route requests correctly.
  • Managing pipelines is difficult as the infrastructure for these systems is fixed-deployment, meaning resource allocation must be managed manually. This is particularly difficult for data scientists, many of whom might not be interested in or skilled at deploying and operating scalable online services.

Live Prediction Serving with Familiar Data Pipeline APIs

  • Added GPU support.
  • Now supporting execute-many-pick-one semantics for competitive execution.
  • Added continuations, so pipelines with dynamic lookups could leverage Cloudburst’s locality-aware scheduling.
  • Added batching support, especially for GPUs.

Looking Forward




Working on distributed systems and serverless things in grad school @ Cal.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

My First Adventures in Similarity Search

GLoVE: Theory and Python Implementation

Build a Dog breed classifier using CNN

Text Augmentation in a few lines of Python Code

AI-900: Exam Questions (2/7)

Neural Networks Part 2: Backpropagation and Gradient Checking

Finding out Optimum Neighbours (n) number in the KNN classification using Python

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vikram Sreekanti

Vikram Sreekanti

Working on distributed systems and serverless things in grad school @ Cal.

More from Medium

The Simple Introduction to Hadoop EcoSystem

AWS SageMaker and GitHub Integration

You Will Like Cloud IAM!

Tarantool 2.10: data compression, traffic encryption, and incidents investigation