Real-time Machine Learning Inference at Scale Using Spark Structured Streaming

A complete demo for developing locally and deploying on Databricks

Photo by Hunter Harritt on Unsplash

Advantages of Spark Structured Streaming

1. Achieve virtually unlimited scalability

2. Achieve optimal performance

3. Fast turn-around for automation, testing, and debugging

4. Save engineering time

Demo: Getting Started

Photo by Everton Vila on Unsplash

Local Deployment

Photo by Avel Chuklanov on Unsplash

Deploying on Databricks

Photo by Frederik Merten on Unsplash

Logging, Autoscaling, and Monitoring

Photo by Carlos Muza on Unsplash

Logging

Autoscaling

Monitoring Kafka Consumer Lag

Monitoring Databricks Prometheus Metrics

Software Engineer with entrepreneurial spirit. Passionate about building Machine Learning applications at scale. PhD in ECE, Univ. Minnesota. Caltech Alumnus.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store