Real-time Machine Learning Inference at Scale Using Spark Structured Streaming

A complete demo for developing locally and deploying on Databricks

Photo by Hunter Harritt on Unsplash

Advantages of Spark Structured Streaming

1. Achieve virtually unlimited scalability

2. Achieve optimal performance

3. Fast turn-around for automation, testing, and debugging

4. Save engineering time

Demo: Getting Started

Photo by Everton Vila on Unsplash

Local Deployment

Photo by Avel Chuklanov on Unsplash

Deploying on Databricks

Photo by Frederik Merten on Unsplash

Logging, Autoscaling, and Monitoring

Photo by Carlos Muza on Unsplash



Monitoring Kafka Consumer Lag

Monitoring Databricks Prometheus Metrics

Software Engineer with entrepreneurial spirit. Passionate about building Machine Learning applications at scale. PhD in ECE, Univ. Minnesota. Caltech Alumnus.