Streaming Model inference using Flask and Kafka

Kafka made easy with Flask

Vatsal Saglani
Geek Culture

--

What is Kafka?

Apache Kafka is a highly fault-tolerant event streaming platform. In event streaming the data is captured in real-time and from different even sources, it can be your web analytics data, data from a thermostat or even a database. Along with data capturing there are a horde of resources provide along with Kafka to manipulate and process the data and dividing resources efficiently by prioritising different highly critical processes and moderately impactful process. That’s Kafka and event streaming in short. To learn more about Kafka check out this video.

How will Kafka benefit/help by streaming model inferences?

Most of the Deep Learning models are deployed via Flask over REST API calls. Later, to deploy it using a server; developers use servers like gunicorn and uvicorn with different numbers of workers and threads. This is all fine till you are doing only inference where you have only one model and it doesn't take that much time to provide the inference. But, if there are outputs from combinations of different models or even if there are multiple steps after inference it's better to have a stream processing pipeline.

--

--

Vatsal Saglani
Geek Culture

Data Science Lead - GenAI. A Software Engineer, Programmer & Deep Learning professional. https://vatsalsaglani.vercel.app