The Fastest Way To Deploy AI Models

Vishal Rajput
AIGuys
Published in
7 min readJun 27, 2024

--

We all know that creating a model is one thing, but deploying it is a whole other level of work and effort. Even people who are more into building models, they also need to deploy them from time to time, especially for POCs and sometimes to pitch their ideas to other people. So, in today’s blog, we are going to cover one of the fastest and easiest ways to deploy AI models using Modal. Modal takes care of a lot of stuff which otherwise takes a lot of time and debugging. In today’s blog, we are going to deploy two segmentation models using a simple Streamlit app.

And no, it’s not a sponsored blog.

Introduction To Modal

Modal is designed to handle large-scale workloads efficiently, leveraging a custom-built container system in Rust for exceptionally fast cold-start times. This system design allows users to scale their applications to hundreds of GPUs and back down to zero within seconds, ensuring cost efficiency by paying only for the resources used. Modal supports rapid deployment of functions to the cloud with custom container images and hardware requirements, eliminating the need to write YAML configurations.

The platform excels in running machine learning models on GPUs, making it easy, cost-effective, and performant to execute large linear algebra tasks essential for contemporary ML models…

--

--