MLOps: Serving AI apps to million users
Published in
16 min readJul 21, 2023
MLOps is the new kid in the town, grabbing all the attention 🔥right now. MLOps is yet to mature, thus, there is a lot of confusion about how to deploy models to a large user base. Running a simple EC2 and putting the model there will only work for a few practical use cases.
In today’s blog, we will put an image segmentation model in Google Cloud and make our server auto-scale to millions of users 🚀. This blog is divided into the following parts:
- Creating a segmentation model and compressing it using Quantization