ML-E7: Options for deploying machine learning & deep learning models on AWS

5 min readJun 22, 2023

There are many ways to deploy machine & deep learning models on Amazon Web Services (AWS). I’ve tried 2 or 3 so far.

ML series menu: E1 E2 E3 E4 E5 E6 E7 E8 E9

A serverless architecture example. CREDIT | AWS

CPU-based options

For GPU-based options see the next section.

The most cost effective, possible only for intermediate size models, around 2 GB, is #4 AWS Lambda, as you only pay for actual call execution time.

Amazon S3 and EC2: You can host your model on Amazon S3 (Simple Storage Service) and use EC2 (Elastic Compute Cloud) instances to run the model. You can customize your EC2 instance to use CPUs according to your needs. Then you can develop an application (maybe a web or mobile app) that sends a request to the EC2 instance to run the model and return predictions via an API.
Amazon SageMaker: This is a fully managed machine learning service that helps you build, train, and deploy machine learning models. SageMaker provides different instance types based on CPU, GPU, or even custom hardware that you can select based on your model needs. It also supports automatic scaling of instances to handle different loads.
AWS Elastic Beanstalk: It’s an easy-to-use service for deploying and scaling web…

ML-E7: Options for deploying machine learning & deep learning models on AWS

CPU-based options

Written by Paul Pallaghy, PhD