Analytics Vidhya
Published in

Analytics Vidhya


Amazon AI/ML Stack

Amazon’s Web Services have a series of optimized services specifically tailored for Artificial Intelligence and Machine Learning Algorithms. These fit into three major tiers, as follows:

Application Services — These are domain-based services which allow us to very quickly generate predictions with pre-trained models using simple API calls.

Platform Services — Unlike Application services, platform services allow us to build our customized Machine Learning and AI solutions through optimized and scalable options. The SageMaker service that we will discuss in this article, falls in this tier.

Frameworks and Hardware — The tiers mentioned above run on top of the frameworks and hardware tier. This layer provides a wide range of optimized deep learning tools like TensorFlow, Keras, Pytorch and Apache MXNet. Options of compute options (GPU, CPU) are also available.

Amazon SageMaker

Now that we know where Amazon’s SageMaker Service falls, lets delve a bit deeper into it.

A generic Machine Learning Pipeline has the following primary modules:

· Data Extraction
· Data Processing
· Data Analysis
· Feature Engineering
· Model Training and Tuning
· Prediction Generation
· Deployment to End-User

SageMaker combines these modules and works with three major components:

  • Build — Involves data extraction from S3, Docker or any other storage option used. Processing and feature engineering follow.
  • Train — This component combines model tuning and model training
  • Deploy — This component allows us to deploy the predictions and save them to the preferred storage location.

These components are independent of each other and can be used separately or even in required combinations.

The management of SageMaker components is extremely easy through Amazon SageMaker Console which has a very clean layout, making options for the different components easily accessible and configurable.


The build phase initiates the first interaction of Data with the Pipeline. The easiest way to do this is to generate SageMaker’s Notebook Instances. This not only enables the integration of the required code but also facilitates clear documentation and visualization.

The other options available for code integration is Spark SDK which enables integration of Spark pipeline through AWS’s Elastic Map Reduce or EMR service.


Setting up the training module in SageMaker is extremely easy and feasible. The primary attractions of the training component in SageMaker are as follows:

  • Minimal Setup requirements — on creating a simple training job, SageMaker takes care of the hardware requirements and backend jobs like fetching storage and instances.
  • Dataset Management — SageMaker takes care of streaming data and also helps manage distributed computing facilities which can help increase the speed of training.
  • Containerization — All models in SageMaker, whether it is an in-built model like XGBoost or K-Means Cluster, or a custom model integrated by the user, are stored in Docker containers. SageMaker efficiently manages and integrates the containers without any external aid from users.


There are several deployment options in SageMaker. With SageMaker’s UI, it is a one-step deployment process, providing high reliability with respect to quality, scalability and high throughput facilities.

Several models can be deployed using the same end-point (the point of deployment) so that the model can go through A/B testing which is supported by SageMaker.

One major advantage of the deployment facility is that SageMaker allows upgrades, updates and other modifications with zero downtime, owing to blue-green deployment (when two similar production environments are live such that if one goes down, the other one keeps the server up and running).

Batch predictions, which are often required in production, can also be carried out using SageMaker with specific instances which would stream data in from and out f S3 and distribute the tasks among GPU or CPU instances (as per the configuration).


With this, we have come to the end of Amazon SageMaker basic concepts. Watch this section for a DEMO on how to get started with SageMaker, which will be published soon.

For any questions or suggestions, you can drop a mail at

Or DM me on LinkedIn

Looking forward to connecting with you and your ideas!




Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Recommended from Medium

Regex Parentheses: Examples of Every Type

PgBouncer — Low Resource — PostgreSQL Connection Pooling

NSX-T Security with Ansible — Pt 2. NS Groups


The Risks of Replatforming

Top programming languages you should learn in 2022 CopyCAT Launch

Why do Traffic-Intensive Websites Need Dedicated Servers

How Is a VPS Different from a Dedicated Server?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Samadrita Ghosh

Samadrita Ghosh

Building Censius AI Observability- bridge to the next AI Gen. Tech writer in my free time, experienced working with Forbes Cloud100 & CB Insights AI100 brands

More from Medium

AWS Data Wrangler

Improving our data pipeline from device to cloud (bonus part)

MLOps using Jenkins Pipeline

Using AWS SageMaker to build an End-to-end Bank Application Model