Start Your Machine Learning on AWS SageMaker

Guang X
Guang X
Feb 13, 2019 · 7 min read

What will be covered in this article:

  • How SageMaker works
  • How to prepare a model for SageMaker
  • How to use AWS Lambda to trigger model training and deployment automatically

source code for this article can be found at

Image for post
Image for post

1. SageMaker Introduction

API levels

All you need to do is to define the model training/prediction/data input/output function, and then submit/deploy the source code, with necessary configurations (e.g.: instance types). The SDK will take care of the rest work (e.g.: load data from S3, create training job, publish model endpoint)

Besides defining the source code, you also need to upload the source code to S3 yourself, and specify the s3url to the source code, and explicitly setup all other configurations.

Essentially SageMaker does everything within a container. Users can create their own docker container and make it do whatever they want. These containers are called Algorithm in SageMaker

In this article, I will cover the usage of python-sagemaker-sdk andboto3. Defining your own docker container (low level API) is only necessary when your ML model is not based on any of the SageMaker supported frameworks: Scikit-Learn, Tensorflow, PyTorch, …..

SageMaker Modules

Image for post
Image for post

There are many modules provided by SageMaker. In this article, the following module will be used:

  • Notebook instances: full managed jupyter-notebook instance where you can test your machine learning code with access to all other AWS services (e.g. S3)
  • Training jobs: the place to manage model training job
  • Models: the place to manage trained models
  • Endpoints: full managed web service that can handle requests (HTTP or others) as input and make predictions as responses
  • Endpoint configurations: configuration for endpoints
Image for post
Image for post

2. Prepare SageMaker Model

Main script

  • It will be run by SageMaker with several command line arguments and environment variables pass into it.
  • It should load training data from --train directory and output the trained model (usually a binary file) into --model-dir
  • Example code looks like this:

Besides the main script, other functions are defined for model deployment, which includes the following functions:

  • model_fn: loads saved model binary file(s)
  • input_fn: parses input data from different interface(HTTP request or Python function calling)
  • predict_fn: takes parsed input and make predictions with the model loaded by model_fn
  • output_fn: encodes prediction made by predict_fn and return to corresponding interface (HTTP response or python function return)
Image for post
Image for post



Usually, the input_fn should check the type of input data (request_content_type) first before it can be parsed. You can also do data pre-processing here. Example code looks like this:


Usually, we may need to do some data transformation before prediction. Example code looks like this:


All above functions should be put into a python script (let’s say it is ), then we can use python-sagemaker-sdk to test our model for SageMaker in our local environment.

3. Model development with python-sagemaker-sdk

We use from sagemaker.sklearn import SKLearn if our model is based on Scikit-learn. If the model is based on Tensorflow, we can use from sagemaker.tensorflow import TensorFlow instead.

The meaning of these arguments can be found in SageMaker official documents for scikit-learn, TensorFLow, and PyTorch.

Debug Locally

Test Published Endpoint

4. Using Lambda Function to control SageMaker

In real world, we should receive new data every day and need to retrain the machine learning model periodically.

Moreover, model training usually takes a long time and we need to make sure one the training job is done, the trained model should be automatically deployed into existing endpoint (SageMaker does not do this automatically).

There are two solutions for this: Step Function and Lambda

  • Solution 1-Step Function. We can define a lambda function to check the status of a training job. Then we use a step function to call the lambda function (e.g. every 1 hour), once training is done, call another lambda function to deploy the model. This solution has been well-documented in this article
  • Solution 2: Lambda. Once a model training job is finished, the trained model will be written into an S3 bucket. S3 PUT event can be associated with a lambda function to trigger model deployment.

This article will demonstrate the solution 2, how we use lambda function and S3 event to manage SageMaker.

Image for post
Image for post

The lambda_handler takes the argument event, which contains information about how the lambda is triggered. By interpreting the event, we can conduct different action within one lambda handler. Following is an example code of the lambda function we use:

Three different events are handled here:

  • If input event is S3 event, it will call handle_s3_event(records[‘s3’])
  • Otherwise, if event is model re-train task, it will call retrain_the_model()
  • And if event is prediction task, it will call make_prediction() and pass the result as response

Trigger model retrain task periodically

Image for post
Image for post

Lambda function can be triggered by CloudWatch event periodically.

We can add a CloudWatch Event trigger and set up a Rule.

  • Event Source should be Schedule, with customised event pattern (very similar to crontab on Linux)
  • Set Targets as the lambda function we use to control SageMaker.
  • Configure input can be Constant (JSON text), so that the lambda_handler can understand what to do with it.

Model Retrain

This function will create a model training job.

Please be noted that we cannot use python-sagemaker-sdk within lambda environment. Then the best solution is to use boto3. That means we need to upload the to S3 in a gzipped tar package by ourselves. And also set up everything as done in above code.

Model Deployment

Once a training job is done, an S3 PUT event will be triggered, which can notify our lambda function that training is done and we can do deployment now.

Within an S3 PUT event, the key of the S3 object is provided, which is usually related to the unique ID of our training job, as is done in the following code:

Once the model training job ID is known, we then can call the deploy_model function to deploy our model, which looks like this:

And then, everything is set up and our model will keep training periodically and provide the best performance.

Wrap Up

To use SageMaker for Machine Learning, the most important step is to prepare a script that defines the behaviours of your model. And you also have full control of the whole system by creating your own docker container.

Other things to be aware of

  • S3 Event doesn’t guarantee 100% delivery. If model training is critical, Step-Function is a better choice
  • Make sure your AWS role have enough permission to control necessary resources
  • SageMaker can also run batch prediction jobs, and there are many other functions remain to be explored.

Find out more about Servian’s AI and ML capabilities here.


The Cloud & Data Professionals

Thanks to Marat Levit

Guang X

Written by

Guang X



At Servian, we design, deliver and manage innovative data & analytics, digital, customer engagement and cloud solutions that help you sustain competitive advantage.

Guang X

Written by

Guang X



At Servian, we design, deliver and manage innovative data & analytics, digital, customer engagement and cloud solutions that help you sustain competitive advantage.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store