Deploy Machine Learning Model using Amazon SageMaker (Part 3)

Olusegun Ajose
6 min readMay 9, 2022

--

In this article, we will continue our series on deploying a machine learning model using Amazon SageMaker.

This is part 3 of a series:

Set up SageMaker — Part 1

Data Preprocessing — Part 2

Train the Model — Part 3 (You are here)

So far, we have set up an Amazon SageMaker instance, preprocessed our data, and uploaded the data into Amazon S3. Now we are going to train our model.

Choose the Training Algorithm

Ideally, we need to evaluate different models in order to find the most suitable models for our data. We can let SageMaker Autopilot find an appropriate model for this tabular dataset.

Amazon SageMaker Autopilot automates a machine learning solution by automatically building, training, and tuning the best machine learning models based on the dataset.

Amazon SageMaker Autopilot

For simplicity, we won’t be using the Amazon SageMaker Autopilot but we will use the SageMaker XGBoost Algorithm built-in algorithm.

Run a model training job

The first thing we have to do is to run a model training job.

The Amazon SageMaker Python SDK provides framework estimators and generic estimators to train your model while orchestrating the machine learning (ML) lifecycle accessing the SageMaker features for training and the AWS infrastructures, such as Amazon Elastic Container Registry (Amazon ECR), Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3).

We start by importing the Amazon SageMaker Python SDK and use it to retrieve the basic information from the current SageMaker session.

This returns 2 information. The first one is the AWS region - the current AWS region where the SageMaker notebook instance is running. Currently, you can see that it’s running in EU East 3 Region.

The second output is the IAM role used by the notebook instance we created earlier.

We can check the SageMaker Python SDK version by running the following command.

The current version is 2.86.2. If the SageMaker’s version was less than 2.20, then you will have to update it by using

Next, we are going to create an XGBoost estimator using the sagemaker.estimator.Estimator class.

In the code above, the XGBoost estimator is named xgboost_model. Now, to construct the SageMaker estimator, we need to specify some parameters.

image_urispecifies the training container image. In this example, the Sagemaker XGboost training container URI is specified by using the command sagemaker.image_uris.retrieve.

roleis the AWS identity and access management role that SageMaker uses to perform tasks on your behalf. Some of these tasks include reading training results, calling model artifacts from Amazon s3, and writing training results to Amazon s3.

instance_count and instance_type specify the type and number of Amazon EC2 ML compute instances to use for model training. For this training exercise, we use an ml.m5.xlarge instance. (You can use anything you like).

volume_size is the size (in GB) of the EBS storage volume to attach to the training instance, and this must be large enough to store training data if you use file mode. File mode is on by default, so we don’t have to worry about that.

output_path is the path to the s3 bucket where SageMaker stores the model artifact and training results.

sagemaker_session is the session object that manages interactions with SageMaker API operations, and AWS services that the training job uses.

rulesspecify a list of SageMaker debugger built-in rules. In this example, we have used the create XGboost report which creates an XGboost report that provides insights into the training progress and results. We will check this report out later.

Setting the hyperparameters

Next is to set the hyperparameters for the XGboost algorithm by calling the set_hyperparametersmethod of the estimator and there is a complete list of XGboost hyperparameters.

You can also tune hyperparameters using the SageMaker hyperparameter optimization features.

Configure data input flow for training

Next, we use the TrainingInputclass to configure a data input flow for training.

The code above shows how to configure TrainingInput objects to use the training and validation datasets you uploaded to Amazon S3 in the Split the Dataset into Train, Validation, and Test Datasets section in Part 2.

Train the model

Now that we have configured the training job, we are finally going to train the model.

Here, we call the estimator’s fit method with the training and validation datasets. By setting wait=True, the fit method displays progress logs and waits until training is complete.

The output of the code above shows that it has started the training job which can take a while.

And once the training is done, you can see that the output is completed. Training job completed, Training seconds and Billable seconds is given.

Download an XGBoost Training Report

After the training job has done, you can download an XGBoost training report and a profiling report generated by SageMaker Debugger.

The XGBoost training report offers you insights into the training progress and results, such as the loss function with respect to iteration, feature importance, confusion matrix, accuracy curves, and other statistical results of training.

We run the following code to specify the s3 bucket URI where the debugger training reports are generated and check if the reports exist.

Next, we will have to download the debugger XGBoost training and profiling reports to the current workspace.

Next, we will have to run the following IPython script to get the file link of the XGBoost training report.

Now the following script returns the filing of the debugger profiling report that shows summaries and details of the ec2 instance, resource utilization, system bottleneck detection results, and also the python operation profiling results.

Once you run this block of code, you can just click below to view the profile report. The report shows a summary and statistics of the report. Here, you have the rules summary where you can see the GPU memory, increase the CPU bottleneck, the max initialization time, the low GPA utilization load, balancing, etc. You can also see the start time, end time, and job duration.

View the location of the model artifact

Now that we have a trained XGBoost model, SageMaker stores the model artifact in your S3 bucket. To find the location of the model artifact, run the following code to print the model_data attribute of the xgb_model estimator.

In the next part, we will Deploy the Model to Amazon EC2.

Here is a link to the notebook on GitHub.

--

--

Olusegun Ajose

Data Scientist with experience in software development and machine learning. Currently focused on building responsible Artificial Intelligence solutions.