Classifying Images Using AWS SageMaker

Neuralyte
4 min readSep 6, 2020

--

As machine learning becomes more prevalent in daily society, more machine learning models are being constructed to solve emerging problems in the world. AWS SageMaker provides a space where much of the common machine learning models (PCA, k-means clustering, linear regression) are stored as a template so that individuals can simply access these models, pass in their input data, and run the model in an automated fashion. This article will show the necessary steps in implementing a basic image classifier from an AWS SageMaker template.

Step 1: Log in to your AWS account and navigate to SageMaker Training Jobs.

Step 2: Create a training job and fill in basic information like the job name, IAM role, and the algorithm that you wish to use. Remember to incorporate all AWS resources you are using to your IAM role so that your algorithm does not face errors while training when accessing S3 buckets. We will be using Amazon SageMaker’s built-in algorithms; however, it is possible to create your own custom algorithm in ECR or purchase them from the AWS Marketplace.

Step 3: You can provide your training data by either using a file or pipe format. The pipe format is a record IO definition that can be streamed through to shorten the process of moving large amounts of data back and forth while the file format is simply for smaller files stored in S3 and EFS. Take note of the metrics that are used as these will be provided in more detail later after the model has finished training.

Below, you can specify the number of instances and the additional storage volume for each instance. You can also specify the instance type depending on the compute workload that you have; since image classification needs GPUs, the accelerated computing instance type must be used (ml.p2, ml.p3, ml.g4dn). Encryption can be enabled using KMS; however, for this project, it is not necessary. Lastly, you can specify how long you want the model to run before it shuts down through a time out limit; this prevents your code from running indefinitely.

Step 4: You can customize your training job to be inside a private VPC so that it can access specific resources or enable network isolation to stop any outbound network calls; however, for this project, this customization isn’t necessary.

Step 5: Hyperparameters like the number of layers and epochs for the model can be changed in this section. For automated hyperparameter tuning, navigate to the AWS SageMaker hyperparameter tuning jobs section in the console and create a tuning job with the specified parameters that you want to evaluate. You can specify different ranges for each parameter and the metric to measure (e.g. sensitivity and specificity). Comment below for a more detailed explanation of this process.

Step 6: Specify the input data configuration by creating channels of data that the model will use to train. There are four main channels that are used: train, validation, train_labels, and validation_labels.

In the S3 bucket, there should be separate folders that separate the input data into the separate labels that you wish to classify for this project (e.g. if there are two classes like red or green, then there would be two separate folders for each class). Additionally, there should be two lists (training and validation) that contain the specific class that the image is part of. These lists should have three elements: the index, the label, and the file path to get the image from the S3 bucket.

Make sure to specify the input mode as either file or pipe and the content type as either x-recordio (record IO format) or x-image (image format). For this particular project, we will be using S3 instead of a file system like EFS or FSx Lustre.

Step 7: Make sure to choose the S3 location for your output model and any additional checkpoints along the way. Checkpoints can be used to resume training jobs if a spot instance is being used and was terminated. Managed spot training can be used to optimize the cost of training models; however, this process could take longer due to the pricing present inside of the spot market.

Step 8: Create your training job and wait for the model to train. Click inside the training job to check the event history and relevant metrics.

--

--