Getting started with AWS Batch and Serverless Framework

Paulo Carneiro
birdie.ai
Published in
5 min readJan 25, 2021

If you are a software engineer or a data scientist you probably have worked for days in a script without thinking much about the engineering involved in deploying your script/model. In some step of the process, you may wonder, now what? How do I deploy and run this script in a scalable way?

This series of articles aims to introduce AWS serverless services so that you can answer this question more easily, enabling you to focus only on the development of your scripts and leave all the infrastructure part in the hands of AWS. The first serverless service that we will introduce is AWS Batch.

AWS Batch is a service that helps you to run batch workloads using the AWS infrastructure, providing great computational power without the need for maintenance and configuration of the infrastructure.

Here at Birdie, we use AWS Batch with GPU to deploy our machine learning models, reading batches of data from s3, predicting, and saving the predictions back into s3. We also use AWS Batch for simpler tasks but with FARGATE as a Compute environment.

We can break down the AWS Batch into 4 components:

  • Jobs: A unit of work (such as a shell script, a Linux executable, or a Docker container image) that you submit to AWS Batch. It has a name and runs as a containerized application on AWS Fargate or Amazon EC2 resources in your Compute environment, using parameters that you specify in a job definition
  • Job Definition: A job definition specifies how jobs are to be run; you can think of it as a blueprint for the resources in your job. You can supply your job with an IAM role to provide programmatic access to other AWS resources, and you specify both memory and CPU requirements. The job definition can also control container properties, environment variables, and mount points for persistent storage. Many of the specifications in a job definition can be overridden by specifying new values when submitting individual Jobs.
  • Job Queues: When you submit an AWS Batch job, you submit it to a particular job queue, where it resides until it is scheduled onto a Compute environment. You associate one or more compute environments with a job queue, and you can assign priority values for these compute environments and even across job queues themselves. For example, you could have a high priority queue that you submit time-sensitive jobs to, and a low priority queue for jobs that can run anytime when compute resources are cheaper.
  • Compute Environment: A Compute environment is a set of managed or unmanaged compute resources that are used to run jobs. Managed Compute environments allow you to specify desired compute type (Fargate or EC2) at several levels of detail.

Deploying a GPU job using AWS Batch

Now that we have explained all the components we are going to configure our first GPU job using AWS Batch.

1.First of all, we will need to send our script image to AWS ECR (Elastic Container Registry).

To this example, I wrote the following code:

2. Now let's configure our AWS Batch using the Serverless Framework. For this task, we will need to create every component that we previously explained using the CloudFormation syntax but first, we need to create the serverless.yml file.

The goal of this article is to explain how to configure an AWS Batch using serverless, for more information on the serverless.yml see this link.

Now let's start our main task: configuring the batch.yml file

The full code is on Github.

The YML file above implements all 4 components that we explained at the beginning of this article and their respective roles. There are a lot of micro configurations that I shall not explain but let's dig deeper into some of them.

First, let's see how the scaling process works in the AWS Batch: if you see at the compute environment configs you will see the MaxvCpus and MinvCpus, these parameters define how your computer environment will scale (upper and lower limits respectively). For example, we configured the lower bound to 0, meaning that if there is no job in the queue all resources will be terminated, and if you submit a new job the AWS will have to initialize the resources back, this means with 0 as the lower bound you will have to wait to you resources scale up (cold start). As upper bound we defined 32 VCPUs, considering we are using a p3.2xlarge instance (instance with GPU and 8 VCPUs) we will be able to scale our job to 4 instances ( upper bound / number of VCPUs of your instance).

The next parameter worth mentioning in this article is the ResourcesRequirements at the job definition. It's a pretty straightforward argument but you must not forget to pass this parameter otherwise your job will not be able to access your GPU even if your ec2 has one available.

For more details of other configurations, you may refer to AWS CloudFormation documentation.

3. Deploy:

To deploy our stack with serverless is pretty simple. First, you need to install the serverless framework and the desired plugins. Then, you just need to run the command "serverless deploy".

npm install -g serverless
npm init
npm install --save-dev serverless-pseudo-parameters
serverless deploy --stage dev --verbose

After running the commands above, your stack will be ready to be executed in the AWS cloud.

4. Now that we have our stack deployed, we need to ignite the AWS Batch by submitting a new job to our JobQueue. This will make our infrastructure to scale up so we can validate if everything is running as desired.

The following code implements an AWS Lambda that submits a job to our queue using boto3 library.

To deploy a python lambda using Serverless Framework you may refer to this link.

To monitor the logs of our task you may want to log into the AWS console and navigate to the AWS Batch page. There you will see a dashboard with all job queues and the status for the latest jobs. If you click on a job, you can find its log stream so you can see the task logs into CloudWatch.

Congratulations, you just deployed your first GPU job with AWS Batch. Now you have a pretty scalable tool to run your batches workloads without needing to manage any infrastructure per se.

--

--