Create a scheduled (cron-like) task in AWS Fargate

Hengfeng Li
4 min readFeb 1, 2019

--

A scheduled (cron-like) task

This all begins with my need to schedule a script to crawl some stock data weekly. In a web service or application, we always have some needs to do a job at fixed times, dates, or intervals.

The most famous job scheduler is cron, which provides a utility for scheduling repetitive tasks on Linux/Unix systems. Cron Expression is commonly used to let you define when tasks should be run. You can check this website crontab.guru to configure the cron expression. There are different variants of Cron Expression used in systems, like Jenkins, Kubernetes CronJob, Fargate Scheduled Task etc. Make sure you check its instructions before use.

AWS Fargate

Fargate is a new managed service for container orchestration provided by AWS. In ECS, we always need to provision the EC2 cluster first before running services. Also, there is an operating cost to maintain the cluster. When there is a change, we need to change both the underlying cluster (the number of EC2 instances) and services (the number of tasks). In Fargate, you don’t need to operate the cluster. You can image Fargate as an unlimited cluster that you can use and you pay as you use.

Fargate also provides the ability to run scheduled tasks via CloudWatch Events.

There are a couple of other job schedulers:

  • Kubernetes CronJob
  • Airflow (using Celery scheduler)
  • Jenkins

Kubernetes CronJob needs to operate a Kubernetes cluster (I haven’t tried AWS EKS yet). Airflow needs to run a machine or a cluster and also is tied to Python. Jenkins can be used as a job scheduler but it is not designed for this.

The major advantages of using Fargate I think are:

  • No operation code and cost
  • Support Docker containers
  • Easy for scaling up

Plus, Fargate is an extension to ECS, which I am very familiar with.

Create a Fargate scheduled task

#1 Create an empty cluster and a VPC configured with two public subnets.

a. Install ECS-CLI first.

b. Create a cluster configuration

ecs-cli configure --cluster tutorial --region us-east-1 \
--default-launch-type FARGATE --config-name tutorial

c. Run ecs-cli up.

d. Check that a VPC has a default security group.

#2 Create log group, S3 bucket (optional, if you need one), ECS task execution role, e.g.,

 ---
AWSTemplateFormatVersion: '2010-09-09'
Description: >
Create S3 bucket.
Parameters:
BucketName:
Type: String
LogGroupName:
Type: String

Resources:
StockDataBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Ref BucketName
EcsTaskExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
Service:
- "ecs-tasks.amazonaws.com"
Action:
- "sts:AssumeRole"
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
Path: /
LogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Ref LogGroupName
RetentionInDays: 3

#3 Build & push docker image to AWS ECR.

File Makefile:

build:
docker build -t hengfeng/stock-crawler:latest .

push:
aws ecr get-login --region us-east-1 > ./ecrlogin.sh
sed -i -E 's,'"-e none"','""',' ./ecrlogin.sh
sh ./ecrlogin.sh && rm -rf ./ecrlogin.sh && rm -rf ./ecrlogin.sh-E
docker tag hengfeng/stock-crawler:latest XXX.dkr.ecr.us-east-1.amazonaws.com/stock-crawler:${TAG}
docker push XXX.dkr.ecr.us-east-1.amazonaws.com/stock-crawler:${TAG}

#4 Deploy ECS task definition.

File Makefile:

task-definition:
export TAG=${TAG}
ecs-cli compose --project-name stock-crawler --file ./deploy/docker-compose.yml --ecs-params ./deploy/ecs-params.yml --region us-east-1 create --launch-type FARGATE

File ./deploy/docker-compose.yml:

 version: '3'
services:
crawler:
image: XXX.dkr.ecr.us-east-1.amazonaws.com/stock-crawler:${TAG}
env_file:
- ../.env
logging:
driver: awslogs
options:
awslogs-region: us-east-1
awslogs-group: stock-crawler
awslogs-stream-prefix: ${TAG}

File ./deploy/ecs-params.yml:

 version: 1
task_definition:
task_execution_role: {ECS_TASK_EXECUTION_ROLE_NAME}
ecs_network_mode: awsvpc
task_size:
mem_limit: 0.5GB
cpu_limit: 256
run_params:
network_configuration:
awsvpc_configuration:
subnets:
- {SUBNET_A_NAME}
- {SUBNET_B_NAME}
security_groups:
- {SECURITY_GROUP_NAME}
assign_public_ip: ENABLED

Note: Replace ECS_TASK_EXECUTION_ROLE_NAME, SUBNET_A_NAME, SUBNET_B_NAME, and SECURITY_GROUP_NAME with your parameters.

#5 Create a scheduled Fargate task.

 ---
AWSTemplateFormatVersion: '2010-09-09'
Description: >
Scheduled ECS task.
Parameters:
ECSClusterArn:
Type: String
ScheduledWorkerTaskArn:
Type: String

Resources:
StockCrawlerTaskSchedule:
Type: AWS::Events::Rule
Properties:
Description: "Crawl stock data every 15 mins"
Name: "stock-crawler"
ScheduleExpression: "cron(0/15 * * * ? *)"
State: DISABLED
Targets:
- Id: stock-crawler-fargate-task
RoleArn: !GetAtt TaskSchedulerRole.Arn
Arn: !Ref ECSClusterArn
EcsParameters:
TaskCount: 1
TaskDefinitionArn: !Ref ScheduledWorkerTaskArn
TaskSchedulerRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
Service:
- "events.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: /
Policies:
- PolicyDocument:
Statement:
- Effect: "Allow"
Condition:
ArnEquals:
ecs:cluster: !Ref ECSClusterArn
Action: "ecs:RunTask"
Resource: "*"
- Effect: "Allow"
Condition:
ArnEquals:
ecs:cluster: !Ref ECSClusterArn
Action:
- "iam:ListInstanceProfiles"
- "iam:ListRoles"
- "iam:PassRole"
Resource: "*"
PolicyName: "TaskSchedulerPolicy"

Note: From EcsParameters, CloudFormation in Step 5 should support specify parameters, such as LaunchType, NetworkConfiguration, PlatformVersion, for Fargate tasks. However, currently, I don’t think it is finished yet because it throws an error saying keyword does not exist. This is the reason why I set the task to DISABLED.

#6 Manually update the task config in CloudWatch Event Rules.

a. Edit

  • Launch Type: Fargate
  • Platform Version: LATEST
  • Subnets: subnet-001, subnet-002
  • Security Group: sg-001
  • Auto-assign Public IP (MUST): ENABLED
  • Use existing role: TaskSchedulerRole

b. Enable the task

After the above steps, the task should be run at the schedule in Fargate. The CloudFormation template does not fully support it yet and some manual steps are required.

References

--

--

Hengfeng Li

PhD. Software engineer at Google. Living in Australia.