Self-Managed Clusters: ECS + Fargate as an Alternative to Kubernetes

Published in

Flux IT Thoughts

12 min readFeb 17, 2023

A few days ago, it was my job as a Solution Architect to design an architecture for a client with a very dynamic and configurable business activity, which led us to choose microservices. We had to divide up concepts and rearrange them so that they would interact with other internal and external services.

Due to certain features of this client, we decided to work with AWS; and as we usually do, we used Kubernetes with EKS (Amazon Elastic Kubernetes Service). But, when I told Carlos (a great friend of mine) about this choice, he asked me if the client had any employees who knew how to manage Kubernetes, and I replied that the client did not in fact have anyone who knew how to do so. Therefore, he warned me that service management should be as automated as possible, so that I did not have to manage scaling resources myself either by adding or subtracting Pods (basic EKS unit of implementation and scalability).

This was the reason why we finally decided to switch from EKS to ECS (Elastic Container Service) and Fargate. And if this sounds new to you, keep reading and I will tell you more about it.

Let’s Check AWS Documentation

“AWS Fargate is a technology that you can use with Amazon ECS to run Docker containers without having to manage servers or clusters of Amazon EC2 instances. With AWS Fargate, you no longer have to provide, configure, or scale clusters of virtual machines to run containers. This removes the need to choose server types, decide when to scale your clusters or optimize cluster packing.” 1

The idea is that you do not manage resources, virtual machines, or memories on your own, but to have AWS manage them for you instead.

An ECS is a cluster that works as a service and that is used as if it were an API. You can take advantage of all its usage values (such as memories, processes and more), but without the user’s need to manage it.

But Then, What Is a Cluster?

An Amazon Elastic Container Service (ECS) cluster is a group of Amazon Elastic Compute Cloud (EC2) instances that is used for running containers. Clusters foster an environment in which it is possible to run container tasks that in turn run individual containers.

Each cluster can have one or more task groups that, at the same time, may contain one or more containers. Besides, to automatically scale a container within a cluster, you must configure automatic scaling policies.

There are two types of ECS:

1. ECS with instances: a cluster in which the instances are provided for. These instances can be either on-demand or spot instances.

2. Serverless (ECS Fargate): It pretends to be a serverless cluster (with no instances) in which it is defined which services are going to be executed, and AWS takes care of the processes required to execute and solve services from behind the scenes.

As I mentioned earlier, a cluster is used to solve tasks or services, and an ECS cluster is developed exclusively for orchestrating Docker containers.

All your tasks and processes end up in Docker containers. And to define these containers, there is an object called container definition, in which we will define all the parameters and properties that a Docker needs to work.

Likewise, a task definition will be used which is a template that lets you know where a Docker image is, how to access it and how to execute it, and if it will be of the EC2 or Fargate type. You could have a set of container definitions: for example, a .Net Core server and a SQL Server running in the same environment while sharing data with each other.

Then there is the service object, which is the gateway to execution. At service, you will already have defined the parameters of the number of tasks that you are going to execute, as well as the load balancer you will use and the limits of both the memory and processor.

Source : www.serverless.com/blog/serverless-application-for-long-running-process-fargate-lambda

How to Create an ECS Fargate Directly from the AWS Console

If we follow these steps, we can create it from Terraform or from SDK:

1. Create a Cluster

From the AWS Console, you will enter the Elastic Container Service and create a new cluster.

You will select the VPC on which the cluster will run or create a new one.

Once you get to the infrastructure section, you will choose the type of cluster (EC2 or Fargate) depending on whether you want a serverless ECS or EC2 with instances. Remember that if you select Fargate, all services must be of the same type.

Once created, in the console, you will be able to see the cluster and what state it is in, and you will have to wait for AWS to create and assign the cluster resources.

2. Create a Task Definition

Once the cluster is created, you must let it know which image it is going to execute, and for that reason you will create your task definition.

What is a task definition? It is a JSON file that specifies the details of a container task, such as which containers to include, the resource configurations for those containers, and the network rules. A task definition may also specify data amounts to add to the container, environment variables, and authentication settings. The task definition is used so as to create a task that will later on run on the ECS cluster.

How can you create a task definition? To create a Fargate task, you must:

Select the “Task Definitions” option on the navigation menu.
Press the “Create New Task Definition” button.
Press the “Fargate” button as the execution platform.
Press the “Next Step” button.
In the “Task Definition” tab, fill in the necessary details such as name, CPU number, and required memory. These parameters are the basic ones. Once the task is created, Auto Scaling deals with managing, adding, or removing these resources.
Press the “Add Container” button to add a certain container to the task.
Fill in the container details such as name, image, ports, and other details.
Press the “Create” button to create the task.

Once the task is created, you will be able to use the “Run Task” option to run it on a Fargate cluster.

Note that to run a task on Fargate, at least one service and network must be configured before you create a task.

Once the task is created, you will wait for the ECS to provide the resources and to let the task run. In the cluster, you will be able to see the status, both of services and tasks.

3. Create Services

In order to coordinate all this and to keep the instances or tasks running, you need a service. So… what is a service?

A service in Amazon Elastic Container Service (ECS) is an abstraction used to manage a group of identical containers that run in a cluster. A service is responsible for ensuring that a specific number of container replicas are running at all times, and it also provides a mechanism to automatically scale containers based on demand.

When you create a service in ECS, you must specify a task definition that describes which container to run, and you need to configure the desired number of container replicas. ECS is responsible for providing the necessary resources in the cluster and for launching the tasks that correspond to the task definition.

A service can also be linked to an Amazon Elastic Load Balancing (ELB) or Application Load Balancer (ALB) route, thus making it possible to distribute the workload between container tasks. It can also be used to automate the restart of failed tasks and the automatic scaling of tasks based on demand.

You can say that it is like the “gateway” to the cluster.

First, you will need to create the service and to set the environment.

Then, you will create the deployment configuration.

Deployment configuration is a feature that allows you to control how to make service updates. It specifies how transitions between different versions of a task definition will be made, thus allowing users to precisely control the process of deploying a service.

When you create or update a service, you will be able to choose a deployment configuration that includes the following information:

The maximum number of tasks to be stopped simultaneously during an upgrade.
The maximum timeout limit to stop a task before forcing completion.
The maximum number of tasks to be launched simultaneously during an update.

In this way, you will be able to control the deployment process to ensure a smooth transition between older and newer versions of the service. In addition, the deployment configuration will allow you to reduce downtime and disruption risks during deployment.

Setting Up the Networking

Networking is mainly used to establish connections between containers.

Another important function of networking in ECS is the possibility of connecting containers with each other and with other services within a cluster. This can be accomplished by using Virtual Private Clouds (VPCs) and Amazon VPC routing and security options.

In addition, the Load Balancing service in ECS can implement an Elastic Load Balancer (ELB) to distribute the workload between service tasks and to ensure an equitable distribution of incoming workload. This makes it possible to automatically scale containers depending on demand and to increase service availability.

You can also control inbound and outbound traffic by using Amazon VPC’s Security Group and Network Access Control Lists (ACLs) to limit access to only certain specific services and to protect both containers and services from external attacks.

Once services are created, you will be able to see in the AWS console how the components that make up the cluster behave and to check if all components are up and running.

Once you have the cluster with its respective services and task definitions, you will be able to generate a load balancer to distribute the incoming traffic that gets to services.

What is a load balancer? A load balancer is a service that provides automatic load balancing for applications. It allows you to automatically distribute incoming load across multiple servers or instances, thus increasing application availability and scalability.

AWS offers two types of load balancers:

Application Load Balancer (ALB): A load balancer that works with application loads and that operates at layer 7, which is to say that it focuses on the application level and more specifically on load distribution depending on criteria such as URLs or HTTP headers.

Network Load Balancer (NLB): A load balancer that operates at layer 4, which focuses on load distribution based on network protocol criteria, such as IP addresses and ports.

Load balancers can be used to distribute incoming load across multiple EC2 instances, and they can also be used to balance the load between containers. Load balancers can run on Amazon ECS or on a Kubernetes cluster.

In addition, a load balancer can have other features such as:

SSL Offloading
Health Checks
Sticky Sessions

With the load balancer, you can improve the availability and scalability of applications by automatically distributing the load among several servers or instances. This enables you to have only one access point and to use listeners to route traffic.

Now, what is a listener? In a load balancer, listeners are rules that are configured to control how incoming traffic to the load balancer is handled. Each listener is configured with a specific network protocol and port and is used to receive and handle incoming requests that get to that protocol and port.

In the case of the Application Load Balancer (ALB), listeners focus on the application layer, and they can be configured with rules based on URLs or HTTP headers to redirect incoming traffic towards different task groups or servers.

In the case of the Network Load Balancer (NLB), listeners focus on the network layer, and they can be configured with rules based on network protocols such as TCP or UDP and with the corresponding port to redirect traffic towards different task groups or servers.

For each listener, we can configure one or more rules, which are used to determine to which group of tasks the incoming traffic will be sent. These rules can be based on different criteria, such as URLs or HTTP headers, or IP addresses and ports.

Listeners are essential if you want the load balancer to work, since they are responsible for receiving and redirecting traffic to the right servers or instances, thus allowing for an equitable distribution of the load and improving both availability and scalability of applications.

The ECS Fargate architecture used in this example would look like this (the project includes a front end in React and a back end in Node.JS):

In a specific branch of a version manager, there is a deployment configuration, which generates the docker image and uploads it to the ECR (Amazon Elastic Container Registry). There, new versions of the task definition are created and implemented so that the service can make use of them.

And all this happens without the intervention of users since all clusters have Auto Scaling!

Auto Scaling in ECS is used to automatically scale containers of a service depending on demand. With Auto Scaling, containers will always be available to manage cargo and to ensure high availability for applications.

There are two main ways to configure Auto Scaling in ECS:

CloudWatch-Based Auto Scaling: This type of automatic scaling uses CloudWatch rules to increase or reduce the number of tasks in a service depending on CloudWatch metadata. For example, rules can be created to automatically scale tasks based on the CPU or memory usage.

Schedule-Based Auto Scaling: In this case, the automatic scaling is a scheduler that seeks to automatically scale the number of tasks of a service in accordance with a schedule. For example, you can configure a scheduler to increase the number of tasks during peak hours and reduce them during off-peak hours.

Once configured, Auto Scaling automatically increases or reduces the number of tasks depending on the rules or scheduler that had been previously configured, thus ensuring that there are always enough tasks available to handle the load and to ensure high availability for applications.

To Sum Up

When clusters are necessary in order to manage services, APIs or a front end, and you do not have the resources, time or knowledge to manage them, it is convenient to choose self-managed clusters such as ECS + Fargate, which offer dynamism and scaling skills both to add up or subtract scalability depending on current demand without the constant intervention of a DevOps or SRE role that is constantly monitoring, moving up and down instances, or managing cluster resources. For all this magic to happen and to control it, you have to take into account that Auto Scaling2 (AWS Documentation) exists and that it is responsible for enlarging or shrinking the ECS cluster according to your requirements.

Now I’d like to ask you your opinion on these AWS serverless technologies and if you have tried them before. In case you have, I would like to know how that has turned out. I will leave the comments section open to read your opinions. If you have any questions, you can write to me at demian.sclausero@fluxit.com.ar or, you can also leave a comment here, and together we will be able to discover more about what this infinite AWS world has to offer. Of course, thank you for reading the article!

References

1 https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html

2 https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html

Sources

https://docs.aws.amazon.com

https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/application.html

https://docs.gitlab.com/ee/ci/

Know more about Flux IT: Website · Instagram · LinkedIn · Twitter · Dribbble · Breezy