Which AWS Compute Options to choose?
Below are some of the considerations to choose the right compute option:
- Processing power.
- Customized CPU for certain use cases like GPU/IO optimized.
- Capacity.
- Scaling.
- Cost.
The major compute options AWS offers:
1. EC2:
EC2 provides virtualized servers provisioned for you and you can run your workloads on it. You own the OS, patching and all other management.
Consider features mentioned below provided by AWS which may not be very easy to get in a conventional data center:
a. Burstable: Traditional Amazon EC2 instance types provide fixed performance, while burstable performance instances provide a baseline level of CPU performance with the ability to burst above that baseline level (T3, T3a, and T2 instances). The baseline performance and ability to burst are governed by CPU credits.
A CPU credit provides the performance of a full CPU core running at 100% utilization for one minute. Other combinations of number of vCPUs, utilization, and time can also equate to one CPU credit. For example, one CPU credit is equal to one vCPU running at 50% utilization for two minutes, or two vCPUs running at 25% utilization for two minutes.
b. FPGA(Field Programmable Gate Array):
FPGAs are reprogrammable hardware devices that can implement any logic function. This allows developers to create custom processors/accelerators that provide optimized compute tailored to accelerate a specific workload. This enables in specific cases a significant acceleration compared to fixed-function compute solutions like CPUs and GPUs. FPGAs are particularly useful for prototyping application-specific integrated circuits (ASICs) or processors. An FPGA can be reprogrammed until the ASIC or processor design is final and bug-free and the actual manufacturing of the final ASIC begins. Intel itself uses FPGAs to prototype new chips. More on FPGA here.
AWS provides F1 Instances for such use cases.
Consider the Instance types provided:
a. General Purpose: These instances provide a balance of compute, memory and networking resources, and can be used for a variety of diverse workloads. These instances are ideal for applications that use these resources in equal proportions such as web servers and code repositories (eg., a1, t2, m4).
b. Compute Optimized: These instances are ideal for compute bound applications that benefit from high performance processors. Instances belonging to this family are well suited for batch processing workloads, media transcoding, high performance web servers, high performance computing (HPC), dedicated gaming servers and other compute intensive applications (eg., c4, c5).
c. Memory optimized: These instances are designed to deliver fast performance for workloads that process large data sets in memory.(eg., R4, R5, X1)
d. Accelerated computing: These instances use hardware accelerators, or co-processors, to perform functions, such as floating point number calculations, graphics processing, or data pattern matching, more efficiently than is possible in software running on CPUs (eg., F1, G3, P2).
e. Storage optimized: These instances are designed for workloads that require high, sequential read and write access to very large data sets on local storage. They are optimized to deliver tens of thousands of low-latency, random I/O operations per second (IOPS) to applications (eg., i3, H1, etc).
f. Bare metal: These instances allow EC2 customers to run applications that benefit from deep performance analysis tools, specialized workloads that require direct access to bare metal infrastructure, legacy workloads not supported in virtual environments, and licensing-restricted Tier 1 business critical applications. Bare metal instances also make it possible for customers to run virtualization secured containers such as Clear Linux Containers. Workloads on bare metal instances continue to take advantage of all the comprehensive services and features of the AWS Cloud, such as Amazon Elastic Block Store (EBS), Elastic Load Balancer (ELB) and Amazon Virtual Private Cloud (VPC). (eg., m5.metal, r5.metal, and z1d.metal)
Scaling options to consider:
a. Metrics based: You can track different performance criteria and how then scale up or down using these metrics.
b. Schedule based: This allows you to set your own scaling schedule for predictable load changes. For example, every week the traffic to your web application starts to increase on Wednesday, remains high on Thursday, and starts to decrease on Friday.
c. Health based: AWS monitors the health of your instances and replace it if needed. There’s no in or out scaling in this case, but this ensures that the capacity is maintained.
2. ECS
A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. Put simply, a container consists of an entire runtime environment: an application, plus all its dependencies, libraries and other binaries, and configuration files needed to run it, bundled into one package. By containerizing the application platform and its dependencies, differences in OS distributions and underlying infrastructure are abstracted away.
Containers are an abstraction at the app layer that packages code and dependencies together. Multiple containers can run on the same machine and share the OS kernel with other containers, each running as isolated processes in user space. Containers take up less space than VMs (container images are typically tens of MBs in size), can handle more applications and require fewer VMs and Operating systems.
Amazon Elastic Container Service (Amazon ECS) is a fully managed container orchestration service. You can even choose to run your ECS clusters using AWS Fargate, which is serverless compute for containers. Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design. Amazon ECS lets you schedule long-running applications, services, and batch processes using Docker containers.
Here is an example where moving to ECS was quite beneficial.
Lambda
AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume.
S3 can trigger Lambda to process data immediately after an upload. e.g., Lambda can be used to thumbnail images in real-time. Lambda and Amazon Kinesis can be used to process real-time streaming data for social media analysis or for click stream analysis. Lambda can also be used to perform data validation, filtering, sorting, or other transformations for every data change in a DynamoDB table and load the transformed data to another data store.
AWS Lambda removes the need for manual server provisioning, maintenance and upgradation. It can seamlessly scale depending on data volume, requests and traffic. It also supports multiple languages.
Considerations:
- Timeout — The amount of time that Lambda allows a function to run before stopping it. The default is 3 seconds. The maximum allowed value is 900 seconds. Long running functions or complex processes aren’t good fit for the AWS Lambda.
- Deployment Size — Lambda function’s code consists of scripts or compiled programs and their dependencies. The limit to this deployment size is 50 MB.
This article describes more such considerations.