What is Elastic Cloud Compute and What we should know about?

Prabhu Rajendran
Everything at Once
Published in
11 min readOct 14, 2019

Amazon Web Services (AWS) is one of the most “popular cloud infrastructure service provider” (like other Microsoft Azure, Google Cloud Platform,Digital Ocean,Heroku). The AWS EC2 Cloud server is absolutely unique and helped many resources in intensive business websites scale seamlessly across the globe.

What is AWS ? — is comprehensive cloud computing platform that allows entrepreneurs to power their business infrastructure and become more agile.It is an off-shoot of Amazon’s Internal Infrastructure, launched in 2006 to enhance their management of online retail business.

What is AWS EC2 and Its features ? —

  1. Elastic Compute Cloud or EC2 is a virtual server that assists users to run numerous applications (called instances) on the AWS cloud infrastructure.
  2. With AWS EC2,we will get instances with different resource configurations of CPU, memory, storage and networking.Each type is available in different sizes so that it can cater to the work load as required.
  3. AWS EC2 is core component of Amazon Web services platform that provides scalable cloud based computing platform.
  4. EC2 offers wide variety of standard & custom templates (preconfigured) stores as (AMI- Amazon Machine Images — that have various Linux based or windows based operating system and software configurations).
  5. EC2 instances allows users to increase or decrease resource configuration within minutes and ability to launch instances in specific parts of the world (means on any regional demand).
  6. EC2 can be integrated into other AWS components and features (including Auto Scaling which automatically launches and stop instances to meet the demand on your application). Instances running applications in Docker container can be grouped into clusters and managed by Amazon EC2 Container Service (ECS or EKS -Kubernetes).
  7. Were we can easily attach Amazon Elastic Block Store (EBS) volumes to new instances for persistent block-level storage and many instance types can be optimized for EBS to provide additional performance.
  8. A firewall that enables us to specify the protocols,ports and source IP ranges that can reach your instances using security groups.
  9. we can provide some metadata (known as tags) when we assign to resources.

Things That to start Amazon EC2 :-

  1. Basics: — Instances, AMI’ s ,regions & availability zones, instance types,tags
  2. Networking & Security — EC2 Key Pairs,Security groups , Elastic IP Address , EC2 & VPC
  3. Storage — EBS & Instance Store
  4. Working with Linux instances — System Manager, Run Command
  5. Related Services.- Auto Scaling, Cloud Formation, Elastic Bean Stalk, Ops Work
  6. How to Access Amazon EC2 — Command Line Interface , Power shell
  7. EC2 Pricing Model
  8. PCI DSS Compliance.

lets look about above something in detail:

  1. AMI — is a template that contains a software configuration (Example : Operating system,application server and applications). From an AMI , we can launch an N number of instances — which is “copy of the AMI running as a virtual server in the cloud”.
Instances Running from Copy of AMI until stop or terminate them.

2. Instances :- is a virtual server in the cloud.

i.we can launch different type of instances from single AMI.
ii.Instance type specially speaks about hardware of the computer used for the instance.(each instance type offers different compute and memory capabilities)
iii. Amazon Account has a limit number of instances that we have for running.
iv. Storage for Your instance
a. Root device for your instance contains the image used boot the instance
b. instance may include local storage volumes , which can configure at lanch time with block device mapping.
c. to keep important data safe , we should use replication strategy across multiple instances, or store your persistent data in Amazon S3 or Amazon EBS volumes.

What to look for EC2 monitoring ?

Regardless of the configuration of our individual instances, we want to monitor basic system level metrics to keep an eye on the health of our core infrastructure (well how our resources capacity matches demand).

Available EC2 Metrics falls into three categories:

  1. Disk I/O
  2. Network
  3. CPU

In addition to this ec2 metrics , we also have access to binary status checks ( which report health of our instances and AWS system they are hosted on). And we can track scheduled events (stoping, retiring, shutdown instances) that might affect our instances status or availability.

Amazon Cloud Watch monitoring system is the easiest way to see most resource metrics of our EC2 instances and other AWS Services.

  1. First By Default cloud watch uses basic monitoring (which only publishes metrics at five-minute intervals). we can enable detail monitoring when available that resolution to one minute at additional cost.
  2. Some metrics have nuances specific to EC2 instances.
  3. AWS separates most resource by region , so we can generally view cloud watch metrics within a single region at a time. And finally cloud watch does not expose metrics related to instance memory.

Now, lets jump to EC2 metric categories:-

1. Disk I/O metrics :-

a. 2 primary kind of block level storage volumes attached to ec2-instances (EBS volumes and instance store (ephemeral) volumes). Instance store volumes are physically attached to host computer the instance runs on (this means their performance are more predictable than EBS volumes , which might be splitting hardware resources among multiple tenants.But, all data on instance store volumes are lost when instance is stopped or if the disk fails(hence “ephemeral”). on Other hand EBS volumes provides persistent storage.(many instance types not support instance store).

b. Both EBS and instance store volumes can be in solid-state drive (SSD) or Hard-Disk-Drive (HDD) Format. The number,capacity and performance of these disks differ based on the instance type and volume configuration.

So, Monitoring EC2 Disk I/O can help you ensure that our chosen instance type’s disk IOPS and throughput matches our application’s needs.

CloudWatch ’s main EC2 disk I/O metrics only collect data from instance store volumes.(means it does not offer EBS disk I/O metrics within EC2 namespace, but these only available for C5 & M5 instance types). For Other instance types , disk I/O EBS volumes must be monitored via cloudwatch EBS‘s metric.

Disk read/write operations :-

Because any data stored on instance volumes is lost if the instance stops or fails, these type of volumes are best suited for the I/O intensive uses such as buffer,caches and other cases where data is stored temporarily and changes frequently.This metric pair can help determine if degraded performance is the result of consistently high IOPS, causing bottlenecks as disk requests become queued. If your instance volumes are HDD, you can consider a move to faster SSDs. Or you can upgrade the VM to an instance with more volumes attached to it.

Disk read/write bytes:-

Monitoring the amount of data being written to and read from disk can help reveal application-level problems. Too much data being read from disk might indicate that your application would benefit from adding a caching layer. Prolonged higher-than-anticipated disk read or write levels could also mean request queuing and slowdowns if your disk speed is not fast enough to match your use case.

2. Network metrics:-

Network metrics are particularly important for cloud based services like ec2 that rely on consistently strong network connections,and that might be dispersed across various availability zones.This is especially true if you have attached EBS volumes to your instances,as they are networked drives.

Instance type provide different limits both for network bandwidth and maximum transmission unit (MTU) or the largest amount of data that can be sent in a single packet.

Bandwidth limit range from 5 to 25Gbps.Network MTU is a standard 1,500 bytes for most instance types but many allow jumbo frames of as much as 9,001 bytes , increasing efficiency and reducing overhead applications that transmit large amounts of data.

Selecting the right type and availability zone for your instances can improve network performance, as can configuration options such as placement groups and enhanced networking.

In addition to measuring network throughput in bytes, CloudWatch provides metrics for packets sent and received. Note, though, that packet metrics are only available in basic monitoring, at five-minute resolution.

Network in/out:-

These metrics report the network throughput, in bytes, of your instances. Drops or fluctuations can be correlated with other application-level metrics to pinpoint possible issues. It’s unlikely that your instances will approach their network throughput limits unless they are severely mismatched with your application’s needs, but it can still be helpful to keep an eye on possible network saturation to make sure your instances meet demand — for example, if you want to restore or backup large amounts of data quickly. Or, if certain instances are receiving considerably more network traffic than others, you may wish to use a load balancer to distribute traffic more evenly.

3. CPU metrics:-

EC2 instance type have wide range of vCPU configurations, so tracking of CPU Usage can help ensure that our instances are appropriately sized for workload.

  1. Note that cloudwatch measures the percent utilization of virtualized processing capacity of the instance (which AWS labels as compute units)
  2. It does not report the CPU usage of the underlying hardware being used.
  3. T2 instances are capable of bursting or providing processing power above a standard baseline level for short period of time. This is ideal for applications that are not generally CPU Intensive but may benefit from higher CPU capacity.

CPU utilization :-

CPU usage is one of the prime host-level metrics to monitor. Depending on the application, consistently high utilization levels may be normal. But if performance is degraded, and the application is not constrained by disk I/O, memory, or network resources, then maxed-out CPU may indicate a resource bottleneck or application performance problems. You can dive into application-level metrics or request traces to diagnose the cause of CPU saturation, or switch to an instance type with more vCPUs.

For T2 instances with bursting, the increased processing power comes at the cost of CPU credits. EC2’s CPU credit metrics help keep track of your available balance and usage so that you are aware of possible charges as a result of extended bursting.

CPU credit balance:-

For standard T2 instances with bursting, a burst can continue only as long as there are available CPU credits, so it’s important to monitor your instance’s balance. Credits are earned any time the instance is running below its baseline CPU performance level. The initial balance, accrual rate, and maximum possible balance are all dependent on the instance level.

CPU credit usage:-

One CPU credit is equivalent to one minute of 100 percent CPU utilization (or two minutes at 50 percent, etc.). Whenever an instance requires CPU performance above that instance type’s baseline, it will burst, consuming CPU credits until the demand lessens or the credit balance runs out. Keeping an eye on your instances’ credit usage will help you identify whether you might need to switch to an instance type that is optimized for CPU-intensive workloads. Or, you can create an alert for when your credit balance drops below a threshold while CPU usage remains above baseline.

CPU surplus credit balance

In the case of T2 Unlimited instances, if the CPU credit balance is exhausted but burst performance is still required, the instance will consume additional credits to maintain greater CPU usage. This metric tracks the accumulated balance.

CPU surplus credits charged

This metric tracks the difference between the number of credits accumulated and the current credit balance that can be used to pay down the surplus balance. In other words, it is a measure of extra credits that will result in additional charges.

Status checks

EC2 status checks are, simply, checks on the status of an individual instance and of the AWS systems hosting it. Status checks are available at one-minute intervals. They provide a clear, high-level indication of an instance’s health and whether there is a problem with either the larger AWS infrastructure or with the software or network configuration of the instance itself.

Metric to watch: Status check failed — system

This status check reports whether there are problems detected with the system hosting the instance. Generally these are problems with the Amazon-administered computer on which your instance is hosted and are outside of your control — as an example, power loss. Possible resolutions include stopping and restarting an instance to switch it to a new host computer. (Keep in mind that instance store–backed volumes will be lost if the instance is stopped.) This check returns False (0) if an instance passes the system status check, and True (1) if it fails.

Metric to watch: Status check failed — instance

This check reports whether there are any problems detected with the instance itself and returns False (0) if an instance passes the status check and True (1) if it fails. Problems that might cause this check to fail include software or network configuration issues, a corrupted file system, etc. Amazon’s troubleshooting tips offer causes and possible solutions for common errors that result in a failed status check.

Events

Events are scheduled changes in an instance’s lifecycle. AWS may initiate events if problems are detected or if standard maintenance is required on an instance’s host computer.

Events include:

  • Stopping an instance. This is only applicable to EBS-backed instances, which retain their data and can be restarted. If restarted, the instance will be hosted on a new computer.
  • Retiring an instance. This will terminate the instance and delete any attached volumes.
  • Rebooting either the instance (again, only applicable to EBS-backed instances) or the host computer.
  • System maintenance, possibly affecting the instance’s performance or availability.

AWS will inform users if an event has been scheduled for their instances. But you can also use CloudWatch’s events stream to track events and monitor upcoming changes to your EC2 infrastructure that might degrade performance or affect data availability. This is particularly important for any instance store volumes — even if they are connected to an EBS-backed instance — as all data stored on those volumes is lost. Keeping an eye on your EC2 events will help you determine if you need to migrate data to a new instance before the current one is terminated or stopped.

Hope we learned something about EC2 and how to monitor.

Thanks for reading!.

--

--