EC2 Storage — EBS, EFS, and Instance Store fundamentals
Chapter 6: EBS, EFS, and Instance Store Fundamentals for the AWS Solutions Architect Associate Certification
Within the EC2 part of the course, we have already seen how it works at a general level and how it scales thanks to the Load Balancers and Auto Scaling Groups, but we still need to know how to store information within the EC2 instances. This is what we will cover in this chapter, thanks to Amazon Elastic Block Store (EBS), Amazon Elastic File System (EFS), and Instance Store. Let’s get started!
- Amazon Elastic Block Store (EBS)
- EBS Types
- EBS Snapshots
- Raid Options
- Instance Store
- Elastic File System (EFS)
- Typical Exam Questions
Remember that all the chapters from the course can be found in the following link:
Amazon Elastic Block Store (EBS)
Easy to use, high-performance, block-storage service for Amazon Elastic Compute Cloud (EC2) for both throughput and transaction-intensive workloads at any scale. You can attach the network (no physical) storage drive to EC2 instances to maintain its data. It’s locked to a specific Availability Zone, and you pay for the capacity you provision, so a good approach is to start with a small amount and go up if necessary.
Amazon EBS uses AWS KMS keys when creating encrypted volumes and snapshots. We’ll see AWS KMS later in this course; all we need to know is that we don’t need to worry about anything to encrypt data.
EBS TYPES
- GP2/GP3 SSD → General Purpose SSD volumes. They balance price and performance, so this is always a good option.
- IO1/IO2 SSD → Highest performance designed for critical, IOPS-intensive, and throughput-intensive workloads requiring low latency. It’s the most expensive one. They support EBS Multi-Attach, as we’ll see later.
- ST1 HDD → Ideal for frequently accessed, throughput-intensive workloads as it has a good throughput. Some examples would be large datasets and I/O sizes, such as MapReduce, Kafka, log processing, data warehouse, and ETL workloads.
- SC1 HDD → Lowest cost per GB of all EBS volume types. Used when a lot of information has to be stored at the lowest price.
EBS SNAPSHOTS
You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time snapshots. Snapshots are incremental backups, meaning only the blocks on the device that have changed after your most recent snapshot are saved.
If we want to migrate an EBS between regions, we need to create Snapshots, as you can move them between AZs or Regions. To copy an EBS to another region:
- You must take a snapshot of the EBS.
- Restore the snapshot in a different AZ/Region.
You can also Schedule the snapshots with EBS Lifecycle Manager. For example, we could schedule a snapshot at 7.00 pm of all the EBSs containing a specific tag.
We cannot create an encrypted volume from an unencrypted Snapshot. We should follow these steps:
- Create an encrypted snapshot from the original volume.
- Create a volume from the snapshot.
RAID OPTIONS
The storage options AWS provides in EBSs are not enough for you for different reasons, such as needing more IOPS, storage, etc. That’s what RAID options are for.
- RAID 0 → Used when I/O performance is more important than fault tolerance. You increase performance by combining EBSs as one. Here we gain in performance but lose fault tolerance because if the disk is erased, we cannot recover the data.
- RAID 1 → Used when fault tolerance is more critical than I/O performance. You write the data in several EBSs simultaneously instead of just in one. We lose some performance since the content has to be copied twice, but we gain fault tolerance in exchange.
RAIDS are created directly from the OS and not from the console.
EBS Multi-Attach
Amazon EBS Multi-Attach enables you to attach a single Provisioned IOPS SSD (io1 or io2) volume to multiple EC2 instances in the same AZ. It only works with io1 and io2.
INSTANCE STORE
An instance store provides temporary block-level storage for your instance. This storage is located on disks physically attached to the host computer. This last part is essential; EBS are network drives, and the instance store is physically attached to the instance, making its IOPS extremely high. What’s the problem? The problem is that storage is temporary. When the instance terminates, the data is lost.
If they ask which one to use in the exam, EBS or Instance Store, ask yourself if we don’t care if the data is removed. If we don’t care about the data and they say something about high performance / a lot of IOPS, it will be this one.
ELASTIC FILE SYSTEM (EFS)
Elastic File System lets you share file data without provisioning or managing storage. We can use it with AWS Cloud services and on-premises resources. Amazon EFS is designed to provide massively parallel shared access to thousands of Amazon EC2 instances, AWS containers, and Serverless compute services to reach up to 10GB/s throughput.
The main difference between EBS and EFS is that EBS is only accessible from a single EC2 instance in your AWS region. At the same time, EFS allows you to mount the file system across multiple AZs and instances.
One significant thing about EFS is that it only works for Linux Instances.
To support a wide variety of cloud storage workloads, Amazon EFS offers two performance modes:
- General Purpose → Recommended for the majority of your Amazon EFS file systems. General Purpose is ideal for latency-sensitive use cases, like web serving environments, content management systems, home directories, and general file serving.
- Max I/O → It can scale to higher levels of aggregate throughput and operations per second. It is used for highly parallelized applications and workloads, such as big data analysis, media processing, and genomic analysis.
EFS storage classes:
- Standard → Frequently accessed files.
- Standard–IA (EFS-IA) → The Standard–IA storage class reduces storage costs for files that are not accessed daily.
- There are also storage classes in one Availability Zone, like EFS One Zone and EFS One Zone Infrequent Access.
And that’s it! See you in the next chapter, where we’ll probably see the most famous AWS service!
TYPICAL EXAM QUESTIONS
A High-Performance Computing (HPC) application needs to provide 135,000 IOPS. The storage layer is replicated across all instances in a cluster. What is the most optimal and cost-effective storage solution that provides the required performance?
- Use Amazon EBS Provisioned IOPS volume with 135,000 IOPS.
- Use Amazon Instance Store.
- Use Amazon S3 with byte-range fetch.
- Use Amazon EC2 Enhanced Networking with an EBS HDD Throughput Optimized volume.
Solution: 2. Amazon EC2 instance store provides temporary block-level storage for instances. This storage is located on disks physically attached to the host computer, and data is lost if the instance is stopped or fails. It can provide very high IOPS (input/output operations per second) compared to EBS volumes. It also provides low latency, ideal for workloads requiring millions of transactions per second. However, it’s essential to remember that the data on Instance Store is ephemeral.
As we saw in the table from the previous section, no EBS device provides 135.000 IOPS. If we don’t care about losing the data or implementing a process to store the data in a different service, like Amazon S3, we can use Instance Store.
We want to launch an Amazon EC2 instance with multiple attached volumes by modifying the block device mapping. Which block device can be specified in a block device mapping to be used with an EC2 instance? (Select TWO)
- EBS volume.
- EFS volume.
- Instance store volume.
- Snapshot.
- S3 bucket.
Solution: 1, 3. Amazon Elastic Block Store (EBS) provides block-level storage volumes for Amazon EC2 instances. You can attach an EBS volume to any running instance, and the storage can persist independently.
Instance store provides temporary block-level storage, and the data persists only during the life of the associated Amazon EC2 instance; if you stop, terminate, or reboot the instance, all data on the instance store volume is lost.
EFS is not a block-level storage, and can’t be specified in a block device mapping. Amazon S3 is an object storage service, so this option is also incorrect.
A single volume requires 500 GiB in size and needs to support 20,000 IOPS. What EBS volume type should be selected?
- EBS General Purpose SSD
- EBS Provisioned IOPS SSD
- EBS General Purpose SSD in a RAID 1 configuration
- EBS Throughput Optimized HDD
Solution: 2. Amazon EBS Provisioned IOPS SSD volumes are designed to meet the needs of I/O-intensive workloads that require low latency and consistent performance. They offer a defined level of IOPS that you can provision with the volume, up to a maximum of 64,000 IOPS per volume. The other options are below this number.
A company is deploying a big data and analytics workload from thousands of EC2 instances across multiple AZs. We must store the data on a shared storage layer that can be mounted and accessed concurrently by all EC2 instances. Extremely high throughput is required. What storage layer would be most suitable for this requirement?
- Amazon EFS in General Purpose mode.
- Amazon EBS PIOPS.
- Amazon EFS in Max I/O mode.
- Amazon S3.
Solution: 3. For use cases requiring high levels of throughput from many EC2 instances, you should use Amazon EFS in Max I/O mode, as it’s optimized to provide the highest possible throughput.
More Questions?
- Do you want more than 500 AWS practice questions?
- Access to a real exam simulator to thoroughly prepare for the exam.
- You can download all of the AWS questions on PDF.
All of this and more at FullCertified!
Thanks for Reading!
If you like my work and want to support me…
- The BEST way is to follow me on Medium here.
- Feel free to clap if this post is helpful for you! :)