Ephemeral Volume emptyDir backed on EC2 Instance Store NVMe

Sandeep Kadyan
6 min readMay 11, 2024

--

One of the key features of Kubernetes is its ability to manage storage volumes efficiently. In this blog post, we’ll explore how to leverage EC2 NVMe Instance Store Volume-backed storage for /tmp path in the application pods via the emptyDir volume. This setup can significantly boost efficiency, especially during tasks involving intensive local I/O processing. For example, in a data platform setup running on AWS EKS and dealing with millions of small data files, it’s common to perform processing or manipulation on local data files before further processing. Let’s dive in

Understanding NVMe Instance Store

NVMe (Non-Volatile Memory Express) is a protocol designed specifically for accessing storage media via the PCIe bus. AWS EC2 instances support NVMe Instance Store Volumes, which are high-performance, low-latency storage options ideal for temporary data storage and caching. The following command provides a list of instance types supporting the instance store, along with storage information.

aws ec2 describe-instance-types \
--filters "Name=instance-type,Values=*" \
"Name=instance-storage-supported,Values=true" \
"Name=instance-storage-info.nvme-support,Values=required" \
--query "InstanceTypes[].[InstanceType, InstanceStorageInfo.TotalSizeInGB, InstanceStorageInfo.Disks[0].Count]" \
--output table

Below is an example output of all instance types starting with ‘r’ (Name=instance-type,Values=r*)

--------------------------------
| DescribeInstanceTypes |
+----------------+--------+----+
| r5dn.8xlarge | 1200 | 2 |
| r5ad.8xlarge | 1200 | 2 |
| r5ad.24xlarge | 3600 | 4 |
| r5ad.12xlarge | 1800 | 2 |
| r5d.4xlarge | 600 | 2 |
| r5dn.metal | 3600 | 4 |
| r5d.2xlarge | 300 | 1 |
| r5d.16xlarge | 2400 | 4 |
| r5dn.xlarge | 150 | 1 |
| r5d.xlarge | 150 | 1 |
| r5d.large | 75 | 1 |
| r5dn.large | 75 | 1 |
| r5ad.large | 75 | 1 |
| r5d.24xlarge | 3600 | 4 |
| r5ad.16xlarge | 2400 | 4 |
| r5dn.12xlarge | 1800 | 2 |
| r5ad.2xlarge | 300 | 1 |
| r5d.8xlarge | 1200 | 2 |
| r5dn.16xlarge | 2400 | 4 |
| r5d.metal | 3600 | 4 |
| r5ad.xlarge | 150 | 1 |
| r5dn.2xlarge | 300 | 1 |
| r5ad.4xlarge | 600 | 2 |
| r5dn.4xlarge | 600 | 2 |
| r5d.12xlarge | 1800 | 2 |
| r5dn.24xlarge | 3600 | 4 |
+----------------+--------+----+

EC2 Instance Store Pricing: Instance store volumes are included as part of the instance’s usage cost.

The NVMe instance store provides temporary block-level storage located on disks that are physically attached to the host computer (where the EC2 instance runs). An instance store consists of one or more instance store volumes exposed as block devices. The number of available devices and their storage sizes depend on the EC2 instance type

Source: https://docs.aws.amazon.com/images/AWSEC2/latest/UserGuide/images/instance_storage.png

These devices are ephemeral, identified by labels such as ephemeral0, ephemeral1 (or nvme1n1, nvme2n1), and so on

Understanding emptyDir

emptyDir: empty at Pod startup, with storage coming locally from the kubelet base directory (usually the root disk) or RAM

In Kubernetes, an emptyDir volume is created and managed by Kubernetes itself. It exists for as long as the Pod that uses it exists. The emptyDir volume is useful for storing temporary data that needs to be shared between containers within the same Pod.

The default path on EKS EC2 nodes for storing pod-specific data files, including the storage backend for emptyDir volumes, is /var/lib/kubelet/pods/.

Mapping NVMe Instance Store Volume to emptyDir

By default, block storage devices are not usable and require formatting and mounting. To accomplish this, we need two prerequisite command-line tools:

  • nvme-cli: Used to query the available NVMe devices.
  • mdadm: A software RAID utility.

After formatting the devices, we can mount them to a local path visible on the EKS nodes using the standard Linux mount command.

By default, block storage devices are not usable and require formatting and mounting. To do this, we need two cli tools as pre-requisites.

  1. nvme-cli: To query the available NVMe devices.
  2. mdadm: Software RAID utility

After formatting the devices, we can mount the device to a local path visible in the EKS nodes using the standard linux mount command.

Below is a shell script that performs these two steps during the launch of an EC2 instance. This script can be set as user-data for the EKS nodes. The steps for setting the user-data may vary depending on the used EKS cluster provisioning method.

#!/bin/bash
#
# Install NVMe CLI, Software RAID Utility
#
yum install nvme-cli mdadm -y

#
# Get a list of instance-store NVMe drives. If none found, do not fail.
#
nvme_drives=$(nvme list | grep "Amazon EC2 NVMe Instance Storage" | cut -d " " -f 1 || true)

#
# Build an array object
readarray -t nvme_drives <<< "$nvme_drives"

#
# Get number of disks available.
#
num_drives=${#nvme_drives[@]}

# Create RAID-0 array across the instance store NVMe SSDs
mdadm --create /dev/md0 --level=0 --name=md0 --raid-devices=$num_drives "${nvme_drives[@]}"

# Format drive with ext4
# Please note: THIS IS JUST AN EXAMPLE HERE. You may implemented advanced
# formatting option. Refers to the mkfs.ext4 documentation
#
# https://linux.die.net/man/8/mkfs.ext4
#
mkfs.ext4 /dev/md0

# Create a filesystem path to mount the disk
# The location is CRITIAL here. It it root path used by kubelets to host
# the scratch directories requested by Pods
mount_location="/var/lib/kubelet/pods"
mkdir -p $mount_location

#
# Mount RAID device finally
mount /dev/md0 $mount_location

#
# Have disk be mounted on reboot
#
mdadm --detail --scan >> /etc/mdadm.conf
echo /dev/md0 $mount_location ext4 defaults,noatime 0 2 >> /etc/fstab

Indeed, once the necessary configurations are completed, you’re all set!

How to use it

Below is a minimal pod specification example demonstrating how to mount the /tmp path in any pod to an emptyDir volume.

apiVersion: v1
kind: Pod
metadata:
name: minimal-pod
namespace: default
spec:
volumes:
- name: tmp-volume # tmp volumn
emptyDir: {} # emptyDir
containers:
- name: minimal-container
image: alpine:latest
command: ["sleep"]
args: ["infinity"]
volumeMounts:
- name: tmp-volume
mountPath: /tmp # Default tmp location
> kubectl apply -f "minimal.yml"
pod/minimal-pod created

> kubectl exec -n default -it pod/minimal-pod -- sh
/ #
/ #
/ # df -h /tmp
Filesystem Size Used Available Use% Mounted on
/dev/md0 100.0G 13.5G 86.5G 13% /tmp
/ #

So you are done.

Performance

This blog post provides a few ways to measure disk performance. In our case, we observed up to a 38x performance improvement.

When comparing performance, it’s important to note that the IOPS of EBS-backed storage depends on the provisioned disk size (higher size results in improved IOPS). We observed an IOPS of 1500 with a provisioned 20 GiB disk, whereas it increased to 2800 with a 100 GiB disk. However, the IOPS reached 58000 when using a 275 GB local storage disk. In this scenario, we observed approximately a ~21x faster performance (58000/2800)

Final notes

We focused on improving the performance of the /tmp path, where sharing is not required with other pods. However, if you are looking to utilize the instance store as a Kubernetes persistent volume, the blog post provides details on how to do so. A persistent volume can be mounted to multiple containers, allowing you to share data across pods running on the same node (EC2 instance)

Hope you have like it.

--

--

No responses yet