Harnessing Amazon S3 as NFS Volume for EC2 Instances: Achieving Cost Savings in Large-Scale Storage

Published in

Opsnetic

8 min readJan 9, 2024

Introduction

In the ever-evolving landscape of cloud computing, the synergy between Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2) stands out as a powerful combination. This guide will delve into the intricacies of leveraging Amazon S3 as an NFS (Network File System) volume for EC2 instances in private connection, unveiling a strategic approach to achieve substantial cost savings in large-scale storage scenarios.

The Foundation: Amazon S3 and EC2

Amazon S3, renowned for its scalability, durability, and secure object storage capabilities, meets its counterpart, Amazon EC2, a robust service providing scalable compute resources in the cloud. Together, they form the backbone of a versatile infrastructure, capable of handling diverse workloads efficiently.

The Need for NFS: Bridging Object Storage and File-Based Systems

While S3 excels at object storage, there arises a demand for seamlessly integrating file-based systems with cloud resources. The need to mount S3 as an NFS volume becomes evident, offering compatibility, flexibility, and ease of integration for applications that rely on traditional file systems.

Choosing the Right Solution: S3 as NFS vs. EFS

A critical decision point surfaces when to opt for S3 as NFS and when to consider Amazon Elastic File System (EFS). This guide navigates through the decision-making process, shedding light on scenarios where the cost-effectiveness of S3 as NFS becomes a strategic advantage over EFS, especially in the realm of large-scale storage.

Advantages of mounting S3 as an NFS volume include:

Compatibility: Many applications and tools are designed to work with file systems and may not natively support S3’s object storage interface. Mounting S3 as an NFS volume provides a familiar file system interface.
Ease of Integration: NFS enables seamless integration with existing applications and workflows that expect a traditional file system structure. This integration simplifies the migration of applications to the cloud without significant code changes.
Flexibility: NFS allows you to access S3 data as if it were a standard file system, providing flexibility for various use cases, such as sharing files across multiple instances, collaborating on data, or supporting legacy applications.
Uniformity of Access: Mounting S3 as an NFS volume allows for a unified access method across different storage types, making it easier to manage and maintain.

Solution Overview

Prerequisites

The deployment steps assume that:

You have deployed the Amazon EC2 instance where you will mount Amazon S3 as an NFS volume.
Note the security group ID of the instance as it will be required for permitting access to the NFS file share.
You have created the S3 bucket that you will mount as an NFS volume in the same account and Region as the instance. The bucket and objects should not be public. I recommend enabling server-side encryption.

The figure below illustrates the solution architecture for mounting the Amazon S3 bucket to the Amazon EC2 instance as an NFS volume with private connections.

This EC2 instance is the NFS client where the NFS file share is mounted. You would have set up this EC2 instance as a part of the prerequisites.
This EC2 instance hosts the S3 File Gateway. You will create this instance by installing the S3 File Gateway Amazon Machine Image (AMI).
This VPC interface endpoint provides private connectivity using SSH and HTTPS from your VPC to the AWS Storage Gateway service using AWS PrivateLink.
The S3 File Gateway uses AWS PrivateLink to privately access AWS Storage Gateway, which is an AWS Regional service.
This VPC gateway endpoint for S3 provides private access using HTTPS to the Amazon S3 AWS Regional service using AWS PrivateLink.
The S3 File Gateway uses the VPC gateway endpoint to connect privately to the S3 service and your S3 bucket mounted to your EC2 instance.

Implementation

Step 1: Create the Amazon S3 File Gateway on the EC2 instance

Go to Storage Gateway → Create Gateway
Give a Name to the Gateway and select the Gateway Timezone
Select Amazon S3 File Gateway as Gateway Option
In the platform option select the Amazon EC2. In the Launch EC2 instance section, choose Customize your settings to launch the Gateway EC2 in the private subnet. Click on Launch Instance to get redirected towards the EC2 configuration page.

Set up Gateway on Amazon EC2:

For Instance type, we recommend selecting at least m5.xlarge.
In Network settings, For VPC, select the VPC that you want your EC2 instance to run in.
For Subnet, specify the private subnet that your EC2 instance should be launched in.
For Auto-assign Public IP, select Disable.
Create a Security Group, Amazon S3 File Gateway requires TCP port 80 to be open for inbound traffic and one-time HTTP access during gateway activation. After activation, you can close this port. To create NFS file shares, you must open TCP/UDP port 2049 for NFS access, TCP/UDP port 111 for NFSv3 access, and TCP/UDP port 20048 for NFSv3 access. Where the source is CIDR of resp. security group.

6. For Configure storage, choose Add new volume to add storage to your gateway instance. You must add at least one Amazon EBS volume for cache storage with a size of at least 150 GiB, in addition to the Root volume.

Now, as our Gateway server is ready, we can proceed towards the Gateway Connection Options. Select Activation Key based connection as the instance launched doesn’t have any IP address.

Step 2: Create the VPC endpoints

We will be creating 2 VPC Endpoints:

For AWS Storage Gateway to allow private access to the AWS Storage Gateway service from your VPC
S3 VPC Gateway endpoint to allow private access to Amazon S3 from your VPC

So, let’s start with the creation of 1st VPC endpoint:

Go to AWS VPC → choose Endpoints → Create Endpoint

Give the endpoint the appropriate Name, Select AWS services in the Service Category. In the services, select com.amazonaws.<aws-region>.storagegateway
Select the VPC and verify that Enable Private DNS Name is not checked in Additional Settings.
Choose the relevant Availability Zone and Private Subnet where the S3 File Gateway is deployed.
Create a Security Group with the source as the subnet CIDR range and following inbound rules:

5. Attach the security group to the VPC Endpoint and then hit Create Endpoint.

6. When the endpoint status is available, copy the first DNS name that doesn’t specify an Availability Zone

With this 1st endpoint created, let’s continue towards the 2nd VPC endpoint

For this, again go to AWS VPC → choose Endpoints → Create Endpoint. For Service Name, choose com.amazonaws.<region>.s3 of type Gateway. Select the appropriate VPC and the route table in which the chosen private subnet is present.

So, at the end of this you would have these 2 VPC Endpoints listed:

Step 3: Get the Activation Key

Now, Connect to the Storage Gateway EC2 instance (Either via Bastion or SSM) that we launched to extract the Activation Key

Then provide the appropriate inputs to the prompted questions and be sure to provide the right DNS of the VPC Endpoint that we had copied in earlier steps

Step 4: Deploying the Storage Gateway

Paste the Activation Key in the required field

Once Done, Click Next to verify the configurations

The Configure Cache Storage automatically detects the suitable additional disk for Cache allocation

After some time we have our Storage Gateway in Running State

Step 5: Create a File Share

Now, we will create the NFS file share and mount it onto the EC2 instance

Go to Storage Gateway → File Shares → Create File Share
Select the Gateway from Dropdown and S3 bucket

3. Further we can name our file share, select the access objects protocol, and don’t forget to enable Automated cache refresh from S3 and set the minimum TTL.

4. Next, I prefer to choose the S3 Intelligent-Tiering for storage class and keep the rest options as default

5. We can also restrict the file share connection to specific IP allowed clients such as all the clients in the VPC CIDR. Keep the other options as the default. Then, review and create your file share.

6. It takes some minutes to update the status of the file share to Available. Once, it is in the available state. Copy the Linux mount command and run it on the NFS client

For Linux:

sudo mount -t nfs -o nolock,hard 172.31.79.131:/images [MountPath]

For Windows:

mount -o nolock -o mtype=hard 172.31.79.131:/images [WindowsDriveLetter]:

Validation

Upload an image on the S3 bucket

2. Let’s do a ls on the mount directory in the NFS Client EC2

3. Now, let’s create a text file from the EC2 and see whether it shows up on the S3 or not?

touch images/test.txt

4. On our S3, we see the file getting reflected!

Conclusion

Use S3 as NFS When:

Data access patterns are sporadic or infrequent.
You want the flexibility of different storage classes.
You need to optimize costs for specific access patterns and data types.

Use EFS When:

You have a consistent need for file-based storage with dynamic scaling.
Data access patterns are frequent and consistent.
Simplified management is a priority.

Ultimately! If you have to create a PetaByte scale NFS storage and you are looking for cost-savings then this implementation is your best bet rather than choosing a costly AWS EFS service.

If you need help with DevOps practices, AWS, or Infrastructure Optimization at your company, feel free to reach out to us at Opsnetic.

Contributed By: Raj Shah