AWS EFS — The Container Friendly File System

In April 2015 AWS announced a new cloud service: Elastic File System — a Shared File Storage for Amazon EC2. Fast forward to June 28th 2016, more than a year later, and AWS announced that EFS is finally available and production-ready in 3 regions (us-east-1, us-west-2 and eu-west-1).

A day later, Convox’s David Dollar opened a pull request integrating EFS into Convox Rack, the open-source cloud management platform built on top of AWS.

Let’s give it a try and run a Postgres container on EFS…

We’ll see that EFS is quite easy to get up and running, and that it works as advertised. A Postgres data directory is synchronized to every instance in the cluster. This unlocks a much needed challenge with containers. We can now run, kill and reschedule a Postgres container anywhere in the cluster and resume serving the persistent data.

We’ll also see that while possible, Postgres on EFS probably isn’t something we need or want to use outside of development or testing, but this experiment gives us confidence to add EFS to our infrastructure toolbox.

EFS and NFS Primer

EFS is an implementation of the Network File System (NFS) protocol.

NFS dates back to 1984, but EFS implements version 4.1 which was proposed as a standard in 2010.

With NFS, every computer uses an NFS client to connect to an NFS server, and synchronize file metadata and data over the network. The goal is to synchronize low level file modifications— locks and writes — across servers so that all clients have an eventually consistent filesystem to read from.

The NFS client/server protocol is designed to handle tough failures around network communication and mandatory and advisory file locking. Version 4.1 made big improvements around using multiple servers to separate the metadata paths from the data paths.

See these primers on NFS 4.0 and NFS 4.1 for more details.

Because NFS looks like a standard filesystem it is trivial to use an EFS volume with Docker containers.

EFS is free to provision and offers 5 GB of storage for free in the AWS free tier. Usage beyond that costs $0.30 per GB per month. See the launch announcement for more details.

1990 called and wants its technology back

Update Convox To Use EFS

An EFS volume is relatively easy to provision with CloudFormation and to mount into an EC2 instance with UserData.

Here I am demonstrating the simple, open-source Convox project and tools to write set up a new environment (or to update an existing environment) to use EFS and inspect the resulting instances and containers. See this Pull Request for the specifics of the CloudFormation and UserData changes.

As long as we install Convox into one of the 3 regions where EFS is supported, Convox instances now automatically have EFS mounted in /volumes.

# install convox in us-east-1, us-west-2 or eu-west-1
$ convox install --region=us-west-2
# update to the EFS release
$ convox rack update 20160629185452-efs
# check out the new /volumes path
$ convox instances ssh i-492ab0cf
$ mount | grep /volumes
us-east-1c.fs-f223e6bb.efs.us-east-1.amazonaws.com:/ on /volumes type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.2.139,local_lock=none,addr=10.0.2.95)
# modify something in the /volumes path
$ sudo su
$ echo hi > /volumes/hi
# see the changes on other instances
$ convox instances ssh i-cb979a57 cat /volumes/hi
hi
$ convox instances ssh i-f08a3660 cat /volumes/hi
hi

Sure enough we have a filesystem shared between three instances!

Update an App Database To Use an EFS Volume

Let’s run a database on this filesystem.

Convox takes an application with a Dockerfile and docker-compose.yml file and automates building images and pushing them to the EC2 Container Registry (ECR) and deploying the images to AWS via the EC2 Container Service (ECS).

We can take the convox-examples/rails app and add a persistent volume to the Postgres container by adding 2 new lines to the docker-compose.yml file that say we want a data directory in production:

  volumes:
- /var/lib/postgresql/data

Now I can deploy the app:

$ cd rails
$ cat docker-compose.yml
web:
build: .
labels:
- convox.port.443.protocol=tls
- convox.port.443.proxy=true
links:
- database
ports:
- 80:4000
- 443:4001
database:
image: convox/postgres
ports:
- 5432
volumes:
- /var/lib/postgresql/data
$ convox deploy
Deploying rails
...
RUNNING: docker pull convox/postgres
RUNNING: docker tag rails/database 132866487567.dkr.ecr.us-east-1.amazonaws.com/convox-rails-eefmdtclkf:database.BHEUKPGVHOB
RUNNING: docker push 132866487567.dkr.ecr.us-east-1.amazonaws.com/convox-rails-eefmdtclkf:database.BHEUKPGVHOB
database.BHEUKPGVHOB: digest: sha256:be8596b239b2cf9c139b93013898507ba23173a5238df1926e0ab93e465b342c size: 11181
Promoting RJVOZPCGYHE... UPDATING

And watch the logs:

$ convox logs
agent:0.69/i-e0117466 Starting database process 4c6cde98595c
database/4c6cde98595c The files belonging to this database system will be owned by user "postgres".
database/4c6cde98595c
database/4c6cde98595c This user must also own the server process.
database/4c6cde98595c
database/4c6cde98595c The database cluster will be initialized with locale "en_US.utf8".
database/4c6cde98595c The default database encoding has accordingly been set to "UTF8".
database/4c6cde98595c The default text search configuration will be set to "english".
database/4c6cde98595c
database/4c6cde98595c Data page checksums are disabled.
database/4c6cde98595c fixing permissions on existing directory /var/lib/postgresql/data ... ok
database/4c6cde98595c creating subdirectories ... ok
database/4c6cde98595c selecting default max_connections ... 100
database/4c6cde98595c selecting default shared_buffers ... 128MB
database/4c6cde98595c selecting dynamic shared memory implementation ... posix
database/4c6cde98595c creating configuration files ... ok
database/4c6cde98595c creating template1 database in /var/lib/postgresql/data/base/1 ... ok
database/4c6cde98595c initializing pg_authid ... ok
database/4c6cde98595c initializing dependencies ... ok
database/4c6cde98595c creating system views ... ok
database/4c6cde98595c loading system objects' descriptions ... ok
database/4c6cde98595c sh: locale: not found
database/4c6cde98595c creating collations ... ok
database/4c6cde98595c No usable system locales were found.
database/4c6cde98595c Use the option "--debug" to see details.
database/4c6cde98595c creating conversions ... ok
database/4c6cde98595c creating dictionaries ... ok
database/4c6cde98595c setting privileges on built-in objects ... ok
database/4c6cde98595c creating information schema ... ok
database/4c6cde98595c loading PL/pgSQL server-side language ... ok
database/4c6cde98595c vacuuming database template1 ... ok
database/4c6cde98595c copying template1 to template0 ... ok
database/4c6cde98595c copying template1 to postgres ... ok
database/4c6cde98595c syncing data to disk ... ok
database/4c6cde98595c
database/4c6cde98595c WARNING: enabling "trust" authentication for local connections
database/4c6cde98595c You can change this by editing pg_hba.conf or using the option -A, or
database/4c6cde98595c --auth-local and --auth-host, the next time you run initdb.
database/4c6cde98595c
database/4c6cde98595c
database/4c6cde98595c Success.
database/4c6cde98595c
database/4c6cde98595c
database/4c6cde98595c PostgreSQL stand-alone backend 9.4.6
database/4c6cde98595c backend> statement: CREATE DATABASE app;
database/4c6cde98595c
database/4c6cde98595c backend>
database/4c6cde98595c
database/4c6cde98595c PostgreSQL stand-alone backend 9.4.6
database/4c6cde98595c backend> statement: ALTER USER postgres WITH SUPERUSER PASSWORD 'password';
database/4c6cde98595c
database/4c6cde98595c backend>
database/4c6cde98595c LOG: database system was shut down at 2016-07-05 21:53:36 UTC
database/4c6cde98595c LOG: MultiXact member wraparound protections are now enabled
database/4c6cde98595c LOG: database system is ready to accept connections
database/4c6cde98595c LOG: autovacuum launcher startedTest Persistence

Postgres boots up though this looks just like how it worked before with ephemeral Docker volumes.

But… Pick a different instance and look again at the shared filesystem. It also sees the Postgres data directory!

$ convox instances ssh i-f08a3660 sudo ls /volumes/var/lib/postgresql/data
base  global  pg_clog  pg_dynshmem  pg_hba.conf  pg_ident.conf pg_logical  pg_multixact  pg_notify  pg_replslot  pg_serial  pg_snapshots  pg_stat  pg_stat_tmp  pg_subtrans  pg_tblspc  pg_twophase  PG_VERSION  pg_xlog  postgresql.auto.conf  postgresql.conf  postmaster.opts  postmaster.pid

The Postgres container is only accessible from inside the VPC so lets proxy in, create a table and insert a record:

$ convox apps info
Name rails
Status running
Release RJVOZPCGYHE
Processes database web
Endpoints internal-rails-database-T5LLUKS-i-1848066794.us-east-1.elb.amazonaws.com:5432 (database)
rails-web-WQDNFES-853899250.us-east-1.elb.amazonaws.com:80 (web)
rails-web-WQDNFES-853899250.us-east-1.elb.amazonaws.com:443 (web)
$ convox proxy internal-rails-database-T5LLUKS-i-1848066794.us-east-1.elb.amazonaws.com:5432
proxying 127.0.0.1:5432 to internal-rails-database-T5LLUKS-i-1848066794.us-east-1.elb.amazonaws.com:5432
$ psql -h 127.0.0.1 -U postgres -d app
Password for user postgres:
psql (9.4.5, server 9.4.6)
app=# CREATE TABLE users (
app(# name varchar(40)
app(# );
CREATE TABLE
app=# INSERT INTO users VALUES('foo');
INSERT 0 1

Now for the moment of truth, let’s kill the Postgres container while watching the logs:

$ convox ps stop 4c6cde98595c
Stopping 4c6cde98595c... OK
$ convox logs
agent:0.69/i-e0117466 Stopped database process 4c6cde98595c via SIGKILL
database/4c6cde98595c LOG: received smart shutdown request
database/4c6cde98595c LOG: autovacuum launcher shutting down
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
database/4c6cde98595c LOG: incomplete startup packet
agent:0.69/i-e0117466 Stopped database process 4c6cde98595c via SIGKILL
agent:0.69/i-e0117466 Dead database process 4c6cde98595c
agent:0.69/i-e0117466 Stopped database process 4c6cde98595c via SIGTERM
agent:0.69/i-492ab0cf Starting database process 533a014918b5
database/533a014918b5 LOG: incomplete startup packet
database/533a014918b5 LOG: incomplete startup packet
database/533a014918b5 LOG: database system was not properly shut down; automatic recovery in progress
database/533a014918b5 LOG: record with zero length at 0/16CB998
database/533a014918b5 LOG: redo is not required
database/533a014918b5 LOG: MultiXact member wraparound protections are now enabled
database/533a014918b5 LOG: database system is ready to accept connections
database/533a014918b5 LOG: autovacuum launcher started

The old container stopped and the new container started on a different instance and it sees and recovers the data!

$ psql -h 127.0.0.1 -U postgres -d app
Password for user postgres:
psql (9.4.5, server 9.4.6)
app=# SELECT * FROM users;
name
------
foo
(1 row)
IT WORKS

In Summary

  • Provision an EFS volume with a bit of CloudFormation
  • Mount EFS to every instance via UserData and standard Linux NFS utilities
  • Mount sub-directories of the EFS volume into Docker containers (ECS services) via volume mounts
  • Persist data across container stops and re-starts, independent of instances

All of this is working with a couple of hours of integrating EFS into an existing AWS / CloudFormation / VPC / ECS setup. Thanks Convox!.

I did notice some side-effects…

  • Rolling deploys can result in two processes trying to lock the database
  • Deleting an app leaves data around

And I didn’t push towards extreme usage scenarios…

  • High throughput
  • High volume
  • Network failures

Still…

Key Takeaways

Always Bet On AWS — A new standard has been set in cloud storage. Not only does EFS work, it promises all the things we want from the cloud. It’s cheap at $0.30/GB-month, you don’t provision storage up front, you pay for what you use, and it scales to petabytes.

No doubt there will be questions about the NFS protocol and horror stories of the dark days of EBS that cast a shadow over EFS. But you’d be silly to think AWS won’t operate this with the extremely high quality-of-service that is the reason they are taking over all of the world’s computing.

They will keep the system running, recover as fast as possible when it does have problems, and will answer support tickets when it’s not working as advertised.

Amazon continues the trend of turning cloud computing into utility services that get better and cheaper over time all while taking on massive infrastructure complexity so we don’t have to.

Filesystems Are Back — The experiment with Postgres gives me confidence that EFS this will fit into my architecture in some places. Now one of the tenants of 12 Factor is up for review:

VI. Processes
Execute the app as one or more stateless processes
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.

All of a sudden Wordpress got a lot easier to run on AWS. What other software and use cases are possible again? How about:

  • Docker image caching
  • Home directories
  • User file uploads and processing

Integration Over Invention — The 12 Factor tenant came from avoiding the hard problems of stateful containers.

I have been working on container runtimes for more than 6 years between Heroku and Convox and have explicitly avoided putting engineering resources towards this problem until now.

We have been tempted with solutions like S3FS, GlusterFS, or Flocker. We have built contraptions around EBS volumes, snapshots and rsync. But these systems always bring on a lot of additional complexity which means more operational risk and more engineering time.

Tremendous engineering has gone into these systems and people have been extremely successful with the various solutions. But most of us have correctly put our own energy into architecting our applications to not need to solve tough problems with state, delegating everything to a Postgres service and S3.

Finally the tides have turned. We can integrate a service with a few lines of CloudFormation and build around shared state vs. invent, install, debug and maintain complex distributed systems software.

Specialized Utility Services Still Win —Even if EFS works for a Postgres container an RDS Postgres database starts at $12/mo. This includes more assurances about catastrophic failure like Postgres data replication and backups that would be be risky to ignore when running on any storage service.

So I still see no reason to take on the operational properties of a containerized data volume for a database outside of development or test purposes.

Likewise S3 isn’t going anywhere. It’s hard to beat the simplicity and maturity of a simple blob storage service in our application architecture.


What do you think?

Is EFS a new tool in your infrastructure toolbox?

What new or old use cases does this open up for our apps in the cloud?

What use cases will you still avoid using EFS for at all costs?

What becomes easier, cheaper and and more reliable now that AWS has taken on this tough challenge for us?