AWS ECS “no space left on device” solution

Daniel Smith
2 min readJul 11, 2018
Photo by Jilbert Ebrahimi on Unsplash

Introduction

I have seen this issue manifest itself in several different ways, but it usually follows another issue that resulted in a container not starting and ultimately an exit code being bubbled up to ECS from the agent. However, even after resolving the issue, the task still won’t start. A message in the ECS console reveals the reason to be either CannotCreateContainerError or CannotPullContainerError followed by “no space left on device.”

It appears the container initially failing results in a loop where ECS continues to attempt to start the task. This loop leaves dangling Docker volumes which eventually fill the disk. Depending on the size of the images, this may only take a few minutes.

Another way this may happen is if the deploy process involves creating a new task during each deploy, but the frequency of deploys is greater than the rate the agent cleans up old images. It is also worth reviewing the ECS container agent config to be sure automatic task and image cleanup hasn’t been disabled accidentally.

The Solution

I’ve set up a cronjob to run every 5 minutes to remove old images and dangling volumes. It runs two simple commands to remove containers with an exit status and remove unused volumes. If you need a more robust solution, I suggest taking a look at Brian Christner’s article on Docker cleanup scripts.

# Example Cron
*/5 * * * * /home/ec2-user/ecs-docker-cleanup.sh >> /home/ec2-user/docker-cleanup.log

--

--