Understanding Docker Volumes and Persistent Storage

3 min readSep 2, 2023

In our ongoing series, after exploring how to dockerize Django and connect it with PostgreSQL, the need for understanding data persistence becomes important. Here, we’ll explore Docker Volumes, the types of mounts, and illustrate how to ensure that your data doesn’t vanish when your container does.

Introduction to Docker Volumes

Docker containers are ephemeral by nature. This means any data created inside a container is lost once that container is removed. Volumes are the solution. They’re designed to persist data outside of the container’s lifecycle, enabling us to store and manage data that must survive when a container is removed or the image is rebuilt.

Two types of Mounts in Docker:

Docker provides two primary ways to use persistent volumes:

Named Volumes: Created and managed by Docker. Data is stored in a part of the host file system which is managed by Docker (/var/lib/docker/volumes on Linux). This is the most common way of handling persistent data and the way to go in most production scenarios.
Bind Mounts: Ties a volume to a specific folder or file on the host machine. It provides the most flexibility but can cause discrepancies between host and container environments due to configuration or permission issues.

Persisting Data in Docker

The secret to persisting data lies in strategically placing it outside the container’s ephemeral filesystem. With the two types of mounts at our disposal, we have a lot of flexibility:

Use Bind Mounts when you want specific files on your host to be directly reflected within your container.
Use Named Volumes for most persistent data needs, like databases, as they offer the best balance of performance and convenience.

Practical Example: Persisting PostgreSQL Data

For our Django and PostgreSQL setup, ensuring PostgreSQL data persists across restarts is crucial.

Using Named Volumes with PostgreSQL:

First, create a named volume:

docker volume create pgdata

Now, when starting your PostgreSQL container, link it to the volume:

docker run - name my-postgres -e POSTGRES_PASSWORD=mypassword -v pgdata:/var/lib/postgresql/data -d postgres

Here, the -v pgdata:/var/lib/postgresql/data ensures the PostgreSQL data is stored in our named volume.

Integrating with docker-compose:

Include the named volume in the docker-compose.yml:

version: '3'

services:
  web:
    build: .
    command: ["python", "manage.py", "runserver", "0.0.0.0:8000"]
    volumes:
      - .:/app
    ports:
      - "8000:8000"
    depends_on:
      - my-postgres

  my-postgres:
    image: postgres
    environment:
      POSTGRES_PASSWORD: mypassword
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:

In our docker-compose.yml, we have a service configuration for both the web application and the PostgreSQL database. Let's take a closer look at how volumes are set up:

Web Service:

volumes: - .:/app: This is a bind mount. The current directory (represented by .) on the host is mounted to /app inside the web container. This is useful for development since any code change on the host is immediately reflected inside the container.

my-postgres Service:

volumes: - pgdata:/var/lib/postgresql/data: This configures a named volume. The volume named "pgdata" (which we created earlier) is mounted to /var/lib/postgresql/data inside the PostgreSQL container. This directory is where PostgreSQL stores its data. By mounting a named volume to this path, we ensure that the data persists across container restarts.

It’s important to note that for Docker Compose to understand and correctly set up named volumes like pgdata, these volumes need to be declared at the bottom of the docker-compose.yml file in a top-level volumes section. Not doing so results in an error, as Docker Compose expects any named volume used in services to be defined there. This declaration informs Docker Compose to create the named volume if it doesn't already exist.

Testing:

Make some changes that affect the database.
Stop and remove the PostgreSQL container.
Start a new PostgreSQL container using the same named volume.
Verify that the data remains intact despite the container removal and recreation.

If your data remains unaffected, congratulations, you’ve successfully set up Docker Volumes.

Conclusion:

Docker Volumes play an essential role in any real-world Docker use-case. Whether you’re dealing with databases, files, or caching, understanding the right way to manage persistent data is vital.

In our upcoming guides, we’ll dive deeper into Docker’s vast landscape. Until then, please share your experiences or questions below!