Data That Stays - Introduction to Docker Volumes

Welcome back, aspiring Docker master! So far, we’ve learned how to create, run, and manage containers. You’ve seen how powerful they are for packaging applications. But there’s a tiny “gotcha” we need to address: what happens to your data when a container stops or gets removed? Poof! It’s gone. That’s not ideal for most real-world applications, right?

In this chapter, we’re going to tackle this challenge head-on by introducing Docker Volumes. You’ll discover how to make your containerized applications store data persistently, ensuring your important information survives even if your containers don’t. This is a fundamental concept for building robust, production-ready Docker applications, so get ready to make your data truly stay.

Before we dive in, make sure you’re comfortable with the basic Docker commands from previous chapters, especially docker run, docker ps, docker stop, and docker rm. You’ll be using them a lot today!

The Problem: Ephemeral Data

Imagine you have a container running a simple web server, and users upload files to it. Or maybe it’s a database container storing all your precious user information. What happens if that container crashes, or you decide to update it by removing the old one and starting a new one? By default, any data written inside the container’s writable layer is tied to that specific container instance. When the container is removed, that data is lost forever. Scary, right?

Let’s quickly illustrate this with a super simple example. Don’t worry, we’ll fix it right after!

First, let’s run a temporary alpine container, write a file, and then remove the container:

# Run a temporary Alpine container, write a file, then exit
docker run --name temp-writer --rm alpine sh -c "echo 'Ephemeral data!' > /data/my_file.txt && cat /data/my_file.txt"

What’s happening here?

docker run: Starts a new container.
--name temp-writer: Gives it a memorable name.
--rm: Crucially, this flag tells Docker to automatically remove the container as soon as it exits.
alpine: The image we’re using.
sh -c "...": Executes a shell command inside the container. We’re creating a file /data/my_file.txt and then immediately printing its content.

You should see “Ephemeral data!” printed to your console. Now, try to see if that file exists on your Docker host:

# Try to find the file (it won't be there!)
ls /data/my_file.txt

You’ll get an error like “No such file or directory”. This confirms that the file was created inside the container and vanished along with it. This is the “ephemeral data” problem we need to solve!

The Solution: Docker Volumes

Enter Docker Volumes! Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. They are completely managed by Docker, meaning Docker handles their creation, storage location, and lifecycle. This makes them much easier to back up and manage than simply writing files directly into a container’s writable layer.

Think of a Docker Volume like a special, sturdy briefcase that you can attach to any container. When the container needs to store something important, it puts it in the briefcase. If that container (or even you!) decide to throw away the container, the briefcase remains safe and sound. You can then attach the same briefcase to a new container, and all the contents will still be there! How cool is that?

Types of Mounts: Volumes vs. Bind Mounts

While Docker Volumes are the primary focus for persistence, it’s good to know there are two main ways to connect host storage to a container:

Volumes (our star for today):
- Managed entirely by Docker.
- Docker creates and manages the storage on your host machine (you usually don’t need to know the exact path).
- Best for persisting application data (e.g., database files, user uploads).
- Can be easily shared between multiple containers.
Bind Mounts:
- You, the user, control the exact mount point on the host filesystem.
- Docker doesn’t manage the host location; it just “binds” a specific directory or file from your host into the container.
- Very useful for development workflows (e.g., mounting source code into a container so changes are immediately reflected without rebuilding the image).
- Less portable than volumes.

For most application data persistence, Volumes are the recommended choice. They’re more abstract, portable, and secure. We’ll focus on them primarily in this chapter.

Named Volumes vs. Anonymous Volumes

When you create a volume, you can give it a name (a named volume) or let Docker assign a unique, long ID (an anonymous volume).

Named Volumes: These are what you’ll use 99% of the time for persistence. They have a human-readable name you specify (e.g., my-database-data). This makes them easy to refer to, inspect, and reuse.
Anonymous Volumes: Docker gives them a random name. They’re harder to refer to and manage. They’re generally used when you just need a temporary scratch space that doesn’t need to be explicitly managed.

We’ll be working with named volumes exclusively today.

Step-by-Step Implementation

Let’s get our hands dirty and start using volumes! We’ll create a named volume, attach it to a container, write some data, then remove the container and prove the data persists when a new container uses the same volume.

Step 1: Create a Named Volume

First, let’s create our “briefcase” – a named volume. We’ll call it my-nginx-data.

# Create a named Docker volume
docker volume create my-nginx-data

You should see the name my-nginx-data printed, confirming its creation.

Now, let’s see what volumes Docker knows about:

# List all Docker volumes
docker volume ls

You should see my-nginx-data in the list, along with any other volumes Docker might have created (like anonymous ones from previous experiments).

To get more details about our new volume, we can inspect it:

# Inspect our new volume
docker volume inspect my-nginx-data

This command will output a JSON object with a lot of information. Look for the "Mountpoint" key. This is the actual path on your Docker host where Docker is storing the volume’s data. You usually don’t need to interact with this path directly, but it’s good to know it exists!

[
    {
        "CreatedAt": "2025-12-04T10:30:00Z",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/my-nginx-data/_data",
        "Name": "my-nginx-data",
        "Options": {},
        "Scope": "local"
    }
]

(Note: Your Mountpoint path might differ slightly based on your operating system and Docker installation, but it will generally be under /var/lib/docker/volumes/ on Linux, or in a Docker-managed VM/filesystem on macOS/Windows.)

Step 2: Use the Volume with a Container

Now, let’s run an Nginx web server container and tell it to use our my-nginx-data volume. We’ll mount it to /usr/share/nginx/html inside the container, which is Nginx’s default web root.

# Run an Nginx container and mount our volume
docker run -d \
  --name my-web-server \
  -p 8080:80 \
  -v my-nginx-data:/usr/share/nginx/html \
  nginx:latest

Let’s break down the new parts of this docker run command:

-d: Runs the container in detached mode (in the background).
-p 8080:80: Maps port 8080 on your host to port 80 inside the container, so we can access Nginx from our browser.
-v my-nginx-data:/usr/share/nginx/html: This is the magic!
- my-nginx-data: The name of the volume we just created.
- :: The separator.
- /usr/share/nginx/html: The path inside the container where the volume will be mounted.

Give it a moment to start up. You can check its status with docker ps.

Now, open your web browser and navigate to http://localhost:8080. You should see the default Nginx welcome page. This is because the nginx:latest image comes with a default index.html file in /usr/share/nginx/html. When we mounted our empty my-nginx-data volume to that path, it effectively hid the default Nginx files. This is important to understand: when you mount an empty volume to a non-empty directory in a container, the contents of the container’s directory are hidden by the volume.

Let’s put our own custom content into the volume. We’ll exec into the container and create a new index.html file.

# Execute a command inside the running container
docker exec -it my-web-server bash

You are now inside the my-web-server container’s shell. Let’s create our custom index.html:

# Inside the container, create a new index.html
echo "<h1>Hello from my Docker Volume!</h1>" > /usr/share/nginx/html/index.html

Now, exit the container’s shell:

# Exit the container's shell
exit

Go back to your browser and refresh http://localhost:8080. You should now see “Hello from my Docker Volume!”. Awesome! Our custom data is now being served by Nginx, and it’s stored in our my-nginx-data volume.

Step 3: Verify Data Persistence

Here’s the moment of truth! We’ll stop and remove the my-web-server container. Remember, if we hadn’t used a volume, our index.html would be gone.

# Stop the container
docker stop my-web-server

# Remove the container
docker rm my-web-server

Confirm that the container is gone using docker ps -a.

Now, let’s run a brand new Nginx container, but crucially, we’ll attach the same my-nginx-data volume to it.

# Run a NEW Nginx container with the SAME volume
docker run -d \
  --name my-new-web-server \
  -p 8080:80 \
  -v my-nginx-data:/usr/share/nginx/html \
  nginx:latest

Once it’s up (docker ps), head back to http://localhost:8080 in your browser. What do you see?

You should still see “Hello from my Docker Volume!”. This proves that our data persisted in the my-nginx-data volume, even after the original container was completely removed. The new container picked up right where the old one left off regarding its data. This is the power of Docker Volumes!

Step 4: Cleaning Up Volumes

Just like containers, volumes can accumulate if you don’t remove them. While a volume doesn’t take up much space until data is written to it, it’s good practice to clean up what you no longer need.

First, stop and remove our my-new-web-server container:

# Stop and remove the second container
docker stop my-new-web-server
docker rm my-new-web-server

Now, to remove the volume itself:

# Remove the named volume
docker volume rm my-nginx-data

Important: Removing a volume permanently deletes all data stored within it. Be absolutely sure you no longer need the data before running docker volume rm!

If you have many unused volumes, Docker provides a convenient command to clean them up:

# Remove all unused volumes
docker volume prune

This command will ask for confirmation before deleting any volumes that are not currently attached to a running container. This is a handy way to free up disk space.

Mini-Challenge: Persisting Application Logs

Let’s solidify your understanding with a small challenge!

Challenge: You need to run a simple alpine container that continuously writes a timestamp to a log file. Ensure that this log file persists even if you stop and remove the container.

Create a new named volume called my-app-logs.
Run an alpine container in detached mode.
Mount my-app-logs to /var/log/app inside the container.
Make the container run a command that appends the current timestamp to /var/log/app/access.log every 2 seconds. (Hint: use watch or a simple while true loop with sleep).
Wait for about 10-15 seconds, then stop and remove the container.
Run a new alpine container, attach the same my-app-logs volume, and verify that access.log contains the timestamps from the previous run.
Clean up your container and volume.

Hint: For the command to run inside the container, something like sh -c 'while true; do echo "$(date): Hello from container!" >> /var/log/app/access.log; sleep 2; done' could work. Remember to put it in quotes if it contains spaces.

What to Observe/Learn: This challenge reinforces the concept of data persistence for application-generated data (like logs) and how volumes make it trivial to achieve. You’ll see that even if the process writing the logs dies with the container, the logs themselves are safe.

Common Pitfalls & Troubleshooting

Forgetting to Mount the Volume: The most common mistake! If you run a container, write data, and then remove it without a volume, your data is gone. Always double-check your -v flag when persistence is required.
- Fix: Ensure docker run includes -v <volume_name>:<container_path>.
Incorrect Volume Path: You might mount the volume, but to the wrong path inside the container, or the path in the container might not be where your application expects to write data.
- Fix: Inspect your application’s configuration or the container’s Dockerfile to find the correct path for data storage. Use docker exec -it <container_name> ls -l <path> to explore inside the container.
Permissions Issues: Sometimes, the user inside your container might not have the necessary permissions to write to the mounted volume path. This often happens if the container runs as a non-root user and the volume’s ownership on the host is set to root.
- Fix: You can try to change the permissions inside the container after mounting (docker exec ... chmod ...), or configure your Dockerfile to ensure the application user has appropriate permissions for the target mount point. For development, you might temporarily use a more permissive mount, but for production, proper user management is key.
Volume Not Cleaned Up: If you remove containers but forget to remove their associated volumes, they can accumulate over time, consuming disk space.
- Fix: Regularly use docker volume ls to check for unused volumes, and docker volume prune to clean them up.

Summary

Phew! You’ve just unlocked a crucial piece of the Docker puzzle. Let’s recap what you’ve learned:

By default, data inside a container is ephemeral and lost when the container is removed.
Docker Volumes are the primary and recommended way to persist data generated by and used by Docker containers.
Volumes are managed by Docker and stored on the host machine.
We focused on named volumes, which are easy to create, manage, and reuse.
You learned how to create a named volume using docker volume create <name>.
You attached a volume to a container using the -v <volume_name>:<container_path> flag with docker run.
You successfully demonstrated that data written to a mounted volume persists even after the container is stopped and removed.
You know how to inspect volumes (docker volume inspect) and clean them up (docker volume rm, docker volume prune).
You also briefly touched upon bind mounts as another way to connect host storage, often used for development.

This understanding of volumes is absolutely essential for running stateful applications like databases or web servers with user-uploaded content in Docker. You’re building solid foundations!

Next up, we’ll take things a step further and learn how to manage multiple interconnected containers with Docker Compose, which makes defining, running, and scaling multi-container applications a breeze. Get ready to orchestrate your Docker applications like a pro!