Certain file operations in a Dockerfile may substantially inflate the resulting image size. In this post we will be looking closely at why this happens and ways around it.
Let’s say we want to add a file to our image:
The image size increased by the size of that file. No surprises here.
Now let’s say we want to add a file and also update the file owner/group:
Compare the image size of the new image (myhttpd2) to the one we built in the previous step (myhttpd). The former one somehow gained an extra 100MB in size!
Surprised? Let’s explain what’s going on here.
Every command in a Dockerfile runs in a separate (intermediate) container. The results are then stored as a new image layer on top of the existing ones. Adding a file in one layer and then removing, replacing or even moving it in another layer does not remove the original file from the underlying layer. Think of layers as of Git commits — a file is preserved in the git history even after you remove it from the repo.
Updating ownership on a file with chown effectively results in duplicating that file and storing its new copy in a new layer. The original copy is still there in the underlying layer. The same applies to chmod and even mv commands. When using them in your Dokerfile, you may end up with an inflated image, before you realize what’s going on.
As a workaround and the best practice multiple RUN commands in a Dockerfile should be logically grouped together, making sure that any download, move/remove, permission setting operations happen in the same intermediate container and thus committed in the same image layer.
Unfortunately, this does not help much when adding local files with ADD/COPY. There is currently no way to pass user:group ownership, nor set desired permissions for files copied from the host. These files end up being owner by root and permissions set to 644 (-rw-r — r — ).
Here is a lengthy discussion on Github regarding the issue, which got eventually closed as “won’t fix”:
Dockerfile: ADD does not honor USER: files always owned by root · Issue #6119 · moby/moby
Hi, consider this Dockerfile: FROM ubuntu RUN adduser foo USER foo ADD . /foo /foo in the container will be owned by…
Is there workaround for this?
Well, in theory, you could download the files from the host with curl instead of using ADD/COPY. This would allow performing additional operations with the files in the same RUN statement. While technically possible, this is highly impractical and completely not portable.
Another option could be for Docker to replace the ADD/COPY statements with some mechanism, that can be used inside RUN or allowing to use ADD/COPY within a RUN statement.
Have you figured out a workaround for this? Share it with others in comments!
Update: Further discussions of the issue and solutions are happening here:
Update: Starting with Docker 17.09.0-ce (2017–09–26)
ADD/COPY commands now support the
—-chown flag in Dockerfile:
COPY --chown=docker:docker source /path/to/destination
Did you find this article interesting? Clap below to help others discover and enjoy it as well!