Understanding How Docker Multi-arch Images Work

Jakub Czapliński
Icetek
Published in
9 min readApr 23, 2020
Everyone is building containers these days

With the constantly rising popularity of IoT and Edge computing ARM architecture is gaining a lot of popularity. Not only phones are now equipped with ARM processors but also small single board computers. The most popular example is of course is the beloved and super popular Raspberry Pi, however there are a lot more devices designed for makers in mind, like the NVIDIA Jetson family, Khadas VIM, Rock Pi to name a few. Because of that, the need to build, test and run code both natively and in Docker containers on various architectures is bigger than ever.

Why should I bother?

In this article I want to show how to build and store our images for different architectures so that they behave like — for example — the python image which you can run on almost any architecture just by running docker run -it python:3.8. The main focus is on how registry handles multiple images and how the images work. I won’t use built-in Docker buildx command as it has some restrictions and it will be harder to explain what is happening under the hood.

If you want to follow along I have prepared a repository with source code, all the commands, Dockerfiles and everything you need to do to run similar tests. You can work on your hardware as long as you can run Docker on it and the architecture is supported by Python, which you can check on picture below. The repository can be found here.

Mission objectives

Let’s start by checking what can we find about the python image with 3.8 tag on docker hub.

Python 3.8 supported architectures

As you can clearly see this image with this given tag is available for multiple architectures. Thanks to that I can run simple python script on all of my test hardware.

My laptop:

upgrade@ZeroOne ~ $ uname -a
Linux ZeroOne 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
upgrade@ZeroOne ~ $ docker run -it python:3.8 python -c 'print("Hello world!")'
Hello world!

Raspberry Pi 4:

upgrade@rpi4:~ $ uname -a
Linux rpi4 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
upgrade@rpi4:~ $ docker run -it python:3.8 python -c 'print("Hello world!")'
Hello world!

NVIDIA Jetson Nano:

upgrade@jetson:~$ uname -a
Linux jetson 4.9.140-tegra #1 SMP PREEMPT Mon Dec 9 22:47:42 PST 2019 aarch64 aarch64 aarch64 GNU/Linux
upgrade@jetson:~$ docker run -it python:3.8 python -c 'print("Hello world!")'
Hello world!

This portability is pretty awesome — we can develop python scripts that can run on multitude of devices without huge hassle. And what is more important, we do not have to worry about remembering which tag runs where. This is critical if we want to develop application that will be run by external users — no one will be interested in digging trough documentation to understand which tag they should choose based on their hardware. However, there is a catch as to how to make our images work the same way.

Understanding the problem

I want to start by illustrating where the “catch” is, so that I can later show how we can correctly solve the problem.

First thing first — we need a test subject. Time to build and test simple API that will return a hostname and equivalent of uname -i

Docker build:

upgrade@ZeroOne ~/src/python/multi-arch-python $ docker build -t icetekio/nodeinfo:fixedarch . 
Sending build context to Docker daemon 108kB
Step 1/5 : FROM python:3.8
...
trimmed output
...
Digest: sha256:3df040cc8e804b731a9e98c82e2bc5cf3c979d78288c28df4f54bbdc18dbb521
...
trimmed output
...
Successfully built 9185b3dffa7a
Successfully tagged icetekio/nodeinfo:fixedarch

Docker push:

upgrade@ZeroOne ~/src/python/multi-arch-python $ docker push icetekio/nodeinfo:fixedarch
The push refers to repository [docker.io/icetekio/nodeinfo]
...
trimmed output
...
fixedarch: digest: sha256:486ec9f38ecfc476e7abe911031bf8ea4a7c605716b35c490b5cf524ef0c3d12 size: 2635

And let’s see how it looks on the Docker Hub.

Architecture of the “nodetool:fixedarch” image

Cool! We have the image, and as you can see in my case it is linux/amd64 and it makes kind of sense as if you remember the output of uname -a on my machine was

Linux ZeroOne 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

and we need to remember that often x86_64 and AMD64 are used to represent this architecture.

The quote “The nice thing about standards is that you have so many to choose from” from, I think, Andrew S. Tanenbaum fits here pretty nice.

Now I can run the application

upgrade@ZeroOne ~ $ docker run --name nodeinfo -it -p 8080:8080 icetekio/nodeinfo:fixedarch
* Serving Flask app "app" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
* Running on <http://0.0.0.0:8080/> (Press CTRL+C to quit)

and in the second terminal check if the app is working

upgrade@ZeroOne ~ $ curl localhost:8080
{"hostname":"0b31ce6600f1","machine":"x86_64"}

OK, it seems everything is going accordingly to plan.

As we know Python is very portable, and we should be able to run our simple script on multitude of architectures. Well, Python is, but our image isn’t. Remember when we looked at the Docker registry? When I’m running the same command I get this nasty error.

NVIDIA Jetson Nano:

upgrade@jetson:~$ docker run --name nodeinfo -it -p 8080:8080 icetekio/nodeinfo:fixedarch
standard_init_linux.go:211: exec user process caused "exec format error"

Raspberry Pi 4:

upgrade@rpi4:~ $ docker run --name nodeinfo -it -p 8080:8080 icetekio/nodeinfo:fixedarch
standard_init_linux.go:211: exec user process caused "exec format error"

To understand better what happened we need to think how docker builds the images. Simplifying, it takes the list of layers that are composing base image — the FROM python:3.8 from our example - and run commands from Dockerfile on top of them. Every command is a new layer and then, it creates a new image manifest with the list of layers.

You can always review the layers on the Docker Hub UI. The Python image looks like this

Python 3.8 image for amd64 architecture details

From here you can learn how every image is built. As you can also see, there is an architecture dropdown and images for different architectures may be built differently. And that is why when we have built our image, it took as a base a very specific image. This image was designed for AMD64 architecture and will not work on hardware that is not compatible with this architecture. And what we want to do, is to have our image behave the same way — have a dropdown and be available on multitude of the platforms.

Building on different architecture

It is time to build the same application on my Jetson Nano and push it to the registry.

Running the build on Jetson Nano will create image that is based on ARM64 Python image, and will have the same architecture in its manifest.

Docker build:

upgrade@jetson:~/multi-arch-python$ docker build -t icetekio/nodeinfo:fixedarch . 
Sending build context to Docker daemon 108kB
Step 1/5 : FROM python:3.8
3.8: Pulling from library/python
Digest: sha256:3df040cc8e804b731a9e98c82e2bc5cf3c979d78288c28df4f54bbdc18dbb521
Status: Downloaded newer image for python:3.8
---> a42ce4e154a5
Step 2/5 : RUN pip install Flask==1.1.2
---> Using cache
---> e643716e3aae
Step 3/5 : EXPOSE 8080
---> Using cache
---> 74727b378afc
Step 4/5 : ADD app.py /app.py
---> 346527755dc5
Step 5/5 : CMD ["python", "app.py"]
---> Running in 47dbd9c82e10
Removing intermediate container 47dbd9c82e10
---> 2ef5c195cac4
Successfully built 2ef5c195cac4
Successfully tagged icetekio/nodeinfo:fixedarch

Docker run:

upgrade@jetson:~/multi-arch-python$ docker push icetekio/nodeinfo:fixedarch
The push refers to repository [docker.io/icetekio/nodeinfo]
...
trimmed output
...
fixedarch: digest: sha256:f7cf076161876b4339d08ada1db9f39f818b743ecb2a8276c4512688e179e83d size: 2635

I have run the same test with the curl command on my Jetson Nano and it works perfectly. Let's look at the Docker hub and see how the images are described now.

Architecture of the “nodetool:fixedarch” image

As you can see docker has overwritten the fixedarch tag and now only available image is on ARM64. When I will try to run it on my laptop which is AMD64 architecture or Raspberry Pi which is on ARM32 I will get familiar errors. So, we can build images for different architectures, but how to expose them under one tag?

Manifest tool to the rescue!

There is a tool designed to resolve this problem written by Phil Estes. You can read more about it, and download it from here https://github.com/estesp/manifest-tool. What we need to do now is to build all 3 images with different tags and then use this tool to create a multi architecture manifest and publish it on docker hub.

The easiest way to do it, is to push all 3 images with different tags and then merge them into the same tag. I choose to use multiarch as a tag, and add -arch suffix for every architecture that I will build the image.

Run the docker build on my laptop

upgrade@ZeroOne ~/src/python/multi-arch-python $ docker build -t icetekio/nodeinfo:multiarch-amd64 . 
...
trimmed output
...
Successfully tagged icetekio/nodeinfo:multiarch-amd64

and then docker push

upgrade@ZeroOne ~/src/python/multi-arch-python $ docker push icetekio/nodeinfo:multiarch-amd64
...
trimmed output
...
multiarch-amd64: digest: sha256:486ec9f38ecfc476e7abe911031bf8ea4a7c605716b35c490b5cf524ef0c3d12 size: 2635

I have repeated the process for Jetson Nano and Raspberry Pi 4, and after now we can verify images at the docker registry.

Multiple images build for specific architecture

Now we need to create a spec file in YAML format that we will pass tomanifest-tool, to create a multi architecture manifest glueing all images into one tag.

The YAML file should look like this:

image: icetekio/nodeinfo:multiarch
manifests:
- image: icetekio/nodeinfo:multiarch-amd64
platform:
architecture: amd64
os: linux
- image: icetekio/nodeinfo:multiarch-arm32
platform:
architecture: arm
os: linux
- image: icetekio/nodeinfo:multiarch-arm64
platform:
architecture: arm64
os: linux

Notice, that the architecture for ARM32 bit is actually arm. The values in manifest need to correspond to the values found in docker registry.

And now for the final step!

upgrade@ZeroOne ~/src/python/multi-arch-python $ manifest-tool --debug push from-spec manifest.yaml 
DEBU[0000] endpoints: [{false <https://registry-1.docker.io> v2 false true true 0xc000346600}]
DEBU[0000] repoName: icetekio/nodeinfo
INFO[0000] Retrieving digests of images...
DEBU[0000] authConfig for docker.io: mrupgrade
DEBU[0000] endpoints: [{false <https://registry-1.docker.io> v2 false true true 0xc000346780}]
DEBU[0000] Trying to fetch image manifest of docker.io/icetekio/nodeinfo repository from <https://registry-1.docker.io> v2
INFO[0002] Image "icetekio/nodeinfo:multiarch-amd64" is digest sha256:486ec9f38ecfc476e7abe911031bf8ea4a7c605716b35c490b5cf524ef0c3d12; size: 2635
DEBU[0002] authConfig for docker.io: mrupgrade
DEBU[0002] endpoints: [{false <https://registry-1.docker.io> v2 false true true 0xc000542d80}]
DEBU[0002] Trying to fetch image manifest of docker.io/icetekio/nodeinfo repository from <https://registry-1.docker.io> v2
INFO[0005] Image "icetekio/nodeinfo:multiarch-arm32" is digest sha256:c9ab7b7cd3c7b89bc611ce56368d6a4cbc6c27040e485839410e9c1ae8d4c9bf; size: 2635
DEBU[0005] authConfig for docker.io: mrupgrade
DEBU[0005] endpoints: [{false <https://registry-1.docker.io> v2 false true true 0xc000346f00}]
DEBU[0005] Trying to fetch image manifest of docker.io/icetekio/nodeinfo repository from <https://registry-1.docker.io> v2
INFO[0008] Image "icetekio/nodeinfo:multiarch-arm64" is digest sha256:f7cf076161876b4339d08ada1db9f39f818b743ecb2a8276c4512688e179e83d; size: 2635
DEBU[0008] Manifest list push url: <https://registry-1.docker.io/v2/icetekio/nodeinfo/manifests/multiarch>
DEBU[0008] mediaType of manifestList: application/vnd.docker.distribution.manifest.list.v2+json
DEBU[0008] authConfig for docker.io: mrupgrade
Digest: sha256:7eb4667ada05d0bc64686080c7dbc80edb1adf2405f7061db386df61e5ad778a 1050

Success!

One image supporting all three architectures

So now we know how to build images that users can use seamlessly on multiple architectures. Now this small app can run on all 3 devices and there is no need to worry about which tag to use — simply use the multiarch tag!

Final thoughts

There is a lot more to preparing applications to run on every platform. In this example we used Python which by itself is portable. Most of the Python code can run anywhere without needing to port anything, and usually the only thing that user have to worry about is having Python interpreter available. Applications in other languages — especially the compiled ones — need to be written with portability in mind — and built on a given machine. Sometimes — like with Golang for example — the compiler offers cross compilation. Building applications on multiple architectures is actually massive subject, and I will try to cover more aspects of it, including how Docker can be used to help with this process in future articles.

If you are interested reading about my failures and lessons learned when I first try to port my application to ARM architecture you can read my Medium post found here.

Readout

Multi architecture announcement on Docker blog: https://www.docker.com/blog/docker-official-images-now-multi-platform/

Blog post about multi architecture functionalities in docker: https://www.docker.com/blog/multi-arch-all-the-things/

Manifest tool GitHub repository: https://github.com/estesp/manifest-tool

--

--