Docker containers with Ansible

14 min readMar 1, 2023

Before the recent reign of containers, Vagrant, VirtualBox and similar technologies were the ingredients for setting up a local development environment. These tools paired with automation tools like Ansible and Chef made for a decent reproducible application environment. The rise of lightweight virtualisation options; ushered in by docker and continually simplified by various cloud based innovations, has, however, led to the decline of these once very popular developer toolbox darlings. So much so that sightings of them in the wide is often a possible indication of code base from their era.

Recently I made such a sighting. More accurately, I’m now charged with a project that can be thus sighted. The setup required an installation of virtualBox running Debian provisioned by vagrant and then configured via ansible. It works. Well, for the most part. And when it doesn’t, it’s no small pain to figure out. Maintaining the coordination between vagrant and virtualBox was an irksome black magic that particularly spurred me to contemplate cheaper, friendly virtualisation alternatives.

By the end of my onboarding, I arrived at a 3 phase plan:

Replace vagrant and virtualBox with docker. i.e. use ansible to set up the project in docker containers.
Replace ansible with docker-compose, accompanied by dedicated Dockerfile for each of the collaborating apps in the project.
Extend the tooling change to other environments: staging and production.

This article is a documentation of the first phase of this plan.

Why?

Using ansible to configure and run docker containers is not a common scenario. In fact, one of the motivations for this article is that I failed to find a single resource that put together what I needed. This scarcity of resources is understandable because once you succeed in getting some automation around a project, going the extra mile with the many tools available becomes very trivial.

This case, unfortunately, is different. Given I’m fairly new to the project, it will take some time to track down all the necessary dependencies sprawled over multiple ansible roles and playbooks, figure out their interactions and correctly reproduce them in isolated containers.

However, since the requirements for these dependencies are already captured in the various ansible playbooks, I could make great progress by simply pointing ansible to an empty container, and have it configure the container same way it configures the virtualBox virtual machine (hereafter referred to as vm). After all, that is one of the awesome features of ansible; give it (a secure shell, ssh) access to any machine, and it’ll give you your desired environment in return. Here, the machine with ssh access will be a docker container and the desired environment is the development environment captured overtime in the various ansible playbooks.

The project

The target project of this effort is a fairly large Ruby-on-Rails application attended by the usual many dependencies like Sidekiq, Nginx etc.

For obvious reasons, I’m unable to use the actual project for this article. In its place, we’ll be using the public project, docker-rails. I chose this project because its dependencies are fairly similar to the target project. It spots:

Sidekiq
Postgres
Redis
Opensearch

And accommodating these dependencies will provide the necessary challenges I hope to cover in this walk through.

With that settled, the adventure begins with a clone of the project into a workspace.

mkdir -p ansidock
cd ansidock
git clone https://github.com/ledermann/docker-rails

The inventory

Ansible’s inventory manages the list of nodes/hosts that ansible should look forward to working with and how it should work it. Configurations ranging from the connection plugin to use, to the location of the python interpreter on each host, are all configurable using the inventory file. In this case, I’ll leverage it to lay out the topology for the local development environment.

On the high level, we’ll need to distinguish two different nodes:

a parent node on which to create (and manage) docker containers
a container node within which to run the desired application

To understand the reasoning for this further, let’s review the current ansible + vagrant + virtualBox setting.

Vagrant will provision a vm against virtualBox. When it’s done, it hands over details of the new vm to ansible. Ansible then configures the new box to run the requested application. In other words, vagrant operates with virtualBox on the actual host, the developer’s machine, to provide a vm which ansible then configures to run the target application.

Translating this starts us up with an inventory file like this:

# file: ansidock/dev
[dockerhost]
localhost

[app]
[postgres]
[redis]
[opensearch]

[containers:children]
app
postgres
redis
opensearch

[all:vars]
ansible_python_interpreter= /usr/bin/python3

[dockerhost:vars]
ansible_connection=local

[containers:vars]
ansible_connection=docker

The control node, is specified as the dockerhost, with the connection set to localhost. Hosts redis, postgres and opensearch, grouped conveniently under the tag containers, are remote nodes to be connected to via docker.

Notice that none of the container hosts have any connection information like URL or IP address. This is because they would be created on the fly, and one of the tricks I’ll be employing down the line will be to fill this information dynamically as the containers are created.

The network

In a simplified world within a virtualBox’s vm, all processes will be able to talk to any neighbouring process. Hence, the very first playbook will be to ensure a broad open network for the upcoming services.

Docker allows for various kinds of network configuration. For my goals, the basic bridge network suffices.

# file: ansidock/network.yml
---
- hosts: dockerhost
  tasks:
    - name: "Create docker network: {{ network_name }}"
      ansible.builtin.docker_network:
        name: "{{ network_name }}"
        driver: bridge

This network would allow any container on it to be able to address any other container within the network using the container’s IP address, name or alias.

The dependencies

There are three main dependencies required by our target application; Postgres, Redis and Opensearch.

In the actual project I’m working on, these dependencies are installed directly in the VM. The ansible plays for this installation are also readily available. Hence, we have the option to tow the same line and explicitly install each dependency in a bare-bones container.

However, I have phase 2 in sight; the phase wherein I deprecate ansible. As such, we can already get started deploying dedicated containers for each of these services. More so, the container options for these projects are readily available, making it pretty easy to adopt. With that consideration, we’ll proceed to deploy these dependencies via their official docker images instead of running their installation playbooks.

Now, we’ll be doing some repeated works with containers, so this is a good time as any to abstract this tasks to a reusable ansible role:

# file: ansidock/roles/container/tasks/main.yml
---
- name: "pull {{ image }}:{{ image_tag }}"
  ansible.builtin.docker_image:
    name: "{{ image }}"
    tag: "{{ image_tag }}"
    source: pull

- name: "create {{ container_name }} container"
  ansible.builtin.docker_container:
    name: "{{ container_name }}"
    image: "{{ image }}:{{ image_tag }}"
    command: "{{ container_command }}"
    auto_remove: yes
    detach: yes
    env: "{{ container_env }}"
    ports: "{{ container_ports }}"
    volumes: "{{ container_volumes }}"
    working_dir: "{{ container_workdir }}"
    networks:
      - name: "{{ network_name }}"

- name: "add {{ container_name }} container to host group: {{ container_host_group }}"
  ansible.builtin.add_host:
    name: "{{ container_name }}"
    groups:
      - "{{ container_host_group }}"
  changed_when: false
  when: container_host_group is defined

- name: "update {{ container_name }} package register"
  ansible.builtin.command:
    cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get update"'
  when: container_deps is defined

- name: install dependencies
  ansible.builtin.command:
    cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get install -y {{ container_deps | join(" ") }}"'
  when: container_deps is defined

Backed with the following default variables

# file: ansidock/roles/container/defaults/main.yml
---
container_command:
container_env: {}
container_host_group:
container_ports: []
container_volumes: []
container_workdir:

This role has tasks to pull, create a container from a given image and add it to a docker network. It can also optionally install dependencies on the container. As you would notice, there’s also a add host task, defined to fill in those empty host sections in the inventory.

Using this role, we create a playbook for our three dependencies

# file: ansidock/dependencies.yml
---
- name: Postgres database
  hosts: dockerhost
  vars:
    image: "{{ postgres_image }}"
    image_tag: "{{ postgres_version }}"
    container_name: "{{ postgres_container_name }}"
    container_env: "{{ postgres_env }}"
    container_ports: "{{ postgres_ports }}"
    container_host_group: postgres
  roles:
    - container

- name: Redis cache
  hosts: dockerhost
  vars:
    image: "{{ redis_image }}"
    image_tag: "{{ redis_version }}"
    container_name: "{{ redis_container_name }}"
    container_host_group: redis
  roles:
    - container

- name: Opensearch library
  hosts: dockerhost
  vars:
    image: "{{ opensearch_image }}"
    image_tag: "{{ opensearch_version }}"
    container_name: "{{ opensearch_container_name }}"
    container_env: "{{ opensearch_env }}"
    container_host_group: opensearch
  roles:
    - container

This brings us about 40% to the goal.

Here’s our progress so far:

an inventory that will be updated dynamically as containers are created
a docker network to allow all containers to reach each other
the application dependencies available in their own containers

Let’s also confirm our work. For the variable values used in the plays, we provide:

# file: ansidock/group_vars/all.yml
---
network_name: ansidocknet
app_dir: /app/ansidock
app_ruby_version: 3.2.1
app_bundler_version: 2.4.6

# file: ansidock/group_vars/dockerhost.yml
---
postgres_image: postgres
postgres_version: 15-alpine
postgres_container_name: ansidock_db
postgres_ports:
  - 8765:5432
postgres_env:
  POSTGRES_PASSWORD: password
  POSTGRES_USER: postgres
  POSTGRES_DB: ansidock

opensearch_image: opensearchproject/opensearch
opensearch_version: latest
opensearch_container_name: ansidock_search
opensearch_env:
  discovery.type: single-node
  plugins.security.disabled: "true"

redis_image: redis
redis_version: alpine
redis_container_name: ansidock_redis

then run:

ansible-playbook -i dev network.yml dependencies.yml

Once the play completes successfully, we perform the following checks:

docker container ls --format "{{.Names}}" 
# expect three containers: ansidock_search, ansidock_redis, ansidock_db

docker network inspect --format "{{range .Containers}}{{println .Name}}{{end}}" ansidocknet
# ansidock_search, ansidock_redis, ansidock_db

Nice. Onwards.

The application

Before delving into the application, take a look at the value of the host attribute of the playbooks.

You will note that so far, ansible has been operating on the dockerhost, i.e. the control node. This is why, in some instances where we needed to perform an operation on a container, we did not use ansible directly within the container but rather used ansible to execute a command shell on the controller host which then executes the command using docker cli.

For example take another look at the install dependencies tasks from the container role above.

- name: install dependencies
  ansible.builtin.command:
    cmd: 'docker exec {{ container_name }} /bin/bash -c "apt-get install -y {{ container_deps | join(" ") }}"'
  when: container_deps is defined

If the target at this point had been the container node, the tasks should have simply been:

- name: install dependencies
  ansible.builtin.apt:
    package: '{{ container_deps | join(" ") }}'
  when: container_deps is defined

But since ansible, for this task, is operating off of the control node i.e. the container host machine, we have to explicitly execute the commands using docker cli.

This distinction comes to a head in the application playbook where in the first part we’ll need to create the container (using the control node as host) and in the second configure the container (switching to the container node as host). As such, keep an eye out for the switcheroo.

The first order of business is provisioning an empty container like the debian vm that vagrant makes available to ansible for setting up the project. We’ll approximate this with a base ubuntu image. Like the dependencies before, we’ll use the container role to set this up.

However, before doing that, we’ll need to figure out what to do about the container’s main process. The lifecycle of a docker container revolves around its main process(PID 1) and handling it properly has been the focus of many teachings, learnings and heartbreaks in container management.

Our challenge, here, is that the target main process, the rails server, won’t be available until much later after ansible has had it’s time with the container. But for ansible to be able to get to the container, the container needs to running. And to run the container we’d like it to be the rails server … The obvious solution will be to relinquish PID 1 to another long-lived task (like sleep infinity) and then start the rails server later when it’s ready. This solution is in the right direction, with the addition that we’ll need whatever takes the main processes to also shoulder the management of the rails processes and any other child processes that might crop up.

Fortunately, this is not a tall order. The Linux ecosystem is rich with applications written for this very purpose. And from the various options, we’ll go with supervisord. Supervisord in addition to our desired behaviour, allows the addition (and removal) of child processes at any point in its lifetime. We’ll be leveraging this later to get our rails processes online.

With that settle, the next task is clear; to cobble together a set of tasks that provides us a base image with supervisord and the option to reconfigure supervisord as needed.

# file: ansidock/roles/supervisor/tasks/build.yml
---
- name: create temp directory for build
  ansible.builtin.tempfile:
    state: directory
  register: build_dir

- name: generate dockerfile
  ansible.builtin.template:
    src: dockerfile.j2
    dest: '{{ build_dir.path }}/Dockerfile'

- name: generate supervisord conf
  ansible.builtin.template:
    src: supervisord.conf.j2
    dest: '{{ build_dir.path }}/supervisord.conf'

- name: build supervisord image
  ansible.builtin.docker_image:
    name: "{{ image }}"
    tag: "{{ image_tag }}"
    source: build
    state: present
    force_source: true
    build:
      path: "{{ build_dir.path }}"
      pull: yes

The task needs the following two templates:

a simple supervisord configuration

; file: ansidock/roles/supervisor/templates/supervisord.conf.j2
[supervisord]
logfile=/tmp/supervisord.log
loglevel=debug
nodaemon=true
user=root

and a docker image that uses it

# file: ansidock/roles/supervisor/templates/dockerfile.j2
# syntax=docker/dockerfile:1
FROM ubuntu:23.04

RUN apt-get update \
    && apt-get install -y supervisor \
    && mkdir -p /var/log/supervisor

COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf

CMD ["/usr/bin/supervisord", "-n"]

The end result of this will be an Ubuntu image with supervisord installed.

And now that we have an image, we can hand it off to our container role from earlier to run. Hence, the first part of our application playbook:

# file: ansidock/application.yml
---
- name: Prepare application container image
  hosts: dockerhost
  vars:
    aim: build
    image: "{{ app_image }}"
    image_tag: "{{ app_image_version }}"
    container_name: "{{ app_container_name }}"
    container_env: "{{ app_env }}"
    container_ports: "{{ app_ports }}"
    container_host_group: app
    container_workdir: "{{app_dir}}"
    container_volumes:
      - "{{playbook_dir}}/{{app_src}}:{{app_dir}}"
    container_deps:
      - python3-apt
      - python3
      - python3-pip
  roles:
    - supervisor
    - container

Notice the extra python dependencies we install in the container. Those will allow us to use ansible commands directly within the container in the next part. Of course, we could (and should) include these when we built the base image earlier, but then that will be no fun.

Also, in case you didn’t catch it, we’ve also sneaked in our project into the container at this point.

Now that we have a properly primed base image, for my real project everything was largely in place. All that is left to do was to point the existing ansible plays to the container host and voilà, I had a working environment similar to the existing vm solution. But since you stuck with me thus far, we’ll go on to finish this and get the docker-rails project running in our setup.

The final piece remaining is to configure the container to run a rails app. This involves, installing ruby with all its dependencies, installing the popular ruby dependency manager Bundler, Node.js and it’s package manager, Yarn, then finally prepare the database for the app.

# file: ansidock/roles/ruby/tasks/main.yml
---
- name: install rbenv and app dependencies
  ansible.builtin.apt:
    name:
      - autoconf
      - bison
      - build-essential
      - git
      - imagemagick
      - libdb-dev
      - libffi-dev
      - libgdbm-dev
      - libgdbm6
      - libgmp-dev
      - libncurses5-dev
      - libpq-dev
      - libreadline6-dev
      - libssl-dev
      - libyaml-dev
      - patch
      - rbenv
      - ruby-build
      - rustc
      - tzdata
      - uuid-dev
      - zlib1g-dev
    state: present
    update_cache: true

- name: register rbenv root
  ansible.builtin.command:
    cmd: rbenv root
  register: rbenv_root

- name: install ruby-build rbenv plugin
  ansible.builtin.git:
    repo: https://github.com/rbenv/ruby-build.git
    dest: "{{ rbenv_root.stdout }}/plugins/ruby-build"

- name: "install ruby {{ ruby_version }}"
  ansible.builtin.command:
    cmd: "rbenv install {{ ruby_version }}"
  args:
    creates: "{{ rbenv_root.stdout }}/versions/{{ ruby_version }}/bin/ruby"
  environment:
    CONFIGURE_OPTS: "--disable-install-doc"
    RBENV_ROOT: "{{ rbenv_root.stdout }}"
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"

- name: install bundler
  ansible.builtin.gem:
    name: bundler
    version: "{{ bundler_version }}"
  environment:
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"

- name: install app gems
  ansible.builtin.bundler:
    state: present
    executable: "{{ rbenv_root.stdout }}/shims/bundle"

- name: remove conflicting yarn bin
  ansible.builtin.apt:
    package: cmdtest
    state: absent

- name: add yarn source key
  block:
    - name: yarn |no apt key
      ansible.builtin.get_url:
        url: https://dl.yarnpkg.com/debian/pubkey.gpg
        dest: /etc/apt/trusted.gpg.d/yarn.asc

    - name: yarn | apt source
      ansible.builtin.apt_repository:
        repo: "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/yarn.asc] https://dl.yarnpkg.com/debian/ stable main"
        state: present
        update_cache: true

- name: install yarn
  ansible.builtin.apt:
    package: yarn

- name: install javascript packages
  ansible.builtin.command:
    cmd: yarn install --frozen-lockfile
  environment:
    NODE_OPTIONS: "--openssl-legacy-provider"

- name: prepare database
  ansible.builtin.command:
    cmd: bundle exec rails db:prepare
  environment:
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"

- name: precompile assets
  ansible.builtin.command:
    cmd: bundle exec rails assets:precompile
  environment:
    PATH: "{{ rbenv_root.stdout }}/shims:{{ ansible_env.PATH }}"
    NODE_OPTIONS: "--openssl-legacy-provider"

Nothing out of the ordinary, just standard tasks any good rubyist will have had to execute at some point in their career.

If it looks verbose, it’s because it wrestles with common issues like yarn keyword pointing to cmdtest on Ubuntu and has to be explicitly replaced with Yarn the JavaScript dependency manager; issues like rbenv ruby-build being out of date with that from the apt repositories, … etc. Any ways, fun stuffs like those aren’t our focus at the moment. So we move on.

Now that we are ready to run the application, we need to instruct supervisord to help us with that.

# file: ansidock/roles/supervisor/tasks/reconfigure.yml
---
- name: generate supervisor conf
  ansible.builtin.template:
    src: program.conf.j2
    dest: "/etc/supervisor/conf.d/{{ filename }}"
  vars:
    command: "{{ item.value }}"
    program: "{{ item.key }}"
    filename: "{{ item.key }}.conf"
    workdir: "{{ container_workdir }}"
  with_dict: "{{ programs }}"

- name: restart supervisord
  ansible.builtin.supervisorctl:
    name: '{{ item.key }}'
    config: /etc/supervisor/supervisord.conf
    state: present
  with_dict: "{{ programs }}"

The task takes a map of program to execution command, generates a supervisord config from the template below and copies it over to the container.

; file: ansidock/roles/supervisor/templates/program.conf.j2
[program:{{ program }}]
command={{ command }}
directory={{ workdir }}
startretries=10
stdout_logfile={{ workdir }}/log/development.log
user=root

Yes, the task also restarts supervisord to use the new configuration(s).

Since this role serves for both building the base image and also re-configuring the supervisord process, let’s throw in a parent task that conditionally switches between the two actions:

# file: ansidock/roles/supervisor/tasks/main.yml
---
- include_tasks: build.yml
  when: aim == "build"
- include_tasks: reconfigure.yml
  when: aim == "configure"

As you might have realized, thus far, we’ve not paid any special attention to Sidekiq. This is because it runs the very same rails app, just via a different process. Hence, everything we’ve done for the main application thus far applies to it. We only get to single it out now as we complete our application playbook:

# file: ansidock/application.yml
---
- name: prepare application container image
  hosts: dockerhost
  vars:
    aim: build
    image: "{{ app_image }}"
    image_tag: "{{ app_image_version }}"
    container_name: "{{ app_container_name }}"
    container_env: "{{ app_env }}"
    container_ports: "{{ app_ports }}"
    container_host_group: app
    container_workdir: "{{app_dir}}"
    container_volumes:
      - "{{ playbook_dir }}/{{ app_src }}:{{ app_dir }}"
    container_deps:
      - python3-apt
      - python3
      - python3-pip
  roles:
    - supervisor
    - container

- name: setup application container
  hosts: app
  vars:
    aim: configure
    container_workdir: "{{ app_dir }}"
    ruby_version: "{{ app_ruby_version }}"
    bundler_version: "{{ app_bundler_version }}"
    programs:
      app: "/root/.rbenv/shims/bundle exec puma -C config/puma.rb"
      worker: "/root/.rbenv/shims/bundle exec sidekiq"
  roles:
    - ruby
    - supervisor

And our work is done.

Wrap everything in one nice playbook

# file: ansidock/site.yml
---
- ansible.builtin.import_playbook: network.yml
- ansible.builtin.import_playbook: dependencies.yml
- ansible.builtin.import_playbook: application.yml

and an accompanying vars file

# file: ansidock/group_vars/dockerhost.yml
---
postgres_image: postgres
postgres_version: 15-alpine
postgres_container_name: ansidock_db
postgres_ports:
  - 8765:5432
postgres_env:
  POSTGRES_PASSWORD: password
  POSTGRES_USER: postgres
  POSTGRES_DB: ansidock

opensearch_image: opensearchproject/opensearch
opensearch_version: latest
opensearch_container_name: ansidock_search
opensearch_env:
  discovery.type: single-node
  plugins.security.disabled: "true"

redis_image: redis
redis_version: alpine
redis_container_name: ansidock_redis

app_image: rails_supervisor
app_image_version: 2
app_container_name: ansidock_app
app_src: docker-rails
app_ports:
  - 7000:3000
app_env:
  DB_HOST: "{{ postgres_container_name }}"
  DB_USER: "{{ postgres_env.POSTGRES_USER }}"
  DB_PASSWORD: "{{ postgres_env.POSTGRES_PASSWORD }}"
  OPENSEARCH_HOST: "{{ opensearch_container_name }}"
  REDIS_SIDEKIQ_URL: "redis://{{ redis_container_name }}:6379/0"
  REDIS_CABLE_URL: "redis://{{ redis_container_name }}:6379/1"
  REDIS_CACHE_URL: "redis://{{ redis_container_name }}:6379/2"
  SECRET_KEY_BASE: some-super-secret-from-ansible-vault
  RAILS_MASTER_KEY: another-super-secret-from-ansible-vault
  APP_ADMIN_EMAIL: admin@example.org
  APP_ADMIN_PASSWORD: secret
  APP_EMAIL: reply@example.org
  PLAUSIBLE_SCRIPT: https://plausible.example.com/js/script.js

Let’s take our work for test drive

ansible-playbook -i dev site.yml

If all went well, docker container ls should show our 4 containers chugging along happily. And on visiting localhost:7000, we should be greeted with Ledermann’s sample app, working in all its magnificent.

We did it.

Conclusion

This exercise helped answer the questions:

can I replace vagrant + virtualBox with docker?
if yes, how easily can it be done?

It doesn’t aim to be a final stop. With everything tucked into containers, they are now viable targets for lots of modern tools out there.

For starters, we could take a snapshot of the application after it’s been created. Creating an image from the running container gives us a working base that can be used to circumvent the entire ansible business we just went through.

And armed with these new possibilities, we’re off to phase 2, docker-compose?