Roles management with Ansible

This article aims mainly at compiling a collection of good practices as explained in the official Ansible documentation. It originated from our difficulty to synthetize all the information available on this topic, in a simple, efficient and evolution proof fashion.

Ansible’s roles, a good practice ?

Ansible is a configuration management tool that is usually used for environment provisioning, software deployment and process orchestration.

Its agentless approach, simple syntax (YAML-based) and modularity make it a complex and versatile tool. Therefore we can easily end up with a project with multiple implementations, that are all correct from a technical standpoint. Our mission is to identify the “best” one according to the following principles :

  • Code reusability
  • Code and folder structure readability
  • Evolution/maintenance-friendly

Coming back to the topic at hand, Ansible’s roles are a set of tasks that works toward ensuring the presence or absence of a feature (e.g a Linux user or a Kubernetes cluster of 100 nodes).

Each role must be able to operate in a standalone fashion (role dependency aside) and integrate, to this end, many additional things in its folder strucuture :

├── README.md
├── defaults
│ └── main.yml
├── files
├── handlers
│ └── main.yml
├── meta
│ └── main.yml
├── tasks
│ └── main.yml
├── templates
└── vars
└── main.yml

We have there a README.md file used to document the role’s usage, a defaults folders where we define our default variables. Then there is the files folder that contains all binary files (e.g for configuration) we are going to need. handlers takes care of Ansible event handlers and meta contains Ansible Galaxy’s metadata (e.g Author’s name) as well as the role’s dependency list. Coming up, the most important folder tasks keeps the set of tasks to execute. We’ll put in the template folder all the Jinja2 templates that allow us to generate all sorts of files in a procedural fashion. Finally, we declare the role’s variables in vars (this obviously supersede the defaultdefinition).

The use of Ansible’s roles seems adequate to ensure that the code we produce is generic enough and reusable anytime, anywhere. All we need to do, with that in mind, is to follow some guidelines.

Basic guidelines

It doesn’t hurt to review the basics of role writing first.

The unique-feature concept allows to maximise the role’s reusability in various scenarios while keeping its complexity at a bare minimum. Then, it is better to use roles dependency (as stated in the README file and the role’s metadata) than to include a whole new set of tasks each time we meet a new environment.

Similarly, in order to attain an adequate generalisation level we’ll need to abstract all variables specifying the role’s runtime context within the playbook (the users, their permissions, the number of nodes for a cluster, tools configurations, etc). We’ll see in an upcoming section how to deal with variables outside of the role itself.

Next, all groups of tasks aren’t convertible into a role or even suited for that kind of thing. If the feature can’t be generalized or is simple enough so that an Ansible module can be used to implement it, then it seems sensible to avoid using a role. Tasks and roles can perfectly coexist within the same Ansible project, the trick is to find a balance depending on the project’s specs.

Finally, keeping a formal folder structure within the role itself is important in order to facilitate their comprehension and maintenance. For that, we base ourselves on the structure given by Ansible in their webpage dedicated to good practices and already layed out in a previous section of this article. It is possible to generate automatically a skeleton structure using the ansible-galaxy CLI :

ansible-galaxy init <role_name>

Reusability in practice

Now that we have layed the groundwork, we must find the best way to reuse those roles in various projects whatever the versioning tool used (if any).

We have then many options to choose from :

  • Copy the roles in the target repository alongside the playbooks called by the CI/CD pipeline.
  • Use Ansible Galaxy.
  • Use a repository for each role on a host such as Github, GitLab, BitBucket, etc.

The first one, obviously, isn’t right from a maintainability and role management perspective (those ending up scattered across your projects). Ansible Galaxy is a good plateform to make your roles accessible to all, but it involves setting up a new account on Github (if you’re using another versioning tool) just to store and manage the roles. Out of convenience, we decided to put our role on the plateform we already use for our source code (even though the second option is a good alternative).

Projecting ourselves in the third scenario, all we have to do is using the ansible-galaxy CLI to achieve the expected outcome (as described in the official documentation) :

ansible-galaxy install -r path/to/requirements.yml -p folder/to/install/role

This command is called before the playbook execution within our CI/CD pipeline.

The file requirements.yml contains the arguments that are of interest to us. It allows, among other things, to import multiple roles from different sources. Here’s a basic example of a role we use to deploy a containerized Sphinx Search :

---
- name : sphinxse
src : git@gitlab.com:repository/name/sphinxse.git
scm: git
version : origin/master

This role can be found here and there is the role to generate a SphinxSE configuration, for those who might be interested. Be careful however, to pull a repository from an automated CI/CD job you’ll have to add the ssh key of the host/server running the main playbook (or the one of your custom Ansible docker image for that matter) to the deploy keys of the concerned project. Nearly all of the versioning tools have that option nowadays (e.g GitLab and Bitbucket). Using GitLab, the setting can be found in Settings — >Repository->Deploy Keys.

As explained here, it’s possible to define with the same syntax the role’s dependencies to import in the meta/main.yml file. Nothing too fancy here, by following diligently the instructions of the documentation and those of the default file created by ansible-galaxy init, it becomes easy to establish a dependency tree moderately complex and flexible (i.e multiple kind of sources can be used, multiple dependencies can be defined, those dependencies can have their own dependencies, etc) :

dependencies: []
# List your role dependencies here, one per line. Be sure to remove
# the ‘[]’ above, if you add dependencies to this list.

You specify role dependencies in the meta/main.yml file by providing a list of roles. If the source of a role is Galaxy, you can simply specify the role in the format username.role_name. The more complex format used in requirements.yml is also supported, allowing you to provide src, scm, version, and name.

Variables management

After taking a look at the Ansible’s variables documentation page it becomes clear that variables management is a sensitive and important topic. Indeed, between the various kinds of variables (facts, registered variables, manually defined variables) and the scope of manually defined variables ranging from the playbook level to the task level depending on where they are set, it can be a bit confusing. Especially when we start to use roles that introduces a new layer of complexity (e.g naming variables efficiently to avoid names’ collisions => having var_1 from role_1 and var_1 from role_2 is what we call a collision here, because they have the same name).

Basic strategy

Although a variable’s definition depends largely from its use, it is necessary to choose a strategy viable in most cases without adding unwanted complexity. Fortunately, the official Ansible page dedicated to good practices covers this topic in some depth. There, it is recommended to use a folder group_vars at the root of your Ansible project and then declare all your variables into two distinct files vars and vault (the latter being encrypted using ansible-vault and containing all the sensitive information).

This practice introduces a new layer of complexity when you have to work with multiple environments. As a matter of fact, a role called in several environments would have to differentiate the name of its input parameters on top of adding duplicate to the variables definition files. There is also the question of the variables organisation within those files. Should we sort them by environment ? By feature ? Or dividing variables files into smaller ones ?

The right approach would then be to split up our files and it is, once again, the Ansible documentation about variables that sheds some light on the matter :

Regional information might be defined in a group_vars/region variable.

The goal here is to split the variables definitions by region and by host (those are stated in the Ansible inventory of the project). Let’s take a look at a simple example of what it means :

group_vars
├── all
│ ├── vars
│ └── vault
├── region_1
│ ├── vars
│ └── vault
└── region_2
├── vars
└── vault

host_vars
├── host_1
│ ├── vars
│ └── vault
└── host_2
├── vars
└── vault

Please notice that the more specific a group is (e.g all > region > host) the more its will be prioritised when defining a variable. Meaning that a variable (called var_1) defined in the vars file of all will be superseded by the definition of the vars file in region_1 and region_2 (the two latter definitions can coexist without impacting eachother). The same is true in the case of a definition of var_1 in vars of host_1 and host_2.

Drafting convention for variables files

To order this slew of variables it becomes essential to set some drafting convention in place for our variables files. Just a little disclaimer before moving on, this convention is based on arbitrary rules and we use it in our own development cycle. Please see the next paragraphs as examples on how to formalise things and not as strict rules set in stone.

We chose to name our variables with the following convention roleName_varName and roleName_varName_vault to make the distinction between secrets and non-sensitive information. With that we can avoid collisions of variables from different playbooks/roles/set of tasks that have the same name. Simultaneously, we’ll structure our files with the following syntax :

---
playbook_var_1: “test playbook var 1”

# Role 1 variables
role_1_var_1: “test role 1 variable 1”

# Role 2 variables
role_2_var_1: “test role 2 variable 1”

Although it generates variables duplicate for services used in multiple places (connect to the same Docker repository for instance), it is imperative to keep roles correlation to a minimum in order to avoid side effects when changing a variable’s value (or name).

For the more specific cases

Let’s pretend that we use a role originating from Ansible Galaxy or a given git repository. In this case, we have no control on the definition of the role’s variables. This may definitely contradict the convention we just established and a middle ground must be found (we can’t just avoid using roles created and maintained by the community). Either we take the same name for our playbook’s variable as the one from the role and we give up our convention for this particular role, or we map the definition from inside the role with our own. For instance :

roles:
- { role: role_1, var_defined_by_role_1: “{{ role_1_var_redefined_vault }}” }
- role: role_2
var_defined_by_role_2: “{{ role_2_var_redefined }}”

To wrap up

I hope this article shed some light on how to use Ansible more effectively in a streamlined fashion (more specifically its role mechanism). I’ll update it should we change our role writing process or should Ansible introduce breaking changes in an upcoming version.

Impulse by INGENIANCE

Written by

We are a community of technology lovers based in Paris. We work on innovative projects related to Software Technology, DevOps, Big Data & Blockchain.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade