5 Simple Ansible Tweaks for Better Playbooks

Tate Galbraith
Jan 14 · 7 min read
Image for post
Image for post
Photo by Charles Deluvio on Unsplash

Ansible is fast, efficient and easy to use. On it’s own it can handle deployments of just about any size and let’s you build out large-scale infrastructure with nothing more than a simple YAML interface. Sometimes, Ansible playbooks, roles and modules can grow to become inefficient and unweildy over time. The more complicated a role becomes and the more moving parts, cause that once elegant YAML to look and perform like a nightmare.

Iterating multiple times over the same collections, not using filters efficiently, or just burying everything under a complex conditional rats nest can leave Ansible sluggish and confusing. Taking time to refactor monolithic playbooks is crucial to a happy, healthy Ansible ecosystem.

In this article we’ll look at some of my personal favorite Ansible concepts that have helped me craft efficient, extensible and most importantly readable playbooks. Let’s get dive in.

This is a pattern that comes up extremely often. The ability to loop over a group of multiple tasks does exist in Ansible, just not the way you might think. One might expect to be able to loop over a block of tasks, but unfortunately that doesn’t work. The most common way to accomplish this is by using include_tasks.

Let’s take a look at how we can execute a group of tasks for a simple list of items:

---
# main.yml
- include_tasks: loop_me.yml
loop:
- one
- two
- three

Above we have our primary set of tasks in the main.yml file. Let’s say we wanted to loop over a set of tasks here. In order to do this, we must break the tasks out into their own YAML file and then include them in this one with a loop defined. We’ve already defined our loop and specified include_tasks, now let’s look at what’s inside our tasks file, loop_me.yml:

---
# loop_me.yml
- name: print stuff
debug:
msg: "stuff: {{ item }}"
- name: print other stuff
debug:
msg: "other stuff: {{ item }}"

Inside loop_me.yml we have a set of tasks that can be looped over via the loop variable back in main.yml. These two YAML files can be placed in the same directory and you can simply use the item keyword inside loop_me.yml to reference each iteration. The item label can be changed using loop_control; more on that next.

Here’s what the output from our playbook looks like:

TASK [include_tasks] included: /ansible/loop_me.yml for localhost => (item=one)
included: /ansible/loop_me.yml for localhost => (item=two)
included: /ansible/loop_me.yml for localhost => (item=three)
TASK [print stuff]ok: [localhost] => {
"msg": "stuff: one"
}
TASK [print other stuff] ok: [localhost] => {
"msg": "other stuff: one"
}
TASK [print stuff]ok: [localhost] => {
"msg": "stuff: two"
}
TASK [print other stuff]ok: [localhost] => {
"msg": "other stuff: two"
}
TASK [print stuff] ok: [localhost] => {
"msg": "stuff: three"
}
TASK [print other stuff] ok: [localhost] => {
"msg": "other stuff: three"
}
PLAY RECAP
localhost : ok=10 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

As you can see, the three items in the list were passed to the loop_me.yml tasks and each item was run through both tasks. Using include_tasks is incredibly handy when you have a more complicated procedure to execute for a list of items. Breaking the tasks out into their own dedicated file is also beneficial for readability and separation of concerns.

While we’re on the subject of loops. There is a very simple change you can implement that will drastically cleanup the output of your plays. When you’re working with a list of items that are larger (like a list of JSON items), you can provide a label for the loop so that large blobs of data don’t pollute the console. Here is a simple way to do this:

- name: test loop output
debug:
msg: "debug output: {{ item.name }}"
loop_control:
label: "{{ item.name }}"
loop:
- { name: "one", data: "1234" }
- { name: "two", data: "5678" }
- { name: "three", data: "9876" }

Notice that we’ve added one extra configuration to this task: loop_control. This let’s you tweak how the loop behaves and what data get’s written to the console. If you add the label key and specify an element to use as the item’s label you will have more concise output for each iteration. Instead of the entire item being written, now you can simply output the name (or some other key) of the item. Let’s look at the difference. Here is the output before using label and loop_control:

TASK [test loop output] ok: [localhost] => (item={'name': 'one', 'data': '1234'}) => {
"msg": "debug output: one"
}
ok: [localhost] => (item={'name': 'two', 'data': '5678'}) => {
"msg": "debug output: two"
}
ok: [localhost] => (item={'name': 'three', 'data': '9876'}) => {
"msg": "debug output: three"
}

Notice how each item is logged in full to the console. Now, here is the output after:

TASK [test loop output]ok: [localhost] => (item=one) => {
"msg": "debug output: one"
}
ok: [localhost] => (item=two) => {
"msg": "debug output: two"
}
ok: [localhost] => (item=three) => {
"msg": "debug output: three"
}

We no longer have to see the entire item displayed in the console. This becomes even more helpful with larger lists. Not only does this reduce the log overhead, but it also enables faster identification of items that were skipped or have changed/failed.

This is an exceptionally convenient filter that will save you a lot of time when used correctly. Essentially, this filter will drop dict items from a list based on their keys/values. Simple enough. Let’s say we have the following list:

- set_fact:
my_list:
- { name: 'jeff', age: 24 }
- { name: 'bill', age: 56 }
- { name: 'jenny', age: 39 }

Now, let’s assume we want to drop anyone from this list whose age is equal to a certain value. We can do this easily with rejectattr:

- set_fact:
my_new_list: "{{ my_list | rejectattr('age', 'equalto', 24) | list }}"

Now if we take a look at both lists we can see that jeff has been dropped from the new list:

TASK [debug] ok: [localhost] => {
"msg": [
{
"age": 24,
"name": "jeff"
},
{
"age": 56,
"name": "bill"
},
{
"age": 39,
"name": "jenny"
}
]
}
TASK [set_fact] ok: [localhost]TASK [debug] ok: [localhost] => {
"msg": [
{
"age": 56,
"name": "bill"
},
{
"age": 39,
"name": "jenny"
}
]
}

Using this filter prevents you from having to write complex iterative loops where you compare items in order to filter a list. If you’ve ever attempted to use when logic while looping over a list, this filter will be your new best friend.

There are a number of different conditions you can use with rejectattr. Check out the documentation for more info.

This next filter is a simple one, but extremely important if you’re working on roles that get a lot of reuse by other engineers. You never know when a variable might not be defined. Inventories can change, especially if they’re dynamic, and host/group variables you rely on could disappear. For this reason, it is always important to provide default values. This is precisely what default does.

Using default is simple. Let’s say we have some tasks that reference a group variable called num_copies. This variable will (for the sake of an example) specify the number of a copies of a file that should be created. In our task we’ll reference the group_var:

- name: show number of copies
debug:
msg: "there will be {{ num_copies }} of the file"

If the num_copies variable were to change or be removed we’d be left with the following:

TASK [debug] fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'num_copies' is undefined"}

To protect against this with a default value, you can simply pipe your variable reference to default like this:

- name: show number of copies
debug:
msg: "there will be {{ num_copies | default(1) }} of the file"

In the above task we reference the num_copies variable again, but this time if it is undefined we’ll use the value provided to default instead. Now if we run it again without num_copies defined we’ll see successful output using our default value:

TASK [debug] ok: [localhost] => {
"msg": "there will be 1 copies of a file"
}

If you’re working with roles you always have the ability to specify default values via the defaults.yml file. This works fine for static values, but if you need to work with user input or variables that get dynamically loaded, default may save you a few headaches.

Last up is another simple and straightforward element. If you’ve ever executed a playbook on a host you know what when Ansible runs it first connects to the host via SSH and then runs the task. This is fine when you’re actually making configuration changes on the target host, but what about when you don’t actually need a particular task run on the host? What if you need access to a local database or just have to make an API call? This is where delegate_to comes in handy.

With delegate_to you can specify a different host that the task should run on. Usually, this is localhost if the Ansible controller being used has the necessary access you need (internet, database, etc). Let’s look at how we can implement this on a per-task basis:

- name: scrape google via the local machine
delegate_to: localhost
uri:
url: "https://google.com"
return_content: true
register: uri_output
- debug:
msg: "{{ uri_output }}"

In the playbook above we are using the uri module to scrape Google’s home page and display the output. In our uri task we specify delegate_to: localhost which tells Ansible that this task should be run on the local machine instead of our inventory targets. Now whenever we have tasks that don’t rely on target hosts we can delegate them to the local machine.

Thank you for reading! I hope you have enjoyed learning about some of my favorite Ansible functionality. If you want to learn more, check out the latest official Ansible documentation available at: https://docs.ansible.com/ansible/latest/index.html.

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Tate Galbraith

Written by

Software Engineer @mixhalo & die-hard Rubyist. Amateur Radio operator with a love for old technology. Tweet at me: https://twitter.com/@Tate_Galbraith

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Tate Galbraith

Written by

Software Engineer @mixhalo & die-hard Rubyist. Amateur Radio operator with a love for old technology. Tweet at me: https://twitter.com/@Tate_Galbraith

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store