Perfecting fact gathering in ansible

second edition

A long time ago (way before delegate_facts become a feature) I wrote about clever hacks to gather IP addresses of remote servers. Ansible has been developed a lot since then. Less clever hacks are needed now, therefore it’s time to update the recipe.

When and why it’s needed?

Most of Ansible’s playbooks I saw, use facts about hosts to configure addresses. That means that plays (roles, etc) use ansible_default_ipv4 or other addresses of other hosts which are discovered in a runtime. This makes sense as most configurations are designed to work with preconfigured network addresses controlled by some external entity (DHCP in cloud environments, allocations from ISP, etc). It’s an obvious and a proper idea to use runtime information instead of some ‘hardcoded’ values in inventory whose needed to be updated every time inventory changes.

Normally it’s done by this inelegant construction:

{% for item in groups[somegroup] %}
{{hostvars[item]["ansible_default_ivp4"["address"]}}
{% endfor %}

Or even by this (multihome-aware) monster:

{% for item in groups[somegroup] %}
{{(hostvars[item]['ansible_all_ipv4_addresses']|ipaddr(my_preferable_net))[0]}}
{% endfor %}

Nothing special yet. (Some people may find ‘ipaddr’ filter new here).

The problem

The problem arises when we want to use --limit option to restrict our play to specific host or group which is less wide then ‘somegroup’ from examples above. If we have at least one host in our group (somegroup) which wasn’t setup’ed by gather_fact, it won’t have ‘ansible_*’ set of variables and will cause nasty and ugly error:

fatal: [host1]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'ansible_all_ipv4_addresses'\n\nThe error appears to have been in ...

And we can’t just choke away this message because we really need this variable.

To solve this problem we can use combination of delegate_to and delegate_facts, together with some optimizations to speed things up.

The solution

- name: update facts
setup:
delegate_to: '{{item}}'
delegate_facts: yes
when: hostvars[item]["ansible_all_ipv4_addresses"] is not defined
with_items: '{{groups["somegroup"]}}'
tags:
- always

Notes:

  • key trick here is to check if there are facts variables for a host or not. If they are, task for this host is skipped. If not, ansible will try to gather them.
  • delegate_facts made life much easier — facts goes into host whom task was delegated to.
  • always is a special tag which shovel task in task list regarless of tag list in --tags. You may want to replace ‘always’ with list of tags assigned to all tasks which use gathered facts.

Further enhancements

In real scenarios it may be desirable to add until/delay construction to allow Ansible to try few times on each host before give up.

Other solution may be usage of service discovery services (consul, etcd). In many cases simple ‘fact gathering’ is enough. Key disadvantage of such approach is that you need to have all hosts up before using their data.