Host cross-linking in Ansible

George Shuklin
OpsOps
Published in
4 min readApr 2, 2020

In this article I discuss a problem of robust linking of two host in different groups. It’s not a beginner problem, and I think we need an example here.

Let’s say we have some kind of ‘consumers’ and ‘providers’. There is a group ‘consumers’ and a group ‘providers’, with the equal number of hosts. We want to configure providers to serve consumers and we need to configure consumers to use a specific provider for each. (The real nature of ‘consumer/provider’ is not important — it can be benchmarking, iSCSI configuration, pool management, etc). The main point here is that each consumer is linked to one provider only, and one provider can serve only one consumer. It’s an 1-to-1 relation.

Our inventory looks like this:

[providers]
provider1 pkey=42
provider2 pkey=41
provider3 pkey=43
[consumers]
consumer1 ckey=99
consumer2 ckey=33
consumer3 ckey=19

When we configure each provider, it need its own ‘pkey’, and it need ‘ckey’ of the linked consumer.
(ckey/pkey are our imaginary settings reduced to the simplest form).

Bad solutions

Before showing proper solutions I’d like to discuss how ‘not to do’.

❌ Host names as keys ❌

- name: extracting ckey
set_fact:
ckey: '{{ hostvars["consumer" + inventory_hostname.split("provider")[1]].ckey }}'

In this example we extract ‘number’ from our inventory name and join it with ‘consumer’ string to gen consumer hostname. It’s terrible. There is parasitic linking, implicit assumptions, lack of validation and a lot of ambiguity. And it’s hard to read.

Duplication of data

It’s not good but not terrible either. There are some reasonable cases when this approach may work.

[providers]
provider1 pkey=42 ckey=99
provider2 pkey=41 ckey=33
provider3 pkey=43 ckey=19
[consumers]
consumer1 ckey=99 pkey=42
consumer2 ckey=33 pkey=41
consumer3 ckey=19 pkey=43

This is very good if you work with short immutable set of data. If there are a lot of variables, or inventory is updated, this becomes hard to maintain. It’s not suitable for dynamic data (like ansible_default_ipv4.address) of the ‘peer’ host.

Good solutions

✅ Common variables in common group

This method is good, but for for static data only. It uses special groups to join paris and provides them with the same set of variables. (I’ve switched to yaml here as the inventory become more complex).

providers:
hosts:
provider1:
provider2:
provider3:
consumers:
hosts:
consumer1:
consumer2:
consumer3:
group1:
hosts:
provider1
consumer1
vars:
ckey: 99
pkey: 42
group2:
hosts:
provider2
consumer2
vars:
ckey: 33
pkey: 41
group3:
hosts:
provider3
consumer3
vars:
ckey: 19
pkey: 33

Please note, that in this example playbooks are run for groups ‘providers’ and ‘consumers’, and groups ‘group1,..,group3’ are so-called ‘groups for variables’. They have no playbooks associated with them.
It has all benefits of DRY (you have every variable once), clarity (it’s very easy to see who connected to whom, and variables are in the same place as their pairs), it’s concise (you have no ‘hostvars’ anywhere in the code), and it’s completely free from any pathological coupling.

If you combine this trick with the fact that Ansible does support multiple inventories, you have an almost perfect solution.

There are two downsides here:

  • this method does not support dynamic variables. setup, set_fact, register, etc — all of that is not available for ‘peers’ in the group.
  • If both ‘providers’ and ‘consumers’ uses the same variables, it’s hard to keep them from overriding each other (think of ‘node_id’ for some cluster).

✅ Explicit cross-linking

This method just put an explicit link from one host to another.

[providers]
provider1 pkey=42 link=consumer1
provider2 pkey=41 link=consumer2
provider3 pkey=43 link=consumer3
[consumers]
consumer1 ckey=99 link=target1
consumer2 ckey=33 link=target2
consumer3 ckey=19 link=target3

To access to ‘peer’ data, just use hostvars[link]:

- hosts: consumers
tasks:
- name: Printing pkey
debug: var=hostvars[link].pkey

This method completely avoid parasitic coupling. We have explicit double link. (If you need you can have uni-directional link, which allows one-to-many relation).

It works nicely with dynamic variables. facts, some calculations, etc — it’s all available.

The single downside here is that you need manually link each node, and keep eye on the link consistency.

The later can be automated by a simple assertion:

- assert:
that:
- hostvars[link].link == inventory_hostname

Nevertheless, if you have many consumers and many providers it’s a lot of typing.

Autoindexing

It’s not exactly the best method, as it cast some hidden hex on your code. In exchange of the soul of clean code it provides some conciseness in case you have bunch of things to link together.

This method assumes you have equal number of consumers and providers and you don’t care who serves whom as long as it’s fixed for a given inventory and the 1-to-1 rule preserved.

This is actually my new invention for my playbooks for running benchmarks (from sources to targets).

We keep inventory intact:

[providers]
provider1 pkey=42
provider2 pkey=41
provider3 pkey=43
[consumers]
consumer1 ckey=99
consumer2 ckey=33
consumer3 ckey=19

We select ‘peer’ host by relative position in the group.

- name: selecting a peer
set_fact:
link: '{{ groups.providers[my_grp_pos] }}'
vars:
my_grp_pos: '{{ groups.consumers.index(inventory_hostname) }}'

There is no ‘Ansible magic’ here.

>>> help(list.index)
Help on method_descriptor:
index(self, value, start=0, stop=9223372036854775807, /)
Return first index of value.

groups.foo is a list. Even inventory is a dictionary, Ansible preserves an inventory order, so groups.foo is ordered. Index method allows us to get the index of the element. (all hosts in the group are unique, so ignore ‘first’ in the index description). As soon as we have stable and unique number for the position of ‘our host’ in it’s own group, we can use it to choose ‘peer’ from another group.

And it’s symmetric. my_grp_pos is the same for both ‘consumer’ and ‘provider’ and they can link each other with no problems.

The downside of this method is that it’s implicit (explicit is better than implicit), and that it’s require the same number of hosts in both groups.

Moreover, it’s very fragile toward inventory update. Any changes in host order (sorting, insertion, deleting) is changing the relation.

Conclusion

There is a way to write clean code in Ansible. Sometime.

--

--

George Shuklin
OpsOps

I work at Servers.com, most of my stories are about Ansible, Ceph, Python, Openstack and Linux. My hobby is Rust.