One of the first steps in an Ansible playbook run (unless you explicitly disable it) is the
gathering of facts via the
setup module. These facts are collected on each
machine and were kept in memory for the duration of the playbook run before being destroyed.
This meant, that a task wanting to reference a host variable from a different machine would have to
talk to that machine at least once in the playbook in order for Ansible to have access to its facts, which in turn sometimes means talking to hosts although we just need a teeny weeny bit of information from that host.
One interesting feature of Ansible version 1.8 is called “fact caching”. It
allows us to build a cache of all facts for all hosts Ansible talks to. This
cache will be populated with all facts for hosts for which the
setup module (i.e.
gather_facts) runs. Optional expiry of cached entries as well as enabling the
cache itself is controlled by settings in
fact_caching is set to
memory. Configuring it as above, makes Ansible use a Redis
instance (on the local machine) as its cache. The timeout specifies when individual Redis keys
(i.e. facts on a per/machine basis) will expire. Setting this value to 0 effectively disables
expiry, and a positive value is a TTL in seconds.
The following small experiment will run over 246 machines.
Running my sample playbook gathers all facts on each run. This playbook took just over a minute to run (1m11). So, after the run, what’s in Redis?
If I configure
gather_facts = False, the
setup module is not invoked in the
playbook, and Ansible accesses the cache to obtain facts. Note, of course, that
the value of each fact variable will be that which was previously cached.
Also, because the fact gathering doesn’t take place, the playbook
runs a bit faster (which may be negligible depending on what tasks it’s set to
accomplish). In this particular case, the play ran in just under a minute (0m50) – a slight speedup.
A second caching mechanism exists at the time of this writing: it’s
jsonfile, and it allows me to use a directory of JSON files as the
cache; expiry is supported as for Redis even though the JSON file remains on
disk after it’s expired (the file’s mtime is used to calculate expiry). If I alter the caching
ansible.cfg, I can activate it:
The “connection” setting must point to a writeable directory in which a file
containing facts in JSON format for each host are stored. A
for the cache also exists.
Any playbook which gathers facts effectively populates the cache for the
machines it speaks to.
The following playbook doesn’t
talk to the www01 machine, but it can access that machine’s facts from the cache. (The
fact isn’t default in Ansible: I set this up using
As soon as a cache entry expires these fact variables will be undefined, and the play will fail.
Populating or rejuvenating the facts cache is trivial: I’ll be running the following playbook periodically in accordance with the cache timeout I’ve configured:
In case of doubt, clear the cache by invoking
ansible-playbook with the