So you want to do configuration management for systems that don't have special software installed on them? You don't want to spend a lot of time learning said management tool? You want to spend more time on stuff that matters? That is possible, with Ansible.

Ansible is a radically simple model-driven configuration management, deployment, and command execution framework. Other tools in this space have been too complicated for too long, require too much bootstrapping, and have too much learning curve.

Getting started with Ansible is simple:

  1. Choose a machine as your management system and install Ansible.
  2. Ensure you have an SSH key for the nodes you want to manage and that your management system can log onto those nodes.
  3. Create a hosts file containing an inventory of your nodes.
  4. Start using Ansible.

Managed nodes don't require any software installation, i.e. there is no need to install ansible nor any of its dependencies and thus there are no daemons to manage.

Ansible architecture

Ansible doesn't require a PKI or specific transports for communicating between the manager and its nodes: it uses plain and simple SSH, authenticating with public keys (or passwords if you must). I recommend you create an SSH key specifically for use by Ansible and deploy its public portion to the .ssh/authorized_keys files of the root user on all your nodes. (The Ansible manager uses paramiko, SSH2-protocol module for Python which, unfortunately, doesn't (yet?) support GSSAPI or native SSH connections which do.)

Ansible uses an inventory file, called hosts to determine which nodes you want to manage. This is either a plain-text file which lists individual nodes or groups of nodes (e.g. DNS servers, Web servers, etc.) or an executable program which outputs an inventory of (groups of) hosts and variables. A simple hosts file I started off with looks like this:


This defines a group of nodes I call dnsservers with the two specified hosts in it. The first node will be passed a specific configuration variable called domain.

I can now immediately launch Ansible to see if my setup works:

$ ansible dnsservers -m ping | success >> {
    "ping": "pong"
} | success >> {
    "ping": "pong"

Ansible connects to each individual node in the group of dnsservers, transmits the required module (here: ping), launches the module, and returns the module's output. Instead of addressing a group of nodes, I can also specify an individual node or a set of nodes with wildcards.

Ansible works.

Shall we copy a file to one of the DNS servers? I instruct ansible to use the copy module and specify a source and a destination path, as well as owner and permissions of the destination file:

$ ansible -m copy -a 'src=/etc/resolv.conf dest=/tmp/resolv.conf owner=jpm mode=0400' | success >> {
    "changed": true, 
    "group": "root", 
    "md5sum": "acd7408a5d2c64c531d2eea1a71513c3", 
    "mode": "0400", 
    "path": "/tmp/resolv.conf", 
    "state": "file", 
    "user": "jpm"

Almost all actions are idempotent, so if I repeat that, Ansible checks the MD5 sum of the source and target files and ignores the copy if they are equal.

Ansible currently contains a number of modules:

  • Manage packages and services with yum and apt and service.
  • Invoke remote commands with command and shell.
  • File storage with copy (to node) and fetch (from node) as well as file (sets attributes) and template.
  • Inventory with setup, and if they are installed, facter and ohai.
  • Software deployment with git, which checks out a Git repo onto a node.
  • User management with user and group.
  • Virtual machine management for systems that support libvirt with virt.

Instead of shooting off individual ansible commands, I can group these together into so-called playbooks which declare a specific configuration I want to apply to a node. Actions specified are processed in the order I specify (I emphasize this, because that is one thing I intensely dislike of how Puppet works). A playbook is expressed in YAML:

- hosts: dnsservers
    name: JP Mens
  user: root
  - name: Deploy resolv.conf
    action: template src=/etc/ansible/ dest=/etc/resolv.conf owner=jpm mode=0444
  - name: get INVENTORY
    action: fetch src=/etc/ansible/setup dest=/tmp/ss

The format is very readable: hosts describes the group of nodes from the hosts file and vars the variables I want to pass to them. (These variables can be used within the playbook and within templates as {{ var }}.) I specify Ansible should log in to nodes as the root user, but I can specify any user and have Ansible use sudo to gain privileges. Tasks have a name which is helps us track playbook progress and an action. In the action I specify which module I want Ansible to invoke.

The ansible-playbook utility processes the playbook and instructs the nodes to perform the actions, starting with an implicit invocation of the setup module, which collects system information for Ansible and stores in on the node in /etc/ansible/setup. Actions are performed top-down and an error causes Ansible to stop processing actions for that particular node.

$ ansible-playbook dns.yaml 

PLAY [dnsservers] ********************* 

SETUP PHASE ********************* 

ok: []

ok: []

TASK: [Deploy resolv.conf] ********************* 

ok: []

ok: []

TASK: [get INVENTORY] ********************* 

ok: []

ok: []

PLAY RECAP *********************                  : ok=   3 changed=   1 unreachable=   0 failed=   0               : ok=   3 changed=   1 unreachable=   0 failed=   0

In the first action of the playbook above, I use the template module to construct a file's content using variables. Ansible uses the Jinja2 templating language:

# by {{ name | upper }}
# for  {{ ansible_eth0.ipv4.address }}
{% if domain is defined -%}
domain {{ domain }}
{% endif -%}
{% for ns in resolvers -%}
nameserver {{ ns }}
{% endfor %}

The resulting resolv.conf I find on the node contains (actually this is on the node k4; on ubu10 the domain isn't set):

# by JP MENS
# for

One of the actions I defined in the playbook above was to fetch the nodes' inventory file and store it on the management system. Ansible fetches file into the directory I specify, and stores them in a file tree organized by hostname. The file obtained for me I find in /tmp/ss/ This contains JSON and lists the known facts of the node. (If these aren't sufficient for you, you can install facter and/or OHAI on the nodes to collect more data.)

    "ansible_architecture": "i386", 
    "ansible_bios_date": "01/01/2007", 
    "ansible_bios_version": "0.5.1", 
    "ansible_distribution": "CentOS", 
    "ansible_distribution_release": "Final", 
    "ansible_distribution_version": "6.2", 
    "ansible_eth0": {
        "ipv4": {
            "address": "", 
            "netmask": "", 
            "network": ""
        "ipv6": [
                "address": REDACTED,
                "prefix": "64", 
                "scope": "global"
                "address": "fe80::5054:ff:fe13:cfb2", 
                "prefix": "64", 
                "scope": "link"
        "macaddress": "52:54:00:13:cf:b2"
    "ansible_form_factor": "Other", 
    "ansible_fqdn": "", 
    "ansible_hostname": "k4", 
    "ansible_interfaces": [
    "ansible_kernel": "2.6.32-220.el6.i686", 
    "ansible_lo": {
        "ipv4": {
            "address": "", 
            "netmask": "", 
            "network": ""
        "ipv6": [
                "address": "::1", 
                "prefix": "128", 
                "scope": "host"
        "macaddress": "00:00:00:00:00:00"
    "ansible_machine": "i686", 
    "ansible_memfree_mb": 258, 
    "ansible_memtotal_mb": 487, 
    "ansible_processor": [
        "QEMU Virtual CPU version (cpu64-rhel6)"
    "ansible_processor_cores": "NA", 
    "ansible_processor_count": 1, 
    "ansible_product_name": "KVM", 
    "ansible_product_serial": "NA", 
    "ansible_product_uuid": "1A5ED075-12B4-280A-41C3-3D2BA7E529F0", 
    "ansible_product_version": "RHEL 6.2.0 PC", 
    "ansible_python_version": "2.6.6", 
    "ansible_selinux": false, 
    "ansible_ssh_host_key_dsa_public": REDACTED,
    "ansible_ssh_host_key_rsa_public": REDACTED,
    "ansible_swapfree_mb": 991, 
    "ansible_swaptotal_mb": 991, 
    "ansible_system": "Linux", 
    "ansible_system_vendor": "Red Hat", 
    "ansible_virtualization_role": "guest", 
    "ansible_virtualization_type": "kvm", 
    "color": "red", 
    "domain": "",
    "group_names": [
    "inventory_hostname": "", 
    "metadata": "/etc/ansible/setup", 
    "name": "JP Mens",
    "resolvers": [

(Note how the variables we specified in our playbook above (name and resolvers) have been added to the facts, as has the domain variable we specified in the hosts inventory file.)

Ansible can run operations on change using a notification system to, say, restart a service when one of its configuration files has changed. It notifies so-called handlers which are basically lists of tasks to perform on change, and Ansible performs those tasks once, irrespective of how many notifications it processes for a particular handler.

When ansible runs on the management station, it sends required modules over to the nodes on the fly and those are processed on the node. You can create your own modules, but there are some caveats:

  • Modules are copied over to the managed node, so binary executables must match the architecture of the target system. On the other hand, there are so many scripting languages to choose from, that a binary program will seldom be needed.
  • Scripts must be self-contained. Any external dependencies in the scripts have to be available on the target node. Say, for example, you write a module in Perl which uses a particular CPAN module, you'll have to install that on the nodes or your module won't work.
  • Ansible drops arguments you pass to your module via a playbook or ansible -m module -a "args" into a file called arguments in a temporary directory. Your module is then invoked with a single argument containing the path to that file, and it must then parse the arguments out of that file.

I mentioned above that Ansible's hosts file can be a program, a bit like Puppet's external node classification. The hosts program must emit a JSON hash of node groups to be managed when invoked with --list and a hash of variables for a particular node when invoked with --host nodename. There's an example program on integrating Ansible with Cobbler.

Some thoughts regarding Ansible, knowing (the little) I know of Puppet:

  • No special software needed on the nodes apart from Python. Yeah!
  • All hosts have access to variables defined on other hosts, and there's no central database required to do so (Puppet: storeconfigs). For example, if I want a database server to know the primary IP address of my node k4, it can access that in a template with
{{ hostvars[''].ansible_eth0.ipv4.address }}
  • Setups don't need to be run as root.
  • I love the ad-hoc possibilities that Ansible gives me:
    • is the date on my servers correct? ansible clusterA -m command -a /bin/date
    • oh, those servers need rebooting: ansible dnsservers -m command -a "/sbin/reboot -t now"

    As long as I don't use ansible to install a package on a particular node and forget to do so on another... That kind of ad-hoc changes that can cause chaos in an environment which should be carefully managed.

  • Ansible's learning curve is much lower than Puppet's; I had deployed my first config file and installed a package in minutes.
  • Ansible's package management forces me to decide whether the remote system is apt-based or yum-based (and unfortunately doesn't yet support zypper or others). While package management on Puppet seems to be more transparent, Puppet doesn't hide differing package names ("http-server" vs "httpd" vs "apache2") from me either. In other words it doesn't really matter. Nodes in a cluster will typically be very similar, so I create an Ansible playbook per cluster using either the yum or apt module.
  • Ansible is push-based (although pull-based should work as well).

The Ansible project compares itself to other systems.

Further reading:

Flattr this
Ansible and configuration :: 06 Jun 2012 :: e-mail


blog comments powered by Disqus