At $WORK the other day somebody asked me for a bit of assistance with a rather basic shell script: he was trying to sudo into a number of remote systems, do something on those machines, gather output of that something, and bring it all back to a central location for further processing. He got it going (I think), and I left it at that.

Back at my desk I thought it would be neat to have something like Func, but $WORK doesn't have that: they use Ansible, and then it hit me, and I slapped my face: Ansible does that too. This can easily be done with the API provided by Ansible which, among other things, allows me to control nodes, run Ansible modules, and obtain any output they produce. For example, I could

  • obtain a list of zones on my authoritative servers (irrespective of their brands) and compare the list for completeness
  • check and possibly clean up e-mail queues of misbehaving mail servers and report back how many messages where deleted on each / in total
  • find out if there are 'root' users logged on somewhere, etc., etc.

As a small example, let's go out and determine how much free disk space we have on our nodes. (We might have this information in, say, Nagios or Icinga, but it could be a mite difficult to get at the raw data.)

We could do something like this, grab the output and massage that, but it'll get messy at some point.

$ ssh a1.ww.mens.de df -P
Filesystem         1024-blocks      Used Available Capacity Mounted on
/dev/mapper/vg_a1-lv_root   8813300   1427644   6937964      18% /
tmpfs                   251360         0    251360       0% /dev/shm
/dev/vda1               495844     32566    437678       7% /boot

Obtaining the information in a more structured way seems useful, so here is my-df, a small Python program which lists mount points and available space in kilobytes. In a few moments I'll be calling this program, unmodified, a "module". :-)

#!/usr/bin/env python

import subprocess
import json

space = []

df = subprocess.Popen(["df", "-P", "-k"], stdout=subprocess.PIPE)
output = df.communicate()[0]

for line in output.split("\n")[1:]:
    if len(line):
        try:
            device, size, used, available, percent, mountpoint = line.split()
            space.append(dict(mountpoint=mountpoint, available=available))
        except:
            pass

print json.dumps(dict(space=space), indent=4)

When I run that on the same machine, I see the following output:

{
    "space": [
        {
            "available": "6937928", 
            "mountpoint": "/"
        }, 
        {
            "available": "251360", 
            "mountpoint": "/dev/shm"
        }, 
        {
            "available": "437678", 
            "mountpoint": "/boot"
        }
    ]
}

Let me now apply a bit of Ansible power to obtain results from a number of systems in my "data center" (cough).

disks.py is a Python program which invokes Ansible's Runner method, the low-level machinery which is used by /usr/bin/ansible and /usr/bin/ansible-playbook. Documentation for Runner is in the source, which you'll find in the file lib/ansible/runner/__init__.py.

#!/usr/bin/env python

import sys
import json
import ansible.runner

# specify hosts from inventory we should contact
hostlist = [ 'sushi.mens.de', 'devservers', '127.0.0.1' ]

def gigs(kibs):
    return float(kibs) / 1024.0 / 1024.0

runner = ansible.runner.Runner(
    module_name='my-df',
    module_args='',
    remote_user='jpm',
    sudo=False,
    pattern=':'.join(hostlist),
)

# Ansible now pops off to do it's thing via SSH
response = runner.run()

# We're back.
# Check for failed hosts

if 'dark' in response:
    if len(response['dark']) > 0:
        print "Contact failures:"
        for host, reason in response['dark'].iteritems():
            print "  %s (%s)" % (host, reason['msg'])


total = 0.0
for host, res in response['contacted'].iteritems():
    print host
    for fs in res['space']:
        gb = gigs(fs['available'])
        total += gb
        print "  %-30s %10.2f" % (fs['mountpoint'], gb)

print "Total space over %d hosts: %.2f GB" % (len(response['contacted']), total)

When this program runs, Ansible's Runner will copy the specified module my-df from the management machine to the nodes we query, will execute that there and report the results back.

Ansible Runner at work

my-df lives in Ansible's library, which can mean either in the default library path or simply in a directory called library/ in my program's directory.

Looking at the response from the Runner we see the following data structure (truncated for brevity).

{
    "dark": {
        "sushi.mens.de": {
            "msg": "FAILED: Authentication failed.", 
            "failed": true
        }
    }, 
    "contacted": {
        "a1.ww.mens.de": {
            "invocation": {
                "module_name": "my-df", 
                "module_args": ""
            }, 
            "space": [
                {
                    "available": "6937824", 
                    "mountpoint": "/"
                }, 
                {
                    "available": "251360", 
                    "mountpoint": "/dev/shm"
                }, 
                {
                    "available": "437678", 
                    "mountpoint": "/boot"
                }
            ]
        }
    }
}

The rest of the program just "does something" with the data. In this case, I just print it to stdout:

Contact failures:
  sushi.mens.de (FAILED: Authentication failed.)
k4.ww.mens.de
  /                                    2.19
  /dev/shm                             0.24
  /boot                                0.42
a1.ww.mens.de
  /                                    6.62
  /dev/shm                             0.24
  /boot                                0.42
127.0.0.1
  /                                    2.00
  /Volumes/lacie1timemach             28.52
  [...]
  /net/nv.ww.mens.de/c/home/jpm     1079.47
Total space over 3 hosts: 4199.85 GB

I can use any Ansible module with the Runner. Say, for example, I use the command module to obtain a list of files in /tmp/, the resulting data structure returned by Runner contains stdout (and stderr) from which I'd have to split the newline-terminated lines.

{
    "dark": {}, 
    "contacted": {
        "a1.ww.mens.de": {
            "changed": true, 
            "end": "2012-12-13 10:51:34.170066", 
            "stdout": "aaa\nfl\ntest1\ntmux-0\ntmux-501\ntmux-902", 
            "cmd": [
                "ls", 
                "/tmp"
            ], 
            "rc": 0, 
            "start": "2012-12-13 10:51:34.165512", 
            "stderr": "", 
            "delta": "0:00:00.004554", 
            "invocation": {
                "module_name": "command", 
                "module_args": "ls /tmp"
            }
        }
    }
}

This is the reason I produced JSON in my-df -- it makes things a bit easier later on.

With Ansible in my environment, I can quite easily leverage its functionality into my own programs.

Flattr this
Ansible and SSH :: 13 Dec 2012 :: e-mail

Comments

blog comments powered by Disqus