When I heard that the CentOS Project was going to publish official CentOS images on Amazon EC2 (official CentOS announcement) I thought the time was ripe to finally try stuff there, and a cloudy weekend suited me perfectly for doing so. In terms of EC2 I’m a beginner, so bear with me if some of the terminology is incorrect: there’s a lot of terminology involved, so I started by reading this.

To start using EC2 you need an Amazon AWS account and a credit card. The good news is that they have a model in which you can create some smallish machines (called instances) free of charge. Check the fine print on the AWS site. (And by fine print I don’t mean it’s in a small font or hidden – their documentation is quite extensive :-)

What I’m going to discuss here is provisioning some CentOS images on EC2 instances with Ansible, and well, do that from a CentOS machine of course.

First off, you don’t need the AWS tools installed on your management system, and that saves you from having to install Java as well. We need the following components on our CentOS management system:

  • Ansible and its small list of dependencies. There’s help in getting started.
  • The Euca2ools, command-line utilities for interacting with Amazon’s EC2 and S3 services. I think of these as the answer to Amazon’s tools, but they’re written in Python. Furthermore, we’ll need this and its prerequisites for Ansible as well. Euca2ools are in EPEL so installation is easy.
  • A rather large collection of keys and authorization codes which you obtain from the AWS site. I won’t bore you with how to do that, but I will show you a list of variables which must be correctly set for things to work.

You’ve set up your AWS account, and you’ve obtained authorization secrets to interact with EC2. For everything we do from here onwards, we need the following variables in our shell’s environment:

export EC2_ACCESS_KEY="xxxxxxxxxxxxxxxxxxxx"
export EC2_SECRET_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export EC2_URL=https://ec2.amazonaws.com
export S3_URL=https://s3.amazonaws.com:443

If that environment is set up correctly, you should be able to use the euca-describe-images command to find out which images are available to create instances. This program talks to EC2, so if that works, the rest should too. Note the ami-xxxx in the second column of the list which is returned: that’s the name we’ll use to choose which image we will instantiate.

In order to launch a new instance (i.e. a new machine) on EC2, we require an SSH key-pair. The private part we keep safely, and the public portion is injected into EC2. Upon creating an instance, EC2 automatically populates the root account of the machine we create with that public key so that we can log in.

$ euca-create-keypair jp1 > jp1.pem
$ euca-describe-keypairs
KEYPAIR jp1     02:69:34:2a:98:c2:f0:56:3f:3d:e1:6c:99:78:34:72:fc:75:70:83

We create as many different keypairs as we need, and note their names:

  1. EC2 needs to be instructed which key it should inject into the instance. It does so with this name.
  2. SSH needs to use the private key to connect to the instance, and we use the pem file for that.
  3. Ansible uses the SSH key to talk to the instance.

If you don’t have an SSH agent running yet, I recommend you do so now:

eval `ssh-agent`
chmod 400 jp1.pem
ssh-add jp1.pem

To make sure everything is running, let us manually set up an instance now from the command-line. I’ve chosen the ami image I want to use, and I select the SSH key I want EC2 to inject into the instance:

$ euca-run-instances -k jp1 ami-8a8932e3
RESERVATION	r-997ee9e0	410186602215	default
INSTANCE	i-df75cea0	ami-8a8932e3			pending	jp1	0		m1.small	2012-11-18T10:57:20.000Z	us-east-1a	aki-88aa75e1			monitoring-disabled					ebs									

Make note of the instance name (i-b37bcdcc): this is the handle into that machine, and we need that name to reboot or destroy it.

After a few moments, I use euca-describe-instances to check that the machine is actually booting. I can also see (but not interact with) its console, using euca-get-console-output, and I’ll see a public hostname which I use to SSH into the instance as the root user.

$ euca-describe-instances i-df75cea0
RESERVATION	r-997ee9e0	410186602215	default
INSTANCE	i-df75cea0	ami-8a8932e3	ec2-54-242-141-105.compute-1.amazonaws.com	ip-10-212-238-128.ec2.internal	running	jp1	0		m1.small	2012-11-18T10:57:20.000Z	us-east-1a	aki-88aa75e1			monitoring-disabled			ebs		

$ euca-get-console i-df75cea0 | tail -20
dracut: Mounted root filesystem /dev/xvde
dracut: Loading SELinux policy
type=1404 audit(1353239590.517:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
type=1403 audit(1353239591.119:3): policy loaded auid=4294967295 ses=4294967295
dracut: Switching root
udev: starting version 147
Initialising Xen virtual ethernet driver.
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.22.6-ioctl (2011-10-19) initialised: dm-devel@redhat.com
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ip6_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ip_tables: (C) 2000-2006 Netfilter Core Team

CentOS release 6.3 (Final)
Kernel 2.6.32-279.11.1.el6.x86_64 on an x86_64

ip-10-212-238-128 login:

Ansible inventory

Ansible uses an inventory in which I describe the machines I want it to speak to, the groups they belong to and specific variables I want those machines to use. The inventory file defaults to /etc/ansible/hosts, but I can override that by setting $ANSIBLE_HOSTS to a different path. An inventory file can be as short as this, and believe it or not, this is actually the inventory we’re going to give to Ansible in order to launch EC2 instances:


On the other hand we need to provision EC2 instances running on the other side of the world (for me at least). How do we do that? How do we know their hostnames?

The Ansible ec2.py inventory script enumerates the EC2 instances we can access. If I launch this program, I see the following JSON output because I already have an instance running. (Compare the instance ID and the public hostname to what we saw earlier.)

  "i-df75cea0": [
  "key_jp1": [
  "security_group_default": [
  "type_m1_small": [
  "us-east-1": [
  "us-east-1a": [

Even though we have a single machine only, it shows up in different groups. These groups will allow us to target specific groups of instances when we use Ansible to provision them. (Note: the ec2.py program caches its output in configurable paths, so it may take a minute until the list is refreshed.) If I invoke the inventory program with a specific host, I get a list of variables particular to that instance: (I’m omitting lots of output for brevity)

ec2.py --host ec2-54-242-141-105.compute-1.amazonaws.com
  "ec2_architecture": "x86_64", 
  "ec2_dns_name": "ec2-54-242-141-105.compute-1.amazonaws.com", 
  "ec2_hypervisor": "xen", 
  "ec2_id": "i-df75cea0", 
  "ec2_image_id": "ami-8a8932e3", 
  "ec2_instance_type": "m1.small", 
  "ec2_ip_address": "", 
  "ec2_key_name": "jp1", 
  "ec2_launch_time": "2012-11-18T10:57:20.000Z", 
  "ec2_monitored": false, 
  "ec2_placement": "us-east-1a", 
  "ec2_root_device_type": "ebs", 
  "ec2_security_group_ids": "sg-29652b41", 
  "ec2_security_group_names": "default", 
  "ec2_virtualization_type": "paravirtual"

Using the ec2.py inventory script will allow Ansible to interact with instances on EC2. This happens either by installing the file as an executable /etc/ansible/hosts or by pointing $ANSIBLE_HOSTS to that executable.

Will it “ping”?

$ export ANSIBLE_HOSTS=~/ec2.py
$ ansible -u root ec2-54-242-141-105.compute-1.amazonaws.com -m ping
ec2-54-242-141-105.compute-1.amazonaws.com | success >> {
    "changed": false, 
    "ping": "pong"

As this instance is costing money (not much but do check the AWS pricelist), I’ll terminate (kill, destroy, zap) the instance:

$ euca-terminate-instances i-df75cea0
INSTANCE	i-df75cea0

At this point I could leave you to it, and you could successfully use Ansible to install and configure your EC2 instances. But I won’t leave you to it: let’s do a bit of provisioning.


Ansible’s ec2 module creates EC2 instances. As Fabian Arrotin found out, it’s really easy to use.

However, I’ll admit that my poor, tired, brain had some trouble in coupling two completely distinct and seemingly unrelated operations:

  1. The first is instance creation. Fine and dandy, but there is very little that associates the information returned on instance creation with a hostname we later require to access that host.
  2. For managing the host with Ansible we need a hostname (or IP address) but neither is obtainable at the time of instance creation. (Compare the output euca-run-instance and euca-describe-instance above.)

Furthermore, EC2 hostnames and addresses are volatile: if I stop the instance and restart it, it gets a different IP address and hostname associated with it, meaning I’d “loose” touch. The problem I was facing was: how do I create a self-defined DNS hostname which resolves to a particular instance? (AWS does offer so-called Elastic IP addresses, but I wanted to try and solve the problem without using those.)

If that isn’t an issue for you, ignore the folloing bits which discuss DNS.

I learned that during instance creation on EC2, I can inject user-defined data into the instance. What I’ll be doing is to ask the operator to enter a hostname (shortname) and inject that into user-data. Once the machine has booted and that data is available, I’ll obtain said hostname and create a DNS address record associating that name with the machine’s IP address.

(I see other people use user-data for all sorts of things, including automation of instances from the user-data, but we don’t need that using Ansible. :)

Ansible launches a CentOS instance on EC2

Ansible and the Cloud

Ansible’s ec2 module creates an instance on EC2 and optionally waits for that instance to become ready. (Note: ready doesn’t mean booted – that can take a few minutes.) Upon creating an instance, I specify the SSH keypair I want to use (we created a key called jp1 for that), the image name, and a few other parameters which are described in the module’s documentation. One parameter I want to point out is called group. This is a so-called security group which, as far as I’ve been able to determine, specifies e.g. firewall rules from the AWS point of view. By default only port 22 (SSH) is allowed into my instance, but I’m creating Web servers so I also want (at least) port 80.

To create an EC2 security group, I used the following commands:

$ euca-add-group -d "Web Servers" webs-a
$ euca-authorize -P tcp -p 80-80 -s webs-a
$ euca-authorize -P tcp -p 22-22 -s webs-a

The specified security group later on shows up as a group in the Ansible inventory (ec2.py).

- hosts:
  connection: local
  gather_facts: False
    keypair: jp1
    instance_type: m1.small
    security_group: webs-a
    image: ami-8a8932e3
    mail_from: Ansible
    mail_to: charlie
    shortname: "What is the shortname of this host to be?"
  - name: Launch new EC2 instance
    local_action: ec2
        user_data='{"shortname":"{{shortname}}","admin":"Jane Jolie", "hostname":"{{shortname}}.{{dnsdomain}}","mail":"{{mail_from}}"}'
    register: ec2
  - name: Send e-mail to admins
    local_action: mail
        subject="EC2 instance ({{shortname}}) {{ec2.instances[0].id}}"
        body="EC2 instance {{ec2.instances[0].id}} created on {{ec2.instances[0].public_ip}}"

Note how the ec2 module is given user_data: in this case, I simply push a JSON blob into it, but it could just as well be something else. I believe it must be less than 16KB. (The user_data parameter to the ec2 module is brand new in Ansible: you’re welcome. :-)

The variable ec2, obtained by registering the result of the ec2 module, contains the following values, which I use in the e-mail I fire off to the admins:

   "instances" : [
         "public_ip" : "",
         "id" : "i-3d1ba042"
   "changed" : true

Let me run Ansible on this playbook. Note that I use the simple inventory containing just localhost, because these modules run on my Ansible management machine and not remotely.

$ ansible-playbook newinstances.yml 
What is the shortname of this host to be?: : web31

PLAY [] ********************* 

TASK: [Launch new EC2 instance] ********************* 
changed: []

TASK: [Send e-mail to admins] ********************* 
ok: []

PLAY RECAP *********************                      : ok=2    changed=1    unreachable=0    failed=0    

To recapitulate: so far Ansible used modules locally (i.e. on our management machine) to remotely create a CentOS instance on EC2 and to send the e-mail. I could now use some of the euca- tools to look and see what is happening.

I’ll also reiterate, that during the creation of the instance our user-data was injected into that machine so it will be available to us as soon as we connect to it.

Also: have a bit of patience: it can take a few minutes for the EC2 instance to actually come alive.

Oh, I have mail:

Date: Sun, 18 Nov 2012 12:51:07 +0100 (CET)
From: Ansible@c6.ww.mens.de
Subject: EC2 instance (web31) i-3d1ba042

EC2 instance i-3d1ba042 created on

As an aside, recall from above that we have the EC2 access codes in shell environment variables. The ec2 module allows specifying these as parameters so we could also use Ansible group_vars or host_vars to set these, instead of relying on Ansible’s run-time environment.

Ansible provisions CentOS instances

After a couple minutes of patience I use Ansible to actually provision the instance I just brought up. To illustrate, I’ll just install an Apache Web server and a template, so nothing special, at least not in the first part of the playbook:

- hosts:
  - security_group_webs-a
  user: root
  connection: paramiko
  gather_facts: false
  - name: Install | Apache
    action: yum pkg=httpd state=installed
  - name: Machine | Launch Apache service
    action: service name=httpd state=started enabled=true
  - name: Machine | Disable firewall (fixme)
    action: service name=iptables state=stopped enabled=false
  - name: Machine | Obtain user_data from EC2
    action: userdata
    register: ud
  - name: Web | Install Web templates
    action: template src=templates/index.j2 dest=/var/www/html/index.html
  - name: Local | DDNS update
    local_action: dnsupdate 
      txt="{{ec2_id}} {{ec2_public_dns_name}}"

The hosts Ansible should act upon are specified by a group name as obtained by ec2.py, and I’m connecting as root because that’s the user for which our jp1 SSH key has been injected into by EC2. After the first portion of the playbook has run, I could use a Web browser to connect to the ec2-*.compute-1.amazonaws.com hostname or to its public IP address.

I wanted to somehow be able to pre-determine the DNS name by which an instance is reachable. As mentioned earlier, that isn’t as easy as it sounds because EC2 hostnames and addresses are volatile.

Fiddling with DNS

Recall we had injected user_data into the instance. If, at this point, I logged onto the instance, I could obtain that data from a specific URL (available to all instances on the instance only).

$ curl -s http://instance-data.ec2.internal/latest/user-data
{"shortname":"web31","admin":"Jane Jolie", "hostname":"web31.ec.jpmens.org","mail":"Ansible"}

Because I was able to “smuggle” that hostname into the user-data, I can now have Ansible retrieve that via a module to further process it. I’ve created a custom module called userdata to read that data when on the instance. Its output is registered in the playbook as variable ud, and I use bits of that information in the template (for index.html) as well as using portions to update the DNS with our hostname pointing to the instance’s public IP address. Ansible will do that for us, and it’ll use a brand-new dnsupdate Ansible module I wrote to fire-off a dynamic DNS update.

So let’s make this happen!

The instance is still waiting for us to do something with it. We tell Ansible to use a different inventory this time, i.e. the EC2 inventory, and launch the configuration playbook:

ANSIBLE_HOSTS=~/ec2.py ansible-playbook apache.yml 

PLAY [security_group_webs-a] ********************* 

TASK: [Install | Apache] ********************* 
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]

TASK: [Machine | Launch Apache service] ********************* 
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]

TASK: [Machine | Disable firewall (fixme)] ********************* 
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]

TASK: [Machine | Obtain user_data from EC2] ********************* 
ok: [ec2-107-22-159-172.compute-1.amazonaws.com]

TASK: [Web | Install Web templates] ********************* 
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]

TASK: [Local | DDNS update] ********************* 
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]

PLAY RECAP ********************* 
ec2-107-22-159-172.compute-1.amazonaws.com : ok=6    changed=5    unreachable=0    failed=0    

Provisioning is complete, services are running, and the dynamic DNS update has been performed. I can immediately connect to our new host via the name I chose upon first launching the instance!

Our new EC2 host


CentOS on EC2 is cool, and using Ansible to provision CentOS instances is also. There are many ways to accomplish this I suppose, and I chose my way.

One final note: pop over to Seth Vidal’s site and read on how he uses Ansible on cloud instances. I particularly recommend that because he knows a lot more about all this and has more experience with it than I do. He takes a different approach by creating a module which injects a host into Ansible’s in-memory inventory; that module is now in Ansible core. I could have copied and pasted and be done with it, but I wanted to do this differently.

CentOS, Ansible, and EC2 :: 21 Nov 2012 :: e-mail