When I heard that the CentOS Project was going to publish official CentOS images on Amazon EC2 (official CentOS announcement) I thought the time was ripe to finally try stuff there, and a cloudy weekend suited me perfectly for doing so. In terms of EC2 I’m a beginner, so bear with me if some of the terminology is incorrect: there’s a lot of terminology involved, so I started by reading this.
To start using EC2 you need an Amazon AWS account and a credit card. The good news is that they have a model in which you can create some smallish machines (called instances) free of charge. Check the fine print on the AWS site. (And by fine print I don’t mean it’s in a small font or hidden – their documentation is quite extensive :-)
What I’m going to discuss here is provisioning some CentOS images on EC2 instances with Ansible, and well, do that from a CentOS machine of course.
First off, you don’t need the AWS tools installed on your management system, and that saves you from having to install Java as well. We need the following components on our CentOS management system:
- Ansible and its small list of dependencies. There’s help in getting started.
- The Euca2ools, command-line utilities for interacting with Amazon’s EC2 and S3 services. I think of these as the answer to Amazon’s tools, but they’re written in Python. Furthermore, we’ll need this and its prerequisites for Ansible as well. Euca2ools are in EPEL so installation is easy.
- A rather large collection of keys and authorization codes which you obtain from the AWS site. I won’t bore you with how to do that, but I will show you a list of variables which must be correctly set for things to work.
You’ve set up your AWS account, and you’ve obtained authorization secrets to interact with EC2. For everything we do from here onwards, we need the following variables in our shell’s environment:
export EC2_ACCESS_KEY="xxxxxxxxxxxxxxxxxxxx"
export EC2_SECRET_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export EC2_URL=https://ec2.amazonaws.com
export S3_URL=https://s3.amazonaws.com:443
export AWS_ACCESS_KEY_ID=${EC2_ACCESS_KEY}
export AWS_SECRET_ACCESS_KEY=${EC2_SECRET_KEY}
If that environment is set up correctly, you should be able to use the
euca-describe-images
command to find out which images are available to create
instances. This program talks to EC2, so if that works, the rest should too.
Note the ami-xxxx
in the second column of the list which is returned: that’s
the name we’ll use to choose which image we will instantiate.
In order to launch a new instance (i.e. a new machine) on EC2, we require an SSH key-pair. The private part we keep safely, and the public portion is injected into EC2. Upon creating an instance, EC2 automatically populates the root account of the machine we create with that public key so that we can log in.
$ euca-create-keypair jp1 > jp1.pem
$ euca-describe-keypairs
KEYPAIR jp1 02:69:34:2a:98:c2:f0:56:3f:3d:e1:6c:99:78:34:72:fc:75:70:83
We create as many different keypairs as we need, and note their names:
- EC2 needs to be instructed which key it should inject into the instance. It does so with this name.
- SSH needs to use the private key to connect to the instance, and we use the
pem
file for that. - Ansible uses the SSH key to talk to the instance.
If you don’t have an SSH agent running yet, I recommend you do so now:
eval `ssh-agent`
chmod 400 jp1.pem
ssh-add jp1.pem
To make sure everything is running, let us manually set up an instance now from the command-line. I’ve chosen the ami image I want to use, and I select the SSH key I want EC2 to inject into the instance:
$ euca-run-instances -k jp1 ami-8a8932e3
RESERVATION r-997ee9e0 410186602215 default
INSTANCE i-df75cea0 ami-8a8932e3 pending jp1 0 m1.small 2012-11-18T10:57:20.000Z us-east-1a aki-88aa75e1 monitoring-disabled ebs
Make note of the instance name (i-b37bcdcc
): this is the handle into that
machine, and we need that name to reboot or destroy it.
After a few moments, I use euca-describe-instances
to check that the machine
is actually booting. I can also see (but not interact with) its console, using
euca-get-console-output
, and I’ll see a public hostname which I use to SSH
into the instance as the root
user.
$ euca-describe-instances i-df75cea0
RESERVATION r-997ee9e0 410186602215 default
INSTANCE i-df75cea0 ami-8a8932e3 ec2-54-242-141-105.compute-1.amazonaws.com ip-10-212-238-128.ec2.internal running jp1 0 m1.small 2012-11-18T10:57:20.000Z us-east-1a aki-88aa75e1 monitoring-disabled 54.242.141.105 10.212.238.128 ebs
$ euca-get-console i-df75cea0 | tail -20
dracut: Mounted root filesystem /dev/xvde
dracut: Loading SELinux policy
type=1404 audit(1353239590.517:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
type=1403 audit(1353239591.119:3): policy loaded auid=4294967295 ses=4294967295
dracut:
dracut: Switching root
udev: starting version 147
Initialising Xen virtual ethernet driver.
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.22.6-ioctl (2011-10-19) initialised: dm-devel@redhat.com
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ip6_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ip_tables: (C) 2000-2006 Netfilter Core Team
CentOS release 6.3 (Final)
Kernel 2.6.32-279.11.1.el6.x86_64 on an x86_64
ip-10-212-238-128 login:
Ansible inventory
Ansible uses an inventory in which I describe the machines I want it to speak
to, the groups they belong to and specific variables I want those machines to
use. The inventory file defaults to /etc/ansible/hosts
, but I can override
that by setting $ANSIBLE_HOSTS
to a different path. An inventory file can be
as short as this, and believe it or not, this is actually the inventory we’re
going to give to Ansible in order to launch EC2 instances:
[local]
127.0.0.1
On the other hand we need to provision EC2 instances running on the other side of the world (for me at least). How do we do that? How do we know their hostnames?
The Ansible ec2.py inventory script enumerates the EC2 instances we can access. If I launch this program, I see the following JSON output because I already have an instance running. (Compare the instance ID and the public hostname to what we saw earlier.)
{
"i-df75cea0": [
"ec2-54-242-141-105.compute-1.amazonaws.com"
],
"key_jp1": [
"ec2-54-242-141-105.compute-1.amazonaws.com"
],
"security_group_default": [
"ec2-54-242-141-105.compute-1.amazonaws.com"
],
"type_m1_small": [
"ec2-54-242-141-105.compute-1.amazonaws.com"
],
"us-east-1": [
"ec2-54-242-141-105.compute-1.amazonaws.com"
],
"us-east-1a": [
"ec2-54-242-141-105.compute-1.amazonaws.com"
]
}
Even though we have a single machine only, it shows up in different groups.
These groups will allow us to target specific groups of instances when we use
Ansible to provision them. (Note: the ec2.py
program caches its output in
configurable paths, so it may take a minute until the list is refreshed.) If I
invoke the inventory program with a specific host, I get a list of variables
particular to that instance: (I’m omitting lots of output for brevity)
ec2.py --host ec2-54-242-141-105.compute-1.amazonaws.com
{
"ec2_architecture": "x86_64",
"ec2_dns_name": "ec2-54-242-141-105.compute-1.amazonaws.com",
"ec2_hypervisor": "xen",
"ec2_id": "i-df75cea0",
"ec2_image_id": "ami-8a8932e3",
"ec2_instance_type": "m1.small",
"ec2_ip_address": "54.242.141.105",
"ec2_key_name": "jp1",
"ec2_launch_time": "2012-11-18T10:57:20.000Z",
"ec2_monitored": false,
"ec2_placement": "us-east-1a",
"ec2_root_device_type": "ebs",
"ec2_security_group_ids": "sg-29652b41",
"ec2_security_group_names": "default",
"ec2_virtualization_type": "paravirtual"
}
Using the ec2.py
inventory script will allow Ansible to interact
with instances on EC2. This happens either by installing the file as an
executable /etc/ansible/hosts
or by pointing $ANSIBLE_HOSTS
to that
executable.
Will it “ping”?
$ export ANSIBLE_HOSTS=~/ec2.py
$ ansible -u root ec2-54-242-141-105.compute-1.amazonaws.com -m ping
ec2-54-242-141-105.compute-1.amazonaws.com | success >> {
"changed": false,
"ping": "pong"
}
As this instance is costing money (not much but do check the AWS pricelist), I’ll terminate (kill, destroy, zap) the instance:
$ euca-terminate-instances i-df75cea0
INSTANCE i-df75cea0
At this point I could leave you to it, and you could successfully use Ansible to install and configure your EC2 instances. But I won’t leave you to it: let’s do a bit of provisioning.
EC-catch-22
Ansible’s ec2 module creates EC2 instances. As Fabian Arrotin found out, it’s really easy to use.
However, I’ll admit that my poor, tired, brain had some trouble in coupling two completely distinct and seemingly unrelated operations:
- The first is instance creation. Fine and dandy, but there is very little that associates the information returned on instance creation with a hostname we later require to access that host.
- For managing the host with Ansible we need a hostname (or IP address) but
neither is obtainable at the time of instance creation. (Compare the output
euca-run-instance
andeuca-describe-instance
above.)
Furthermore, EC2 hostnames and addresses are volatile: if I stop the instance and restart it, it gets a different IP address and hostname associated with it, meaning I’d “loose” touch. The problem I was facing was: how do I create a self-defined DNS hostname which resolves to a particular instance? (AWS does offer so-called Elastic IP addresses, but I wanted to try and solve the problem without using those.)
If that isn’t an issue for you, ignore the folloing bits which discuss DNS.
I learned that during instance creation on EC2, I can inject user-defined
data into the instance. What I’ll be doing is to ask the operator to
enter a hostname (shortname) and inject that into user-data
. Once the machine
has booted and that data is available, I’ll obtain said hostname and create
a DNS address record associating that name with the machine’s IP address.
(I see other people use user-data
for all sorts of things, including
automation of instances from the
user-data, but we don’t need
that using Ansible. :)
Ansible launches a CentOS instance on EC2
Ansible’s ec2 module
creates an instance on EC2 and optionally
waits for that instance to become ready. (Note: ready doesn’t mean booted –
that can take a few minutes.) Upon creating an instance, I specify the SSH
keypair I want to use (we created a key called jp1
for that), the image name,
and a few other parameters which are described in the module’s documentation.
One parameter I want to point out is called group
. This is a so-called
security group which, as far as I’ve been able to determine, specifies e.g.
firewall rules from the AWS point of view. By default only port 22 (SSH) is
allowed into my instance, but I’m creating Web servers so I also want (at
least) port 80.
To create an EC2 security group, I used the following commands:
$ euca-add-group -d "Web Servers" webs-a
$ euca-authorize -P tcp -p 80-80 -s 0.0.0.0/0 webs-a
$ euca-authorize -P tcp -p 22-22 -s 0.0.0.0/0 webs-a
The specified security group later on shows up as a group in the Ansible inventory (ec2.py
).
---
- hosts:
- 127.0.0.1
connection: local
gather_facts: False
vars:
keypair: jp1
instance_type: m1.small
security_group: webs-a
image: ami-8a8932e3
mail_from: Ansible
mail_to: charlie
vars_prompt:
shortname: "What is the shortname of this host to be?"
tasks:
- name: Launch new EC2 instance
local_action: ec2
keypair={{keypair}}
group={{security_group}}
instance_type={{instance_type}}
image={{image}}
wait=true
user_data='{"shortname":"{{shortname}}","admin":"Jane Jolie", "hostname":"{{shortname}}.{{dnsdomain}}","mail":"{{mail_from}}"}'
register: ec2
#
- name: Send e-mail to admins
local_action: mail
from={{mail_from}}
to={{mail_to}}
subject="EC2 instance ({{shortname}}) {{ec2.instances[0].id}}"
body="EC2 instance {{ec2.instances[0].id}} created on {{ec2.instances[0].public_ip}}"
Note how the ec2
module is given user_data
: in this case, I simply push a
JSON blob into it, but it could just as well be something else. I
believe it must be less than 16KB. (The user_data
parameter to the ec2
module is brand new in Ansible:
you’re welcome. :-)
The variable ec2
, obtained by registering the result of the ec2
module,
contains the following values, which I use in the e-mail I fire off to the
admins:
{
"instances" : [
{
"public_ip" : "107.22.159.172",
"id" : "i-3d1ba042"
}
],
"changed" : true
}
Let me run Ansible on this playbook. Note that I use the simple inventory containing just localhost, because these modules run on my Ansible management machine and not remotely.
$ ansible-playbook newinstances.yml
What is the shortname of this host to be?: : web31
PLAY [127.0.0.1] *********************
TASK: [Launch new EC2 instance] *********************
changed: [127.0.0.1]
TASK: [Send e-mail to admins] *********************
ok: [127.0.0.1]
PLAY RECAP *********************
127.0.0.1 : ok=2 changed=1 unreachable=0 failed=0
To recapitulate: so far Ansible used modules locally (i.e. on our management
machine) to remotely create a CentOS instance on EC2 and to send the
e-mail. I could now use some of the euca-
tools to look and see what is
happening.
I’ll also reiterate, that during the creation of the instance our user-data
was injected into that machine so it will be available to us as soon as we
connect to it.
Also: have a bit of patience: it can take a few minutes for the EC2 instance to actually come alive.
Oh, I have mail:
Date: Sun, 18 Nov 2012 12:51:07 +0100 (CET)
From: Ansible@c6.ww.mens.de
Subject: EC2 instance (web31) i-3d1ba042
EC2 instance i-3d1ba042 created on 107.22.159.172
As an aside, recall from above that we have the EC2 access codes in shell
environment variables. The ec2
module allows specifying these as parameters
so we could also use Ansible group_vars
or host_vars
to set these,
instead of relying on Ansible’s run-time environment.
Ansible provisions CentOS instances
After a couple minutes of patience I use Ansible to actually provision the instance I just brought up. To illustrate, I’ll just install an Apache Web server and a template, so nothing special, at least not in the first part of the playbook:
---
- hosts:
- security_group_webs-a
user: root
connection: paramiko
gather_facts: false
tasks:
- name: Install | Apache
action: yum pkg=httpd state=installed
- name: Machine | Launch Apache service
action: service name=httpd state=started enabled=true
- name: Machine | Disable firewall (fixme)
action: service name=iptables state=stopped enabled=false
#
- name: Machine | Obtain user_data from EC2
action: userdata
register: ud
#
- name: Web | Install Web templates
action: template src=templates/index.j2 dest=/var/www/html/index.html
#
- name: Local | DDNS update
local_action: dnsupdate
keyname="my-tsig-ec2centos"
secret="xxxxxxxxxxxxxxxxxxxxxx=="
mname=192.168.12.2
zone="ec.jpmens.org"
domain={{ud.user_data.shortname}}
a={{ec2_ip_address}}
txt="{{ec2_id}} {{ec2_public_dns_name}}"
The hosts Ansible should act upon are specified by a group name as obtained by
ec2.py
, and I’m connecting as root
because that’s the user for which our jp1
SSH
key has been injected into by EC2. After the first portion of the playbook has
run, I could use a Web browser to connect to the
ec2-*.compute-1.amazonaws.com
hostname or to its public IP address.
I wanted to somehow be able to pre-determine the DNS name by which an instance is reachable. As mentioned earlier, that isn’t as easy as it sounds because EC2 hostnames and addresses are volatile.
Fiddling with DNS
Recall we had injected user_data
into the instance. If, at this point, I
logged onto the instance, I could obtain that data from a specific URL
(available to all instances on the instance only).
$ curl -s http://instance-data.ec2.internal/latest/user-data
{"shortname":"web31","admin":"Jane Jolie", "hostname":"web31.ec.jpmens.org","mail":"Ansible"}
Because I was able to “smuggle” that hostname into the user-data, I can now
have Ansible retrieve that via a module to further process it. I’ve created a
custom module called userdata to read that data when on
the instance. Its output is registered in the playbook as variable ud
, and I
use bits of that information in the template (for index.html
) as well as using portions to
update the DNS with our hostname pointing to the instance’s public IP address.
Ansible will do that for us, and it’ll use a brand-new
dnsupdate Ansible module I wrote to fire-off a dynamic DNS update.
So let’s make this happen!
The instance is still waiting for us to do something with it. We tell Ansible to use a different inventory this time, i.e. the EC2 inventory, and launch the configuration playbook:
ANSIBLE_HOSTS=~/ec2.py ansible-playbook apache.yml
PLAY [security_group_webs-a] *********************
TASK: [Install | Apache] *********************
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]
TASK: [Machine | Launch Apache service] *********************
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]
TASK: [Machine | Disable firewall (fixme)] *********************
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]
TASK: [Machine | Obtain user_data from EC2] *********************
ok: [ec2-107-22-159-172.compute-1.amazonaws.com]
TASK: [Web | Install Web templates] *********************
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]
TASK: [Local | DDNS update] *********************
changed: [ec2-107-22-159-172.compute-1.amazonaws.com]
PLAY RECAP *********************
ec2-107-22-159-172.compute-1.amazonaws.com : ok=6 changed=5 unreachable=0 failed=0
Provisioning is complete, services are running, and the dynamic DNS update has been performed. I can immediately connect to our new host via the name I chose upon first launching the instance!
Lessons
CentOS on EC2 is cool, and using Ansible to provision CentOS instances is also. There are many ways to accomplish this I suppose, and I chose my way.
One final note: pop over to Seth Vidal’s site and read on how he uses Ansible on cloud instances. I particularly recommend that because he knows a lot more about all this and has more experience with it than I do. He takes a different approach by creating a module which injects a host into Ansible’s in-memory inventory; that module is now in Ansible core. I could have copied and pasted and be done with it, but I wanted to do this differently.