Applying a Puppet configuration to a machine or group of machines (called nodes) entails defining classes which are applied to said nodes. Classes are defined in manifest files and nodes are often defined as a collection of classes to be applied to a particular machine. These definitions are typically also contained in files.

Instead of classifying nodes in files I can use an external node classifier or ENC, basically a program that outputs a YAML document when passed a node name. (Puppet can also retrieve classification data directly from an LDAP directory, but even then I'd use the ENC method as it is more flexible.) This is good because I can obtain a list of classes for a machine from any source I desire, e.g. a database, and Puppet will, as long as the classes exist of course, happily apply the configuration I create on the fly to the node.

I'll also shortly discuss the generate and extlookup functions because they also relate to obtaining information dynamically within a class.

Here's the YAML document my ENC outputs. I'm applying two classes called "dyn" and "ssh". The former has three parameters ("ports", "smtpmx", and "users"), and the latter none. Note that the values "parameters" in the element are passed into the Puppet class as variables, and that these are not associated with a particular class. (I trust this will become clear in a moment, when we see the class definition for dyn.)

    - 25
    - 587
        home: /nfs/alex
        uid: 502
        home: /home/jane
        uid: 201
  ssh: ''
  location: Chicago
  manager: Jane Doe
  my_memory: 216

The ENC can, as shown in the above YAML example, provide classes with parameters (a.k.a. parametrized classes). The "dyn" class is passed the specified parameters. Class parameters defined in the ENC missing in the class definition result in an error when the agent attempts to apply the class. This error 400 ("Invalid parameter ... on node ...") is very useful in tracking down errors.

Here is the class definition:

class dyn($smtpmx, $ports, $users) {

    file { "/tmp/d2":
        owner    => 'puppet',
        group    => 'root',
        mode    => 0400,
        content    => template("${module_name}/tm.erb"),

    # Use inline_template() to build an /etc/passwd-like list of
    # users "username:home:uid;". Lines are semi-colon-terminated

    $res = inline_template("<% users.each do |key,val| -%>
<%= key %>:<%= users[key]['home'] %>:<%= val['uid'] %>;<% end -%>")

    # Now create an array from that result
    $users_array = split($res, '[ ;]')

    define one_user() {
        $u = split($title, '[:]')

        $username = $u[0]
        $userhome = $u[1]
        $useruid  = $u[2]

        $msg = sprintf('user=%s, home=%s, uid=%d', $username, $userhome, $useruid)

        file { "/tmp/d2.$username":
            # Two examples:
            # content   => "Hello: your home is in $userhome",
            # content   => generate('/tmp/jpgen')
            content     => inline_template(generate('/bin/jpgen'))
    one_user{ $users_array: }

    $myname = extlookup('myname', 'NONAME')
    notice("*************************** $myname ")

This is the template called tm.erb:

  • The location and my_memory variables are from the parameters section of the YAML produced by the ENC.
  • The rest are parameters from the parametrized class.
This is a template for [<%= location %>]
Memory: <%= has_variable?("my_memory") ? my_memory : '64' %>

Server passed == [<%= smtpmx %>]

<% users.each do |key,val| -%>
    Home for <%= key %> is <%= users[key]['home'] %>
    userid == <%= val['uid'] %>
<% end -%>

And here is the resulting file d2 as produced by Puppet applying the template:

This is a template for [Chicago]
Memory: 216

Server passed == []

    Home for alex is /nfs/alex
    userid == 502
    Home for jane is /home/jane
    userid == 201

My intention in experimenting with parametized classes was to create a class with which I can define a list of users as parameter. I don't seem to be able to use control structures (i.e. loops) within a class definition (or at least I haven't found a way to do that) which is why I have to resort to inline_templates() and arrays to create an array of users and then manually process that array. This is done by the defined one_user() resource:

  • With an inline_template() I create a string containing a colon-separated list of values, with each "user" separated from the next with a semi-colon.
  • I then split() this into an array.
  • The one_user() resource is invoked with this array (note the trailing colon on the variable name) which appears to call my one_user() "function" once for each array element, setting $title to the content of the array element.

More data from server: generate

Puppet's generate() function invokes an executable program on the server and collects/returns that program's output. Unfortunately, generate() doesn't accept arguments so it's a bit limited. (I can't say "generate for this user" or "generate for this node".) About the only thing the program can do is check its environment to determine for which node it is being invoked ($SSL_CLIENT_S_DN_CN). Although I can't pass parameters to the program in generate() I can use its result as a string I pass to inline_template() as shown above. The jpgen program in this example is trivial:


cat <<!
Hello, this is <%= nodename %>: how are
you, <%= username %>?

One of files produced by the file resource in one_user() then contains:

Hello, this is how are
you, jane?

External data: extlookup

External lookups from a database into a class are possible with extlookup, a function that extracts data from a CSV file. (There seem to be multiple iterations of extlookup floating around with differing capabilities including a pluggable version with JSON and YAML lookups.) I'm limiting this discussion to the extlookup function as I have access to in a 2.7 Puppet release. (Puppet's quite idiotic release numbering (0.25, 0.26, 2.7) makes searching for solid information difficult at best. Tip: remove all leading zeroes and decimal points and read what's left as the release number. But I digress.)

Because it wasn't obvious to me when I first used this, I'd like to explicitly point out that the CSV files extlookup consults are on the Puppet master (i.e. the server) and not on the client node. In my site.pp I configure the order in which extlookup searches for CSV files:

$extlookup_datadir = "/etc/db/"
$extlookup_precedence = ["%{fqdn}", "common"]

Puppet will first try the fully qualified domain name (i.e. /etc/db/ and will fall back to /etc/db/common.csv. Suppose I have the following line in common.csv

myname,John Doe <>

The notice() in my class (see above) will print the contact and e-mail address. If extlookup() doesn't find myname anywhere, extlookup() will return the default value I specified as "NONAME".

And here, for good measure, is the small Python program I used to create above ENC. Note how I read the content of the the node's facts from a file populated by Puppet before my ENC is invoked. The file contains all the node's facts, including special facts. They have been collected and deposited on the puppet master in a file called /var/lib/puppet/yaml/facts/nodename.yaml. The ENC classifier can read that YAML to find facts about the node it is creating classes for. (The YAML in this file contains a few Ruby objects which I remove with a text substitution instead of doing it properly.)

#!/usr/bin/env python

import sys
import re
from yaml import load, dump

n = sys.argv[1]

# Too lazy to do it properly
text = re.sub('.*!ruby\/.*','', open("/var/lib/puppet/yaml/facts/" + n + '.yaml').read())
facts = load(text)['values']

classes = [ 'ssh', 'dyn' ] # obtain from DB

# Convert list to dict with empty values
cdict = {}
cdict = dict((x, "") for x in classes) 

userlist = {
        'jane': {
            'uid':  201,
            'home': '/home/jane'
        'alex': {
            'uid':  502,
            'home': '/nfs/alex'

cdict['dyn'] = { "smtpmx":"", "ports" : [ 25, 587], "users" : userlist }
node = {
    'classes' : cdict,
    'parameters' : {
        "my_memory" : 216,
        "location"  : facts['location'],    # "Hamburg",
        "manager"   : "Jane Doe",
        "nodename" : facts['fqdn']

dump(node, sys.stdout,
    indent=10 )

All in all, these features make for very powerful combinations when designing a Puppet infrastructure. And as I'm a beginner, I hope I haven't made any grave mistakes here.


blog comments powered by Disqus