The Apache HTTP server allows a system administrator to configure how it should log requests. This is good in terms of flexibility, but it's horrid in terms of parsing: every installation can be different.

I was tasked with getting Apache logs into Graylog and discovered that $CUST has different Apache log formats even between Apache instances which run on a single machine. I certainly didn't want to have to write extractors for all of those, and I can't imagine people here wanting to maintain those ...

People have tried submitting JSON directly from Apache, but I find that a bit cumbersome to write, and I have the feeling it's brittle: an unexpected brace in the request (which ought to be possible) could render the JSON invalid.

apache-logger

I settled on what I think is a much simpler and rather flexible format: a TAB-separated (\t) list of key=value pairs configured like this in httpd.conf:

LogFormat "clientaddr=%h\trequest=%r\tstatus=%s\toctets=%b\ttime=%t\truntime=%D\treferer=%{Referer}i\tuseragent=%{User-Agent}i\tinstance=nsd9" graylog
CustomLog "|/usr/local/apache-logger.py" graylog

The apache-logger program splits those up, adds fields required for GELF, and fires that off to a Graylog server configured with an appropriate GELF input.

#!/usr/bin/env python
# JPMens, March 2015 filter for special Apache log format to GELF

import sys
import json
import gelf    # https://github.com/jspaulding/gelf-python/blob/master/gelf.py
import socket
import fileinput
from geoip import open_database    # http://pythonhosted.org/python-geoip/

my_hostname = socket.gethostname()  # GELF "host" (i.e. source)

try:
    geodb = open_database('GeoLite2-City.mmdb')
except:
    sys.exit("Cannot open GeoLite2-City database")

c = gelf.Client(server='192.168.1.133', port=10002)

def isnumber(s):
    try:
        float(s)
        return True
    except ValueError:
        pass

    return False

for line in fileinput.input():
    parts = line.rstrip().split('\t')
    data = {}
    for p in parts:
        key, value = p.split('=', 1)

        if isnumber(value):
            try:
                value = int(value)
            except:
                value = float(value)

        if value != '' and value != None:
            data[key] = value

    data['host']        = my_hostname    # overwrite with GELF source
    data['type']        = 'special'

    request = data.get('request', 'GET I dunno')
    method = request.split(' ', 1)[0]

    data['short_message']  = request
    data['method']         = method
    if 'request' in data:
        del data['request']

    try:
        g = geodb.lookup(data['clientaddr'])
        if g is not None:
            data['country_code'] = g.country
    except:
        pass

    try:
        c.log(json.dumps(data))
    except:
        pass

Graylog effectively receives something like this (the Geo-location having been added by apache-logger):

{
    "clientaddr": "62.x.x.x",
    "host": "tiggr",
    "instance": "nsd9",
    "method": "GET",
    "country_code": "GB",
    "octets": 282,
    "referer": "-",
    "runtime": 501,
    "short_message": "GET /barbo HTTP/1.1",
    "status": 404,
    "time": "[20/Mar/2015:06:41:36 +0000]",
    "type": "special",
    "useragent": "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.1.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2"
}

You'll have noted that the LogFormat allows me to specify any number of fields (e.g. instance) and values.

View Comments :: Graylog :: 20 Mar 2015 :: e-mail

There's a rather busy Graylog installation next door which dissects messages given it via a few syslog inputs and pushes these into a stream to be viewed in a Graylog dashboard. In particular, it attempts to determine a user connecting via POP3 or IMAP, and this has been working rather well.

Graylog dashboard

I was given the task to think about how we could store a per/day count of unique users, and I started dabbling about with the stream settings, to find I could export a stream via GELF, so I did that.

The GELF forwarder pushes each message to a small Python program (prototype production version below) which sees this after unpacking the GELF:

{
    "_level": 6,
    "_timestamp": "2015-02-18T09:38:59.000+01:00",
    "_mail_loginuser": "jane.doe@example.net",
    "level": 6,
    "_gl2_source_node": "88e6b126-3e28-4464-84d9-7f10d0008a70",
    "timestamp": 1424248739.0,
    "_source": "m001",
    "_message": "m001 imapd: user=jane.doe@example.net, ip=[10.0.1.12], port=[1167]\n",
    "_gl2_source_input": "54003b3d498e65130f5803b0",
    "host": "m001",
    "version": "1.1",
    "full_message": "m001 imapd: user=jane.doe@example.net, ip=[10.0.1.12], port=[1167]\n",
    "_facility": "mail",
    "_id": "95939419-b749-11e4-a19a-9c8e992bcdc0",
    "_forwarder": "org.graylog2.outputs.GelfOutput",
    "short_message": "m001 imapd: user=jane.doe@example.net, ip=[10.0.1.12], port=[1167]\n"
}

The small utility opens a UDP datagram port and waits for messages to flow in, unpacking the GELF from each and processing the data.

#!/usr/bin/env python
import socket
import zlib
import json
import redis

UDP_PORT = 15005

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(("0.0.0.0", UDP_PORT))

r = redis.StrictRedis(host='localhost', port=6379, db=0)

while True:
    data, addr = sock.recvfrom(8192)
    payload = zlib.decompress(data, 16+zlib.MAX_WBITS)
    data = json.loads(payload[0:len(payload) - 1]) # -1 for NUL byte at end

    tstamp = data['_timestamp'][0:10]       # "2015-02-18"

    proto = 'pop3'
    if 'imap' in data['_message']:
        proto = 'imap'

    username = data.get('_mail_loginuser', 'nop').strip().lower()

    key = 'day-' + tstamp + '-' + proto
    r.incr(key)

    key = 'user-' + username
    r.incr(key)

This has been running for a couple of weeks and appears to be quite reliable, even with the volume of messages we're seeing.

Graylog

At the end of the day we can read the incremented values which are stored in redis.

$ redis-cli
127.0.0.1:6379> get "day-2015-03-14-pop3"
"902521"
View Comments :: Graylog :: 14 Mar 2015 :: e-mail

Shortly after enduring immense pain in getting the 4-in-1 sensor going, as explained in the story of home automation with openHAB, Z-Wave, and MQTT the sensor's batteries ran dry. I uttered a stream of unprintables and decided that I needed a more reliable method of finding out when sunrise and sunset are.

I settled on forecast.io, wrote up a little program which queries that a couple of times a day and publishes the values via MQTT on my broker which is bridged to another Mosquitto broker which runs on the little openHAB box.

home/weather/today/sunsetTime 1425748577
home/weather/today/sunsetTime-iso 2015-03-07T18:16:17
home/weather/today/sunsetTime-hhmm 18:16

A rule fires at sunset minus a bit, and we rejoiced. (I in particular because the WAF returned to 100.)

For a few days now I've been getting heat because the lights came on late afternoon while it was still very light outside. I patiently explained that it's due to the "official" sunset time for our city, but when the lights switched on again today at 16:15 I thought I'd better investigate.

So I did. To cut a rather long story short, everything would have worked if the disk of our openHAB box hadn't filled up to 100% with log files... Who said #monitoringsucks? They were right.

Once cleaned up, all started working again.

openHAB sunset

Now I have to go and explain to the management that the "official" sunset time isn't actually 16:15 after all...

View Comments :: automation :: 07 Mar 2015 :: e-mail

Somebody asked me whether it would be possible to be notified (no pun intended) of new DNSSEC keys as they are published in a zone, and I referred them to St├ęphane Bortzmeyer's key-checker which I mentioned when I talked about RFC 5011 with OpenDNSSEC, BIND, and Unbound recently. They wanted something more "live", as in "tell me now!", so I thought I'd give it a shot.

OpenDNSSEC has a DelegationSignerSubmitCommand which is invoked when a new KSK is created. This command is fed a DNSKEY resource record (RR) on stdin, and it's intended to construct a DS record set and submit that to a parent zone. As far as I can tell, it's only invoked for new KSKs, so I had to go a different route because, say, revoked keys wouldn't be passed to the DelegationSignerSubmitCommand.

It occurred to me to re-use some of the ideas developed for automatic provisioning of slave DNS servers, in which I used a bit of Net::DNS::Nameserver to listen for DNS NOTIFY packets. Unfortunately dnspython doesn't have any of that, so I had to fall back on Perl.

The OpenDNSSEC signer uses DNS adapters to NOTIFY a bunch of NSD name servers that a zone is ready for transfer. These in turn slave the zone and make it available to the world. In addition, when NSD transfers a DNSSEC-signed zone, I've configured it to send a DNS NOTIFY to a small utility called key-listen which runs on port 5353 of 127.0.0.1 on one of the NSD hosts:

zone:
    name: "d1.aa"
    zonefile: "d1.aa.zone"
    allow-notify: 192.168.1.110 NOKEY
    request-xfr: 192.168.1.110 NOKEY
    notify: 127.0.0.1@5301 NOKEY       # for key-listen.pl

key-listen

What the key-listen utility does it to set up a handler for DNS NOTIFYs which contain the source address and the zone being notified. When NSD is notified from OpenDNSSEC, it does its transfer thing, and then notifies key-listen. The program then "turns around" to query the DNSKEY RRset for the zone from NSD and dumps it into a file for logging purposes. This is what the program writes:

d1.aa. 300  IN DNSKEY 257 3 8 AwEAAc ... MjdPG6P4Ex8= ; Key-ID 9562
d1.aa. 300  IN DNSKEY 385 3 8 AwEAAc ... L5P8LKFEVKs= ; Key-ID 13008

In addition it records the key id (keytag) and the flags of the keys we're interested in.

So far so good, but how to "notify" the user? I suppose any sort of monitoring solution would have done the job, but I thought this should be "flexibilized" (that's a word I recently invented), so, in the spirit of how do your servers talk to you? I wanted this to be easy. It is.

Twinotify

That tweet contains the keytag and the DNSKEY flags for each of the KSK in the zone d1.aa. OK, it may be a little crazy, but it works really well, and it's not half as bonkers as tweeting via RFC 2136 dynamic updates, is it?

The tweet is here for illustrative purposes only. What the user really wants is a smartphone notification, but that's what mqttwarn is all about; it even has a dnsupdate plugin :-)

View Comments :: DNSSEC and MQTT :: 05 Mar 2015 :: e-mail

The impromptu midnight DNSSEC tutorial I gave last Saturday during the Ansible meetup in Utrecht was quite successful: only one third of the participants fell asleep (a good turnout for me), but I attribute that solely to the Gin&Tonics which were consumed during the tutorial. Be that as it may, one of the participants was particularly attentive: Serge apparently really paid attention because after extolling the benefit of potentially being able to do away with the need for X.509 Certification Authorities by using DANE, the DNS-based Authentication of Named Entities, he actually had the nerve to ask me why this site wasn't using DANE.

It wasn't. It does now.

DANE validation

To be honest, I simply forgot, and I thought it would be a little complicated if not outright impossible, because this site is on a shared-hosting site. But I'm very lucky: the great people of Uberspace have a simple procedure for uploading any amount of certificates, even self-signed ones.

So, after issuing an X.509 TLS certificate from my private Certification Authority (CA) and having the Ubernauten add that to my domain, I got cracking, and created a DNS TLSA record for it. I decided I'd use the following fields:

  • Certificate Usage 3 with which I'll specify the certificate or public key which must match what the HTTP server responds with in the TLS session.
  • Selector of 0 which means full certificate
  • Matching Type of 1 for the SHA-256 hash of the binary certificate content.

Since I settled for certificate usage of 3, I need the "fingerprint" of the binary content. There are two ways I know to get the required certificate fingerprint which I need for building the record record:

$ openssl x509 -in jpmens.net.crt -outform DER | shasum -a 256 | awk '{print $1;}'
2f32249d9c9bd8fa245dafc722c55f5087c3b2b8c236d194ebc912f30be20cb5

$ openssl x509 -in jpmens.net.crt -noout -fingerprint -sha256 | sed -e 's/^.*=//' -e 's/://g'
2F32249D9C9BD8FA245DAFC722C55F5087C3B2B8C236D194EBC912F30BE20CB5

Other than upper/lowercase hex, there is no difference in the value. So, what I now need to do is to create the actual TLSA record in the DNS, and sign the zone. Here we go (and I'm very grateful to Florian for the combination of dig flags which show header and answer only -- I've been wanting that for donkey's ages :-)

$ dig +nocmd +noadditional +noauthority +nostats +noquestion _443._tcp.jpmens.net TLSA
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38211
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; ANSWER SECTION:
_443._tcp.jpmens.net.   79948   IN  TLSA    3 0 1 2F32249D9C9BD8FA245DAFC722C55F5087C3B2B8C236D194EBC912F3 0BE20CB5

Note how the 3 0 1 I chose as flags are part of the rdata of the record.

If you don't feel like doing all that OpenSSL stuff, Shumon Huque has created a marvellous TLSA Record Generator which does all this for you; you just copy/paste the result into your DNS zone and sign it.

TLSA Record Generator

So, I now have a TLS-protected Web site, and I have a TLSA record in the DNSSEC-signed zone. And now?

If tech life were really good, the large browser vendors would have implemented support for TLSA and that would be it. As it stands, none of them have, maybe because they're shy of doing away with a lot of cash they're getting from the hundreds of Certification Authorities whose CA root certificates the carry in the Web browsers we use? Peter tells me the reasons are actually technical in nature: the latency resulting from an out-of-band checking which has to be done via the DNS to obtain the TLSA record and signatures and verify these, and the fact that most consumer equipment (CPE) just won't handle DNSSEC properly. There's probably some truth in both.

Be that as it may, in order to test this, I've installed the TLSA Validator from the cz.nic team. This Web browser add-on is rather well documented, and its source code is available.

The result? It works, as you can see from the screenshot at the top of this page.

My impression of the validator is that it's a bit flaky: I couldn't get it to work at all on FireFox for OS X (or for Windows), but it worked out of the box for Chrome OS X. I'd like to hear of your experiences with it. What I like a lot is that I can specify which resolver it should use, and I just pointed it do the DNSSEC-trigger I run anyway.

Before you try all this at home, do make sure you're actually using a DNSSEC-validating resolver (a.k.a is the top right of this page green?).

I'm reminded in a comment below, that there are other utilities, such as ldns-dane from the ldns project. ldns-dane can create a TLSA record from a TLS connection, and it can verify a a TLSA setup:

$ ldns-dane -4 create jpmens.net 443
_443._tcp.jpmens.net.   3600    IN  TLSA    3 0 1 2f32249d9c9bd8fa245dafc722c55f5087c3b2b8c236d194ebc912f30be20cb5

$ ldns-dane -4 verify jpmens.net 443
95.143.172.28 dane-validated successfully

Further reading:

View Comments :: DANE and DNSSEC :: 04 Mar 2015 :: e-mail

Other recent entries