I have a database which consists of individual records, where each “record” lives in an independent YAML file. Let’s assume something like this, although the content is irrelevant:

title: The Sound of Music
year: 1965
director: Robert Wise
cast:
 - Julie Andrews
 - Christopher Plummer
synopsis: >
  In 1930's Austria, a young woman named Maria is failing miserably in her
  attempts to become a nun. When the Navy captain Georg Von Trapp writes to the
  convent asking for a governess that can handle his seven mischievous
  children, Maria is given the job.
tags:
 - children
 - austria
 - nun

YAML is a rather easy format to type, and I use it for recording all sorts of information, but I do occasionally introduce an error into a file, for example, a colon (:) where there shouldn’t be one. I’ve always wanted to be quickly notified of an error (before onwards conversion of the YAML, which is out of scope of this post).

I check files into git, so I could use git hooks, but I wanted more immediate feedback, something that, say, Inotify could provide, but that’s not portable across platforms.

Upon searching for a cross-platform method of watching files I stumbled over watchdog, a portable Python library for monitoring filesystem events (it’s supposed to work on Windows too). The examples included in the documentation got me started quickly, and I now have something like this show up via Growl when I make a mistake:

Growl notification YAML errors

The program itself, is rather simple:

#!/usr/bin/env python

import os, sys
import signal
import subprocess
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
import time
import yaml

# Ensure absolute path (incl. symlink expansion)
DIR = os.path.abspath(os.path.dirname(os.path.expanduser("~/yamldb/")))

def signal_handler(signal, frame):
    """ Bail out at the top level """
    sys.exit(0)

def notify(msg):
    """ Warn front-end """

    proc = subprocess.Popen(['growlnotify', "-n" "YAMLwatch", "-t", "YAMLwatch"],
        stdin = subprocess.PIPE)
    proc.communicate(msg)

def validyaml(filename):
    """
    Try loading file as YAML; return exception error or None.
    """

    try:
        f = open(filename)
        str = f.read()
        y = yaml.load(str)
        return None
    except Exception, e:
        return e

class MyHandler(FileSystemEventHandler):
    """
    React to changes in YAML files, handling create, update, unlink
    explicitly. Ignore directories. Warning: does not handle move
    operations (mv f1.yaml f2.yaml) isn't handled.
    """

    def catch_all(self, event, op):

        if event.is_directory:
            return

        filename = event.src_path
        extension = os.path.splitext(filename)[-1].lower()
        if extension == '.yaml':
            print "YAML: (%s) %s" % (op, filename)
            err = validyaml(filename)
            if err is not None:
                notify("%s\n\n%s" % (os.path.basename(filename), str(err)))
                print "ERROR in loading yaml (%s)" % err

    def on_created(self, event):
        self.catch_all(event, 'NEW')

    def on_modified(self, event):
        self.catch_all(event, 'MOD')

def main():

    signal.signal(signal.SIGINT, signal_handler)
    while 1:
    
        observer = Observer()
        event_handler = MyHandler()
        observer.schedule(event_handler, DIR, recursive=True)
        observer.start()
        try:
            while True:
                time.sleep(1)
        except KeyboardInterrupt:
            observer.stop()
        observer.join()

if __name__ == '__main__':
    main()

One thing to note when using vim to edit files is, that the editor saves the current file under a temporary name and then unlinks/links to the original. To avoid that, I make sure the particular directory I’m watching modifies the original file directly by configuring this in my .vimrc:

autocmd BufNewFile,BufRead /Users/jpm/yamldb/* set nobackup nowritebackup

I could imaging using watchdog for other things, such as

  • Checking that I’ve bumped a zone’s SOA serial number after editing a zone master file
  • Verifying that Ansible playbooks, which are also YAML, are valid
  • Continuous integration
  • LaTeX compilation
  • Creating a backup whenever something changes; for example versioned backups on a per/file basis

Ideally the file system would reject modifications to a file containing errors, but that would mean a lot more work, for example implementing such a system atop a FUSE file system. For my use case, a warning suffices.

Further reading:

files and Monitoring :: 14 Jan 2013 :: e-mail