I have a database which consists of individual records, where each “record” lives in an independent YAML file. Let’s assume something like this, although the content is irrelevant:
title: The Sound of Music
year: 1965
director: Robert Wise
cast:
- Julie Andrews
- Christopher Plummer
synopsis: >
In 1930's Austria, a young woman named Maria is failing miserably in her
attempts to become a nun. When the Navy captain Georg Von Trapp writes to the
convent asking for a governess that can handle his seven mischievous
children, Maria is given the job.
tags:
- children
- austria
- nun
YAML is a rather easy format to type, and I use it for recording all sorts of information, but I do occasionally introduce an error into a file, for example, a colon (:
) where there shouldn’t be one. I’ve always wanted to be quickly notified of an error (before onwards conversion of the YAML, which is out of scope of this post).
I check files into git, so I could use git hooks, but I wanted more immediate feedback, something that, say, Inotify could provide, but that’s not portable across platforms.
Upon searching for a cross-platform method of watching files I stumbled over watchdog, a portable Python library for monitoring filesystem events (it’s supposed to work on Windows too). The examples included in the documentation got me started quickly, and I now have something like this show up via Growl when I make a mistake:
The program itself, is rather simple:
#!/usr/bin/env python
import os, sys
import signal
import subprocess
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
import time
import yaml
# Ensure absolute path (incl. symlink expansion)
DIR = os.path.abspath(os.path.dirname(os.path.expanduser("~/yamldb/")))
def signal_handler(signal, frame):
""" Bail out at the top level """
sys.exit(0)
def notify(msg):
""" Warn front-end """
proc = subprocess.Popen(['growlnotify', "-n" "YAMLwatch", "-t", "YAMLwatch"],
stdin = subprocess.PIPE)
proc.communicate(msg)
def validyaml(filename):
"""
Try loading file as YAML; return exception error or None.
"""
try:
f = open(filename)
str = f.read()
y = yaml.load(str)
return None
except Exception, e:
return e
class MyHandler(FileSystemEventHandler):
"""
React to changes in YAML files, handling create, update, unlink
explicitly. Ignore directories. Warning: does not handle move
operations (mv f1.yaml f2.yaml) isn't handled.
"""
def catch_all(self, event, op):
if event.is_directory:
return
filename = event.src_path
extension = os.path.splitext(filename)[-1].lower()
if extension == '.yaml':
print "YAML: (%s) %s" % (op, filename)
err = validyaml(filename)
if err is not None:
notify("%s\n\n%s" % (os.path.basename(filename), str(err)))
print "ERROR in loading yaml (%s)" % err
def on_created(self, event):
self.catch_all(event, 'NEW')
def on_modified(self, event):
self.catch_all(event, 'MOD')
def main():
signal.signal(signal.SIGINT, signal_handler)
while 1:
observer = Observer()
event_handler = MyHandler()
observer.schedule(event_handler, DIR, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
if __name__ == '__main__':
main()
One thing to note when using vim to edit files is, that the editor saves the current file under a temporary name and then unlinks/links to the original. To avoid that, I make sure the particular directory I’m watching modifies the original file directly by configuring this in my .vimrc
:
autocmd BufNewFile,BufRead /Users/jpm/yamldb/* set nobackup nowritebackup
I could imaging using watchdog for other things, such as
- Checking that I’ve bumped a zone’s SOA serial number after editing a zone master file
- Verifying that Ansible playbooks, which are also YAML, are valid
- Continuous integration
- LaTeX compilation
- Creating a backup whenever something changes; for example versioned backups on a per/file basis
Ideally the file system would reject modifications to a file containing errors, but that would mean a lot more work, for example implementing such a system atop a FUSE file system. For my use case, a warning suffices.
Further reading: