Whenever I'm asked "which platform should I use for a blog" I typically answer: WordPress, and I mean it: I've used WordPress for some years, and it is a very capable platform, with relatively modest system requirements (MySQL and PHP). WordPress boasts a huge number of plug-ins which add any amount of featuritis to the system, it has XML-RPC support which allows you to post articles using a variety of different clients on at least as many platforms, and there's an App for it.

In spite of all that, I moved away from WordPress, but why? First of all, I realized I didn't want to use the GUI tools with their XML-RPC interfaces or what have you buttons, with a number of different client programs, last with what I liked best: MarsEdit. Even with that beautiful program, what I typically did was to write Markdown syntax, run the result through Markdown.pl | pbcopy and paste that into MarsEdit's window, finally pressing the Send to Weblog button. Second, I'm getting tired of constantly having to update the back-end software because of issues in WordPress or some of the plug-ins I use.

Markdown is plain text, and it looks like this:

Whenever I'm asked _"what platform should I use for a blog"_ I typically
answer: [WordPress][wp], and I mean it: I've used WordPress for some years, and
it is a very capable platform, with relatively modest system requirements
(MySQL and PHP).  WordPress boasts a huge number of plug-ins which add any
amount of featuritis to the system, it has [XML-RPC support][3] which allows
you to post articles using a variety of different clients on at least as many
platforms, and [there's an App for it][2].

  [wp]: http://wordpress.org/
  [2]: http://ios.wordpress.org/
  [3]: http://codex.wordpress.org/XML-RPC_Support

I was using vi/vim to write Markdown syntax, so why not use something with fewer moving parts (i.e. no database back-end), less update hassle, and keep it nice and simple?

I can't imagine why people would want to keep stuff in databases instead of just using the file system. I think most people have just forgotten how files work. (via.)

I tripped over Jekyll a while ago, but I postponed looking at it more closely until recently. Jekyll is

a simple, blog aware, static site generator. It takes a template directory (representing the raw form of a website), runs it through Textile or Markdown and Liquid converters, and spits out a complete, static website suitable for serving with Apache or your favorite web server.

This means I have a directory on, say, my Mac, which contains files of posts, I run the jekyll command, and the result is a pile of HTML files that have to be transferred to the Web server? Sounds just like what I want. But what about historic data I had in WordPress? How do I get that converted to Markdown syntax? Anybody interested in undertaking such an experiment must read Migrating from WordPress to Jekyll -- an excellent introduction into what Jekyll can accomplish, as well as How To: WordPress to Jekyll, both of which discuss how to get posts out of WordPress and into static files.

I wasn't comfortable with the suggested Ruby for blog migration, so I rolled my own set of tools, in which I basically did this:

  • Extract all posts and pages from the MySQL database into individual JSON files which I dropped into a directory. (I was expecting to have to massage a few of those manually, because I expected some contained (pasted) Latin-1 characters I needed to convert into UTF-8.)
  • A second utility processed these JSON files and produced files suitable for Jekyll, that is, the YAML front matter followed by the Markdown text. During this run, I did quite a bit of conversion:
    • Some of my older WordPress posts were written in Textile (because at the time I thought it would be a good idea -- it wasn't), most though in HTML. (Gawd, what a mess!) I attempted a simple recognition of Textile with a regular expression, and the rest I determined must be HTML.
    • Posts which contained pure HTML, e.g. those containing <object> were to be left intact.
    • Regular expressions helped me detect three (sigh) different methods of code-inclusion I used in WordPress in the past (pre/code tags, and two different plug-ins), and I massaged those into FIXME-xxxx, with xxxx being the name of the included file; these I later reviewed, and was able to mechanically fix most of them.
    • I translated fully-qualified URLs (into my own blog) to relative URLs, and detected media (.jpg, .png, etc.) and qualified those with a Liquid tag. (Liquid is a template engine integrated into Jekyll.)
    • I translated the result into Markdown with html2text. (Without this program I would have possibly given up the Jekyll undertaking.)

As a result, I had over two thousand files ready to be fed into Jekyll, but I wanted to ensure all those posts looked more or less sane, so I instructed Jekyll to build a single huge index.html file containing all posts (and a special kill button on each post I could press to record I wanted the post removed). I loaded that into a browser and leisurely scrolled through the whole site. I killed posts that

  • no longer worked; for example embedded Youtube videos that had been deleted.
  • had "expired"; topics of no historic value.

Quite a number of my postings contain source-code, and I wanted to ensure that looked sane after having been massaged by Jekyll's use of Pygments. Most of my automatic conversion had worked, but I had to manually tweak about two-dozen articles.

And now?

A new blogging platform (i.e. new technology) has its pros and cons, and it requires new methods and work flows. My experience, so far, has been good, but there are things I had to re-think.


Previewing, testing, re-design is now much easier: I use a local Web server to access generated content before I decide to push pages out to their final destination.


The page you are now reading is static: it is an HTML file which is being served by an Apache server directly from the file system. WordPress has its own full-text search engine that simple scans the database tables, but without a database at hand, Jekyll-based sites need alternatives. For the time being

I've started using IndexTank for search, but this is activated only if your Web browser supports JavaScript. Otherwise, a custom search engine on Google handles finding stuff.


A trivial program creates a new file for a posting, adds a default YAML front matter (as well as custom YAML tags), and pushes the file into a Git repository before launching my preferred editor on the file. Simple, fast, and effective. Here's an example of the front matter for this posting:

title: 'Deliciously static: from WordPress to Jekyll'
date: 2011-06-12 13:00:39
published: false
expires: never
layout: post
 - WordPress
 - Jekyll
 - Site

This means: No blogging without my Mac. With WordPress I could blog from an iPhone or from its Web interface. I cannot do that any longer: I need my Jekyll installation. I have decided that is not a limitation.


Most comments to articles I write on my blog I receive by e-mail. Even so, I had about a thousand comments on the WordPress blog. Here again, without a database, there is no way to have a commenting system. Initially I thought I'd fore go comments all together, but I then decided to try Disqus and see how well that works. Disqus has an API, so I was able to import all comments. I'll see how well their anti-spam works before deciding on whether or not I'll keep commenting alive.

Final thoughts

I'm very pleased with what Jekyll has to offer, and the way it works. It could be faster (the full site generation currently takes almost 3 minutes), but I can live with that. In spite of the static nature of the site, I can still add dynamic content by writing HTML/JavaScript in the Markdown files and have the Web browser do the dynamic bit. (I have an example of that here.)

All in all, I'm glad to be blogging like a hacker.

Flattr this
WordPress, Jekyll, and Site :: 14 Jun 2011 :: e-mail


blog comments powered by Disqus