CouchDB is a document-oriented database, a non-relational datastore,
one which belongs to the group of NoSQL databases. (More on NoSQL
here.) Contrary to a relational database (RDBMS) in which data is
contained in tables, rows, and columns, CouchDB contains a collection of
JSON documents in a modified B-Tree file structure, each of which is
indexed by a single unique key. (The key is often a GUUID, but it need
not be.) CouchDB documents can have any number of files attached to
them, including their MIME type, and each of these documents contains a
revision number which CouchDB uses to support multiversion concurrency control
(MVCC).
CouchDB is accessed via HTTP only; there are no
application-specific binary protocols used, making it an ideal system to be
put behind load-balancers and proxies. As such, one of the tools of the trade
with which to quickly access CouchDB is curl. For example, in order to
create a database in CouchDB and add the document shown above, I do the
following:
H=http://127.0.0.1:5984/
curl -X PUT $H/demo
{"ok":true}
curl -X PUT $H/demo/1 -d "`cat /tmp/jane.json`"
{"ok":true,"id":"1","rev":"1-2867008f6fa2e8e834745efc4da65b01"}
The first curl invocation creates a database called demo
, and the
second creates a new document with a key 1
reading the content of the data
from the specified JSON text. CouchDB responds with a JSON structure
indicating errors or success. CouchDB’s Futon utility, also accessible from a
Web browser, shows the result and lets me modify the document, attach a file,
etc. (CouchDB’s tagline, relax, is neatly carried on in the name
futon.) According to its author, CouchDB is largely
inspired by the Lotus Notes back-end (i.e. Lotus Domino), and I see
a number of parallels:
- CouchDB detects and flags document conflicts when a document is updated on more than one server. However, contrary to Notes, it cannot merge individual portions of a document – that is an exercise left to the application.
- CouchDB and Lotus Notes are predestined for offline use with subsequent replication when online.
- I can set up CouchDB on a number of nodes (systems) and have a single master replicate to any number of slave nodes (on demand or continuously), and I can have an application write to any number of those nodes, effectively creating a multi-master system, just as with Lotus Notes. (The Lotus Notes client’s automatic failover to a different cluster node I would implement with a proxy, just as I’d have to do with a Notes Web application.) As with Lotus Domino, a cluster of CouchDB nodes need not be implemented on identical hardware or operating system – a great plus.
- CouchDB integrates a view model for aggregating and reporting on documents in a database. Views are built dynamically, and they are defined withing special design documents. This is quite similar to they way I view Notes views.
There are, however, a number of things that Lotus Notes includes out of the box, which CouchDB doesn’t:
- Notes has a very high and very granular level of security. (Encrypted connections, encrypted databases, reader and author fields, etc.)
- Due to its “all-in-a-box” model, Lotus Notes integrates e-mail, calendaring and directory services, which CouchDB doesn’t do, of course, as it is a database layer only.
Whether these are required in Web applications or not, is debatable. CouchDB
uses JavaScript as its internal language, although others are possible, where
Lotus Notes uses @Formula, LotusScript, Java, and the C API. Where CouchDB
certainly excels compared to Lotus Notes/Domino is upon storing a large number
of documents. I remember creating a Notes application some years ago (on 7.x),
with a database containing a couple of hundred thousand small documents (about
200 – 400 bytes per document) and two or three views. Whenever that database
was opened on a Notes client, the whole system (client and server) crawled.
Now, this is a bit like comparing apples and pine trees, but a CouchDB
database with ten times the number of documents in the same application
doesn’t make the server sweat. CouchDB is supported by a large number of
programming languages (including LotusScript). For PHP developers
out there, IBM has nice document called CouchDB basics for PHP
developers on offer. If I work with Lotus Notes, is it useful to
know (or even to migrate certain applications to) CouchDB? The answer is
yes. There are some resemblances, which make my life easy in a transition
phase. As an Open Source product, CouchDB has no licensing fees (and support
is available from CouchIO – the makers of CouchDB). I get to use
whichever programming language I desire, and I’m not restricted by the “IDE”
that the Domino Designer is. Complete applications called CouchApps can
be store in and served by CouchDB, Damien Katz, the man who rewrote the
Lotus Notes @Formula language, is the author of CouchDB. He gives an
interview which is worthwhile listening to if you have the time. I
recommend reading CouchDB: The Definitive Guide: I read it cover to
cover and liked it, although you need a bit of imagination here and there. A
good resource is also Matt Woodward’s Massive CouchDB Brain Dump – a
large collection of bullet points.