It is very difficult to work with UNIX/Linux-based systems without having heard of the ubiquitious dbm family of routines, and most people know there are a number of them, including the newer NDBM, GNU’s GDBM, SDBM, and the Berkeley DB abstraction functions which also provide DBM compatibility.
There is a new kid on the block: Tokyo Cabinet. It has been designed to improve on space efficiency (smaller databases), provides faster processing, higher prallelism, and it supports 64-bit architectures, thereby providing enormous databases, if required. Tokyo Cabinet really does appear to be fast, but it is probably better to conduct your own benchmarks, because somebody obtained other numbers.
Tokoyo Cabinet provides a library of routines to manage a database – a simple data file containing records, each of which is comprised of an arbitrary key and value, both of which can be strings or binary data. Records in the database are organized in a hash table, a B+tree or as fixed-length array. A variant of the hash database called a “table database” is also possible. Here, each record is identified by a unique key, and it has a set of named columns. (No, this has nothing to do with SQL.)
An add-on to Tokyo Cabinet, called Tokyo Tyrant provides a versatile network interface to a Tokoyo Cabinet database. Although your typical DBM-type database is a local-only affair, there are occasions, where you will want to provide multiple readers/writers to a single database (not possible with Tokyo Cabinet only), or you’ll want to access a database on a remote machine. Instead of designing your own network protocol for doing so, Tokyo Tyrant provides a simple API to accomplish the task.
Together with the API (i.e. the library iteself), Tokoyo Tyrant supplies a set
of utilities that surface its API to the command-line. For example, I launch
ttserver program in one window, specifying the name of a database I want
to manage. (This hash-database is created automatically, if it doesn’t exist.)
ttserver listens on port 1978 on all host addresses by default; you can
change this with command-line switches.
I now add two records to the database from another window:
put adds a key, possibly overwriting an existing key.
localhost is the
name of the host on which
ttserver is listening, and the third and fourth
arguments are the key and value respectively.
I can list the keys on the remote database with:
retrieve a single value identified by a key
delete individual keys, etc. and also list the content of the database (all keys and their values):
I mentioned above already, that these commands surface the API onto the command-line; what you see here (and much more), can obviously be accomplished embedded into your application.
And there is more. Let me try to connect to
ttserver via HTTP, using
curl (note how I’m using the default port 1978):
Hmm. Disappointed? Don’t be: I attempted to retrieve a key called
doesn’t exist. If I do
there is the value we stored above for the key
jp. Wow? Yes: Wow!
Back to the Tokyo Cabinet API itself. The following program (a slightly modified bit of sample code) reads through our on-disk database, and enumerates key/value pairs:
Note how, when I open the database with
tchdbopen() I use the flags
HDBOREADER | HDBONOLCK. The former defines my program as a reader (I can
have many readers but one writer only) and I’ve set the program to not lock.
Tokyo Cabinet defines its locking strategy as follows: I can have many
readers or one writer, but not both together. This is the reason for Tokyo
Tyrant : I can circumvent the issue by using the client-server model it
When the sample program runs, I see the list of key/values contained in the database:
Tokyo Tyrant can also be built with embedded Lua. The Lua extension allows the database server to read a Lua script file, and clients can call functions defined in those scripts. User-defined functions can then access all of Lua’s offerings, in addition to using routines exported by Tokyo Tyrant to log messages, store and retrieve records, etc.
Suppose I have the following Lua script, on the same database I used above:
ttserver with that script
I’ll now use the
ext command in the client to invoke my user-defined Lua
function say(); Tokyo Tyrant automatically passes it the key/value pair we
Thanks is what my Lua function returns. Has the record been inserted?
Yes, it has, because my
say() function uses the built-in
That is Wow!
In addition to all this, Tokyo Tyrant can replicate a database onto an additional server, allowing me to easily create fault-tolerant remote databases. Check the documentation for more on this. Here again, I recommend you start with the presentation, which gives a good overview of Tokyo Cabinet and Tyrant capabilities.