Twitter is fun, and you can play with it. In spite of the multitude of Twitter clients, some of which can keep some sort of archive of what I've tweeted, I want to collect my tweets and archive them independently of the client I'm using.

I've knocked up a small Perl program which uses the Twitter API to grab my so-called time line in JSON format. It then stores the individual tweets in a small SQLite database -- nice and lightweight.

Start off by creating the database, which I've called tweets.db:

$ sqlite3 tweets.db < tweets.sql

The database table is trivial; its definition is in tweets.sql:

         tstamp DATE,
         text VARCHAR(140),
         PRIMARY KEY(id));

Now comes the small program, which retrieves my time line from Twitter, parses through the JSON format and inserts my latest tweets into the database.

    use strict;
    use JSON;
    use DBI;
    use LWP::UserAgent;
    use Date::Parse;
    my $username = 'jpmens';
    my $dbh = DBI->connect("dbi:SQLite:dbname=tweets.db",
          "", "", {AutoCommit => 0,
               PrintError => 1} )
        or die $DBI::errstr;
    my $sth = $dbh->prepare("INSERT INTO tweets (id, tstamp, text) VALUES (?, ?, ?)");
    # find last used twitter ID in table
    my @row = $dbh->selectrow_array("SELECT MAX(id) FROM tweets");
    my $lastid = (defined(@row[0])) ? @row[0] : 1;
    my $url = "${username}.json?since_id=${lastid}";
    my $ua = LWP::UserAgent->new;
    $ua->agent("TweeterArchiver/0.1 " . $ua->agent);
    my $res = $ua->get($url);
    if (! $res->is_success) {
        die "Cannot retrieve URL ($url): " . $res->status_line . "\n";
    my $timeline = from_json($res->content);    # JSON string
    foreach my $tweet (@$timeline) {
        print $tweet->{id} . " " . $tweet->{text} . "\n";
            $tweet->{text}) or die $DBI::errstr;
    undef $sth;
    undef $dbh;

Each time the program runs, it determines the last tweet identifier I received, and asks Twitter to return tweets that were submitted since then. (I'm being nice to Twitter in doing so.)

Tweets are stored in the database table, and I can do something with those sometime later, for example:

SELECT  time(tstamp, 'unixepoch'), text 
        FROM tweets
        WHERE date(tstamp, 'unixepoch') = date('now');
08:50:37|@nmeth Ooohh, lucky you! Our postman hasn't rung twice yet...
    09:07:48|une pensée "philosophique"...
    13:07:09|The Missus doesn't want an iPhone. Grounds for a divorce?
    13:45:50|@vowe Cold drink nowhere to be seen. On your tummy behind the c't ? ; -)
    18:16:57|On my blog: New look, old feel
    18:53:54|Couscous. Good.

Remember, that you're not allowed to invoke this program terribly often: Twitter rate limits access to its servers. I run it once an hour from cron.

Flattr this
Database, CLI, twitter, sqlite, and Perl :: 29 Aug 2009 :: e-mail


blog comments powered by Disqus