I’m pondering a network of miniature nodes which have to “do something” upon request from a central master server. (Yes, I suppose this is a kind of botnet, but this is docile and it won’t harm anybody – on the contrary.) The background is that, during a very pleasant dinner with Stéphane Bortzmeyer a while ago we loosely discussed the creation of a network of looking glass systems. In Stéphane’s words:
A “looking glass”, among Internet engineers, typically refers to a server on one network which serves information seen from this network (two points of the Internet may see different things, that’s why looking glasses are important).
I’ve been toying with ideas, on and off, since then. This network of nodes – of which there may be dozens, hundreds, or even thousands – would be implemented on small low-cost Linux PCs, for example a Raspberry Pi.
Ideally, I could use some sort of lightweight message queue like 0mq (ZeroMQ) or something comparable to Redis’ PUB/SUB but I’ve yet to find a lightweight queue-like system which a) has implemented secure transport (e.g. TLS), and b) has strong authentication. As this network of nodes would be deployed over the Internet, both prerequisites are a MUST. (Stunnel & co might be possibilities, but that adds an additional moving part – something I wish to avoid: few moving parts equals less maintenance and fewer problems.)
I’ve been looking at Comet, a term for a technology which allows clients to take advantage of persistent connections kept open until data is available (long polling) or kept open indefinitely as data is pushed down a pipe to the client (streaming), both a.k.a. HTTP server push. There are some rather useful Open Source tools for that: Meteor and NGINX’ HttpPushStreamModule are the two I’m evaluating.
- Meteor is a Perl program. It doesn’t support HTTP over TLS, but can easily be placed
behind an Apache or NGINX reverse proxy. It consumes two TCP ports: one is for
HTTP subscribers, and the other is for firing commands to the server (e.g.
ADDMESSAGE
,LISTCHANNELS
orSHOWSTATS
) via telnet or equivalent. - HttpPushStreamModule (by Wandenberg; there is at least one other similar module) is compiled into NGINX and as such it ought to be a bit faster. Because it becomes an integral part of the Web server, it can be instructed to use TLS out of the box. It differenciates publishers and subscribers via configurable URIs.
Both support so-called “channels”, which I could use to send commands to groups of nodes, and both servers allow a subscriber to subscribe to multiple channels over a single HTTP connection. Both servers can store a few messages (in volatile storage), allowing clients that have gone deaf for a bit to catch up. The message format for clients is configurable in both; as an example for HttpPushStreamModule I’ve configured it to send out JSON text. (The tilde-delimited tokens are almost identical in both servers.)
push_stream_message_template
"{ \"id\":~id~,\"channel\":\"~channel~\",\"time\":\"~time~\",\"payload\":~text~ }";
which results in clients getting something like this:
{
"id" : 9372,
"channel" : "v01",
"time" : "Sun, 22 Jul 2012 23:30:24 GMT",
"payload" : {
"cmd" : "dnslookup",
"args" : "jpmens.net"
}
}
The basic tool-set required on the nodes is lightweight enough (an HTTP client linked with libCurl, for example), HTTP over TLS would certainly be secure enough transport-wise, and authentication could be done with client certificates which can, if required, be blocked via a CRL. (I’m also considering Kerberos as a possibility for authentication. Disadvantage: moving parts.)
The lab environment with both these servers and corresponding TLS-capable libCurl clients is ready, and the clients are busy slurping JSON requests which are being pumped through both servers – at an impressive rate, I may add.
Comet or HTTP push caters for instructing nodes on what they are to do. I’m
thinking a node will asynchronously “do” whatever it’s supposed to do and then
PUT
or POST
results back to a designated master server, also via HTTP over TLS.
Clarification: I want to try and clarify something, prompted by some of the comments below. As far as I can judge, most of the current systems (e.g. RIPE Atlas (I host one myself), or Project Bismark, etc.) monitor network latency, specific DNS requests, etc., which are difficult to modify. My thinking is that above concept could be driven by a group of people who would approve specific probes or tests run on the nodes to groups of operators, in order to answer questions such as “is a particular HTTP server reachable from a group of countries?”, “how does DNSSEC validation for domain example.com react on continent xyz, etc.