There may be times when you want particular RSS feeds to not be loaded in your feed reader. Reasons may include:
- You don’t want to be distracted by news of a particular site during working hours.
- URLs of particular feeds shouldn’t show up in a customer’s logs.
- You are obliged to use a caching proxy, and the feed can’t be loaded through that proxy.
Whichever reason is applies, the solution is quite simple to implement and allows great flexibility. (Most feed readers can be configured to load feeds at a particular interval, or to ignore the feed, but this is too much fiddling and cannot usually be controlled centrally.)
You’ll need a Web server on your local machine (e.g. Apache) and a dynamic
language (e.g. PHP). I’ve tested this on Mac OS X which has both. Ensure your
local Web server is running and copy the following PHP script to a suitable
location, making note of the URL of the script (e.g.
http://localhost/~user/get.php
). It is probably a good idea to protect
access to the script with a .htaccess
file, to ensure that only your local
machine can access this.
Here is the short script. The long part is a beautiful Curl function which I copied from here because I was too lazy to write my own. :-)
<?php
$url = $_GET['url'];
if (!isset($url)) {
die("URL missing");
}
# Determine if it is OK to load this feed.
# See text.
$loadfeed = false;
if ($loadfeed == false) {
header("HTTP/1.0 404 Not Found");
print "Not just now, Josephine.";
exit;
}
$result = get_web_page($url);
if ( $result['errno'] != 0 ) {
header("HTTP/1.0 404 Not Found");
exit;
}
if ( $result['http_code'] != 200 ) {
header("HTTP/1.0 " . $result['http_code'] . " Problem");
exit;
}
$page = $result['content'];
header("Content-type: " . $result['content_type']);
print $page;
exit;
/**
* Get a web file (HTML, XHTML, XML, image, etc.) from a URL. Return an
* array containing the HTTP server response header fields and content.
* From: http://nadeausoftware.com/articles/2007/06/php_tip_how_get_web_page_using_curl
*/
function get_web_page( $url )
{
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "tis-me", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
}
?>
The program sets $loadfeed
to true or false to indicate whether or not
the feed passed in as a GET parameter should be loaded. This is where you
insert your own logic, specifying whether or not a feed should be loaded. Some
things you may wish to check for are:
- The feed (i.e.
$url
) contains a specific value or regular expression. - The feed name is contained in a configuration file. This allows you to quickly alter the list of feeds.
- The feed shouldn’t be loaded during particular times.
- Your local machine has obtained a particular IP address. Check this with
$_SERVER['SERVER_ADDR']
.
Assume the feed you want to “filter” is called http://example.org/feed.rss
.
This is the URL you’d normally enter into your feed reader so that it download
the feed. Instead, configure your feed reader to load above program from your
local Web server, giving it the feed name as a parameter:
http://localhost/~user/get.php?url=http://example.org/feed.rss
Your feed reader will thus invoke the get.php
program, giving it the
parameter url
containing the name of the target feed –
http://example.org/feed.rss
in this example.
If $loadfeed
evaluates to false, your feed reader is given an HTTP 404
code (file not found), and it will typically ignore that until it later
retries the feed again. If, on the other hand, $loadfeed
is true, the
program will attempt to retrieve the URL and will return the feed to your
reader.