We make heavy use of a Web caching or proxy server for outgoing HTTP
requests, and squid is our preferred tool. I implemented squid a
number of years ago, and its
squid.conf configuration file has historically
(or histerically) grown out of reasonable proportions with all manner of
access control lists (ACL) strewn throughout. (If you need an overview of
squid's ACL language, they are documented here.)
We make heavy use of user authentication with LDAP, but for certain URLs or domains, we offer un-authenticated access. We had dozens lists of ACLs such as these:
acl Custom1 url_regex -i ^http://www\.example\.org/ http_access allow Custom1 acl MS_something url_regex -i ^http://www\.microsoft\.com/ http_access allow MS_something acl goog_whatvr dstdomain google.com acl allow CONNECT goog_whatvr ...
You get the idea. It is hideous.
I forget whether this feature didn't exist at the time, or whether I'd simply overlooked it, but squid can (obviously -- it does almost anything you need it to do) read lists of "things" from files on the file system, using the
acl acl_name acl_type "file"
acl is a keyword, acl_name is the name I give this particular
access control list, acl_type is one of the allowed types of ACL, such as
dst for destination address,
dstdomain for DNS domains,
matching URLs against regular expressions, etc. (see them all here.) The
last argument file must be in quotes and specifies the filename from which
squid reads the values.
So, what I then did, was to painstakingly create three new files containing our ACLs for destination domains, addresses and regular expressions, making sure I retained the comments indicating why that particular ACL was entered, and when, and by whom.
These value files as I call them, I then reference in
squid.conf. The next
few lines of configuration replace and endless list of ACLs in squid's
acl f_urlregex url_regex -i "/etc/squid/urlregex.rules" http_access allow f_urlregex acl f_dstaddress dst "/etc/squid/dstaddress.rules" http_access allow f_dstaddress ## domain names: # names in this list are allowed for HTTP *and* HTTPS acl f_dstdomain dstdomain "/etc/squid/dstdomain.rules" http_access allow f_dstdomain http_access allow CONNECT f_dstdomain
The result is a much cleaner
squid.conf, which we don't have to touch.
Alterations (additions, usually) to the rules are added to the rule files, and
we do this semi-mechanically using make.
The value files contain lists of things: IP address/netmask ranges,
destination domains, and URL regular expressions. For example, the file
dstdomain.rules looks like this:
jpmens.org .microsoft.com www.example.net ...
There are two things that you must keep in mind, if you organize your ACLs thusly:
- Make liberal use of
squid -k parseto have squid check the contents of these files before you attempt to
squid -k reconfigure(or restart) the service. This will detect dud DNS domains, illegal address ranges for
dst-type ACLs, etc.
- Unfortunately (though probably just as well -- avoids fubar), squid doesn't reload these files automatically, so whenever you change one of the files, you'll still have to instruct squid to reload itself, manually.
I reloaded the proxies early this morning, and almost everything went well. (I say "almost", because Michael was quick to report two sites that didn't work in un-authorized state. I fixed that quickly enough, and all seems well.)
The copying, pasting and massaging of our ACLs into this structure was boring, but I'm pleased I did it. It'll make life easier, long-term. :-)