Friday, September 04, 2009

squid reverse proxy (aka website accelerator) on ubuntu hardy

We had a bunch of machines all hitting the same URLs on www.example.com, so we put up a squid reverse proxy on ourcache.example.com so that http://www.example.com/foo/bar would get served and cached by http://ourcache.example.com/foo/bar

sudo apt-get install squid squid-cgi # squid-cgi enables the cache manager web interface

Edit /etc/squid/squid.conf. It's very well documented, and we only had to modify a few lines from the default ubuntu hardy config:

Allow other local machines to use our cache:

acl our_networks src 10.42.42.0/24
http_access allow our_networks

instead of the default of:

http_access allow localhost

Have the cache listen on port 80 and forward all requests to www.example.com:

http_port 80 defaultsite=www.example.com
cache_peer www.example.com parent 80 0 no-query originserver

In our case, we wanted to cache pages that included "GET parameters" in the URL, such as http://www.example.com/search?query=foo (which is something you should only do in special cases):

# enable logging of the full URL, so you can see what's going on (though it's a potential privacy risk to your users)
strip_query_terms off

Comment out the lines that exclude cgi-bin and GET parameter URLs from being cached:

#acl QUERY urlpath_regex cgi-bin \?
#cache deny QUERY

Then we went to: http://localhost/cgi-bin/cachemgr.cgi to see how well our cache was working (blank login and password by default).

After doing an "/etc/init.d/squid restart", we found that we could hit http://ourcache.example.com/foo/bar.html and get http://www.example.com/foo/bar.html, as expected.

No comments: