22
May 2011

Redirect High Traffic Referrers to an External Cache with .htaccess and Coral Cache

Last week, out of the blue, a blog post of mine from 2009 started gaining popularity on the social bookmarking site StumbleUpon. While this was a pleasant surprise, the traffic on this site has gone to ~50 visits per day to over 5000. Given the bandwidth caps imposed by my ISP, this extra traffic began to worry me.

My first thought was to use .htaccess to redirect the traffic to the Google cache of the webpage, but the Google cache doesn't load quite right and looks poor. I thought about other caching services and decided to use Coral Cache since it caches webpages in a fairly transparent manner.

The next problem was that using a simple url rewrite rule caused loops where Coral Cache would request content and the webserver would redirect these requests back to Coral Cache. Excluding Coral Cache from the rule prevented these loops from occurring.

Below is the finished rule I added to my .htaccess file:

# Redirect high traffic referrers to Coral Cache
RewriteCond %{HTTP_USER_AGENT} !^Googlebot
RewriteCond %{HTTP_USER_AGENT} !^CoralWebPrx
RewriteCond %{QUERY_STRING} !(^|&)coral-no-serve$
RewriteCond %{HTTP_REFERER} ^http://([^/]+\.)?digg\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://([^/]+\.)?reddit\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://([^/]+\.)?stumbleupon\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://([^/]+\.)?boingboing\.net [OR]
RewriteCond %{HTTP_REFERER} ^http://([^/]+\.)?del\.icio\.us
RewriteRule ^(.*)$ http://www.seancarney.ca.nyud.net/$1 [R=302,L]

Based on this rule, any request from one of the listed websites is routed through Coral Cache and all other requests are served normally. You can see this in action by viewing my site through StumbleUpon.

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.