Improving varnish hit rates for Drupal 7

I've recently moved my site to a new VPS and decided to upgrade to Drupal 7 at the same time. It was an interesting move, since a lot of the modules I was using have not yet been ported to D7. One of the main reasons I moved over was for native external caching support, meaning I no longer needed to depend on Pressflow. I ran into a few issues actually getting this to work properly with Varnish, so I wanted to cover how I got this working.

Changes to Drupal's settings.php

Just like in Pressflow, you need to make sure the site is aware it is being accessed from behind a reverse proxy:

$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('127.0.0.1');

The next option needed is specific to Drupal 7, and there's a bug report to clarify the relation between external caching and page_cache_invoke_hooks. This was a confusing and frustrating part of getting Drupal to set the Cache-Control headers properly.

$conf['page_cache_invoke_hooks'] = FALSE;

When you set page_cache_invoke_hooks to false, the cache settings in the UI will be taken into account when sending out the appropriate cache related headers. Without this, you will see Cache-Control: max-age=0, which basically tells Varnish not to cache anything. The bug report explains more in depth as to why this happens.

A basic VCL for Drupal

Cookies are usually the biggest issue when it comes to properly caching a website. To find out what cookies were being sent to my backend from Varnish, I started with no custom configuration and picked a URL to test against.

varnishlog -b -o TxURL /content/scaling-your-application-cloud

Visit the URL path, and look for a line similar to the following:

11 TxHeader     b Cookie: __utmz=148462695.1292604478.3.2.utmcsr=linkedin.com|utmccn=(referral)
|utmcmd=referral|utmcct=/profile/edit; Drupal.toolbar.collapsed=0; has_js=1; 
__utma=148462695.668390275.1288311613.1298230598.1298310610.28; __utmc=148462695; __utmb=148462695.11

These cookies will cause bad cache hit rates because of the Vary: Cookie header, which tells a compliant reverse proxy, to vary the cache based off of the Cookie header. You can see how this will become an issue, because every browser and client is going to send a different Cookie header. To remove them, you need a line similar to this in the vcl_recv portion of your configuration:

set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js|Drupal.toolbar.collapsed)=[^;]*", "");

Most of this comes from the default Pressflow VCL, but a new addition in Drupal 7 is the Drupal.toolbar.collapsed cookie. This cookie seems to be set in order to control toolbar visibility, per the module documentation.

My full VCL is as follows:

backend default {
  .host = "127.0.0.1";
  .port = "8080";
}

sub vcl_recv {
  remove req.http.X-Forwarded-For;
  set req.http.X-Forwarded-For = client.ip;

  // from https://wiki.fourkitchens.com/display/PF/Configure+Varnish+for+Pressflow?focusedCommentId=13828160
  // Remove has_js, Google Analytics __* and Drupal.toolbar.collapsed cookies.
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js|Drupal.toolbar.collapsed)=[^;]*", "");
  // Remove a ";" prefix, if present.
  set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
  // Remove empty cookies.
  if (req.http.Cookie ~ "^\s*$") {
    unset req.http.Cookie;
  }

  // fix compression per http://www.varnish-cache.org/trac/wiki/FAQ/Compression
  if (req.http.Accept-Encoding) {
    if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {
        # No point in compressing these
        remove req.http.Accept-Encoding;
    } elsif (req.http.Accept-Encoding ~ "gzip") {
        set req.http.Accept-Encoding = "gzip";
    } elsif (req.http.Accept-Encoding ~ "deflate" && req.http.user-agent !~ "MSIE") {
        set req.http.Accept-Encoding = "deflate";
    } else {
        # unkown algorithm
        remove req.http.Accept-Encoding;
    }
  }

}

sub vcl_hash {
  if (req.http.Cookie) {
    set req.hash += req.http.Cookie;
  }
}

Once I dropped this VCL in, I started seeing much higher hit rates, which can be viewed on my munin stats page. Prior to implementing the VCL, I was seeing low rates (< 10%) but didn't have the time to troubleshoot until this weekend. I learned a bunch about using varnishtop and varnishlog to find signs as to why my requests weren't being properly cached. I highly recommend checking out the documentation on achieving higher hit rates.

Tags: