Home

Using alias_method_chain to transparently hash memcache keys

Recently I started using Dalli for memcache in Rap Genius because memcached-northscale was being a dick

This caused some problems because memcache can’t handle cache keys with spaces or other special characters, and Dalli doesn’t do anything to protect you from this (or at least it doesn’t in version 1.0.4, which is the most recent version compatible with Rails 2.3.x (which Rap Genius still uses))

Cache keys with spaces come up because of the way Rails generates them. For example, this block:

cache [@song, 'meta description'] do
  meta = @song.lyrics_as_text.gsub(/\n+/, ' / ').strip[0,180]
end

generates the cache key:

258/258/songs/66266-20120328010114/258/meta description

Here’s what each element represents:

  • 258 – the “cache version”. This is appended to every cache key across the application so that I can easily invalidate everything by incrementing it (it appears multiple times in the cache key because of the way Rails constructs it – i.e., for no particular reason)

  • songs – the model name

  • 66266 – the id of the current song

  • 20120328010114 – the current song’s updated_at

  • meta description – the name of the specific thing we’re caching

The idea is that you encode all this information in the cache key so that you don’t have to worry about invalidating the cache when something changes. For example, if I update the song, the cache key will change and the next time Rails hits this block it will look for a non-existent cache key and regenerate the song’s meta description. This technique is known as “generational caching”

BUT ANYWAY, Dalli throws up when you feed it a cache key that contains a space. How do we handle this?

One thought is to replace spaces with chill characters (e.g., dashes) in all cache keys. But this is fraught because then “Dan Berger” and “Dan-Berger” will appears as the same hash key to Dalli, which could cause collisions

What we need is a way to map every hash key to a unique string consisting of permissible characters. And for this we use a hash. E.g.:

> Digest::SHA1.hexdigest("Dan Berger")
=> "d86a8cfad42d11b58c5d369b235b78345e0eaa01"
> Digest::SHA1.hexdigest("Dan-Berger")
=> "9ea3c32819f1625b926937e5a8a59a23a4d9aae5"

Now we need some code that sits between Dalli and Rap Genius, transparently hashing every cache key. I.e., when I write

Rails.cache.write("Dan Berger", "hilarious")

Dalli will see it as

Rails.cache.write("d86a8cfad42d11b58c5d369b235b78345e0eaa01", "hilarious")

This way, when I’m programming Rap Genius I don’t want to have to remember “okay, I gotta hash this string before I sent it to Dalli!”. Instead, I want Dalli to behave as if it’s using a special upgraded memcache that accepts any hash key.

SO: how do we accomplish this? Well first we need to find the method that Dalli uses to actually interact with memcache. Fortunately, Dalli uses a single method for both reading and writing so it doesn’t have to duplicate its error checking code. That method is called perform:

module Dalli
  class Client
    # Chokepoint method for instrumentation
    def perform(op, key, *args)
      key = key.to_s
      validate_key(key)
      key = key_with_namespace(key)
      begin
        server = ring.server_for_key(key)
        server.request(op, key, *args)
      rescue NetworkError => e
        Dalli.logger.debug { e.message }
        Dalli.logger.debug { "retrying request with new server" }
        retry
      end
    end
  end
end

We need to make sure the key that we feed perform is always hashed. This is fairly simple to accomplish with alias_method:

module Dalli
  class Client
    def perform_with_hash(op, key, *args)
      perform_without_hash(op, Digest::SHA1.hexdigest(key), *args)
    end
    alias_method :perform_without_hash, :perform
    alias_method :perform, :perform_with_hash
  end
end

The idea here is to tell Ruby:

  1. Let me refer to the original perform method with the name perform_without_hash
  2. Let me refer to this new perform_with_hash method with the name perform (thus, when the outside world THINKS they’re calling perform, they’re really calling perform_with_hash)
  3. Now, let’s define perform_with_hash to pass along its arguments unchanged to perform_without_hash (i.e., the original perform), except, let’s hash the key first

Now, if that weren’t trippy enough already, Rails provides an abstraction for this common “method wrapper” pattern:

module Dalli
  class Client
    def perform_with_hash(op, key, *args)
      perform_without_hash(op, Digest::SHA1.hexdigest(key), *args)
    end
    alias_method_chain :perform, :hash
  end
end

And that’s it – we’re done! Now we can use any cache key we want, as if there were never a character restriction. Why doesn’t Dalli do this out of the box? I have no clue lol!

Posted March 31st, 2012