Cache or Bust

Beginner

Ahh, the Rails cache. A funny thing about caching— it’s really important.

Rails has a fancy little cache mechanism that you need to know three things about. Caching is a big topic, and this will be a short post. By no means do I want to suggest this post is comprehensive— far from it. There’s lots more to learn: browser caching, fragment caching, Russian doll caching. I hope this simple introduction gives you a broad overview of what caching is and a brief taste of how it works.

First, you need to know what passing a block in Ruby is. Passing a block means passing a block of code, like a closure, which in Ruby means passing an anonymous method. What is important is whether or not the block gets evaluated.

Why is that important? I’ll get to that soon. A block is a piece of code passed from one method to another. Unlike a function call, it has not been evaluated yet at the time of being passed.

Next: What does “cache” mean? “Cache” comes from the French word cacher (“to hide”). It refers to a storage space, or a “hiding space” (which does not really apply to Rails development— think of it more as a storage space). It exists to store data that you will quickly pull in and get out again. You will do this “quickly” because your app got bloated with view or JSON rendering. You can use a caching mechanisms (like Redis) to store things for specified periods of time or smartly until the data is no longer relevant.

Figuring out when the data — for example, a view that has been rendered showing today’s news— is no longer relevant is the main point (and a major headache) of caching.

Perhaps you remember the programming adage: there are only two things difficult in software development: 1) Cache expiration, 2) Naming things, and 3) Off-by-one problems

Well, here you are ready to learn about #1: Cache expiration.

Resource-based Expiration Caching

Rails 4 – Rails 5.1

Important: To use this resource-based expiration strategy with Redis, you should have a Redis instance that is configured for Least-Recently-Used (LRU) eviction. That means that you let Redis fill up completely. When older (no longer relevant) cache keys fill it up, Redis evicts them automatically as they become ‘least recent’.

VERY Important: You can only use this strategy if you have a Redis instance that doesn’t share data with data that you require to be persistent (for example, if you use Redis to store data for persistency). Since caching is ephemeral, it is not persistent. Most Rails apps treat their primary database as persistent and the Redis cache as ephemeral, so this is not a problem if that is you. If you have persistent data in Redis, I’d recommend two separate Redis instances (one, for ephemeral caching set to allkeys-lru eviction, and another Redis for persistent data storage that is set to noeviction strategy)

Cache Basics

In this post from 2012, DHH explains what resource-based caching strategy is. If you give Rails the object itself as the key parameter, Rails will automagically generate a specific cache key. This cache key follows this format

  1. The name of your object (like “todos”)
  2. The id of the record that will be cached (like “62”)
  3. The updated_at timestamp of the object

Let’s say I have a Todo record with id 62. It’s cache_key might be “templates/62-20200608172452254609”

When the updated_at timestamp changes on the record, for example, because you changed a value, the cache_key will change. (Notice that the numbers after 62- are the year, month, day, hour, minute, and second of the change). In that example, the todos/62- part stays the same, but the updated date change. This expires the cache for that object, because when Rails then requests the object again, it will have a different cache key from the one it had before.

We presume that the block will pull values from the object, and so that the result won’t need to change until the object is next updated. For that reason, we know that when we change the object’s value the update_at time stamp will update. The next time your app goes to request the object, the cache key will, therefore, be different than the last time.

Rails 5.2+ Resource Cache

In Rails 5.2, this changes a little bit. To keep using the old (pre-Rails 5.2) way, do nothing and your cache will continue to work as expected.

To prepare your Rails 5.2 app ready for Rails 6 (while still on Rails 5.2), you can globally set:

ActiveRecord::Base.cache_versioning = true

And your Rails app will behave the following way:

The cache_key no longer contains the updated timestamp. However, under the hood, Rails uses another method it also implements called cache_version. Combining these two, the Rails cache basically operates the same way. Except that prior to Rail 5.2, the cache_key contained both the record name, id, and the updated_at timestamp. With Rails 5.2+, this specific method is only the name and id, but using the other method too Rails does the same thing as it did before.)

In Rails 5.2, the default for cache_versioning is false, which makes Rails behave as it did before Rails 5.2, storing the updated_at timestamp along with the cache_key

Starting in Rails 6, however, cache_versioning will be set to true which means cache_key will return only the name & id the thing and cache_key will return the updated_at information.

If you use the wrapper methods that Rails provides (Rails.cache), you don’t actually have to think about this much.

Specified Expiration Caching

This is where you just specify expires_in: 3.days as a parameter in your Rails.cache.write or Rails.cache.fetch.

You will get mileage out of this caching strategy if you have data that is OK to change for the user on a consistent, periodic schedule (like hourly or daily). Media sites, publishing sites, etc all do well with this kind of a caching strategy.

Aggressive expiration Caching

This is where you programmatically tell the Rails cache to re-write. To do this you need to create hooks in your application which can create bloat. Aggressive expiration is the most complicated form of cache expiration and is beyond the scope of this beginner tutorial. Just know that it means that unlike the other two mechanisms (where you sit back and let the timestamp flags expire things) with aggressive expiration you must programmatically expire your cache when it needs to change. There’s no “sit back and let the updated at timestamp” take care of it.

The trick here is to learn all three cache expiration strategies (resource-based, specified expiry, or aggressive) and employ whichever makes your app fastest.

Read, Write, and Fetch out of the Cache

The three pillars of the Rails.cache library are read, write, and fetch. There’s more in the API too which I won’t cover today, so be sure to visit the docs to see the complete set of methods you call to work with your cache.

Rails Cache Write

Rails.cache.write

When you write to the cache, it’s pretty straightforward. Your value, which you do not pass as block but instead pass a variable (which, in Ruby is “passed by reference” but the important point here is that it is evaluated before you pass it— meaning it always gets evaluated.) The cache simply accepts the new value for the specified key and if there was an existing key it would be evicted. Either way (whether or not there is a key), the cache returns “OK”

Rails Cache Read

Rails.cache.read

Here you are just telling the cache to fetch the key if there is one. IF there is no key, the cache will return nil which means there is no cached version.

Rails Cache Fetch

Rails.cache.fetch

Here’s where things get interesting. In a way, this little method is where all the magic happens.

Keep in mind that the purpose of this entire exercise is to speed things up. The slowest part of the process is the evaluation of the block. (If it isn’t, then you are adding cache overhead when you don’t need it.)

So when you examine this diagram below you see that what we’re really doing is trying to skip evaluation of the block and only evaluate it when we truly need to, which is when the key is expired.