Fragment Caching with Rails 5.1

There are three important caching mechanisms for a Ruby on Rails developer:

  • HTTP Caching
  • Page Caching
  • Fragment Caching

I cover all in depth in my Ruby on Rails 5.1 book. In this post I’d like to give a quick overview of Fragment Caching which is the most used caching method. Often it is the lowest hanging fruit. But I’ve seen plenty of Rails projects where it was used in a way that it actually decreased the speed. Most times because of too long and too complicated cache keys. In those cases it takes more time to calculate the cache key than it would take to render the content of the cache.

I offer consulting and training for Ruby on Rails, Phoenix and WebPerformance. Here’s my email in case you want to hire me: sw@wintermeyer-consulting.de or DM me on Twitter (@wintermeyer)

In case you prefer a video: I’ll add a recording and the slides of my “Cache = Cash” RailsConf talk at the end of this post.

What is Fragment Caching?

A fragment cache stores erb output so that it can be recycled later. The idea is that something time consuming has only to be done once. After the first time the result of a specific erb code is saved in a key/value store and the next time the value will be fetched from that store.

A simple long running code example would look like this:

<% cache ‘foobar’ do %>
<% for i in 0..5 %>
<% sleep 1 %>
Test
<% end %>
<% end %>

This erb code writes a cache with the key foobar. The first time this page gets fetched by a browser the console log shows this:

Started GET “/page/index” for 127.0.0.1 at 2017–05–27 19:06:16 +0200
Processing by PageController#index as HTML
Rendering page/index.html.erb within layouts/application
Rendered page/index.html.erb within layouts/application (6019.8ms)
Completed 200 OK in 6094ms (Views: 6092.7ms)

The second time the erb code within the cache doesn’t get run. Rails fetches the result from the cache:

Started GET “/page/index” for 127.0.0.1 at 2017–05–27 19:11:20 +0200
Processing by PageController#index as HTML
Rendering page/index.html.erb within layouts/application
Rendered page/index.html.erb within layouts/application (2.9ms)
Completed 200 OK in 25ms (Views: 23.0ms)

Which is a lot faster. Let me show you the difference in a little GIF:

Screencast of the 5 second cache example.

But this example doesn’t make sense in a real world application. In Rails we mainly use Fragment Caching in combination of some sort of data set. So we fetch data from a database and display this data with erb. That can erb code can be slow. When ever this happens it’s a good idea to look into fragment caching. Let me show you ...

The Example Setup

Please follow me on your command line. We are going to use Rails 5.1.

$ rails new shop
$ cd shop
$ rails g scaffold Category name
$ rails g scaffold Product name stock:integer category:references
$ rails db:migrate

We need some data which we include via the seed.rb.

db/seeds.rb

fruit = Category.create(name: ‘Fruit’)
vegetable = Category.create(name: ‘Vegetable’)
Product.create(name: ‘Banana’, stock: 10, category_id: fruit.id)
Product.create(name: ‘Pineapple’, stock: 2, category_id: fruit.id)
Product.create(name: ‘Apple’, stock: 23, category_id: fruit.id)
Product.create(name: ‘Potato’, stock: 5, category_id: fruit.id)
Product.create(name: ‘Salad’, stock: 4, category_id: fruit.id)

To create the database entries we have to run rails db:seed:

$ rails db:seed

Activate Caching

By default caching is turned off in the development environment (it is on by default in production). To work with it in development we have to activate it by creating the file tmp/caching-dev.txt

$ touch tmp/caching-dev.txt

The Models

I assume that you are familiar with has_many and belongs_to so here is just the code for the two models.

app/models/category.rb

class Category < ApplicationRecord
has_many :products
 def to_s
name
end
end

app/models/product.rb

class Product < ApplicationRecord
belongs_to :category, touch: true
 def to_s
name
end
end

Please fire up your rails server now:

$ rails server

The Product#index View

Fragment caching happens in the view. First we are going to tackle this Product#index (http://0.0.0.0:3000/products) view.

Screenshot of http://0.0.0.0:3000/products

On my MacBook Pro it takes about 45ms to render this view in the development environment:

Completed 200 OK in 45ms (Views: 43.1ms | ActiveRecord: 0.4ms)

ActiveRecord is fast. The performance bottleneck is the rendering of app/views/products/index.html.erb which takes 43.1ms. The major part of that time is spend on rending the HTML of the table. Let’s add a fragment cache for that by encapsuling the table in <% cache @products do %>:

app/views/products/index.html.erb

<p id=”notice”><%= notice %></p>
<h1>Products</h1>
<% cache @products do %>
<table>
<thead>
<tr>
<th>Name</th>
<th>Stock</th>
<th>Category</th>
<th colspan=”3"></th>
</tr>
</thead>
<tbody>
<% @products.each do |product| %>
<tr>
<td><%= product.name %></td>
<td><%= product.stock %></td>
<td><%= product.category %></td>
<td><%= link_to ‘Show’, product %></td>
<td><%= link_to ‘Edit’, edit_product_path(product) %></td>
<td><%= link_to ‘Destroy’, product, method: :delete, data: { confirm: ‘Are you sure?’ } %></td>
</tr>
<% end %>
</tbody>
</table>
<% end %>
<br>
<%= link_to ‘New Product’, new_product_path %>

The first time you visit http://0.0.0.0:3000/products in your browser it will take Rails a bit longer than before to deliver the result to your browser. That is because the table has to be written to the cache which is empty at the start. The second time you visit the same URL you see a big speed improvement. Here’s the log of those too visits:

Started GET "/products" for 127.0.0.1 at 2017-05-27 10:47:45 +0200
Processing by ProductsController#index as HTML
Rendering products/index.html.erb within layouts/application
(0.2ms) SELECT COUNT(*) AS "size", MAX("products"."updated_at") AS timestamp FROM "products"
Product Load (0.4ms) SELECT "products".* FROM "products"
Category Load (0.2ms) SELECT "categories".* FROM "categories" WHERE "categories"."id" = ? LIMIT ? [["id", 1], ["LIMIT", 1]]
CACHE Category Load (0.1ms) SELECT "categories".* FROM "categories" WHERE "categories"."id" = ? LIMIT ? [["id", 1], ["LIMIT", 1]]
CACHE Category Load (0.0ms) SELECT "categories".* FROM "categories" WHERE "categories"."id" = ? LIMIT ? [["id", 1], ["LIMIT", 1]]
CACHE Category Load (0.0ms) SELECT "categories".* FROM "categories" WHERE "categories"."id" = ? LIMIT ? [["id", 1], ["LIMIT", 1]]
CACHE Category Load (0.0ms) SELECT "categories".* FROM "categories" WHERE "categories"."id" = ? LIMIT ? [["id", 1], ["LIMIT", 1]]
Rendered products/index.html.erb within layouts/application (12.9ms)
Completed 200 OK in 33ms (Views: 29.9ms | ActiveRecord: 0.9ms)
Started GET "/products" for 127.0.0.1 at 2017-05-27 10:48:08 +0200
Processing by ProductsController#index as HTML
Rendering products/index.html.erb within layouts/application
(0.2ms) SELECT COUNT(*) AS "size", MAX("products"."updated_at") AS timestamp FROM "products"
Rendered products/index.html.erb within layouts/application (3.4ms)
Completed 200 OK in 24ms (Views: 21.6ms | ActiveRecord: 0.2ms)
The time difference between with or without cache is not huge in this example because we are using a very small set of data. If you’d have 500 products it would be more visible and impressive.

What did Rails do? Because we used @products as a cache key (which is the name of a cache) it uses the output of this SQL statement as a cache key:

SELECT COUNT(*) AS "size", MAX("products"."updated_at") AS timestamp FROM "products"

Rails does that automatically. You don’t have to think about it. Obviously you have to be sure that no external program interacts with your data outside of ActiveRecord. The caching only works if the updated_at field is used properly.

Russian Doll Caching

A couple of years ago DHH introduced Russian Doll Caching during one of his famous RailsConf keynotes. The idea of it is to not just cache the total table but also each table row in it’s own cache. As a result we receive the complete table if non of the rows has changes super fast from the cache. In case a row has changed we’d have to write the table cache again but at least we could recycle the caches of all the rows except of the one which has been changed.

The name Russian Doll Caching is a bit misleading because we are stacking a couple of equal daughters into each mother element and not just one daughter element into each mother element.

To implement Russian Doll Caching in our example we’d have to add a cache for each table row in addition to the cache of the whole table:

<% cache @products do %>
<table>
<thead>
<tr>
<th>Name</th>
<th>Stock</th>
<th>Category</th>
<th colspan=”3"></th>
</tr>
</thead>
<tbody>
<% @products.each do |product| %>
<% cache product do %>
<tr>
<td><%= product.name %></td>
<td><%= product.stock %></td>
<td><%= product.category %></td>
<td><%= link_to ‘Show’, product %></td>
<td><%= link_to ‘Edit’, edit_product_path(product) %></td>
<td><%= link_to ‘Destroy’, product, method: :delete, data: { confirm: ‘Are you sure?’ } %></td>
</tr>
<% end %>
<% end %>
</tbody>
</table>
<% end %>

To show the effect open a second terminal and update a product in the console:

$ rails console
Running via Spring preloader in process 23211
Loading development environment (Rails 5.1.1)
>> Product.first.update_attribute(:stock, 9)
Product Load (0.1ms) SELECT “products”.* FROM “products” ORDER BY “products”.”id” ASC LIMIT ? [[“LIMIT”, 1]]
(0.1ms) begin transaction
SQL (0.6ms) UPDATE “products” SET “stock” = ?, “updated_at” = ? WHERE “products”.”id” = ? [[“stock”, 9], [“updated_at”, “2017–05–27 09:24:24.017200”], [“id”, 1]]
Category Load (0.2ms) SELECT “categories”.* FROM “categories” WHERE “categories”.”id” = ? LIMIT ? [[“id”, 1], [“LIMIT”, 1]]
SQL (0.2ms) UPDATE “categories” SET “updated_at” = ‘2017–05–27 09:24:24.051525’ WHERE “categories”.”id” = ? [[“id”, 1]]
(4.5ms) commit transaction
=> true
>> exit

When you reload http://0.0.0.0:3000/products in your browser you’ll see this log output:

Started GET “/products” for 127.0.0.1 at 2017–05–27 11:29:07 +0200
Processing by ProductsController#index as HTML
Rendering products/index.html.erb within layouts/application
(0.2ms) SELECT COUNT(*) AS “size”, MAX(“products”.”updated_at”) AS timestamp FROM “products”
Product Load (0.1ms) SELECT “products”.* FROM “products”
Category Load (0.1ms) SELECT “categories”.* FROM “categories” WHERE “categories”.”id” = ? LIMIT ? [[“id”, 1], [“LIMIT”, 1]]

Rendered products/index.html.erb within layouts/application (18.3ms)
Completed 200 OK in 37ms (Views: 34.7ms | ActiveRecord: 0.5ms)

It shows that the cache key of @products has changed and therefor that cache has to be written again. To do that the one changed row has to be written too. Because only one line had to be changed it is much quicker.

Why should I touch my data?

The product model contains the following line:

belongs_to :category, touch: true

The touch is important because it takes care of the category model in case you change a product. When ever you use belongs_to in your models make sure to touch the parent model. Than you can use the parent model as a cache key too.

But I have personalized data!

Many Rails developers tell me that they can not cache because they are dealing with personalized data. My answer to this problem: Caching can make sense in this case too. Assuming you not only display products but also prices and those prices are different for every user. Than you can use your @current_user instance variable in the cache key:

<% cache [@current_user, @products] do %>
...
<% end %>

But most times you can go one step further by analyzing your data and it’s structure. Probably you have groups of users who get the same price. In this case you can use that group as a cache key. Than you can use the same cache for many users:

<% cache [@current_user.price_group, @products] do %>
...
<% end %>

How big a Cache-Store do I need?

That’s a tough one to answer. As a general rule it is safe to say that cache is cheaper than CPU power. So go big! But don’t waste resources (and money). It is healthy to see how much data of your cache is used. In many installations 20% of the Memcached (which is the most popular Cache-Store) are responsible for 80% of all answers.

What about those too complicated cache keys?

When ever you are using fragment caching you should always keep an eye on the complexity of the cache key vs. the cached content. Here’s a negative example:

<% cache [@current_user, @product] do %>
<%= @product.name %>
<% end %>

It takes longer to check if there is a cache stored for that key than it would actually take to print @product.name.

Because you can create your own cache key there is the chance that the calculation of that given cache key takes too long.

Always make sure that a fragment cache makes sense.

Do you want to dive deeper?

In this post we just scratched the surface of fragment caching. Here are some links if you want to dive deeper:

Please share and like this post in case you want me to post additional how-tos for other caching mechanisms.

My “Cache = Cash” RailsConf Talk

Here is a recording of a RailsConf talk of mine about Caching. It is a couple of years old but the basic idea of Fragment Caching is still the same.

In this talk I show how a typical online shop can be optimized.

And here are the slides of the talk:

Show your support

Clapping shows how much you appreciated Stefan Wintermeyer’s story.