Partial rendering performance in Rails
Some numbers about rendering times with different techniques
I see often that people are not aware of the importance of rendering views, and especially partials, in the proper way in Rails. I will now show you the different approaches with relative numbers, which too often are missing in blog posts about this topic.
Avoid N+1 Queries
That’s the first, very important topic: N+1 queries. Avoid them! Don’t let them go through your code because they will inevitably make it slower, and all other performance optimisations would just be useless if you have an N+1 query.
Let’s take a very simple example:
<% @users.each do |user| %>
<div class="post">
<%= user.post.title %>
</div>
<% end %>
We are iterating over a list of users, each one of them as a post. Easy.
Now, I will run my tests with 1'000 users and assign the users with
assign(:users, User.all)
in my test. Result?
Warming up — — — — — — — — — — — — — — — — — — —
with N+1 1.000 i/100ms
Calculating — — — — — — — — — — — — — — — — — — -
with N+1 0.291 (± 0.0%) i/s — 2.000 in 7.158333s
we are able to render 0.291 of those views in a second. That’s pretty bad, so the first thing to do is to solve the N+1 query generated by our view, that for each iteration performs a select on the DB to retrieve the post of the user.
To solve it now and, mainly, from now on, we’ll introduce Bullet. I ❤️ Bullet and I prefer it to Goldiloader for three main reasons:
- Bullet doesn’t change your code automatically, without asking for permissions
- Bullet doesn’t run in production
- Bullet helps you understand and identifying the N+1 queries, it doesn’t hide them
Add Bullet to your Gemfile and configure it for your test environment to raise an exception in case it identifies an N+1 query, so that you are forced to solve it.
# app/config/environments/test.rbconfig.after_initialize do
Bullet.enable = true
Bullet.bullet_logger = true
Bullet.raise = true
end
If we run the test again, Bullet will tell us everything we want to know and make our tests fail:
Bullet::Notification::UnoptimizedQueryError:
USE eager loading detected
User => [:post]
Add to your finder: :includes => [:post]
So, we can now change our assignment to
assign(:users, User.includes(:post))
and our results get better and ~7x faster!
Warming up --------------------------------------
with N+1 1.000 i/100ms
without N+1 1.000 i/100ms
Calculating -------------------------------------
with N+1 1.539 (± 0.0%) i/s - 8.000 in 5.305367s
without N+1 10.479 (± 9.5%) i/s - 52.000 in 5.057764sComparison:
without N+1: 10.5 i/s
with N+1: 1.5 i/s - 6.81x slower
Partials rendering
Let’s get to the most important part of this post: the usage of partials. We decide to refactor our code and extract a partial to render each single post. I think that’s a great idea but there is one way you can do that really bad and is the following:
- extract the partial:
<div class="post">
<%= user.post.title %>
</div>
2. render it:
<% @users.each do |user| %>
<%= render 'erb_partials/post', user: user %>
<% end %>
And here are the numbers:
Warming up — — — — — — — — — — — — — — — — — — —
inline 1.000 i/100ms
partial 1.000 i/100ms
Calculating — — — — — — — — — — — — — — — — — — -
inline 11.776 (± 8.5%) i/s — 59.000 in 5.085002s
partial 5.648 (±17.7%) i/s — 28.000 in 5.043322sComparison:
inline: 11.8 i/s
partial: 5.6 i/s — 2.09x slower
Great! We just made our code 2 times slower. That’s because we are iterating the rendering of the partial. For each user, Rails needs to “open” the partial and evaluate it. The solution? Use collections.
Change the view to the following:
<%= render partial: 'erb_partials/post', collection: @users, as: :user %>
and here are the results:
Warming up --------------------------------------
inline 9.000 i/100ms
partial 1.000 i/100ms
collection 6.000 i/100ms
Calculating -------------------------------------
inline 96.394 (±11.4%) i/s - 477.000 in 5.016304s
partial 8.989 (±22.2%) i/s - 43.000 in 5.108843s
collection 57.828 (±13.8%) i/s - 288.000 in 5.092763sComparison:
inline: 96.4 i/s
collection: 57.8 i/s - 1.67x slower
partial: 9.0 i/s - 10.72x slower
you see that we were able to extract our code in a partial and keep it performant. In this version of our code Rails evaluates the partial only once and render it for each user. You can think at it as the optimisation we did for the N+1 query before. This means also that our code is scalable!
Final note
Don’t show 1000 users in the same page. Implement a pagination.
You can find the code in my repository on GitHub.
Bonus: other template engines
We saw numbers with erb
. What about slim and haml
? Here they are:
Warming up — — — — — — — — — — — — — — — — — — —
erb inline 8.000 i/100ms
erb partial 1.000 i/100ms
erb collection 5.000 i/100ms
slim inline 10.000 i/100ms
slim partial 1.000 i/100ms
slim collection 6.000 i/100ms
haml inline 9.000 i/100ms
haml partial 1.000 i/100ms
haml collection 4.000 i/100ms
Calculating — — — — — — — — — — — — — — — — — — -
erb inline 97.943 (±10.2%) i/s — 488.000 in 5.046691s
erb partial 9.438 (±21.2%) i/s — 46.000 in 5.026193s
erb collection 67.090 (± 6.0%) i/s — 335.000 in 5.015540s
slim inline 104.373 (± 8.6%) i/s — 530.000 in 5.122621s
slim partial 9.836 (±20.3%) i/s — 49.000 in 5.123851s
slim collection 69.146 (± 7.2%) i/s — 348.000 in 5.059607s
haml inline 85.732 (±11.7%) i/s — 432.000 in 5.111380s
haml partial 8.180 (±24.4%) i/s — 40.000 in 5.165770s
haml collection 41.069 (±21.9%) i/s — 196.000 in 5.084682sComparison:
slim inline: 104.4 i/s
erb inline: 97.9 i/s — same-ish: difference falls within error
haml inline: 85.7 i/s — same-ish: difference falls within error
slim collection: 69.1 i/s — 1.51x slower
erb collection: 67.1 i/s — 1.56x slower
haml collection: 41.1 i/s — 2.54x slower
slim partial: 9.8 i/s — 10.61x slower
erb partial: 9.4 i/s — 11.06x slower
haml partial: 8.2 i/s — 12.76x slower
Inline rendering performs always the best. Seems like slim
performs as fast as erb
(even slightly better) while haml
suffers a bit more when is time to render partials.
Conclusion
Use partials. Don’t be afraid of doing that.
Inline rendering is faster, true that, but also code maintainability, legibility and testability matters.
So: split your views in partials when needed.
But do it the right way by using collections.
And, of course, avoid N+1 queries.