Rails Template engines performance

Post from a previous blog


It’s not uncommon that from time to time people ask me about what view engine do I use and why. After my reply, they typically say something like: “isn’t that one slow?”

Usually, when people talk about performance, I always utter something like: “well, that depends on …”, and there you can insert your slow performance reasoning of choice, if it can be bad or not that bad.

For a very long time I have been using HAML, I prefer it over ERB, my reasons for that choice are plain and simple, I don’t like all the clutter needed in ERB to write HTML, and the fact that almost every time I use it I forget to close a tag.

My choice for HAML never was based on performance, it was based on the feeling of being more productive writing views with HAML. Begin a Ruby developer this could sound odd but, I do like that HAML forces you to indent your HTML code, it makes everything easier to read and follow.

HAML snippet

%section.container
%h1= post.title
%h2= post.subtitle
.content
= post.content

ERB snippet

<section class=”container”>
<h1><%= post.title %></h1>
<h2><%= post.subtitle %></h2>
<div class=”content”>
<%= post.content %>
</div>
</section>

But then again, what about the performance issue?

View engines comparison

The following exercise is not intended to trash about any of the engines. The procedure to make the comparison on performance is quite simple and not scientific at all. So take it with a grain of salt.

For this exercise I will run a set of tests on ERB, HAML and SLIM template engines.

Test environment

My test environment is as follow:

  • Ruby 2.0.0-p195
  • HAML 4.0.3
  • SLIM 1.3.9
  • Rails 4.0.0

I haven’t done any tweaking to the Garbage Collector or any other setting for Ruby. It’s running just out of the box, on OSX 10.8.4.

Testing views outside Rails

In my first test I would use three very simple inline templates with a combination of a struct to render views output.

I will run the rendering method a thousand times using Ruby’s benchmark library. In this first test I would instantiate a new Engine object every time. In the real world you should not do this, but instantiate an Engine once and try to reuse it as much as possible.

require 'erb'
require 'haml'
require 'slim'
require 'benchmark'
require 'ostruct'
data = OpenStruct.new name: 'Sample', email: 'sample@mail.com', list: %w(one two three)
@output = ''
erb_template = <<-ERB_TEMPLATE
<p><%= data.name %></p>
<p><%= data.email %></p>
<ul>
<% data.list.each do |l| %>
<li><%= l %>
<% end %>
</ul>
ERB_TEMPLATE
haml_template = <<-HAML_TEMPLATE
%p= data.name
%p= data.email
%ul
- data.list.each do |l|
%li= l
HAML_TEMPLATE
slim_template = <<-SLIM_TEMPLATE
p= data.name
p= data.email
ul
- data.list.each do |l|
li= l
SLIM_TEMPLATE
bind = binding
context = OpenStruct.new data: data
Benchmark.bmbm(10) do |b|
b.report(:erb_new) { (1..1000).each { ERB.new(erb_template, 0, '', '@output').result bind } }
b.report(:haml_new) { (1..1000).each { @output = Haml::Engine.new(haml_template).render(bind) } }
b.report(:haml_ugly_new) { (1..1000).each { @output = Haml::Engine.new(haml_template, ugly: true).render(bind) } }
b.report(:slim_new) { (1..1000).each{ @output = Slim::Template.new { slim_template }.render(context) } }
end

Statistics for this first test are as follows:

user     system      total        real
erb_new 0.200000 0.000000 0.200000 ( 0.204660)
haml_new 0.700000 0.010000 0.710000 ( 0.716864)
haml_ugly_new 0.770000 0.000000 0.770000 ( 0.778206)
slim_new 2.350000 0.020000 2.370000 ( 2.378874)

Here I tried HAML engine with ugly: true and false. As the stats show, ERB was faster by 0.5 ms — all values are in milliseconds unless I said otherwise -, being SLIM the slowest engine.

Now let’s move to the second test where I will reuse the Engine instance and just call the render method. I will be using the same templates and data struct as before.

@erb = ERB.new(erb_template, 0, '', '@output')
@haml = Haml::Engine.new(haml_template)
@haml_ugly = Haml::Engine.new(haml_template, ugly: true)
@slim = Slim::Template.new { slim_template }
Benchmark.bmbm(10) do |b|
b.report(:erb_reuse) { (1..1000).each { @erb.result bind } }
b.report(:haml_reuse) { (1..1000).each { @output = @haml.render(bind) } }
b.report(:haml_ugly_reuse) { (1..1000).each { @output = @haml_ugly.render(bind) } }
b.report(:slim_reuse) { (1..1000).each{ @output = @slim.render(context) } }
end

This time we read a whole different story from the stats:

user     system      total        real
erb_reuse 0.090000 0.000000 0.090000 ( 0.102397)
haml_reuse 0.170000 0.010000 0.180000 ( 0.178557)
haml_ugly_reuse 0.140000 0.000000 0.140000 ( 0.139304)
slim_reuse 0.010000 0.000000 0.010000 ( 0.015742)

ERB was twice as fast, HAML was almost four times as fast, but SLIM, well it was lighting-fast, it shows an increase of over 150% in speed.

Before moving on to test them using Rails, I ran the latest test with Ruby Profiler, just to have a rough idea of how the engines render the templates.

require 'ruby-prof'
RubyProf.start
(1..1000).each { @erb.result bind }
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)
RubyProf.start
(1..1000).each { @output = @haml.render(bind) }
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)

RubyProf.start
(1..1000).each { @output = @slim.render(context) } }
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)

The sample that I got for ERB, shows that more than half of the time is spent performing Kernel#eval.

Total: 0.203420
%self total self wait child calls name
58.05 0.150 0.108 0.000 0.042 1002 Kernel#eval
10.59 0.034 0.020 0.000 0.014 1179 *Array#each
6.35 0.012 0.012 0.000 0.000 15000 String#concat
2.01 0.157 0.004 0.000 0.154 1040 Proc#call
1.73 0.003 0.003 0.000 0.000 5000 String#to_s
1.68 0.158 0.003 0.000 0.155 1000 ERB#result
0.86 0.160 0.002 0.000 0.158 1 Range#each
0.80 0.001 0.001 0.000 0.000 1004 Kernel#proc
0.66 0.001 0.001 0.000 0.000 159 Regexp#=~
0.50 0.002 0.001 0.000 0.001 6 CodeRay::Scanners::Ruby#scan_tokens

Another chunk of time is spent at Array#each and String#concat.

Getting the same sample for HAML reveals that it also does the Kernel#eval, but two thousand more times than ERB, and spent a significant amount of time in the Haml::Buffer and Haml::Engine.

Total: 0.455242
%self total self wait child calls name
32.62 0.241 0.149 0.000 0.092 3002 Kernel#eval
10.52 0.162 0.048 0.000 0.114 2179 *Array#each
10.50 0.048 0.048 0.000 0.000 10100 Hash#[]=
8.48 0.050 0.039 0.000 0.012 5000 Haml::Buffer#format_script_false_true_false_false_false_true_false
3.73 0.427 0.017 0.000 0.410 1000 Haml::Engine#render
3.10 0.014 0.014 0.000 0.000 5000 Haml::Buffer#push_text
1.50 0.014 0.007 0.000 0.007 1042 Array#map
1.41 0.006 0.006 0.000 0.000 5000 Haml::Buffer#adjust_tabs
1.36 0.078 0.006 0.000 0.072 1000 Haml::Engine#set_locals
1.35 0.006 0.006 0.000 0.000 3000 Haml::Options#format
1.26 0.011 0.006 0.000 0.005 1000 Haml::Buffer#initialize

Now is time to inspect what is SLIM doing to render that fast. Here we have to notice that we have many less calls for everything, just two Kernel#eval calls and a hundred a seventy-two calls for Array#each!

Total: 0.023274
%self total self wait child calls name
11.52 0.008 0.003 0.000 0.006 172 *Array#each
5.36 0.001 0.001 0.000 0.000 159 Regexp#=~
3.69 0.002 0.001 0.000 0.001 5 CodeRay::Scanners::Ruby#scan_tokens
3.06 0.001 0.001 0.000 0.000 8 BasicObject#method_missing
2.98 0.001 0.001 0.000 0.000 2 Kernel#gem_original_require
2.45 0.001 0.001 0.000 0.000 973 String#===
2.39 0.001 0.001 0.000 0.000 180 <Class::File>#file?
1.99 0.001 0.000 0.000 0.000 2 Kernel#eval
1.93 0.001 0.000 0.000 0.000 107 Array#include?
1.56 0.000 0.000 0.000 0.000 21 Binding#eval
1.38 0.000 0.000 0.000 0.000 306 <Class::Regexp>#escape
1.36 0.000 0.000 0.000 0.000 334 StringScanner#scan

Testing views with Rails

Now is time to test all three engines within Rails. So, I have created a sample Rails application for this test .

This application has 2 models: User and Skill. The database was seeded with 1000 users, and to each user I have assigned 3 random skills. Models relationship is as follows:

class User < ActiveRecord::Base
has_many :user_skills
has_many :skills, through: :user_skills
end
class Skill < ActiveRecord::Base
has_many :user_skills
has_many :users, through: :user_skills
end

I have also 3 controllers named ERB, HAML and SLIM, the code for index action is the same, the only difference is the layout that each controller uses.

class UsersErbController < ApplicationController
layout 'application.html.erb'
  def index
@users = User.includes(:user_skills).limit(20)
end
end

Templates for ERB are:

# layouts/application.html.erb
<!DOCTYPE html>
<html>
<head>
<title>Sample</title>
<%= stylesheet_link_tag "application", media: "all", "data-turbolinks-track" => true %>
<%= javascript_include_tag "application", "data-turbolinks-track" => true %>
<%= csrf_meta_tags %>
</head>
<body>
<%= yield %>
</body>
</html>
# users_erb/index.html.erb
<h1>Users</h1>
<ul>
<%= render partial: 'users_erb/user', collection: @users %>
</ul>
# users_erb/_user.html.erb
<li>
<span><%= user.name %></span>
<span><%= user.email %></span>
<%= render partial: 'users_erb/skill', collection: user.skills %>
</li>
# users_erb/_skill.html.erb
<span><%= skill.name %></span>

The templates for HAML are:

# layouts/application.html.haml
!!!
%html
%head
%title Sample
= stylesheet_link_tag "application", media: "all", "data-turbolinks-track" => true
= javascript_include_tag "application", "data-turbolinks-track" => true
= csrf_meta_tags
%body
= yield
# users_haml/index.html.haml
%h1 Users
%ul
= render partial: 'users_haml/user', collection: @users
# users_haml/_user.html.haml
%li
%span=user.name
%span=user.email
= render partial: 'users_haml/skill', collection: user.skills
# users_haml/_skill.html.haml
%span= skill.name

The templates for SLIM are:

# layouts/application.html.slim
doctype html
html
head
title Sample
= stylesheet_link_tag "application", media: "all", "data-turbolinks-track" => true
= javascript_include_tag "application", "data-turbolinks-track" => true
= csrf_meta_tags
body
== yield
# users_slim/index.html.slim
h1 Users
ul
= render partial: 'users_slim/user', collection: @users
# users_slim/_user.html.slim
li
span= user.name
span= user.email
= render partial: 'users_slim/skill', collection: user.skills
#users_slim/_skill.html.slim
span= skill.name

So, for all of the three engines we will render a layout, a view, a partial for a collection and a nested partial for another collection.

To track the time that Rails spends in rendering the view, we will subscribe to ActiveSupport::Notifications on the event render template.actionview. For every notification we will store the :duration, which is given in ms.

To start our test, Rails was configured to run with a Puma server on production mode.

To exercise our http calls for each engine controller we will use ab command from Apache. ab is a tool for benchmarking HTTP server response.

$ ab -n 500 -c 1 http://127.0.0.1:3000/erb
Requests per second:    21.02 [#/sec] (mean)
Time per request: 47.571 [ms] (mean)
Time per request: 47.571 [ms] (mean, across all concurrent requests)
Average: 41.211 [ms]
Fastest: 32.128 [ms]
Slowest: 135.773 [ms]
$ ab -n 500 -c 1 http://127.0.0.1:3000/haml
Requests per second:    19.52 [#/sec] (mean)
Time per request: 51.233 [ms] (mean)
Time per request: 51.233 [ms] (mean, across all concurrent requests)
Average: 43.642 [ms]
Fastest: 33.665 [ms]
Slowest: 100.419 [ms]
$ ab -n 500 -c 1 http://127.0.0.1:3000/slim
Requests per second:    21.37 [#/sec] (mean)
Time per request: 46.796 [ms] (mean)
Time per request: 46.796 [ms] (mean, across all concurrent requests)
Average: 40.435 [ms]
Fastest: 31.907 [ms]
Slowest: 121.193 [ms]

No surprise here, SLIM is still the fastest engine, but now the difference between all 3 is pretty close, we can even say that the difference between ERB and SLIM is almost imperceptible. But let’s remember that ab’s statistics are based on the full request, this means that is measuring the query to the database as well, but remember that we have collected information for the rendering process in Rails.

The following graph shows the comparison between all 3 engines, it’s a bit hard to visualize which one was faster, mostly because the three of them have these spikes from time to time.

But here I have more comprehensible information from the data reported by Rails. Again, this just helped to confirm that SLIM is the fastest template engine.

Erb render:    41.091 [ms] (average)
Haml render: 43.508 [ms] (average)
Slim render: 40.387 [ms] (average)
Erb render:    36.021 [ms] (median)
Haml render: 37.316 [ms] (median)
Slim render: 35.349 [ms] (median)

Conclusion

As we have seen here, SLIM is the fastest of the three engines compared. ERB does an OK job. It seems that, from the technical point of view, you will be OK if you use any of the 3 engines. Also, bear in mind that I didn’t include any caching or reverse proxy in these tests, which are typically used in production environments. If that’s your case, I think that these numbers become even more meaningful.

Now is up to you to choose the templating engine that makes you more productive. It is also important to bear in mind that if you have frontend designers accustomed to code in HTML only, the learning curve for them will be of relevance.

Thanks for reading.


If you or your team needs training on software development techniques or Ruby on Rails, reach out Michelada.io at michelada.io.