Reading Arel of ActiveRecord

Takahiko Inayama
5 min readApr 29, 2019

--

From Ruby to SQL

What is the Arel?

Arel is an AST manager used in ActiveRecord to create SQL query. Every SQL statements are represented as ruby code in Arel. Because of this, it enable to write a complicated SQL query without using raw SQL string. But it’s not recommended to use Arel in the application because it’s a private API.

I’ll dive into the internal of Arel in this article. I’ll explain how it works when generating SQL with simple code example.

I’m not an expert of this subject. Please correct me if i’m wrong.

Overview

Arel is located in ActiveRecord gem. And has these kind of directory structure.

.
├── active_record
│ └── ...
├── arel
│ ├── attributes
│ ├── collectors
│ ├── nodes
│ └── visitors
└── rails
└── ...

As I said, Arel is an AST manager. So there are classes that represents each statements. They are located in nodes directory.

attributes

This contains classes for each table attributes with type such as STRING, TIME or BOOLEAN, etc…

visitors

This contains AST nodes visitor and SQL engine specific codes. (for MySQL, PostgreSQL, etc…)

Arel’s basic usage

product = Product.arel_table
# returns Arel::Table

select = Product.arel_table.from
# returns Arel::SelectManager

query = select.where(product[:name].eq("foo")).to_sql
# SELECT FROM "products" WHERE "products"."name" = 'foo'

Arel’s key components

Arel::Table

This represents table itself as name suggests. You can obtain this instance via arel_table method of model class.

Table class has methods to access Arel::SelectManager, Arel::Attribute and Nodes::TableAlias. And also has methods to construct query.(e.g. where, join, group, order, having, etc...)

Arel::Attribute

Attribute class represents attributes that belongs to the table.

Example

puts product[:name]
# #<struct Arel::Attributes::Attribute
# relation=
# #<Arel::Table:0x00007fcb1b0552d8
# @name="products",
# @type_caster=
# #<ActiveRecord::TypeCaster::Map:0x00007fcb1b055328
# @types=Product(id: integer, name: string)>,
# @table_alias=nil>,
# name=:name>

Arel::SelectManager

This seems to be a node forSELECT statement. But it's different there is another node called Arel::Nodes::SelectStatement which represents only SELECT. Arel::SelectManager has convenient methods to create AST tree.

There are another class like this.

  • Arel::InsertManager
  • Arel::UpdateManager

Arel::Nodes

A base class of each AST node classes.

Arel::Visitors::ToSql

Contain AST node visitors methods which used on creating SQL string. I’ll explain in later.

Arel::Nodes::NamedFunction

This allows to create a AST node for statement which not being declared in Arel by default. Here is a usage in Arel::FactoryMethods.

# activerecord/lib/arel/factory_methods.rb:40
# Create a LOWER() function
def lower(column)
Nodes::NamedFunction.new "LOWER", [Nodes.build_quoted(column)]
end

def coalesce(*exprs)
Nodes::NamedFunction.new "COALESCE", exprs
end

How Arel generate SQL query?

I’ll follow what happens inside Arel when running this example code.

select = Product.arel_table.from
# Arel::SelectManager

query = select.where(product[:name].eq("foo")).to_sql

First, where in select manager is defined in Arel::TreeManager.

# activerecord/lib/arel/tree_manager.rb

def where(expr)
if Arel::TreeManager === expr
expr = expr.ast
end
@ctx.wheres << expr
self
end

Simple enough. If the argument is Arel::TreeManager then fetch AST from it. And put into @ctx.wheres. Otherwise, put expr directly into @ctx.wheres. As Arel::TreeManager is included in Arel::SelectManager, this @ctx is same as it on Arel::TreeManager.

I called to_sql to generate SQL. This method is located in Arel::TreeManager too. This method has following operations.

def to_sql(engine = Table.engine)
collector = Arel::Collectors::SQLString.new
collector = engine.connection.visitor.accept @ast, collector
collector.value
end

‌Table.engine is initialized on ActiveRecord loaded. So this is a reference to ActiveRecord to the module.

# activerecord/lib/active_record.rb

ActiveSupport.on_load(:active_record) do
Arel::Table.engine = self
end

engine.connection is defined in ActiveRecord::ConnectionHandling. This is same method referenced with ActiveRecord::Base.connection. This method returns a connection from connection pool.

activerecord/lib/active_record/connection_handling.rb:191

def connection
retrieve_connection
end

If you are using MySQL, a connections is initialized in mysql2_adapter.rb. It means the engine.connection is ‌ConnectionAdapters::Mysql2Adapter

# activerecord/lib/active_record/connection_adapters/mysql2_adapter.rb:12

module ActiveRecord
module ConnectionHandling # :nodoc:
# Establishes a connection to the database that's used by all Active Record objects.
def mysql2_connection(config)
# ...
client = Mysql2::Client.new(config)
ConnectionAdapters::Mysql2Adapter.new(client, logger, nil, config)

visitor is defined in ActiveRecord::ConnectionAdapters::AbstractAdapter. It calls arel_visitor. In this case with MySQL engine, ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter#arel_visitor is used. I've reached engine.connection.visitor finally.

# activerecord/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:772

def arel_visitor
Arel::Visitors::MySQL.new(self)
end

Arel::Visitors::MySQL is a subclass of Arel::Visitors::ToSql. Major logics are written in Arel::Visitors::ToSql. Arel::Visitors::MySQL only has some engine specific code.

And Arel::Visitors::MySQL is a subclass of Arel::Visitors::Visitor. The accept method called in to_sql which is defined in this class.

# activerecord/lib/arel/visitors/visitor.rb:10

def accept(object, *args)
visit object, *args
end

# ...

attr_reader :dispatch

def self.dispatch_cache
Hash.new do |hash, klass|
hash[klass] = "visit_#{(klass.name || '').gsub('::', '_')}"
end
end

def get_dispatch_cache
self.class.dispatch_cache
end

def visit(object, *args)
dispatch_method = dispatch[object.class]
send dispatch_method, object, *args
rescue NoMethodError => e
raise e if respond_to?(dispatch_method, true)
superklass = object.class.ancestors.find { |klass|
respond_to?(dispatch[klass], true)
}
raise(TypeError, "Cannot visit #{object.class}") unless superklass
dispatch[object.class] = dispatch[superklass]
retry
end

What happens inside when you call visit?

collector = Arel::Collectors::SQLString.new
collector = engine.connection.visitor.accept @ast, collector

When you pass class name to dispatch_cache, it convert class name like Arel::Nodes::And to visit_Arel_Nodes_And. The name directly become a visitor method declared in Arel::Visitors::ToSql and other visitor classes.

Here is examples.

  • visit_Arel_Nodes_DeleteStatement
  • visit_Arel_Nodes_UpdateStatement
  • visit_Arel_Nodes_InsertStatement
  • visit_Arel_Nodes_Exists
  • visit_Arel_Nodes_Casted

How about with the example code?

select.where(product[:name].eq("foo")).ast

The @ast has following structure.

#<Arel::Nodes::SelectStatement:0x00007f8f1aba2830 @cores=[
#<Arel::Nodes::SelectCore:0x00007f8f1aba2808
@source=#<Arel::Nodes::JoinSource:0x00007f8f1aba26f0
@left=#<Arel::Table:0x00007f8f1aba2ad8
@name="products",
@type_caster=#<ActiveRecord::TypeCaster::Map:0x00007f8f1aba2bc8
@types=Product(id: integer, name: string)>,
@table_alias=nil>,
@right=[]>,
@top=nil,
@set_quantifier=nil,
@projections=[],
@wheres=[
(...)
], @orders=[], @limit=nil, @lock=nil, @offset=nil, @with=nil>

Also @wheres of the first @cores looks like this.

@wheres=
[
#<Arel::Nodes::Equality:0x00007f8f1ab918a0
@left=#<struct Arel::Attributes::Attribute
relation=#<Arel::Table:0x00007f8f1aba2ad8
@name="products",
@type_caster=#<ActiveRecord::TypeCaster::Map:0x00007f8f1aba2bc8
@types=Product(id: integer, name: string)>,
@table_alias=nil>, name=:name>,
@right=#<Arel::Nodes::Casted:0x00007f8f1ab918c8
@val="foo",
@attribute=#<struct Arel::Attributes::Attribute
relation=#<Arel::Table:0x00007f8f1aba2ad8
@name="products",
@type_caster=#<ActiveRecord::TypeCaster::Map:0x00007f8f1aba2bc8
@types=Product(id: integer, name: string)>,
@table_alias=nil>, name=:name>>>],
@groups=[],
@havings=[],
@windows=[]>
]

The root node is Arel::Nodes::SelectStatement. So visit_Arel_Nodes_SelectStatement must be called at first.

# activerecord/lib/arel/visitors/to_sql.rb:119

def visit_Arel_Nodes_SelectStatement(o, collector)
if o.with
collector = visit o.with, collector
collector << " "
end

collector = o.cores.inject(collector) { |c, x|
visit_Arel_Nodes_SelectCore(x, c)
}

unless o.orders.empty?
collector << " ORDER BY "
len = o.orders.length - 1
o.orders.each_with_index { |x, i|
collector = visit(x, collector)
collector << ", " unless len == i
}
end

visit_Arel_Nodes_SelectOptions(o, collector)
end

We know that o is an AST node of Arel::Nodes::SelectStatement in this case. And collector is Arel::Collectors::SQLString.

Nothing happens on if o.with block because it's empty.

o.cores is an array of Arel::Attributes::Attribute. Then collector is the appended result of visit_Arel_Nodes_SelectCore.

And visit_Arel_Nodes_SelectCore is this. I found a "SELECT"string finally!!

def visit_Arel_Nodes_SelectCore(o, collector)
collector << "SELECT"

collector = collect_optimizer_hints(o, collector)
collector = maybe_visit o.set_quantifier, collector

collect_nodes_for o.projections, collector, " "

if o.source && !o.source.empty?
collector << " FROM "
collector = visit o.source, collector
end

collect_nodes_for o.wheres, collector, " WHERE ", " AND "
collect_nodes_for o.groups, collector, " GROUP BY "
collect_nodes_for o.havings, collector, " HAVING ", " AND "
collect_nodes_for o.windows, collector, " WINDOW "

maybe_visit o.comment, collector
end

I think I’ve got a feeling of how Arel working. So I’m gonna finish this article.

Thanks for reading!!

References for Arel

Mastering ActiveRecord and Arel — GitHub Pages

Mastering ActiveRecord and Arel

danshultz/mastering_active_record_sample_code: Sample Code Used in my Talk at Codemash 2013

Creating Advanced Active Record DB Queries with Arel

Composable Query Builders in Rails with Arel

Build crazy queries with ActiveRecord and Arel

--

--