Reading Arel of ActiveRecord
From Ruby to SQL
What is the Arel?
Arel is an AST manager used in ActiveRecord to create SQL query. Every SQL statements are represented as ruby code in Arel. Because of this, it enable to write a complicated SQL query without using raw SQL string. But it’s not recommended to use Arel in the application because it’s a private API.
I’ll dive into the internal of Arel in this article. I’ll explain how it works when generating SQL with simple code example.
I’m not an expert of this subject. Please correct me if i’m wrong.
Overview
Arel is located in ActiveRecord gem. And has these kind of directory structure.
.
├── active_record
│ └── ...
├── arel
│ ├── attributes
│ ├── collectors
│ ├── nodes
│ └── visitors
└── rails
└── ...
As I said, Arel is an AST manager. So there are classes that represents each statements. They are located in nodes
directory.
attributes
This contains classes for each table attributes with type such as STRING
, TIME
or BOOLEAN
, etc…
visitors
This contains AST nodes visitor and SQL engine specific codes. (for MySQL, PostgreSQL, etc…)
Arel’s basic usage
product = Product.arel_table
# returns Arel::Table
select = Product.arel_table.from
# returns Arel::SelectManager
query = select.where(product[:name].eq("foo")).to_sql
# SELECT FROM "products" WHERE "products"."name" = 'foo'
Arel’s key components
Arel::Table
This represents table itself as name suggests. You can obtain this instance via arel_table
method of model class.
Table class has methods to access Arel::SelectManager
, Arel::Attribute
and Nodes::TableAlias
. And also has methods to construct query.(e.g. where
, join
, group
, order
, having
, etc...)
Arel::Attribute
Attribute class represents attributes that belongs to the table.
Example
puts product[:name]
# #<struct Arel::Attributes::Attribute
# relation=
# #<Arel::Table:0x00007fcb1b0552d8
# @name="products",
# @type_caster=
# #<ActiveRecord::TypeCaster::Map:0x00007fcb1b055328
# @types=Product(id: integer, name: string)>,
# @table_alias=nil>,
# name=:name>
Arel::SelectManager
This seems to be a node forSELECT
statement. But it's different there is another node called Arel::Nodes::SelectStatement
which represents only SELECT
. Arel::SelectManager
has convenient methods to create AST tree.
There are another class like this.
Arel::InsertManager
Arel::UpdateManager
Arel::Nodes
A base class of each AST node classes.
Arel::Visitors::ToSql
Contain AST node visitors methods which used on creating SQL string. I’ll explain in later.
Arel::Nodes::NamedFunction
This allows to create a AST node for statement which not being declared in Arel by default. Here is a usage in Arel::FactoryMethods
.
# activerecord/lib/arel/factory_methods.rb:40
# Create a LOWER() function
def lower(column)
Nodes::NamedFunction.new "LOWER", [Nodes.build_quoted(column)]
end
def coalesce(*exprs)
Nodes::NamedFunction.new "COALESCE", exprs
end
How Arel generate SQL query?
I’ll follow what happens inside Arel when running this example code.
select = Product.arel_table.from
# Arel::SelectManager
query = select.where(product[:name].eq("foo")).to_sql
First, where
in select manager is defined in Arel::TreeManager
.
# activerecord/lib/arel/tree_manager.rb
def where(expr)
if Arel::TreeManager === expr
expr = expr.ast
end
@ctx.wheres << expr
self
end
Simple enough. If the argument is Arel::TreeManager
then fetch AST from it. And put into @ctx.wheres
. Otherwise, put expr
directly into @ctx.wheres
. As Arel::TreeManager
is included in Arel::SelectManager
, this @ctx
is same as it on Arel::TreeManager
.
I called to_sql
to generate SQL. This method is located in Arel::TreeManager
too. This method has following operations.
def to_sql(engine = Table.engine)
collector = Arel::Collectors::SQLString.new
collector = engine.connection.visitor.accept @ast, collector
collector.value
end
Table.engine
is initialized on ActiveRecord loaded. So this is a reference to ActiveRecord
to the module.
# activerecord/lib/active_record.rb
ActiveSupport.on_load(:active_record) do
Arel::Table.engine = self
end
engine.connection
is defined in ActiveRecord::ConnectionHandling
. This is same method referenced with ActiveRecord::Base.connection
. This method returns a connection from connection pool.
activerecord/lib/active_record/connection_handling.rb:191
def connection
retrieve_connection
end
If you are using MySQL, a connections is initialized in mysql2_adapter.rb
. It means the engine.connection
is ConnectionAdapters::Mysql2Adapter
# activerecord/lib/active_record/connection_adapters/mysql2_adapter.rb:12
module ActiveRecord
module ConnectionHandling # :nodoc:
# Establishes a connection to the database that's used by all Active Record objects.
def mysql2_connection(config)
# ...
client = Mysql2::Client.new(config)
ConnectionAdapters::Mysql2Adapter.new(client, logger, nil, config)
visitor
is defined in ActiveRecord::ConnectionAdapters::AbstractAdapter
. It calls arel_visitor
. In this case with MySQL engine, ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter#arel_visitor
is used. I've reached engine.connection.visitor
finally.
# activerecord/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:772
def arel_visitor
Arel::Visitors::MySQL.new(self)
end
Arel::Visitors::MySQL
is a subclass of Arel::Visitors::ToSql
. Major logics are written in Arel::Visitors::ToSql
. Arel::Visitors::MySQL
only has some engine specific code.
And Arel::Visitors::MySQL
is a subclass of Arel::Visitors::Visitor
. The accept
method called in to_sql
which is defined in this class.
# activerecord/lib/arel/visitors/visitor.rb:10
def accept(object, *args)
visit object, *args
end
# ...
attr_reader :dispatch
def self.dispatch_cache
Hash.new do |hash, klass|
hash[klass] = "visit_#{(klass.name || '').gsub('::', '_')}"
end
end
def get_dispatch_cache
self.class.dispatch_cache
end
def visit(object, *args)
dispatch_method = dispatch[object.class]
send dispatch_method, object, *args
rescue NoMethodError => e
raise e if respond_to?(dispatch_method, true)
superklass = object.class.ancestors.find { |klass|
respond_to?(dispatch[klass], true)
}
raise(TypeError, "Cannot visit #{object.class}") unless superklass
dispatch[object.class] = dispatch[superklass]
retry
end
What happens inside when you call visit
?
collector = Arel::Collectors::SQLString.new
collector = engine.connection.visitor.accept @ast, collector
When you pass class name to dispatch_cache
, it convert class name like Arel::Nodes::And
to visit_Arel_Nodes_And
. The name directly become a visitor method declared in Arel::Visitors::ToSql
and other visitor classes.
Here is examples.
- visit_Arel_Nodes_DeleteStatement
- visit_Arel_Nodes_UpdateStatement
- visit_Arel_Nodes_InsertStatement
- visit_Arel_Nodes_Exists
- visit_Arel_Nodes_Casted
How about with the example code?
select.where(product[:name].eq("foo")).ast
The @ast
has following structure.
#<Arel::Nodes::SelectStatement:0x00007f8f1aba2830 @cores=[
#<Arel::Nodes::SelectCore:0x00007f8f1aba2808
@source=#<Arel::Nodes::JoinSource:0x00007f8f1aba26f0
@left=#<Arel::Table:0x00007f8f1aba2ad8
@name="products",
@type_caster=#<ActiveRecord::TypeCaster::Map:0x00007f8f1aba2bc8
@types=Product(id: integer, name: string)>,
@table_alias=nil>,
@right=[]>,
@top=nil,
@set_quantifier=nil,
@projections=[],
@wheres=[
(...)
], @orders=[], @limit=nil, @lock=nil, @offset=nil, @with=nil>
Also @wheres
of the first @cores
looks like this.
@wheres=
[
#<Arel::Nodes::Equality:0x00007f8f1ab918a0
@left=#<struct Arel::Attributes::Attribute
relation=#<Arel::Table:0x00007f8f1aba2ad8
@name="products",
@type_caster=#<ActiveRecord::TypeCaster::Map:0x00007f8f1aba2bc8
@types=Product(id: integer, name: string)>,
@table_alias=nil>, name=:name>,
@right=#<Arel::Nodes::Casted:0x00007f8f1ab918c8
@val="foo",
@attribute=#<struct Arel::Attributes::Attribute
relation=#<Arel::Table:0x00007f8f1aba2ad8
@name="products",
@type_caster=#<ActiveRecord::TypeCaster::Map:0x00007f8f1aba2bc8
@types=Product(id: integer, name: string)>,
@table_alias=nil>, name=:name>>>],
@groups=[],
@havings=[],
@windows=[]>
]
The root node is Arel::Nodes::SelectStatement
. So visit_Arel_Nodes_SelectStatement
must be called at first.
# activerecord/lib/arel/visitors/to_sql.rb:119
def visit_Arel_Nodes_SelectStatement(o, collector)
if o.with
collector = visit o.with, collector
collector << " "
end
collector = o.cores.inject(collector) { |c, x|
visit_Arel_Nodes_SelectCore(x, c)
}
unless o.orders.empty?
collector << " ORDER BY "
len = o.orders.length - 1
o.orders.each_with_index { |x, i|
collector = visit(x, collector)
collector << ", " unless len == i
}
end
visit_Arel_Nodes_SelectOptions(o, collector)
end
We know that o
is an AST node of Arel::Nodes::SelectStatement
in this case. And collector
is Arel::Collectors::SQLString
.
Nothing happens on if o.with
block because it's empty.
o.cores
is an array of Arel::Attributes::Attribute
. Then collector
is the appended result of visit_Arel_Nodes_SelectCore
.
And visit_Arel_Nodes_SelectCore
is this. I found a "SELECT"string finally!!
def visit_Arel_Nodes_SelectCore(o, collector)
collector << "SELECT"
collector = collect_optimizer_hints(o, collector)
collector = maybe_visit o.set_quantifier, collector
collect_nodes_for o.projections, collector, " "
if o.source && !o.source.empty?
collector << " FROM "
collector = visit o.source, collector
end
collect_nodes_for o.wheres, collector, " WHERE ", " AND "
collect_nodes_for o.groups, collector, " GROUP BY "
collect_nodes_for o.havings, collector, " HAVING ", " AND "
collect_nodes_for o.windows, collector, " WINDOW "
maybe_visit o.comment, collector
end
I think I’ve got a feeling of how Arel working. So I’m gonna finish this article.
Thanks for reading!!
References for Arel
Mastering ActiveRecord and Arel — GitHub Pages
Mastering ActiveRecord and Arel
danshultz/mastering_active_record_sample_code: Sample Code Used in my Talk at Codemash 2013
Creating Advanced Active Record DB Queries with Arel