How Does Bundle Install Work? — Part 1

This is a look at the technical details of a how Bundler works that I wanted to share with everyone. I’m currently the newest member on the Bundler core team where I mainly just help out where I can. To better familiarize myself with the project I decided that I wanted to explore the codebase more and become familiar with the technical details and concepts inside Bundler. Hope you enjoy.


The bundle install command is the main interaction most users have with Bundler. The command reads a Gemfile which contains the gems your application depends on. An example Gemfile being:

source ‘https://rubygems.org
gem ‘rack’, ‘~> 2.0’
gem ‘rspec’, ‘~> 3.6’

Bundler then resolves these dependencies, fetches the needed gems from a remote source — https://rubygems.org and installs them into your ruby environment. But how does Bundler take the list of gems in the Gemfile and figure out what dependencies and their dependencies need to be installed with a compatible set of versions?

This post assumes that you’re already familiar with both Bundler and Ruby. If not, I recommend reading bundler.io and ruby-doc.org beforehand to familiarize yourself.

Bundler version 1.15.1 is the version used as the reference and will undoubtedly change in future releases.

Finally, the code samples are taken directly from the source code of Bundler which can be found at https://github.com/bundler/bundler. All file paths used will be relative to the root folder of the Bundler project.

After you press Enter

Once you have run bundle install the first piece of code that runs is the bundle executable file located at exe/bundle. (There is also a bundler executable at exe/bundler which just runs the the exe/bundle executable).

This file does a couple of things like checking for legacy versions of Bundler. On line 35 we initialize the Bundler CLI.

Bundler::CLI.start(args, :debug => true)

Bundler uses the Thor gem for its command line interface. You can find the CLI implementation for all Bundler commands and their options in lib/bundler/cli.rb. On line 185 we’ll find the implementation of the install command.

def install
require “bundler/cli/install”
Bundler.settings.temporary(:no_install => false) do
Install.new(options.dup).run
end
end

First thing the install method does is require bundler/cli/install.rb. Bundler has a convention of having the implementation of each command located at bundler/cli/<command>.rb

Then a temporarily setting is set, which means the setting is set with the given value for only the lifetime of the running Bundler process and won’t be persisted on disk. Here we’re setting no_install to be false. This is a setting for the bundle package command, If this setting is true bundle package won’t install any gems but for bundle install we do want Bundler to install our gems.

Bundler.settings.temporary(:no_install => false)

We then create a new instance of the Bundler::CLI::Install class with a copy of the options hash and then call the run method.

Install.new(options.dup).run

The options hash is from Thor and is made from the method_option of each command in lib/bundler/cli.rb. Here are a few example options for the install command:

method_option “quiet”, :type => :boolean, :banner => “Only output warnings and errors.”
method_option “without”, :type => :array, :banner => “Exclude gems that are part of the specified named group.”
method_option “with”, :type => :array, :banner => “Include gems that are part of the specified named group.”

Each option is defined with a name, a type such as a String, Boolean & Array and a small description.

For example If the user runs a command like

bundle install — with development

then the options hash will be:

{“with”=>[“development”]}

Looking inside bundler/cli/install.rb we’ll find both the initialize and run methods within the first few lines. When the Install class is initialized the options hash from Thor is put into an instance variable for later use.

attr_reader :options
def initialize(options)
@options = options
end

Looking at the run method now, between line 12 and 65 is mostly validation and setup such as:

  • Checking if the user is running as root or user id 0
  • Checking for conflicting install options that can’t be used together
  • Setting RubyGems trust policies
  • Triggering Bundler plugin hooks.

line 65 is notable, this is where we initialize the Bundler Definition.

definition = Bundler.definition

Bundler Definition

The Bundler Definition holds the state of the Gemfile and the Gemfile.lock. It also orchestrates all the other parts of Bundler like: sources, platforms, dependencies and groups. The Definition can be found in lib/bundler/definition.rb.

We’re not creating the Defintion class directly, instead we’re calling a helper method. At lib/bundler.rb on line 126 where we will find the implementation of Bundler.definition.

def definition(unlock = nil)
@definition = nil if unlock
@definition ||= begin
configure
Definition.build(default_gemfile, default_lockfile, unlock)
end
end

This method is a helper to generate the Definition from the Gemfile and Gemfile.lock. This operation is quite expensive so it’s cached inside an instance variable unless unlock is truthy. The unlock argument is set to true when Bundler needs to make a change to the Gemfile.lock, which happens when a version of a gem is being updated but we won’t be getting into details about that right now.

The configure method sets up the GEM_PATH and GEM_HOME environment variables.

To create the Definition Bundler calls the `build` class method. Opening lib/bundler/definition.rb on line 27.

def self.build(gemfile, lockfile, unlock)
unlock ||= {}
gemfile = Pathname.new(gemfile).expand_path
 raise GemfileNotFound, “#{gemfile} not found” unless gemfile.file?
 Dsl.evaluate(gemfile, lockfile, unlock)
end

The build method checks that the Gemfile exists but then calls evaluate on another class called DSL.

We will have a look at the DSL and come back to more about the definition later on.

DSL

Before I explain the DSL, did you know that the Gemfile is actually just a ruby script? To prove this we can have a Gemfile with a call to the puts method in Ruby like:

source ‘https://rubygems.org'
gem ‘rack’
puts “Hello World!”

When we run bundle install we see that “Hello World” is printed to the console.

$ bundle install
Hello World!
Hello World!
Hello World!
Using rack 2.0.3
Using bundler 1.15.1
Bundle complete! 1 Gemfile dependency, 2 gems now installed.
Gems in the group deployment were not installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.

With the Gemfile being a ruby script that means a method called source, gem and group exists somewhere then.

The DSL or Domain Specific Language is the class that implements the source, gem, group and other methods found in the Gemfile. The DSL is located at lib/bundler/dsl.rb, note that the Gemfile.lock is not parsed by the DSL. It’s handled elsewhere and will be discussed later on.

The job of the DSL is to evaluate the Gemfile and build a Definition from it. It performs this process on line 9 in a class method called evaluate.

def self.evaluate(gemfile, lockfile, unlock)
builder = new
builder.eval_gemfile(gemfile)
builder.to_definition(lockfile, unlock)
end

This method performs 3 distinct operations, initialize a new instance of the Bundler::DSL class, evaluate the Gemfile on the new instance and create a new Definition instance from the DSL. The implementation of each of these methods can be found on line 23, 39 and 200 respectively.

I was curious to see how Bundler evals the Gemfile and we can see on line 44 that Bundler uses instance_eval. Which means the Gemfile is evaluated in the context of the caller object, which is the newly created instanced of Bundler::DSL.

instance_eval(contents.dup.untaint, gemfile.to_s, 1)

Once the DSL has finished evaluating our Gemfile we now have a Definition.

What happens when Bundler evals the Gemfile?

Lets take our example Gemfile and have a look at what happens in the DSL when it’s evaluating our Gemfile.

source ‘https://rubygems.org
gem ‘rack’, ‘~> 2.0’
gem ‘rspec’, ‘~> 3.6’

DSL source method

The source method specifies the remote location Bundler will use to fetch your gems, this is typically set to the RubyGems public repository: https://rubygems.org. We can find the source method on line 129.

def source(source, *args, &blk)
options = args.last.is_a?(Hash) ? args.pop.dup : {}
options = normalize_hash(options)
source = normalize_source(source)

We can only specify a single endpoint for each `source`. It does take a hash of options as well but it’s only used for Bundler plugins which I won’t be discussing in this post.

There are 2 ways to use the source method in the Gemfile. First is the source found at the top of most Gemfiles like:

source ‘https://rubygems.org'

This is referred to as the global source. The second way is to wrap a group of gems that typically are from a private gem server like:

source ‘https://mygemserver.private' do
gem ‘mygem’

end

Having a look inside the source method on line 149

check_primary_source_safety(@sources)
@sources.global_rubygems_source = source

Bundler does some quick validation to see if you’re using more than one global source and then sets the source to a property on an instance variable @sources.

@sources is initialized on line 25 and it holds an instance of a new class called SourceList.

@sources = SourceList.new

The SourceList just keeps hold of the specified sources in the Gemfile, It doesn’t do much else but it becomes important during the Index and when we need to download the gem.

When we specify a source with a block it adds the source to the SourceList again but not as a primary source. And calls a method with_source with the sources instance variable block on line 146:

elsif block_given?
with_source(@sources.add_rubygems_source(“remotes” => source), &blk)

Looking at the with_source method of line 318:

def with_source(source)
old_source = @source
if block_given?
@source = source
yield
end
source
ensure
@source = old_source
end

We make a copy of the @source instance var and set its new value to source and yield the block. And for each gem in the block @source will be copied into the gem instance on line 395 (unless the gem specifies its own source).

opts[“source”] ||= @source

DSL gem method

The gem method specifies a single library that your application depends on, you can specify as many gems as you want but you can’t specify the same gem twice with different requirements.

We can find the implementation of the gem method on line 91.

def gem(name, *args)
options = args.last.is_a?(Hash) ? args.pop.dup : {}
version = args || [“>= 0”]

The method signature and the few following lines tell us that the gem method takes a name, a list of requirements and an option hash. This gives us the option to specify a gem such as:

gem ‘rack’, ‘>= 1.6’, ‘< 3’, source: ‘https://gemserver'

On line 97 we see the DSL creates a new object from a class called Dependency with the name, version and options we supplied:

dep = Dependency.new(name, version, options)

Dependency is Bundler’s representation of a gem which it inherits from RubyGems. We can find it in lib/bundler/dependency.rb. Each dependency in Bundler holds information such as the name, requirements, platforms, groups, source, require options etc.

Bundler then does some quick validation ie, making sure you’re not specifying the same dependency multiple times with different requirements or specifying a gem coming from from 2 different sources

Finally Bundler will then push the dep object into an the @dependencies instance var which is an array that gets passed to the Definition.

DSL group method

The group method specifies a set of gems that you can selectively include or exclude from being installed with the with or without arguments. We can find the implementation of the group method on line 218.

def group(*args, &blk)
options = args.last.is_a?(Hash) ? args.pop.dup : {}
normalize_group_options(options, args)

The group method takes a list of names, a hash of options and a block.

An example being:

group :test, :development do
gem ‘my_optional_gem’
end

You may be wondering what options does group have? You may have herd of optional groups which means a group of gems that won’t be installed unless you specify the with option.

group :test, :optional => true do

end

The implementation of group is pretty simple, we concat the list of groups to an instance var called @groups and then yield the block. Afterwards we then pop each arg from @groups

If you specified the optional option, then the groups are then concat into another instance var called optional_groups which is passed to the Definition.

The way groups and gems interact is through the yield. When we call a gem in a group block a copy of the @groups var is saved into the dep object inside the gem method. We can find this on line 353.

groups = @groups.dup
opts[“group”] = opts.delete(“groups”) || opts[“group”]
groups.concat Array(opts.delete(“group”))

Groups are not passed into the Definition, each gem holds what groups it’s assigned to, all gems have at least 1 group, by default each gem is assigned to the default group.

DSL to Definition

Once the DSL has eval’d the Gemfile can then create an instance of Bundler::Definition with the dependencies, optional groups, sources and ruby version.

On line 214, the to_definition method implements this functionality. You’ll remember this is called in the build method.

def to_definition(lockfile, unlock)
Definition.new(lockfile, @dependencies, @sources, unlock, @ruby_version, @optional_groups)
end

Definition Ahoy!

Now that we have a definition we can continue on with the installation process. We’re going to jump back to lib/bundler/cli/install.rb on line 66.

definition.validate_runtime!
installer = Installer.install(Bundler.root, definition, options)

The first operation we perform on the definition is call `validate_runtime!`.

This method checks to make sure the machine which your running `bundle install` on is compatible with any declared ruby versions, engines, engine versions or platforms. For example, in the Gemfile I can declare:

ruby “2.4.1”

Which tells Bundler that the application depends on MRI 2.4.1 and will refuse to install gems unless your ruby env is Ruby 2.4.1.

You can also declare specific engines as well such as JRuby.

ruby “1.8.7”, :engine => “jruby”, :engine_version => “1.6.7”

You can read more information about this in the Gemfile documentation on bundler.io

The next line we’re introduced to a new class called the Installer. This class encapsulates the operations of parsing the definition, resolving versions and installing our gems which we’ll explore in part 2.

installer = Installer.install(Bundler.root, definition, options)

Hope you’re enjoying what I’ve written so far, feel free to leave any feedback 👋