Photo by Brett Jordan on Unsplash

The Enumerable module in Ruby: Part II

Tech - RubyCademy
RubyCademy
Published in
3 min readAug 6, 2018

--

In this article we’re going to explore the following topics:

  • The Enumerator::Generator class
  • The Enumerator::Yielder class
  • The Enumerator::Lazy class

NB: feel free to have a look to the part I if you’re unfamiliar with the Enumerable module in Ruby.

The Enumerator::Generator class

Another way to instantiate a new enumerator is by using the Enumerator::new method

The Enumerator::new method returns an enumerator with an instance of Enumerator::Generator as the data source and the each method as the default data consumer.

The built-inEnumerator::Generator class is not supposed to be directly manipulated by the developer. It’s an internal class whose instances have a very particular job.

This job is to save the block passed to the Enumerator::new method — as a Proc — in the returned instance of the Enumerator class.

Then Ruby will execute this Proc to build the data source provided to the data consumer method — the n argument of the #map block in the previous example.

Note that this Enumerator::Generator includes the Enumerable module. So, it implements its own each method.

NB: feel free to read the Proc and Lambda article if you’re unfamiliar with the Proc class in Ruby.

Now, let’s pay attention to the yielder argument of the Enumerator::new block.

The Enumerator::Yielder class

The yielder argument is an instance of the built-in Enumerator::Yielder class.

This class is in charge of building and serving the data source using the Enumerator::Yielder#yield and Enumerator::Yielder#<< methods.

The data source is built and served on the fly for each call to these methods

Let’s describe step by step what happens during the e.map iteration.

During the first iteration, the yielder yields the value 1 as the n argument of the block passed to the e.map method call.

Then the yielder yields the value 2 as the n argument of the block passed to the e.map method call.

So the data is built and served to the map enumeration for each iteration.

The only difference between the Enumerator::Yielder#yield and Enumerator::Yielder#<< methods is that the first one returns nil when the second one returns self.

The Enumerator::Lazy class

What happens when we want to chain enumerations?

(1..100_000).map {|n| n * 2}.first(10)

In the above example, the first(10) enumeration will wait for the returned collection of the map {|n| n * 2} enumeration before starting to enumerate.

So total, there will be ~100 010 iterations to get the 10 first values of the map enumeration.

From a “performance” side, this is terrible.

So how to solve this performance issue? The answer is by using the Enumerator::Lazy class

(1..100_000).lazy.map {|n| n * 2}.first(10)

The Enumerable#lazy method returns an instance of the Enumerator::Lazy class.

This class includes the Enumerable module but it redefines almost all of its methods.

Let’s have a look at the following benchmark before detailing the concept of lazy enumeration

produces

$> ruby benchmark.rb
Warming up --------------------------------------
Enumerations 16.000 i/100ms
Lazy Enumerations 13.679k i/100ms
Calculating -------------------------------------
Enumerations 162.247 (± 1.2%) i/s - 816.000 in 5.030263s
Lazy Enumerations 142.999k (± 0.8%) i/s - 724.987k in 5.070187s
Comparison:Lazy Enumerations: 142999.2 i/s
Enumerations: 162.2 i/s - 881.37x slower

As we can see, using a lazy enumeration instead of a normal one can be a huge gain in speed.

So, how doesEnumerator::Lazy work?

(1..100_000).lazy.map {|n| n * 2}.first(10)

The instance of the Enumerator::Lazy returned by the Enumerable#lazy method will call the Enumerator::Lazy#map method.

This method — as almost all the methods of this class — acts in a particular way. Each iteration follows the following execution flow:

  • 1/It fetches the first value of the 1..100_000 data source (1) and yields it to the {|n| n * 2} block
  • 2/ It gets the return value of the block, and passes it to the next enumeration (first(10)) through a yielder

This mechanism is called the enumeration chain.

Ok.. But why does this allow Ruby to avoid iterating through the entire collection?

In lazy enumeration, the final enumeration — first(10) in our case — is in charge of controlling how long the enumeration runs.

So, it’s enough intelligent to say: “I’ve got enough data. So we can stop the enumeration chain”.

Ruby Mastery

We’re currently finalizing our first online course: Ruby Mastery.

Join the list for an exclusive release alert! 🔔

🔗 Ruby Mastery by RubyCademy

Also, you can follow us on x.com as we’re very active on this platform. Indeed, we post elaborate code examples every day.

💚

--

--