Javarevisited
Published in

Javarevisited

Eager is Easy, Lazy is Labyrinthine

From initialization to iteration, learning eager is easier than learning lazy.

java.util.* interfaces and classes (yellow) and custom collection interfaces (cyan) with eager iteration methods

The difference between eager and lazy

An eager algorithm executes immediately and returns a result. A lazy algorithm defers computation until it is necessary to execute and then produces a result.

Eager and lazy algorithms both have pros and cons. Eager algorithms are easier to understand and debug. They can also be highly optimized for a single use case (e.g. filter). Lazy algorithms sometimes result in less computation and if there are multiple steps in the computation(e.g.filter, map, reduce), there will be less temporary garbage created.

I usually prefer using eager algorithms by default, and lazy algorithms when I see an opportunity for an optimization. Both eager and lazy algorithms are useful so I always want both available in my toolkit.

Here’s a simple example showing eager initialization and lazy initialization.

Eager Initialization

class Someclass
{
private final List<String> strings = new ArrayList<>();

public List<String> getStrings()
{
return this.strings;
}
}

In the eager case, the List named strings is initialized immediately when an instance of SomeClass is created. This allows the variable strings to be defined as final as it will only ever be initialized when the instance of the of the class is created.

Lazy Initialization

class Someclass
{
private List<String> strings;

public List<String> getStrings()
{
if (this.strings == null)
{
this.strings = new ArrayList<>();
}
return this.strings;
}
}

In the lazy case, the List named strings is initialized only if the method getStrings is called. The computation required is deferred until the method is called. If the method is never called, then the extra computation is never required.

It’s Hard Work Being Lazy

The eager implementation of initialization is slightly less complicated than the lazy implementation shown above. In the case of eager and lazy iteration, the difference in complexity is much more pronounced.

The following code example shows a how a filter can be applied to a collection using an eager and lazy implementation. The lazy implementation is using Java Streams. The eager implementation is using a proof of concept collections framework with a type called MutableList that was shown in the diagram above. The filter method on MutableList applies a Predicate to each element of the list and returns a MutableList. The code for the POC collections framework is linked at the bottom of this blog.

@Test
public void filter()
{
MutableList<Integer> list = MutableList.of(1, 2, 3, 4, 5);

// eager filter method on MutableList
MutableList<Integer> eagerFilter =
list.filter(each -> each % 2 == 0);

// lazy filter method on java.util.stream.Stream
List<Integer> lazyFilter = list.stream()
.filter(each -> each % 2 == 0)
.collect(Collectors.toList());

var expected = List.of(2, 4);
Assert.assertEquals(expected, eagerFilter);
Assert.assertEquals(expected, lazyFilter);
}

Both of these examples should be easy enough to read, and have the same result as the test code illustrates. Both examples take a list of integers from 1 to 5 and filter the evens out, resulting in a list with 2 and 4. There is only one method call (filter) required for the eager implementation, compared to the four method calls (stream, filter, collect, toList) required for the lazy implementation.

The real complexity lies beneath the implementation code here and can be seen during debugging if we put a break point in the Predicate.

I will start by debugging the eager filter method. I will put a breakpoint on the lambda that tests if an integers is even, which should pause the execution for each integer in the list.

Debugging eager filter

Breakpoint on the Predicate for eager filter

When I debug the code this is the stack trace I see.

The Lambda code is at the top of the stack trace

If I step into filter method on MutableList, this is the code I see.

Debugging the implementation of the filter method on MutableList

This is easy to reason about and I have only one method to look into to understand how the filter method on MutableList works. I can see that the lambda is turned into a Predicate and the test method of the Predicate interface is called in an if statement inside of the for loop that is iterating over the list.

Debugging lazy filter

Now let’s debug the lazy filter method on Stream.

Breakpoint on the Predicate for lazy filter

When I debug the code this is the stack trace I see.

The Lambda code is at the top of the stack trace

Similar to when I debugged the eager code, the lambda code is at the top of the stack trace. However, I do not see the filter method on Stream in the stack trace. This is because filter returns a new Stream but does not execute the code in the lambda. The execution happens in the collect method because it is a terminal operation. If I step into the collect method this is what I see.

Debugging the implementation of the collect method on Stream

If I want to find the loop that is iterating through the elements of the list, I will need to step into the forEachRemaining method in the stack trace which is on ArrayListSpliterator inside of the ArrayList class.

Debugging forEachRemaining on ArrayListSpliterator in ArrayList

Lazy iteration is harder to understand than eager iteration. You will need to navigate a labyrinth of methods to follow the path of execution with Java Streams. Developers looking to understand how an algorithm works will most likely find it easier to understand an eager implementation first.

Understanding the Order of Things

If we stack several operations together, and use the peek method to output the current value of something, we can trace the order in which things are completed with both eager and lazy execution.

Tracing the order of execution of filter, map, reduce using eager and lazy algorithms

Output for Eager

filter: 1
filter: 2
filter: 3
filter: 4
filter: 5
map: 2
map: 4
reduce: 2
reduce: 4

Output for Lazy

stream filter: 1
stream filter: 2
stream map: 2
stream reduce: 2
stream filter: 3
stream filter: 4
stream map: 4
stream reduce: 4
stream filter: 5

Notice how the order of eager matches the order of the method calls. It goes filter, followed by map, and then by reduce. In the case of lazy, the order of methods is determined by the data. For each element that is matches in filter, map and reduce then are executed for that element.

The total number of executions for both eager and lazy are the same here: 9. The eager approach although easier to understand and reason about is generating two temporary MutableList instances (one for filter and one for map), where the lazy approach does not generate any temporary collections. This is one place where there might be a performance gain using the lazy approach, especially if the source MutableList is large.

The area where lazy really shines, is when short-circuiting can happen, reducing the total amount of work necessary.

Understanding the Effects of Short-circuiting

If we use a short-circuiting method like anyMatch, we can see some real potential performance benefits of using lazy iteration.

Output for Eager

filter: 1
filter: 2
filter: 3
filter: 4
filter: 5
map: 2
map: 4
anyMatch: 2
anyMatch: 4

Output for Lazy

stream filter: 1
stream filter: 2
stream map: 2
stream anyMatch: 2

This illustrates nicely where lazy iteration shines. The lazy iteration does not have to visit the entire collection. After all the hard work of implementation and understanding, we can benefit from reducing the total amount of work necessary by using lazy iteration.

Eager or Lazy? Why not both?

It makes sense to have eager implementations of algorithms directly on collections interfaces and to also have symmetric lazy implementations available on Streams. Having the eager iteration methods directly on the collections will make the code easier to learn, to teach and to debug. This lowers the cost for developers to build an understanding of iteration pattern implementations. The lazy implementations on Streams are a great performance optimization and are easy to move to with symmetric APIs on the collections interfaces like filter, map, reduce, etc. As with all performance optimizations, the code for lazy can be harder to understand and debug.

Further Information

The following blog explains eager, lazy, serial and parallel in more depth and compares the performance of different algorithms against large data sets.

If you would like to see the code I used for the custom collections, check out the Deck of Cards Kata in the GitHub repo here.

If you would like to learn more about the potential benefits of having eager methods on collections in Java today, then check out the following blog.

Have fun programming!

I am a Project Lead and Committer for the Eclipse Collections OSS project at the Eclipse Foundation. Eclipse Collections is open for contributions. If you like the library, you can let us know by starring it on GitHub.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store