Blog Series: The missing Java data structures no one ever told you about

Donald Raab
Javarevisited
Published in
5 min readAug 30, 2021

--

The three part Eclipse Collections series all in one convenient place.

Photo by Jakob Køhn on Unsplash

Engineering when you need it

Seventeen years have passed since I created the first Java classes that would eventually become Eclipse Collections. In technology time, this might as well have been 10,000 years ago. The dominant operating system in 2004 was Windows XP, the most popular programming language was PHP and it would be five years until Oracle purchased Sun and became the steward of Java.

In 2004, I was building data structures and algorithms in a Java library that was ready for lambdas from year one. I might as well have been building an ark and hoping for a little rain. I would have to wait a full decade for Java to finally get lambdas. I did feel a little bit like Noah at times. Many others had deemed an endeavor to build a collections framework in Java with algorithms that needed lambdas, and required the usage of anonymous inner classes in the interim, a fool’s errand.

I started and continued building components in Eclipse Collections because no one provided exactly what I needed at any point in time that I needed it. The first thing I ever needed in Eclipse Collections were small memory efficient List, Set and Map classes. The first classes I built are still in the library and have interfaces named FixedSizedList, FixedSizeSet and FixedSizeMap. You will still find funny sounding classes like SingletonList, DoubletonList, TripletonList, QuadrupletonList, QuintupletonList if you go digging. The classes pre-date the interfaces, as happens when you follow an evolutionary design. These original mutable classes would later become the basis for their immutable equivalents years later, which now go up to size ten — ImmutableDecapletonList.

No one else had anything quite like this in 2004, so I built what I needed and moved on. These classes were exactly what I needed, when I needed them.

A lot more has been needed and built since then.

I wasn’t alone building the things that I needed

There have been well over a hundred contributors to Eclipse Collections over the years. Folks have added things to the library as they needed them, and the library has continued to grow and evolve to meet the continual needs of its supportive development community.

What we have arrived at today with the help of many developers is a comprehensive set of Java data structures and algorithms. You can find ten things that help differentiate Eclipse Collections in the following blog.

The Blog Series

I wrote a series of blogs to capture some data structures I felt were not common knowledge in the Java development community. I intentionally focused on data structures, and not algorithms. Eclipse Collections is an object-oriented library, so the algorithms are methods on the data structures. I have already written quite a lot about many of the algorithms that exist in Eclipse Collections like select, reject, collect, detect, groupBy, countBy, etc. Each blog in this series has code examples that demonstrate various algorithms available on the data structures.

This is a picture of the data structures I cover in the three-part blog series.

Mind map of the types covered in the blog series

Part 1 — If your only tool is a Map, lookout for nulls

In part 1 of the blog series, I wrote about the following data structures in Eclipse Collections.

  • Interval
  • Bag
  • Multimap
  • HashingStrategy
  • BiMap
  • Pool

Bag and Multimap are great examples of Map-like types where you don’t have to worry about null values as you would if you used a Map to simulate these structures in Java.

Part 2 — Prefer Internal Iterators to External Iterators

If you want to understand why we “prefer internal iterators to external iterators” in Eclipse Collections and you should too, then I highly recommend reading this part of the blog series. In part 2 of the blog series, I wrote about the following data structures in Eclipse Collections.

  • Synchronized Collections
  • MultiReader Collections

The MultiReader collections are the types that keep us on the path of using internal iterators as much as possible. The cross-walk should stop the traffic, regardless if the pedestrian remembered to press the button first.

Part 3 — Waiting for Valhalla

The punch line of this blog is that we couldn’t wait for Project Valhalla, and as a result, you don’t need to wait either. You have the ability today to stop generating garbage in the form of unnecessary boxes. Convenience has a cost, and in this case, you the developer will pay for it in your Java heaps using boxed collections. In part 3 of the blog series, I wrote about the following primitive data structures available in Eclipse Collections.

  • Primitive List
  • Primitive Set
  • Primitive Bag
  • Primitive Stack
  • Primitive Map
  • Primitive LazyIterable
  • Primitive Synchronized Collections
  • Primitive Unmodifiable Collections
  • Primitive Strings

Summary

It was an interesting experience writing this blog series. I didn’t realize how much I had forgotten about decisions that were made over a decade ago in the library. These decisions continue to provide a lot of value today, and guide the continued evolution of the framework. I’m happy I was able to take some of the “tribal knowledge” about the framework and get it written down in this blog series. I hope this knowledge will now find its way into the hands of a new generation of talented developers who can help take Eclipse Collections and the Java programming language forward into the future.

I am a Project Lead and Committer for the Eclipse Collections OSS project at the Eclipse Foundation. Eclipse Collections is open for contributions. If you like the library, you can let us know by starring it on GitHub.

--

--

Donald Raab
Javarevisited

Java Champion. Creator of the Eclipse Collections OSS Java library (https://github.com/eclipse/eclipse-collections). Inspired by Smalltalk. Opinions are my own.