How to collect a Java Stream into a primitive Collection

Donald Raab
Javarevisited
Published in
8 min readMay 31, 2024

Discover a primitive collection bridge for Java Stream to avoid boxing.

Photo by Arisa Chattasa on Unsplash

Take the boxing gloves off

Boxed collection types in Java are types like Set<Integer> or List<Double>, where the primitive values like int and double in a collection have to be stored in a boxed wrapper like Integer and Double. Boxed wrappers and collection types in Java quietly eat up memory, waste time and take energy. The cost of a single boxed collection is usually insignificant. Unfortunately, boxed collections quietly pollute millions of Java heaps, possibly helping melt the ice out from under one or two unsuspecting polar bears.

Photo by Peter Neumann on Unsplash

I’ve blogged previously why I believe boxing in Java is evil, and some options we have as Java developers to remove the memory and performance inefficiencies from our applications and libraries. Please read the following blog if you want to understand why I think boxing is evil.

So how can we bridge the wonderful world of Java Stream and the efficient world of primitive collections in Java ?

Let’s take our boxing gloves off and take a look at some solutions.

Java primitive Streams

Java provides Stream support for three primitive types. There are IntStream, LongStream, and DoubleStream types. We can transform from an Object Stream to one of these primitive Stream types using mapToInt, mapToLong, mapToDouble. If we want to transform a primitive Stream into a Collection we have limited options.

The default and most used option we are left with is to collect primitive Streams back into boxed collections like List<Integer>. This is exactly what we are looking to avoid.

Another option is to convert a primitive Stream into a primitive array. Unfortunately, arrays in Java have no useful behavior. They can tell us their length and give us mutable access to their elements so we can loop over them or change them. We can rewrap int[], long[], and double[] arrays into primitive Streams by using the overloaded Arrays.stream() method.

Both of these solutions also only work for int, long, and double. What do we do for boolean, byte, char, short, float? Are there solutions to mapping a Java Stream into a primitive Collection type that will support all eight Java primitives?

Yes, there are.

Call collect!

For those that may not have experienced the joy of calling someone collect... In the good ol’ days when public pay phones were the only way of calling someone if you needed something, we used to make collect calls. This was preferable to carrying around a lot of quarters, which were much better spent playing arcade machines. Calling collect meant asking an operator to bill the call to the person who would pick up the phone on the receiving end. That person would then have to accept the charges and the call would be added to their phone bill. I’m really not sure how we learned to give up this amazing convenience and saddle ourselves with cell phones instead.

Java developers have had a decade to work with Java Stream, and most will have encountered the inability to use a Collector with a primitive Stream. There are no primitive Collection types in Java, so there has been no need for primitive Collection Collectors.

If we want to map from a Java Stream and collect a primitive Collection, we have to accept a dependency on Eclipse Collections or another primitive collection library which will give us access to primitive Collections for all eight Java primitive types. Before we add this dependency, let’s see what we can get.

Option 1: Call the Three Musketeer collect

Photo by Aleksei Ieshkin on Unsplash

There is a Three Musketeer version of collect available on IntStream, LongStream, DoubleStream. The names of the Three Musketeers in the primitive collect method are:

  • Supplier
  • Obj(Int/Long/Double)Consumer
  • BiConsumer

These three Functional Interfaces are stand-in actors for the missing collect method on primitive Streams that would take a primitive Collector. The Three Musketeer version of collect can be used to create primitive collections in Eclipse Collections as we can see in the following examples.

IntStream collect to IntList

@Test
public void intStreamToIntList()
{
IntStream intStream =
IntStream.of(1, 2, 3, 4, 5);

// Three Musketeer collect
IntList ints =
intStream.collect(
IntLists.mutable::empty, // Supplier
MutableIntList::add, // ObjIntConsumer
MutableIntList::addAll); // BiConsumer

IntList expected =
IntLists.mutable.of(1, 2, 3, 4, 5);
Assertions.assertEquals(expected, ints);
}

LongStream collect to LongSet

@Test
public void longStreamToLongSet()
{
LongStream longStream =
LongStream.rangeClosed(1, 5);

// Three Musketeer collect
LongSet longs =
longStream.collect(
LongSets.mutable::empty, // Supplier
MutableLongSet::add, // ObjIntConsumer
MutableLongSet::addAll); // BiConsumer;

LongSet expected =
LongSets.mutable.of(1L, 2L, 3L, 4L, 5L);
Assertions.assertEquals(expected, longs);
}

This approach works with DoubleStream as well, and there are primitive Lists, Sets, Bags in Eclipse Collections that can work with the Three Musketeer collect.

But how can we collect the other five primitive types from a Java Stream?

Option 2: Use a primitive Collector with an Object Stream

Photo by Denley Photography on Unsplash

Wait! There are no primitive collection Collectors in Java!

Correct. There are no primitive collection Collectors in Java, because there are no primitive collections in Java. There are however primitive Collector instances available in Eclipse Collections. There is a utility class named Collectors2 which can help us build a bridge between Java Stream and primitive collections. We will see how we can use Collectors2 to convert from an Object Stream to any primitive collection type (Set, List, Bag) for any of the eight primitive Java types (boolean, byte, char, short, int, float, long, double)

Let’s start with a Stream of some Object type and see how we can convert that Stream into various primitive collections for all eight primitive types.

Stream<String> collect to BooleanBag

This example converts a Stream of String into a BooleanBag and then counts the occurrences of true and false.

@Test
public void streamToBooleanBag()
{
Stream<String> stream =
Stream.of("true", "false", "true", "false", "true");

BooleanBag booleans = stream.collect(
Collectors2.collectBoolean(
Boolean::parseBoolean,
BooleanBags.mutable::empty));

Assertions.assertEquals(3, booleans.occurrencesOf(true));
Assertions.assertEquals(2, booleans.occurrencesOf(false));
}

Stream<String> collect to ByteList

This example converts a Stream of String into a ByteList.

@Test
public void streamToByteList()
{
Stream<String> stream = Stream.of("1", "2", "3", "4", "5");

ByteList bytes = stream.collect(
Collectors2.collectByte(
Byte::parseByte,
ByteLists.mutable::empty));

ByteList expected = ByteLists.mutable.with(
(byte) 1,
(byte) 2,
(byte) 3,
(byte) 4,
(byte) 5);
Assertions.assertEquals(expected, bytes);
}

Stream<Character> collect to CharSet

This example converts a Stream of Character instances to a CharSet.

@Test
public void streamToCharSet()
{
Stream<Character> stream =
Stream.of('a', 'b', 'c', 'd', 'e');

CharSet characters = stream.collect(
Collectors2.collectChar(
Character::charValue,
CharSets.mutable::empty));

CharSet expected =
CharSets.mutable.with('a', 'b', 'c', 'd', 'e');
Assertions.assertEquals(expected, characters);
}

Stream<Sh> collect to ShortBag

I created a Java record for this example named ShSh was shorter than Short. Sh holds onto a short, which is cast in the constructor from int. Like byte, there is no short short literal in Java, so we have to cast an int literal to a short. The Sh record allowed me to cast in one place.

@Test
public void streamToShortBag()
{
record Sh(short value)
{
public Sh(int intValue)
{
this((short) intValue);
}
};
Stream<Sh> stream =
Stream.of(new Sh(1), new Sh(2), new Sh(3), new Sh(4), new Sh(5));

ShortBag shorts = stream.collect(
Collectors2.collectShort(
Sh::value,
ShortBags.mutable::empty));

ShortBag expected =
ShortBags.mutable.with(
(short) 1,
(short) 2,
(short) 3,
(short) 4,
(short) 5);
Assertions.assertEquals(expected, shorts);
}

Stream<BigInteger> collect to IntList

I didn’t know there were four singleton values for BigInteger until I wrote this example test. Now we all know.

 @Test
public void streamToIntList()
{
Stream<BigInteger> stream =
Stream.of(
BigInteger.ZERO,
BigInteger.ONE,
BigInteger.TWO,
BigInteger.TEN);

IntList ints = stream.collect(
Collectors2.collectInt(
BigInteger::intValueExact,
IntLists.mutable::empty));

IntList expected =
IntLists.mutable.with(0, 1, 2, 10);
Assertions.assertEquals(expected, ints);
}

Stream<BigDecimal> collect to FloatSet

This example converts from a Stream of BigDecimal to a FloatSet.

@Test
public void streamToFloatSet()
{
Stream<BigDecimal> stream =
Stream.of(
BigDecimal.ZERO,
BigDecimal.ONE,
BigDecimal.TWO,
BigDecimal.TEN);

FloatSet floats = stream.collect(
Collectors2.collectFloat(
BigDecimal::floatValue,
FloatSets.mutable::empty));

FloatSet expected =
FloatSets.mutable.with(0.0f, 1.0f, 2.0f, 10.0f);
Assertions.assertEquals(expected, floats);
}

Stream<LocalDate> collect to LongBag

This example converts from a Stream of LocalDate to a LongBag containing the LocalDate instances converted to their epochDay.

@Test
public void streamToLongBag()
{
LocalDate now = LocalDate.of(2024, Month.MAY, 31);
Stream<LocalDate> stream =
Stream.of(
now,
now.plusDays(1L),
now.plusDays(2L),
now.plusDays(3L));

LongBag longs = stream.collect(
Collectors2.collectLong(
LocalDate::toEpochDay,
LongBags.mutable::empty));

LongBag expected =
LongBags.mutable.with(19874L, 19875L, 19876L, 19877L);
Assertions.assertEquals(expected, longs);
}

Stream<BigDecimal> collect to DoubleList

This example converts from a Stream of BigDecimal to a DoubleList.

@Test
public void streamToDoubleList()
{
Stream<BigDecimal> stream =
Stream.of(
BigDecimal.ZERO,
BigDecimal.ONE,
BigDecimal.TWO,
BigDecimal.TEN);

DoubleList doubles = stream.collect(
Collectors2.collectDouble(
BigDecimal::doubleValue,
DoubleLists.mutable::empty));

DoubleList expected =
DoubleLists.mutable.with(0.0d, 1.0d, 2.0d, 10.0d);
Assertions.assertEquals(expected, doubles);
}

Efficiency is still our problem

I love Java, and can’t wait for Project Valhalla to fully land in future versions of the language. I waited ten years for concise lambda expressions, and have now been waiting for ten years for Project Valhalla. The problem for me and the applications I have worked on in Financial Services over the past 20 years is that I didn’t have time to wait for the Java language and standard library to evolve to solve the problems I was faced with. Much of the work me and my colleagues did was driven by memory and some by performance pressure. I won’t claim the work we did was part of some greater philanthropy to save the planet by getting rid of all instances of HashSet<Integer>. I do occasionally dream about getting rid of all instances of HashSet<Integer> in the world though. :)

The good news is that we don’t have to wait for Valhalla if we have a need for primitive collections in Java today, or just want to take up a more efficient coding style to do our part to reduce our global computing footprint. There are solutions available to help us write more efficient Java code when we want to. Some of these solutions have been in development for 20 years. There is a cost to learning new types, methods and adding a dependency on a third-party library. If there’s no benefit, then there’s no point in incurring the cost of the dependency. But we should think about it every time we type new HashSet<Integer>. When we type or see this code in our applications, I want us to think about the polar bears.

When Valhalla arrives, I hope all of us will learn about it and take the time to leverage it in our applications so we can take advantages of new efficiencies and collectively reduce our footprint and increase our application performance. Efficiency is our problem. Whether we adopt solutions available today or wait for Project Valhalla to write more efficient Collections code, the problem is ours. Perhaps Valhalla will introduce some efficiencies that will magically deliver memory and performance savings without us having to do any work. That will of course be very welcome!

Thank you for reading, and I hope you learned something new and useful!

I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions.

--

--

Donald Raab
Javarevisited

Java Champion. Creator of the Eclipse Collections OSS Java library (https://github.com/eclipse/eclipse-collections). Inspired by Smalltalk. Opinions are my own.