Reducing in Java
Even if you are already familiar with functional programming concepts, I find that using Stream#reduce is a little complicated. There are three different reduce methods on the interface, so I want to walk through some simple examples of each.
As defined in the Java documentation, reduction is an operation over an input sequence which results in a single result. It’s one of the three fundamental sequence operation (along with map and filter). In some languages, this operation is called a fold. In most use cases, “reduction” is semantically correct, but it is also possible to “reduce” a collection into a bigger collection.
One way to think of reduction is that you are going to go over each element of a sequence and take an action against that element with only the previous result to compare against.
Just a warning, I’m going to use contrived and simplistic examples. Several of these examples reproduce JDK API methods, and those should be used in actual code.
For each of the reduce methods, you will need a BinaryOperator function. This is any function which takes two arguments of the same type and returns an object of that same type. So, you can take two Integer values and return an Integer. The BinaryOperator interface exposes two static methods which demonstrate common use cases: minBy and maxBy. These both take a Comparator to produce a BinaryOperator which will take two values, compare them, then return the lesser or greater of the two values.
So, for every reduction, you are going to use a BinaryOperator to reduce the values into a single, final result.
Let’s start with the two-argument reduce.
T reduce(T identity, BinaryOperator<T> accumulator)
Performs a reduction on the elements of this stream, using the provided identity value and an associative accumulation function, and returns the reduced value.
I find this to be the most straightforward to use, because it closely maps to several mathematical concepts you are already familiar with: sum, product, maximum. Yeah, there are built-in methods for (two of) these already, but it lays out the concept well.
The accumulator is a BinaryOperator, as discussed above. The other important concept here is the “identity”. An identity function is a function which always returns the input parameter. Think of this as your starting value, should the sequence be empty or only have one value. So, with addition, you can add zero to any number to get itself. For multiplication, this is the number one. In the maximum case, you need to choose a sensible default — you could set it to 0 or 1 if your use-case prohibits negative numbers.
The last point is important: a sensible default. This model should guide you away from having a null value. Even if you have an empty collection, there will be a result.
Optional<T> reduce(BinaryOperator<T> accumulator)
Performs a reduction on the elements of this stream, using an associative accumulation function, and returns an Optional describing the reduced value, if any.
The single-argument method has no “identity” value, so it has to return an Optional to indicate that potentially no value will be accumulated from the reduction. You can think of this version as setting the default based upon the first value in the collection. An empty collection will return an empty Optional. A collection with only one element will return that element. For this reason, the single-argument reduce may be a better approach to minimum and maximum.
As you can see from the first case, instead of encoding a default value as an identity, you can use the Optional to provide it.
And remember that a BinaryOperator is any function which takes two arguments of the same type and returns that type. It means you can do the following:
The first example gives you 3 to the power of 3 to the power of 3. Look at the second example and think about what it might do. This example is completely useless.
Before covering the three-argument reduce method, we should go over BiFunction. This interface takes two arguments of a different type and returns a value, potentially of a third type. The restriction laid out for reduce, however, requires that the return type be the same as the first argument. What this amounts to is that you must take as input the element type of the sequence, but can return a different kind of object at the end.
<U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner)
Performs a reduction on the elements of this stream, using the provided identity, accumulation and combining functions.
The presence of the BiFunction means you are going to reduce over the collection in order to produce a result of a different type. This can be a useful tool when you want to reduce over a collection to produce a different kind of collection.
This example is derived from some legacy code at work, minimally adapted to show the technique. He we have a collection of Throttles, and we use them (lines 13–16) to reduce over a collection of objects those throttles can be applied to. We begin (the “identity”) with the collection we wish to filter. The “accumulator” takes each throttle and uses it to divide the initial collection into allowed and blocked. The “combiner” just keeps the most recent results, passing it along to the next iteration of the accumulator. The result is that once something is blocked, it is excluded from the next round. At the end, we should be left with a collection of Potential objects which pass all the Throttles. Our final result includes those which are allowed and those which are blocked.
Like I warned at the beginning, these are contrived examples. This example for the three-argument reduce is definitely not to be reproduced for this exact situation. I say this because Functions (or in this case, Predicates) compose. You can already combine them with Function#andThen, Function#combine, Predicate#and, or Predicate#or. If you wanted to chain several filters together, you would just compose the Predicates and then use a Collectors#groupingBy to divide them into two collections.