Reading Apache Beam Programming Guide — 4. Transforms (Part 2)
We have discussed Transforms Part 1 in the previous blog post,
To continue our discussion about Core Beam Transforms, we are going to focus these three transforms:Combine, Flatten, Partition
this time.
Core Beam transforms
We are going to continue to use the Marvel dataset to get stream data.
4. Combine
Combine
is a Beam transform for combining collections of elements or values in your data.
The use of combine is to perform “reduce” like functionality. There are numeric combination operations such as sum, min, and max already provide by Beam, if you need to write some complex logic, you would need to extend the classCombineFn
.
Let’s try a simple example with Combine
. If we want to sum the average players’ SkillRate per fight, we can do something very straightforward.
SumInts
static class SumDoubles implements SerializableFunction<Iterable<Double>, Double> {
@Override
public Double apply(Iterable<Double> fights) {
double sum = 0.0;
for (Double fightPoint: fights){
sum += fightPoint;
}
return sum;
}
}
You can apply it by calling the following. Since we need to write out using custom windowing, since this is a non-global windowing function, we need to call .withoutDefaults()
explicitly.