Introduction to Java 8 Parallel Stream — Java2Blog

Arpit Mandliya
Javarevisited
Published in
4 min readApr 23, 2019

In this post, we will see about Parallel Stream in java.

Java 8 introduces the concept of the parallel stream to do parallel processing. As we have a number of CPU cores nowadays due to cheap hardware costs, parallel processing can be used to perform operation faster.

Let’s understand with the help of a simple example

When you run the above program, you will get below output

=================================
Using Sequential Stream
=================================
1 main
2 main
3 main
4 main
5 main
6 main
7 main
8 main
9 main
10 main
=================================
Using Parallel Stream
=================================
7 main
6 ForkJoinPool.commonPool-worker-3
3 ForkJoinPool.commonPool-worker-1
9 ForkJoinPool.commonPool-worker-2
2 ForkJoinPool.commonPool-worker-3
5 ForkJoinPool.commonPool-worker-1
10 ForkJoinPool.commonPool-worker-2
1 ForkJoinPool.commonPool-worker-3
8 ForkJoinPool.commonPool-worker-2
4 ForkJoinPool.commonPool-worker-1

If you notice the output, the main thread is doing all the work in case of the sequential stream. It waits for current iteration to complete and then work on the next iteration.

In the case of Parallel stream,4 threads are spawned simultaneously and it internally using Fork and Join pool to create and manage threads.Parallel streams create ForkJoinPool instance via static ForkJoinPool.commonPool() method.

Parallel Stream takes benefits of all available CPU cores and processes the tasks in parallel. If the number of tasks exceeds the number of cores, then remaining tasks wait for currently running task to complete.

Parallel Streams are cool, so should you use it always?

A big No!!
It is easy to convert sequential Stream to parallel Stream just by adding .parallel, does not mean you should always use it.
There are lots of factors you need to consider while using parallel streams otherwise you will suffer from negative impacts of parallel Streams.

Parallel Stream has much higher overhead than sequential Stream and it takes a good amount of time to coordinate between threads.
You need to consider parallel Stream if and only if:

  • You have a large dataset to process.
  • As you know that Java uses ForkJoinPool to achieve parallelism, ForkJoinPool forks sources stream and submit for execution, so your source stream should be splittable.
    For example:
    ArrayList is very easy to split, as we can find a middle element by its index and split it but LinkedList is very hard to split and does not perform very well in most of the cases.
  • You are actually suffering from performance issues.
  • You need to make sure that all the shared resources between threads need to be synchronized properly otherwise it might produce unexpected results.

The simplest formula for measuring parallelism is “NQ” model as provided by Brian Goetz in his presentation.

NQ Model:

N x Q >10000

where,
N = number of items in the dataset
Q = amount of work per item

It means if you have a large number of datasets and less work per item(For example: Sum), parallelism might help you run program faster and vice versa is also true. So if you have less number of datasets and more work per item(doing some computational work), then also parallelism might help you in achieving results faster.

Let’s see with the help of another example.

In this example, we are going to see how CPU behaves when you perform long computations in case of parallel Stream and sequential stream. We are doing some arbit calculations to make the CPU busy.

When you run the above program, you will get below output.

117612733
Time taken to complete:6 minutes

But we are not interested in output here, but how CPU behaved when the above operation performed.

As you can see CPU is not fully utilized in case of Sequential Stream.

Let’s change at 16 lines no. and make the stream parallel and run the program again.

long sum=data.stream()
.parallel()
.map(i ->(int)Math.sqrt(i))
.map(number->performComputation(number))
.reduce(0,Integer::sum);

You will get below output when you run Stream in parallel.

117612733
Time taken to complete:3 minutes

Let’s check CPU history when we ran the program using a parallel stream.

As you can see parallel stream used all 4 CPU cores to perform computation.

That’s all about the parallel stream in java.

You may also like following Java 8 tutorials

  1. Java 8 tutorial
  2. Java 8 interview questions
  3. Lambda expression in Java 8
  4. Core java tutorial
  5. 5 Books to Learn Java 8 from Scratch
  6. 5 Free Courses to Learn Java 8 and beyond

Originally published at https://java2blog.com on April 23, 2019.

--

--