Parallel processing in Scala — Monix vs Future vs standard parallel collections

6 min readOct 16, 2019

In this blog-post we will see how one can execute a collection of things in parallel, compare Monix with Scala Future and with the parallel collections from the standard Scala library in performance, in types clarity, and in ease of use. The examples given will not involve any IO calls (calling DBs, external APIs, etc.). From the other libraries I took a look at ZIO and Cats. For ZIO there was very limited information about parallel processing on their website, and for Cats I was not able to find a way of parallel processing a collection of elements, so I didn’t include the two in the comparison. I will first start with the Scala parallel collections.

Standard library parallel collections

In order to perform parallel execution on elements of a collection, it needs to extend Parallelisable, which has a method par. One such common collection is GenIterable. We can then do this:

val res: ParIterable[X] = iterable.par.map(timeConsumingOperation)

The par method returns ParIterable, so it is easier for us to reason when trying to optimise complex program for performance. For example if we want to map over res we can be sure that it will happen in parallel, or if we want to do it sequentially we can converting the ParIterable back to Iterable by calling res.seq.

In addition to applying an operation to all elements in parallel, among other things, we can also fold the elements in parallel, given that the folding operation is associative and there is a zero element (in other words the elements of the collection need to satisfy the laws of a Monoid). The syntax is as simple as was the mapping:

iterable.par.fold(zero)(_ + _)

Scala Future

Future is mainly used for asynchronous programming, but it can also serve the purpose for parallel processing. We can run a collection of elements in parallel by executing the processing of each element wrapped inside a Future, and then forking back each Future with the method seq:

iterable.map(x => Future(timeConsumingOperation(x))).seq

The order in the output sequence is preserved but the effects are not ordered.
Based on the same idea we can also fold elements in parallel:

iterable.map(x => Future(timeConsumingOperation(x)))
    .foldLeft(Future.successful(zero)){(accF, elemF) => {
         accF.zip(elemF).map(tuple => tuple._1 + tuple._2)  
    }
)

Monix Task

Now let’s take a look at Monix Task. Task is a data type mainly used for controlling possibly lazy & asynchronous computations, but it can also be used for parallel processing. In short, Task[A] represents a description of an execution (asynchronous, forked, or on the same thread) that either returns A or throws an error when executed (nothing happens until we call some of its methods for running it). Now let's say we have an Iterable and want to perform in parallel the same timeConsumingOperation as above. We can do that as follows:

val res: Task[Iterable[X]] = 
Task.wander(iterable)(_ => Task.eval(timeConsumingOperation()))

, where Task.eval is the equivalent of Function0, taking a function that will always be evaluated on runToFuture, possibly on the same thread. And Task.wander takes a Seq[A], f: A => Task[B] and returns a Task[Seq[B]], similar to what seq does for Future. It applies f to each element in the sequence transforming it into Task and then collecting results with preserved order.

It is certainly more difficult to achieve parallelism with Task or Future compared to Parallelisable, but we have better insight about how parallelism is achieved, and we have more power to customize the execution. For example we can provide a different Scheduler, or apply wanderN with which we can specify the number of tasks running in parallel. Another difference is that Task and Future lack in type expressiveness compared to ParIterable. The parallel execution with Task is described by Iterable[Task[X]], which is the same type used to express non-parallel execution. If we want to map over the tasks, we don't know if it will happen in parallel or sequentially, where the same was obvious that will happen in parallel with ParIterable. Future suffers from the same downside.

Then I explored the options for parallel folding. I was not able to find any method in the Task interface for that purpose, although there is a methods map2 and parMap2, so we could implement it ourselves using those, or simply start the tasks one by one, converting them into a sequence of futures, and then using the same folding as for futures.

Monix Observable

Monix provides one more way of applying parallel execution on elements of a collection, that is with Observable. Conceptually Observable represents an asynchronous stream of events. The syntax and the type are simpler than Task:

val res: Observable[X] = Observable.fromIterable(iterable)
    .mapParallelOrdered(8)(_ => Task.eval(timeConsumingOperation()))

Also for consecutive mappings it is simpler too:

res.mapParallelOrdered(16)(_ => Task.eval(timeConsumingOperation()))

Similarly to Task, the type doesn't tell us how the observable is executed, whether sequentially or in parallel. Unfortunately the performance is worse compared to the solution with Task, so I exclude it from the comparison. Unlike Task, Observable does have a method fold, but it only works sequentially.

Performance comparison

For evaluating the performance I used Scalameter. I configured scalameter to run 10 tests per configuration and to take the average of the 10 measurements. I used 24 different configurations derived by combining 5 collection sizes (10, 100, 1000, 10 000, and 50 000 elements) with 5 execution times (0.0001 ms, 0.004 ms, 0.03 ms, 0.1 ms, and 1 ms) of a monadic operation. The Scala version is 2.12.7 and the Monix version is 3.0.0.

The following diagram shows which of the three is most performant for each configuration:

These were the exact execution times:

The measurements tell us that Scala Future and Parallel collections don’t match the performance of Monix Task, which is most performant in 16 configurations. On average it was 39% better than Future and 82% better than ParIterable. If we examine the numbers close, we can see that ParIterable is optimized for huge collections, while Future is fastest for small collections. They both suffer for very short operations, and the three reach almost equal performance on long operations of 1 second.

Conclusion

Based on what was explained I would say it makes sense to use Monix Task for parallel processing relatively small collections and when the processing includes some IO operations. In those cases I think with Monix you could achieve better performance, as well as referentially transparency, which among other things make it easier to control from which point in your program the things start running. Otherwise I would choose the parallel collections from the standard Scala library for their simplicity and good performance. The following table summarizes the trade off:

+-----------------+----------------+----------------+--------------+
|                 | Monix Task     | ParIterable    | Future       |
+-----------------+----------------+----------------+--------------+
| Performance     | balanced and   | better for     | better for   |
|                 | in overall     | large          | small        |
|                 | fastest        | collections    | collections  |
+-----------------+----------------+----------------+--------------+
| Parallel fold   | not out of     | yes            | not out of   |
|                 | the box        |                | the box      |
+-----------------+----------------+----------------+--------------+
| Type safety     | yes            | yes            | yes          |
+-----------------+----------------+----------------+--------------+
| Parallelisation | no separate    | separate type  | no separate  |
| transparency    | type           |                | type         |
+-----------------+----------------+----------------+--------------+
| Execution       | scheduler,     | none           | execution    |
| customisation   | number of      |                | context      |
|                 | parallelism    |                |              |
+-----------------+----------------+----------------+--------------+
| Code complexity | semi-complex   | simple         | semi-complex |
+-----------------+----------------+----------------+--------------+
| Referentially   | lazy or eager, | eager,         | eager,       |
| transparency    | effect type    | no effect type | effect type  |
+-----------------+----------------+----------------+--------------+
| Asynchronous    | yes            | no             | yes          |
+-----------------+----------------+----------------+--------------+