MuleSoft — Scatter Gather tips and tricks

Arun Dutta
The Mule Blog
Published in
4 min readMay 24, 2020

Scatter Gather is very well known routing event processor that we use to parallel process multiple business logic on the same given payload and then finally aggregate all the individually processed mule events into new mule event which is then passed on to the next component in the flow. It is an extremely useful component when you want to process a mule event simultaneously across multiple processors and there is no inter-dependency within data response coming from them.

While building several different workflows using the Scatter Gather component I came across some unique issues which I am going to discuss in this article.

Input Serialization issue

Problem

Usually it is very easy to pass a mule event to Scatter Gather component which it then splits and sends across a reference of it to all the parallel routes. But under certain situations, there might be issues wherein the Scatter Gather component might be unable to accept the incoming data from the previous processor, especially if the previous component is a SFDC or SAP connector which pull data from respective external Cloud SaaS providers.

The above situation arises because the data retrieved is in a some non-compatible binary form and the Scatter Gather needs a serialized or stream data compatible to get past it. The flow below will start throwing errors during runtime.

Raw SFDC binary data will throw error when directly passed to Scatter Gather component

The SFDC connector version used is 9.7.10.

Solution

To solve this situation, you just need to add a dataweave script to serialize the data before entering the Scatter Gather and once the mule event is in route you have to de-serialize it (otherwise the downstream processors will not be able to parse the data as they expect binary format)

Corresponding DWL scripts

I have used json serialization , but any other mime-type should also work.

Serialization and De-serialization components added

There is one drawback with the above technique, serializing the data also loads the entire data in-memory, bypassing the very concept of repeatable concurrent streaming (we will cover streaming in future tutorial). If the data volume is large this may impact performance, hence trade-offs may be required in terms of runtime memory requirements or feasibility to use the Scatter-Gather.

The above solution is tried and tested to work on Mule Runtime 4.2.1, 4.2.2 and 4.3.0.

For the issue I faced, it also got fixed by upgrading the SFDC connector to version 10.1 in which case the serialization transformation is no longer required preventing the drawback mentioned above.

There were also known issues with streaming in Mule 4.2.1 related to Scatter Gather component. Check the references section.

Scatter Gather Concurrency control

There are certain situations under which we can make Scatter Gather bypass its normal concurrent processing nature and make it process data serially. From personal experience what I found is this is useful when you are trying to debug your mule flows locally and you would not actually want to have this behavior deployed in production.

Solution 1

Set the maxConcurrency element to value 1. This will enforce the mule events to be executed across the routes sequentially.

Max Concurrency set to 1

Always ensure that you have atleast 2 routes in your scatter gather block or else Anypoint Studio will complain :)

Solution 2

Putting the Scatter Gather within a transaction block will force all the routes to process the mule event sequentially. This is not actually a way how we should be implementing scatter gather, but rather a side effect when we try to enforce transactions over Scatter Gather component.

The above situation is a tricky one wherein you need to have the processors process data within a transaction (Here we are using try catch block, but it can be any transaction compatible component like JMS listener, VM listener etc). In this case, no matter what is the value of maxConcurrency element the processing will be serial.

Thats all for now. For more detailed deep-dive into the component you always have the official documentation.

--

--

Arun Dutta
The Mule Blog

3x Microsoft Certified (Azure Solutions Architect)| 2x MuleSoft Certified (Integration Architect/Developer)| Java/Microservices/NodeJS Certified