Query Pipeline Design Pattern — Function Composition
Recently, I was assigned a task to create a /search service that consumes data from the search engine. It does a bunch of other tasks before and after it makes a call to the search engine. Some of these tasks include getting the configured rules for a given filter query, query correction, intent recognition, cache lookup, and re-rank results after getting results from the search engine.
Here, I will talk about a few of the patterns that we can use to design a query pipeline and I discuss the details of the pattern that we have used for our service (Function Composition).
Design Pattern Choices
1. Chain of Responsibility(CoR)
In this approach, you will chain your processors and each processor calls the next processor in the chain if the request is not handled/completed yet. Each Processor holds the instance of the next processor in the chain.
The intent of CoR is for a processor to handle a request only if the previous processor was not able to solve it. But, our intention is to run all processors.
Reasons we didn’t go for CoR:
- I personally don’t like the idea of a processor keeping an instance of another(next) processor. This will make the unit test complicated, and we will be needing a mocking framework to mock these next processor objects.
- All processors in the chain need to implement the same interface and you can’t chain the incompatible interfaces.
However, if you want to return from the intermediate step, this pattern could be your ideal choice. But, with some tweaks(not clean approach), the same thing can be achieved using other patterns too.
2. List Of Processes
Another pattern is to maintain a list of the processor instances. Then invoke each processor in sequence to serve the request.
The implementation would look something like:
PipelineContextObject can have your input and output objects and other params that are needed in your pipeline flow to serve the request.
In your service class, you can inject this pipeline object and call the run method.
The RequestHandler object can be registered in the QueryPipeline during application start time.
This pattern is simple and easy to implement. However, there are certain limitations to it:
- No clean solution to return result from intermediate step— need to maintain state between each step and check that state inside the loop to break out of it.
- All steps need to implement the same interface.
3. Function Composition
The idea in Function Composition is to chain different functions by passing a function as a parameter to another function. Many modern languages already support this. In Java 8, you can use Functional Interface to do so.
Functional Interfaces represent the functionality instead of data and can be passed to another function.
How do we create a pipeline using a Functional interface?
Create a custom interface(this will be our Functional interface) and all steps in the pipeline will implement this interface.
Every class that implements this interface will be the functionality it is solving and these will be our functions that will be chained together.
The idea is to call these functions in a sequence where the output of one function will be used as the input of the next function.
- Create a Functional Interface.
The default method is responsible for the function-composition. Below is the expanded form of the default method — follow comment to understand:
2. Implement this interface to create your concrete functionalities.
3. Then you can construct the object and create a function chain at application startup time. If you are using the spring framework then you can create a bean in the Configuration class.
Below is a Spring Bean example:
If you are not using spring framework, you can call below method and inject the pipeline object in your service class at startup of your application:
The queryPipeline bean chains all the beans/objects. Here response of one function is used as a request of the next function.
4. Finally, use the queryPipeline bean in the service class:
How do we chain functions if the function interfaces are not the same?
Say our ReRank handler implements ReRankStrategy instead of the RequestHandler interface.
The solution is simple. Wrap the ReRankStrategy implementation inside the RequestHandler object. Then execute the reRankStrategy.reRank() method inside the handle() method of RequestHandler object.
Some other ways to wrap the reRankStrategy would be to:
- Create an anonymous class OR
- Use Lambda functions OR
- Use method reference if your ReRankStrategy class is a functional interface(only one non-default method)
So, in this case the queryPipeline bean definition will look like this:
You can always use Java 8 Function interface instead of creating your own RequestHandler interface. The reason for creating a new interface is to maintain the naming convention. It is better to use relavant interface names rather than use “Function” :)
Wrap up
Out of these 3 patterns to implement the query pipeline, CoR would be ideal if the intent is to return from an intermediate state. However, having a “list of processes” is easier to implement and less complex.
In our service, we have implemented Function Composition because it is much cleaner, as it doesn’t keep the Object of the next processor in the chain and can tie incompatible interfaces together.