How we write our query filter engine on our REST API (part 1)

Adam Ben Aharon
Melio’s R&D blog

--

Have you ever had to filter data in your API for different data sets with different combinations? How do you solve this issue? Where do you start ? Let me present to you how I created a query filter engine.

Filtering data in API helps users make data filtering easier and makes your API more usable without the need to change it to a spastic scenario.

In this five part series, I will cover filtering problems and strategies to solve them. We also review the strategies we use in our own API and the accompanying implementation details.

Imagine that you have this employee data set:

employee: {
firstName: string,
lastName: string,
salary: number,
startDate: Date,
isOnVacation: bollean
numberOfDaysInOffice: number
}

The rest endpoint gets all employees https://demo/v1/employees that return an array of employees.

Now, let’s say you want to give all of the employees that are on vacation an easy task. All you have to do is get all the employees from the endpoint and use an IF statement, like this:

employees = getAllEmployees() // here we will preforme our api call to https://demo/v1/employees.
employees.filter((employee) => employee.isOnVacation);

Now, let’s say that you want to filter all of the employees whose first names start with “A” and last name with “B” or employees that earn more than 10k. It’s a little bit more complex but you still have to get all the employees and filter them using two “if” statements.

employees.filter(
(employee) => (employee.firstName.startsWith('A') && employee.lastName.startsWith('B'))
|| (employee.salary > 10000)
);

Now, it’s time to make it more complex. Let’s say you need to find all the employees that work 10 days from the office in January and 15 days in February or employees that work 10 days in February and earn less than 8k. This is going to be a much more complicated task than the first two — but we can solve it.

  1. We will get all our users (what else can we do 🙂 it’s not that we have a way to filter it in the API)
  2. We have to separate this filter into two conditions:
    a. Employees that work 15 days in January and 10 days in February
    b. Employees whose salaries are lower than 8k and work 10 days
// presuming that we implmenent firstConfition and secondCondition functions
employees.filter((employee) => firstCondition() || secondCondition());

Now, say that we have to combine more data sets in our filter. For example, in the employee group, it’s becoming increasingly complex. And if you are serving multiple clients, each one will have to implement it on its own over and over.

In this post, I will detail possible solutions. Let’s solve the filtering in API issue while taking care of combining data sets and keeping it simple for the clients using our API.

Knowing the issue, I begin to understand what options I have to solve this issue in our REST API service.

  1. Leave it like this and let the clients do the filter by themselves.
  2. Build a filter endpoint for each use case–for example, an endpoint that retrieves all users on vacation. I understand really quickly that this is not scalable (it solves the issue that only the service writes the logic but the clients are strict with those scenarios) and it breaks our REST API approach.
  3. Filter entities using the query string of the REST API call (in our case the “get all employee call”): This approach means that we don’t need to add additional API calls–we just need to create a layer above them that will construct the filters from the query. This solution solves the filtering issue and leaves the API as is.

After investing more time, I begin to understand that as a REST API that serves multiple clients, solution number three is best.

Let’s dive in and see what options we have to implement this approach:

  1. LHS Brackets using a qs package — Let’s look at the ins and outs of LHS brackets and encode each field with square brackets []. That means that after the field name you are using [] to indicate your operation. Lets look again at our employee API, to get all employees on vacation. We simply send the query https://demo/v1/employees?onVacation[eq]=true, for nested fields we could wrap each nested field with [] — for example firstField[nestedField][operator]=value, to implement this approach we could use the qs package. It decodes the query string as an object. For example, onVacation[eq]=true will transform to the object:
    { onVacation: { eq: true } }
    Pros of this approach:
    - Encoding and decoding are very easy using the qs package.
    - Supports nested field.
    - Supports the “and operator”.
    - Supports multiple operators such as equals, greater/lower than, etc.
    - Supports open API schema validation (not easy to maintain).
    Cons of this approach:
    - Doesn’t support the “or operator”.
    - Requires work to group the filter on the server side.
    - Ts support is limited.
  2. Filter by field name — In this solution, we will use each one of the fields that you have (in our example, firstName, lastName, onVacation etc) as a key and the value will be the value that we want to filter. Let’s implement the “get all employees on vacation” example: https://demo/v1/employees?filter='{"onVacation":true}’ This is pretty easy. The only issue with this approach is that it only supports equal operators but it’s easy to maintain the open API schema. Let’s check out the open API schema for this example.
parameters: 
- in: query
name: filter
Content:
application/json:
schema:
type: object
properties:
onVacation:
type: boolean
  • Pros of this approach:
    -This is the easiest solution to implement. There’s no need to encode/decode — you can use JSON.parse /JSON.stringify on the filter key. Each key (in the parse object) is a field and the value can be anything you want.
    - Support open api schema (easy to maintain), you get validation for free (in case you use open api validation).
    - Client side TS support, once you generate a client from the open api (TS version) you also get it for free.
  • Cons of this approach:
    - Doesn’t support the “or operator” and complex conditions.
    - Define the fields you need to support with each api entity.
    - Support nested — hard to implement nested connection.
    - Need to update the schema for each new field you add.
    - Doesn’t support greater/lower than, only supports equal operators.
    - Requires work to group the filter on the server side.

3. Build a query filter engine — what does “query filter engine” mean? It means that under the query key you choose (in my case it’s “search” but it can be another key), you build your condition language, you choose the fields, operator, and the values that you support and you start to build your own parser. Let’s have a look at our example: Say that we support “onVacation” and the operator is: equal that expects boolean as the value, our api will look like https://demo/v1/employees?filter=onVication:true This approach is the most robust solution because it supports everything that you want, you choose the query language that works for you, and you need to implement it according to your choice. This is the most flexible solution but is harder to implement.

  • Pros of this approach:
    - Supports nested fields.
    - Supports “And/Or operators”.
    - Supports multiple operators such as equal, greater/lower than etc.
    - Supports complex conditions using parentheses (in case you want to support).
    - The conditions look really simple compared to other solutions (in case you build easy query language 🙂).
    - Fully supports typescript (which depends on you).
  • Cons of this approach:
    - Time — thinking about the query language and implementing it takes time.
    - Open API support — you can only add the search key to the open API and the value will be string so you don’t really get value.
    - Document — need to be well documented so the user will be able to use it.
    - Implementation — need to build the parser and validation on your own.
    - Encoding and decoding — need to build it and publish a decode package (so the user will use it with ease).

After covering our solutions, we chose to use our API query to create our own query language. This is because it gives more flexibility, is a more robust solution, and we only need to implement it once.

Part 2 is here!

Visit our career website

--

--