Filtering query language: how we built our powerful transaction search

Published in

The Qonto Way

12 min readJan 14, 2022

As a product grows both in features and data, customers will eventually need more advanced tools to help them analyse and derive actionable insights. Regardless of the product, this will most likely take the form of filters as the prominent tool for in-app analytics.

This was the case with Qonto just a few months ago. From the beginning, we enabled our customers to filter and search for their transactions. We started off with simple, mono-dimensional filters that served our customers well.

Bumps in the road

However, this also created too much noise. A CFO would have to sift through a large number of transactions which is time consuming. We felt that this process could be improved. Transactions below a certain amount in France are not legally required to have a receipt. So a CFO, in addition to the missing receipts, is also looking for transactions above €50. Moreover, each transaction that does have a receipt should also have marked the appropriate VAT value.

If we were to imagine what such query would look like in pseudocode:

(receipts IS missing AND amount_min=50) OR (receipts IS NOT missing AND vat IS missing)

The above would give the CFOs all the transactions they need to review. However, our current filters can’t accommodate that. To be able to accomplish that, we would need to rethink our approach to filters.

The road to a unified query language

As our customers became more reliant on the tools we provide to satisfy business needs such as bookkeeping, the more evident it became that we need to rethink our filtering capabilities. Together with Product, we brainstormed what a North Star could look like for filters.

More operators are needed

The filters we had served customers well but as our customers grow their needs have also shifted. A simple and familiar solution would be to keep adding more filters to satisfy specific behaviours.

For example, the filters we had supported min and max amounts. In practical terms, they meant inclusion of min and max such as min_amount <= amount_value . To support a non inclusive operator such as min_amount < amount_value , we would need to add a new filter aptly called min amount excl. Same for the max value. One could argue such filter isn’t needed. A customer can simply change the amount value by one (depending on the direction) and can achieve the same effect. Although this is true, the behaviour and experience is cumbersome.

A similar example can be illustrated with dates. From and To filters would play the rule of inclusive dates. If you wanted to have exclusive dates, you could adjust your days again by one off. If you wanted to have results any for as specific date, you would need to apply the same date on both From and To.

Remove the AND restriction between the filters

The design of the filters enforces an implicit AND conditional between all active filters. For example, by adding the Min. Amount and the Member filter, we’re essentially creating an AND query between the two such as min_amount <= x AND member = y . Searching for transactions that were created between specific team members or that contain specific labels would not be possible.

Flexibility to slice data

Our customers range from freelancers to big businesses each with their own needs and wants. We should be able to serve the whole spectrum without overloading anyone with complexity. The optimal solution would be the intersection of a business analyst running raw SQL and the intuition of a UI.

To solve the above, we looked around at other tools that provide powerful data slicing such as Notion. We arrived at the conclusion that groups and subgroups would be the optimal solution for our use cases. Essentially, allowing our customers to start with simple queries with the same familiarity as today and slowly building up powerful queries that would allow them to slice their data however they want.

With the product requirements set, we had a few observations:

The mono-dimensional filters are wonderfully straight forward in terms of implementation. They map 1–1 with query params such as ?min_value=50 and so on. Just by looking at the query params, you can derive the state of the filters. However, they could not scale to our vision in that structure. For an operator, we would need a new filter such as min_amount_excl . Then, the client would need to do some string parsing to identify the property (amount) from the operator (min_excl).
The concept of filter became more expressive. Every filter could be represented by the same properties (more on this in a bit).
Not only filters but other types such as groups emerged which could be represented as graphs.

We set out to expand and model a query language that would describe these primitives. They will serve as the contract of querying for transaction between server and client.

Unified query language

You can find the definition on https://gist.github.com/PericlesTheo/534c2ed18c7ca2ed3e63672c1c3c64df as well

Before jumping onto the definition above, it can be worth discussing what the query language is in practical terms and what qualities come with it.

Query language isn’t an actual language such as Ruby or Go. It’s a data structure that serves as a contract between a client and a server. You can think of it as a type. It can easily be validated and parsed both from the client and backend.

Another important aspect of the query language, isn’t tied to particular client language or framework, nor a database. You’re free to use whatever you want as storage. One might ask at this point, why go through the hassle of creating your own language when you can use for instance an off the shelf solution such as ElasticSearch and simply use its language?

For our use case, we need a contract that can easily fit into a reasonable set of off the shelf products for both the FE/BE. Exposing the whole of ElasticSearch to the client would make it difficult for the backend to validate security and other rules.
Clients (mobile/FE) would all of sudden need to understand and learn ElasticSearch. That would be a huge burden for development teams.
Tying to a specific storage would not scale well for development. Transactions filtering currently retrieves data from ElasticSearch. However, our transfers filtering actually hits our Postgres DB. If we wanted to apply the new set of filtering capabilities to transfers, we would need to migrate them on ElasticSearch also. The query language on the other hand allows for the server to have multiple storages implementations (ElasticSearch, MySQL or even another API) through adapters.
The UI doesn’t often translate nicely with the actual queries. Our Receipt filter contains 3 values: missing, addedor lost/not required. Missing is actually a combination of receipt is NOT (added and lost/not_required). If the FE is reading the ElasticSearch query, it wouldn't be able to build a UI filter that has missing as selected since it can’t tell whether the user actually selected missing or the other two options together. We have more cases like this.

Definition of the query language

Let’s dissect the query language. We will start with the smaller primitive parts and then move upwards.

We’re using types to define the query language which could be foreign if you have never used them. For these examples, I will be using Typescript for convenience but the examples should be transferable to the typed language of your choice. The following examples just focus on basic data types. Indeed the language is mostly made up of such simple types.

When you see:

type Pagination = {
  page: number
}

it means that we have a Pagination type or object that has a property called page and it’s a number.

A union type means the property can be of any type in the definition. For example:

type User = {
   age: string | number
}

The above implies that we have a User type or object that has a property called age which can be either a string or a number. You can read more about union types in this Typescript documentation

Search

search is used as a text search across number of fields for our implementation purposes.

Sort

A client needs to be able to sort the results by a particular property (Created at) and a direction. In our case, we need to sort only by a single property. If you need to support sorting on multiple properties, you can convert the Sort into an array.

Pagination

Clients needs to be able to paginate through the results by setting the page they want and the number of results for that page.

Filter

Let’s take a look at a wireframe for visual association:

A Filter is composed of 3 parts:

The property we want to query. This might not correspond 100% to the field name in the backend storage but that's irrelevant at this point. In this example, the field is Category.
Operators include all the comparison and equality operators one encounters in the wild. The available operators can be dictated by the type of filter. For example, if we have a boolean filter the only operators that make sense would be Eq, NotEq and depending from the implementation maybe Exist and NotExist. In the wireframe, we see the NotEq operator.
The Value is the user’s input. The Value accepts and expects an array of values regardless if the particular filter UI does not support multi select. This reduces the API surface since we don’t need to check for .value or .values in our code.

FilterGroup

Let’s look at a wireframes for visual aid:

The highlighted part comprises the FilterGroup. The condition is set to Or for all the filters inside the FilterGroup.

Let’s look deeper in the definition of an expression. <Array Filter | FilterGroup> means that expressions are an array that could contain both of those types. Filter makes sense. But how does FilterGroup work? What we’re essentially saying here is that a group can be nested in another group (AKA recursive type). This allows for any level of nesting. Let's see an example:

type House = {
  rooms: <Array Room | House>
}type Room = {
   name = String
   type = String
}

If we wanted to express that we have a pool house within a house, the above can be expressed as:

let house = [
  {
    name: "Living Room",
    type: "hosting"
  },
  {
    name: "Bedroom",
    type: "rest"
  },
  {
    rooms: [
      {
        name: "Small kitchen",
        type: "food"
      }
    ]
  }
]

With the above definition, we have expressed that we can have another house within a house!

We now have the full query language as a type!

Bonus Content: ElasticSearch implementation

As mentioned earlier, the query language is not tied to any data storage. At Qonto, we have adapters for each storage allowing us to switch implementation without any need to change anything in or about the query language.

How easy could it be to write a query language to an ElasticSearch adapter?

As it turns out, easy enough! A Ruby example:

Let’s see an example. The following query:

Will result in:

Learnings when implementing the query language

When we initially talked about the idea of the query language, we were skeptical about its complexity. Going from query params to a graph felt like a huge leap.

As it turns out, implementing the query language both on the backend and frontend proved to be relatively uneventful. This can be attributed to the fact that both backend and frontend collaborated and agreed upon the structure of the query language prior to implementation. The exercise helped us focus on our needs today and how we see them scaling in the future.

An example of this was a discussion around Not as an operator such as NotEq vs Not as a conditional along side Andand Or. Based on product requirements, and what we felt was the simplest approach, we decided to make Not an operator. Here, we’ve broken the principle of keeping the query language as agnostic as possible from the UI, but we felt it was the right decision for us.

Here’s a few practical things to keep in mind if you decide to implement the query language for your product:

The query language now is a complex javascript object. It can still be encoded as query params but the encoding will be massive. Bear in mind you could hit a potential limit on the query string. In our case, we decided to go with a POST request instead for the reason mentioned above. Also, it's simpler to inspect the network call and see quickly what was sent to the server.
Your links will break! Regardless, if you go for the POST or GET solution, you will have to adjust how you would link from other pages into the filters one. For example, you can no longer do <a href="/reporting?min_amount=4>See transactions</a>. One way to tackle this is to keep a backwards compatible adapter where you can still read flat query params and convert them into the query language.
Your clients now will generate arbitrarily complex queries and your backend may not be able to run them or run into catastrophic performance issues. The ElasticSearch implementation above uses recursion to generate the queries. Imagine a bad actor sending an infinitely nested query. You will now be susceptible to a StackOverflow. Even if your implementations do not use recursion, validate the number of nested groups and filters to avoid any performance or memory issues.
On the same note as the last point, be careful to always add your authorisation rules at the top of the query. Do not rely on the clients to filter by a specific app_id or organization_id.

GraphQL

One might wonder if the same could have been achieved using GraphQL. Any type of graphing should work just fine. In our case, we didn’t explore much into GraphQL as we do not use it at Qonto. Adapting GraphQL just for the feature felt like a massive undertaking with small impact.

Conclusion

Our new filters have been in production for a couple months with great success. We were able to unblock our customers and offer them the flexibility they needed.

Having gone down the road of a unified query language has paid off massively as now developers are able to quickly iterate and add new filters.

Moving away from the simplicity of query parameters into a more complex solution should not be taken lightly. Speaking with customers and understanding fundamentally what they need is what’s important. In other cases, a more powerful export of data via APIs, where customers can manipulate their data in their own way, would be more appropriate.

If you’re planning to go down this path, think carefully of your requirements and look at how other tools are incorporating such functionalities.

About Qonto

Qonto is a finance solution designed for SMEs and freelancers founded in 2016 by Steve Anavi and Alexandre Prot. Since our launch in July 2017, Qonto has made business financing easy for more than 200,000 companies.

Business owners save time thanks to Qonto’s streamlined account set-up, an intuitive day-to-day user experience with unlimited transaction history, accounting exports, and a practical expense management feature.

They have more control, whilst being able to give their teams autonomy via real-time notifications and a user-rights management system.

They have improved visibility on cash-flows through tools such as smart dashboards, transaction auto-tagging, and cash-flow monitoring.

They also enjoy stellar customer support at a fair and transparent price.

Interested in joining a challenging and game-changing company? Consult our job offers!