Do you need ElasticSearch when you have PostgreSQL?

mdebelle
The Qonto Way
Published in
6 min readAug 1, 2022

At Qonto, we noticed that it is powerful for business owners to grant cards for one-off expenses to their employees. That’s why we created Flash cards. With this feature, a customer can create a temporary card with a specific budget and time validity that corresponds to a specific need. The Flash card expires automatically on the date set.

Sometimes building a feature seems easy but when you dive into it, it becomes hard.

Thankfully, before diving into a project, we, at Qonto, investigate the whole technical extent of the code changes that the feature requires.

When we did so for the new Flash card feature, we quickly figured out it was one of those times where you hit unexpected complexity.

Indeed, the cards list in the Qonto app showed off the expired cards first, delegating the active cards at the bottom. Flash cards being so short-lived, the number of expired cards would spike. People want to quickly access their active cards, which would be drowned below expired cards. We needed the active cards to appear first.

How are cards sorted? Our Web application calls the backend API which returns cards in the order intended for display. In turn, the backend calls ElasticSearch which does the sorting and filtering of cards.

However, editing the sorting logic in ElasticSearch could be brittle. Maintenance and production management matter.

When ElasticSearch is too much

A few months ago, we incurred an incident with ElasticSearch. We tried to index a transfer containing an abnormally long comment field with the wrong text analysis configuration. It led to a very long indexation degrading the performance of the whole cluster.

Live incident investigation on the ElasticSearch degraded performances

This incident has been a good opportunity to step back and ponder on the validity of some of our ElasticSearch usage.

ElasticSearch is a good solution when you have a large portion of text to analyze: it provides a very comprehensive syntax for full-text search and is both fast and accurate.

However, it’s subtle to use for 3 main reasons:

  1. ElasticSearch is optimized for indexing large static datasets in batch. Even if we attempted to force synchronous indexation, we would never get as strong a guarantee as you get in PostgreSQL; consistency anomalies are inevitable.
  2. Identical pieces of data live both in the PostgreSQL database and the ElasticSearch cluster. We don’t want to deal with synchronization issues inherent in writing in two data stores. (What if the write succeeds in PostgreSQL and fails in ElasticSearch?)
  3. Indexing the right fields in a robust way can be tricky; a generous index type can trigger unexpected performance spikes, as we saw during the incident.

As a result, we studied possible implementations of the new card sorting order. We weighed a few options:

  • Continue using only ElasticSearch for the feature;
  • Search directly from the PostgreSQL database.

The first one would have worked (with a bit of Lucene hackery) but we wanted to move to something more robust and maintainable.

We were left with the second solution. But, is it really a good option? Let’s dive into PostgreSQL as an alternative.

Boring Technology

ElasticSearch is not the only one with a few tricks up its sleeve! PostgreSQL can also search logarithmically throughout large datasets. By the way, using PostgreSQL in Ruby on Rails is a well-trodden cowpath compared with ElasticSearch.

Even if we hit some performance issues, PostgreSQL will have some index type to cover our use cases:

We allow filtering cards by the holder’s name. For this, we can use the GiST index with trigrams.

We can use the same trick to filter cards by a given substring of the name.

The main thing we must keep in mind is balancing the tradeoff between performance and index size. But it is nothing new: the same tradeoff impacts our choices for ElasticSearch indexing.

As you can see, PostgreSQL lets us do some of the simple operations we can find in ElasticSearch. Nowadays, sorting card statuses with ElasticSearch is like using a sledgehammer to crack a nut.

How The Sausage Gets Made

Do you remember what we tried to achieve? We are introducing a Flash card: you can set it to expire after just a couple of days.

Current model

The code below describes the schema of the cards table:

And the code for the cards’ statuses is specified in the model:

Let’s reorder cards to put the active ones first.

Ordering cards

How would we order this kind of enum? We could use a series of if. Or, and that’s what we did, we could use this neat trick array_position() function, to allow custom ordering of integer enumerations.

With this, we avoid using ElasticSearch scripts which additionally not to be elegant may be tedious to maintain. Look at how awkward it would be:

Well, this list is now ordered and it is going to be displayed in a nice way for the user.

Now, what about nailing it by combining it with the filters we need to build so the query planner will reuse the indexes we put in place?

Filtering the result

Let’s say one of our business owners is just looking for cards owned by John Doe, their employee. They search for John in the search bar. We would play a query very much like this one:

Maintaining the code is now a walkover, thanks to Rails’ interface to the SQL database.

And there’s still more!

It didn’t even make a dent

To make sure that the improvement didn’t degrade our performance, we monitored them with some tools like NewRelic.

  • The initial query with ElasticSearch took almost 20 ms per query.
  • After refactoring, we gain 80 % off duration (2~4 ms).

Where are we winning on this latency?

We are not even adding indexes! We simply gain time on the network hop that we don’t do to ElasticSearch.

The existing PostgreSQL indexes are enough: the database does one B-tree lookup through the organization’s foreign key, and the handful of cards held by a single organization is efficiently sifted through with a sequential scan.

For later, we know that PostgreSQL’s toolbox of index types is there if and when we need it.

Conclusion

Following the success of this deployment, we made an effort across the whole backend teams, and we wrote a standard helping to determine what best search/DB engine fits our needs.

When indexation is simple to implement with the existing SQL database, we don’t need to use external tools.

We now only use ElasticSearch for the cases where it is the best tool (mainly our logs and a few other standalone documents).

About Qonto

Qonto is a finance solution designed for SMEs and freelancers founded in 2016 by Steve Anavi and Alexandre Prot. Since our launch in July 2017, Qonto has made business financing easy for more than 250,000 companies.

Business owners save time thanks to Qonto’s streamlined account set-up, an intuitive day-to-day user experience with unlimited transaction history, accounting exports, and a practical expense management feature.

They have more control, whilst being able to give their teams autonomy via real-time notifications and a user-rights management system.

They have improved visibility on cash-flows through tools such as smart dashboards, transaction auto-tagging, and cash-flow monitoring.

They also enjoy stellar customer support at a fair and transparent price.

--

--