Row Locking Transaction in PostgreSQL

How I was able to perform Row Locking for Read-Write Transactions in Postgres. — It took me 2 days to attempt to solve this, but the solution is actually very simple and elegant.

Iman Tumorang
Xendit Engineering
5 min readMar 30, 2021

--

Concurrent Acces by Xendit

Today's problem was complex with an obscure solution. I have been working on the refactoring of our core-payment system that handles the payment transaction to/from each bank that we support.

While developing this new core service, one of the services is related to generating a payment code, let’s call it the Payment Code Generator service. The payment_code needs to be random but also unique. In short, requires a counter to generate, which means we need to store the counter.

To simplify the scenario, let’s say I’m building a service that will generate a payment code.

In this service, I have 2 tables, master_counter and payment_code.

The master_counter table schema

And the payment_code schema

In the master_counter, there is a counter column that will always increase based on received requests. If we draw into a diagram, the flow will look something like this:

Flow the Generating Payment Code
  1. When the user requests a new payment code, we will get its current counter value.
  2. The counter value is then increased by 1 and the algorithm is applied. (simply: hash the counter value to be a random character)
  3. After the new counter value is hashed, the hashed value is stored to the payment_code table with the unique constraint.
  4. The counter value is updated based on user id again to master_counter.
  5. The hashed counter value is returned to the user. This is a payment code that will be used by the user.

While there are a lot of algorithms used to achieve this, the above flow sufficiently represents the simplified over all process.

Problems Started

Initially, when written in SQL, the query looks similar to this:

However, I’m using GoLang, so there is native support for transactions. This can be seen in the example shown below:

This function works perfectly when used as a single request. However, when load testing with only 2 concurrent users produced the following errors:

The “duplication error on payment_code” is happening because the generated payment_code must be unique.

While the application is behaving correctly by returning the error, from a business point of view, this is a problem. It should not be returning a duplication error, when the user is requesting a “newpayment_code that has never been used?

My first assumption is, if there’s a duplication, it means, that there’s a race condition here. The race condition occurred when I did the update and read on the counter value. The counter will be hashed for the payment code, and the payment code must be unique, but, because of the race condition, the application is reading the same value of the counter which is causing duplicate key violations.

In simple, the error occurs because we have met with a race condition where the counters are not updated when concurrent requests come in, causing duplication to occur when we attempt to create new payment codes in parallel requests.

First Attempt! — Using Isolation Level

Knowing this, I took to the internet to search and learn about the transaction, bring me to these articles:

I realized that there’s also an isolation level on the transaction. Reading all those articles made me remember my old days in college life. I remember I’ve learned this, but I had never found a case when I really needed to understand or apply this in practice.

From this research, I concluded that the isolation level on the transaction is not enough to lock the concurrent read-write-update processes in the database.

To address this, I added the isolation level to the application.

After this fix, I re-ran the application, which returned a new error:

This fix resulted in the avoiding the duplication error, but it also completely blocked concurrent requests. This is no different from having 1 concurrent request to a lot of concurrent requests.

Since from a business standpoint we need this system would be able to handle requests concurrently, I broadened my search to get advice from peers; colleagues, friends outside of work and even many Senior DBA from other companies.

Thanks to this crowdsourced effort of knowledge sharing, I realized, that the solution was row-level locking. Since I only need to read and update the counter value, we only need a row-level locking mechanism.

Second Attempt! — Using SELECT FOR UPDATE

Since I just want row-level locking, I didn’t need a strict isolation level because an isolation level is for table scope.

So my second try is:

  • Isolation level: READ COMMITTED (Postgres default isolation level)
  • Row-level locking: SELECT FOR UPDATE

The change is only adding the FOR UPDATE in my select query for my counter value. This new SQL query presents as follows:

This worked successfully, and in the end all it took was adding the FOR UPDATE to solve the problem. While this may not be the right end state solution, for now at least, it solves our current problem.

I discovered that there is a lot of things that you need to understand about SELECT ... FOR UPDATE — I found this article most helpful http://shiroyasha.io/selecting-for-share-and-update-in-postgresql.html

It’s amazing that this issue could be solved with only a simple SELECT ... FOR UDPATE. Maybe later there will be a new issue, but we leave that for our future selves 😎

--

--

Iman Tumorang
Xendit Engineering

Software Architect @ Xendit | Independent Software Architect @ SoftwareArchitect.ID | Reach me at https://imantumorang.com for fast response :)