Database Lock in Elixir? Why and How? And the Alternative in Elixir

KaFai C
4 min readMar 18, 2019

--

I got a question in a recent interview about locking in Database. I found that I actually lack insufficient knowledge of this essential part of building a robust application. I would like to share my finding on this topic here.

Photo by Chris Barbalis on Unsplash

Why?

It’s all about race condition.

A big part of most web applications is used to store some data, probably in a database. Imagine in a simple case of updating a user credit in an online book shop. A user can spend this credit earned by the previous purchase.

defmodule BookOrdering do
def purchase_book_with_credit(user_id, book_id) do
with {:ok, book} <- get_book(book),
{:ok, new_user_credt_balance} <- UserCreditBalance.charge(user_id, book) do
publish_event(:purchase_book_with_credit_success)
:ok
end
end
end
defmodule UserCreditBalance do
def charge(user_id, book) do
credit_balance = get_credit_balance(user_id)
update_balance(credit_balance, credit_balance.amount - book.cost)
end
end

For any non-trivial web application, there will be some sort of concurrency supported. And assume we are using the Phoenix web framework in elixir, each HTTP request will start as a connection process in Ranch.

Now image a user open 10 browser windows and press purchase at the same time. What will happen could be treated as spawning multiple processes and calling the purchase_book_with_credit

Enum.each(1..10, fn _ ->
spawn(fn -> purchase_book_with_credit(user_id, book_id))
end)

This will create a possibility of race condition in which the new amount of a user credit might change after it gets read and before it gets committed. This bug-prone pattern could be generalized as read-modify-write.

Luckily, SQL database provides us lock which disallows two database sessions to update the same data.

Pessimistic Locking

Row Level Locking

For our above specific example, it could be easily solved by using row-level lock provided in SQL withSELECT FOR UPDATE .

In Ecto, you could do

User |> where(u.id == ^current_user) |> lock("FOR UPDATE")

Transaction Isolation Level

Depends on how you structure your code involving an Ecto query, it’s usually unnecessary and decreasing performance(Lock contention) to add a lock for all query.

In most cases, it will be more suitable to set the transaction isolation level to Repeatable read which looks something like this

defmodule UserCreditBalance do
def charge(user_id, book) do
Repo.transaction(fn ->
Repo.query!("SET TRANSACTION ISOLATION LEVEL REPEATABLE READ")
credit_balance = get_credit_balance(user_id)
update_balance(credit_balance, credit_balance.amount - book.cost)
end)
end
end

In this case, the SQL database will abort the update if it detects the row read has been updated and performs a rollback which essentially does the same locking for you.

Optimistic Locking

The biggest downside of using pessimistic locking is the decrease in performance. In reality, multiple transactions involving the same row can usually be complete with affecting others. In this case, optimistic locking will be a better solution and this will be implemented in application-side and Ecto also provides it.

And it’s implemented by adding an integer column and a changeset function optimistic_lock and Ecto can use the integer column to check if the data has been changed before committing it.

Using Process as an Alternative

Although database lock is powerful, they are not the best in performance and it might not able to help if the operation involves other data living outside of SQL database(maybe :ets or an external API).

For the above example of purchasing a book, we could generalize it as a problem of serializing access to some resources. We could use the power of Erlang process in which the mailbox is received and could be handled synchronously.

There could be lots of different implementation. My first naive implementation is to spawn a child GenServer work with a unique identifier under a dynamic supervisor and its sole purpose is to take a thunk and execute it synchronously and send the result back to the caller.

This approach works fine for the above case as we can use user_id as the unique identifier and pass &UserCreditBalance.Charge/1 as a thunk. We can achieve the same lock effect without a database lock. Also, there will not be a deadlock problem.

The downside of it will be using more memory due to copying data created in the worker process to the caller process. Also, if you are running more than one node, the nodes have to be connected and use some kind of global registry to achieve this.

There is a nicer library written in Erlang called Sleeplocks which also spawn a process with a unique identifier but the process will only keep the total number of slots, the processes currently holding references to the lock, and a queue of waiting processes trying to acquire a lock. So it avoids the data copying as described in the above naive approach.

Please let me know what’s your thought on these and comment if there are other better alternatives.

--

--