Database Locking with Ecto in Elixir
How to be sure tasks are executing once (and only once) across multiple servers
So before I go into database locking (what is it?) and implementing it, I want to first start off with a problem and how I came to learn about this myself.
We have an app that is in charge of sending out invoices when a set of criteria are met (X days before some date…). There is a task we developed that checks daily for pending invoices that need to be sent. The issue we found was that when we ran this process in production, we risked billing customers multiple times because we have multiple production servers running this task at the same time. Mo’ servers mo’ money right?
Wrong! It turns out, people do not like to be sent multiple invoices for the same thing. Who would have guessed.
In this post, we’ll walk through setting up — and solving — this double-billing problem together with Ecto locks.
Setting Up The Problem
Note: If you don’t want to go through the setup of installing Ecto and creating a migration, feel free to clone the demo repo and check out the create-invoices tag, then just skip down to the Feeling the Pain section in this post.
To start off here, I’m assuming that you already have Elixir and Postgres installed. Run
mix new — sup ecto_lock in your terminal to create a new project.
Open up the project and update the
deps function in your
And then also update the
application function in the same file:
Next, load the dependencies with
mix deps.get in the terminal and create the db config with
mix ecto.gen.repo -r EctoLock.Repo.
Let’s also add our repo module to our startup processes by adding the following to our
And then replace
config/config.ex file with:
Creating the Invoice Table
To create the migration, let’s run
mix ecto.gen.migration create_invoices in the terminal. This will create a migration file:
priv/repo/migrations/<some-numbers>_create_invoices.exs (note: the filename has a timestamp in it, so yours won’t be exactly the same as mine here). Then replace the contents of this file with the following:
This will create a very simple table that has one column,
pending. In our app, of course, the
invoices table has many more columns with useful information (balance, due_date, etc…), but we don’t need anything else to learn about lock :)
Finally, let’s create a basic schema file for an invoice in a new file,
lib/ecto_lock/invoice.ex, and fill it in with the following:
Now this should be enough for us to get going! Go ahead and run the migration with
mix ecto.create && mix ecto.migrate.
Feeling the Pain
Let’s write some code where we actually run into this locking issue. We’ll create a new file at
lib/ecto_lock/bill_pending_invoices.ex and fill it with the following:
This code adds some basic functions we can use to create, send, and update invoices. We want to grab all of our pending invoices with
bill_pending_invoices/0 and then send each one. For each one we send, we need to get the invoice, hit our billing API, and then update our own database so we know that it was sent. For the purposes of this blog post, we’re adding a one second delay (
:timer.sleep(1000)) where we are pretending that the process is going through an API request/response. Now, let’s give it a try!
Let’s also create a helper module for ourselves with some commands that we’ll be running frequently. Go ahead and create a file
lib/ecto_lock/helper.ex and put the following in it:
Start up the app by typing
iex -S mix in the terminal. This command gives us an interactive Elixir process:
Inside the Elixir process, let’s create a few pending invoices by running the following function:
Then, let’s send out the invoices by running
EctoLock.BillPendingInvoices.bill_pending_invoices(). You should see something that looks like this:
We can see that all three invoices were sent!
Go ahead, run it again, and you’ll see there are no more invoices left to send. We’ll get an immediate return, rather than the three second wait from “hitting the API”.
Now this all works well and good, but what if we had two servers running this process at the same time? What would that look like? To simulate this, we’re going to spin up two Elixir processes that will run at relatively the same time.
To help us do this, we’re going to add the following function to our
Note: Spawning an elixir process is just allowing some code to execute asynchronously. It’s a great topic for another blog post, but for the moment, I think that should be enough of an understanding :)
Go ahead and restart your
iex session (
ctrl + c twice and then
iex -S mix). Let’s create some more dummy invoices with
EctoLock.Helper.create_invoices(), and then let’s run
EctoLock.Helper.bill_from_two_servers(). This is what we get:
iex(2)> EctoLock.Helper.bill_from_two_servers()Sending invoice 4...Sending invoice 4...Invoice 4 sent!Invoice 4 sent!Sending invoice 5...Sending invoice 5...Invoice 5 sent!Invoice 5 sent!Sending invoice 6...Sending invoice 6...Invoice 6 sent!Invoice 6 sent!
Note: I removed all of the SQL queries from above and only left our
IO.puts to make this a bit more readable.
Now that’s a problem. We’ve sent out each invoice twice.
Note: You can skip up to this part of the app by cloning the feeling-the-pain tag.
There are, of course, many possible solutions to this problem. I’ll touch on a few of the ones we considered, but I won’t go too deep into any of them.
1. Only Run the Process on One Server
This was my first thought when we ran into this issue. I figured the easiest solution would be to just avoid this problem in the first place. After talking with the team though, we realized that:
- There are much larger operation concerns here with regards to how to do this. Is one server designated prime? What if it goes down? Do all the servers need to then talk to each other…
- When dealing with something like invoicing customers, we want to be sure that we’re not billing them twice or anything like that. Relying on each server to know if it should or should not run the task is brittle; it would be better to rely on something more stable and with fewer components.
2. Idempotence Keys
Given that we are trying to combat the same action occurring multiple times, another option here was to implement some sort of unique key and having some other service (e.g. the service in charge of the billing) be in charge of recognizing the same request being sent in twice. This option is highly variable because it depends on the specific service (API) that we’re using. While this seemed like a great option, it just wasn’t an available feature for us.
The idea here is that each server looks up an invoice to send it. When it retrieves the invoice from the database, it puts a lock on that row so that no other process can access it. This means that if Server A looks up invoice 7 and puts a lock on it, Server B would not be able to retrieve the same invoice and therefore would not be able to send it out.
What does it do?
Surprise surprise, we ended up going with database locking! Database locking is a very robust and well used tool. As explained above, database locking allows us to make a row inaccessible. We can use this tool to ensure that only one process is working on a piece of data at a time so we can avoid write conflicts (when two processes try and make updates at the same time).
How do you use it?
Let’s go over how we can use Ecto’s lock function.
Like all Ecto queries, there are two ways we can use this. We can either use the keyword syntax or the expression syntax. I’m going to go ahead and use the keyword syntax, but either would work.
First, let’s add the following function to our
Here we are querying for an invoice and then locking it. The string we’re passing to lock has to be a very specific string though. We can take a look at some options for postgres here. See the section on
FOR UPDATE. This ensures that this row is locked until we update it in this transaction. We are also adding the
NOWAIT option to ensure that other processes will simply fail when trying to retrieve the same row rather than waiting to perform their action. If you leave the
NOWAIT option off, our second process would still try to send out an invoice after the first completes. We know we don’t want this since the invoice is already being handled, so we’re just telling our second server not to wait.
Now let’s update our
get_invoice function in the
EctoLock.BillPendingInvoices module to look like:
This will now ensure that when we retrieve the invoice, we are also locking it. Let’s go ahead and test our code again to see how it works! Restart your interactive Elixir terminal and run
EctoLock.Helper.create_invoices() followed by
EctoLock.Helper.bill_from_two_servers(). This is what we get:
Note: I’ve removed some of the SQL messages here, but not all of them.
Alright! We can see that we’ve solved the problem with locking!
Now each invoice is only being sent one time :) Whichever server did not get the database lock ended up throwing the following error:
** (Postgrex.Error) ERROR 55P03 (lock_not_available) could not obtain lock on row in relation “invoices”. Failing to get the lock threw an error that ended up stopping this process from continuing.
We can now move forward with confidence that we’re only invoicing each customer once.
Wrap Up and Future Work
This is meant to be a very basic intro to locking. Obviously it isn’t ideal to throw an error and have that be an expected flow, but the way I would solve this goes into using
Ecto.Multi to create larger transactions. Hopefully we’ll have a blog post coming out about that soon!
In the meantime, I hope this gave a quick explanation of what database locking is, why you might want to use it, and how to implement it in Elixir with Ecto!
Thanks for reading! Want to work on a mission-driven team that loves Elixir and secure database transactions? We’re hiring!