Four different ways to handle SQLite concurrency

Gwendal Roué
5 min readApr 6, 2016

--

Illustrated by four Swift libraries

Let’s talk about multithreaded applications that store their data in an SQLite database. Most non-trivial topics have several answers, and database concurrency is no exception.

The four libraries SQLite.swift, FMDB, GRDB, and Core Data all provide Swift bindings to SQLite. They all offer various concurrency models that differ in their safety features, and in the amount of work and programming skills that is left to the host application. Among those, FMDB and GRDB will stand up as both the most safe, and the most easy to use properly.

But what are the problems that even require a “concurrency model” in the first place? Focusing on application development, let’s list three situations that need care:

  1. Concurrent writes: Two threads want to write in the same database as the same time. SQLite does not allow that.
  2. Isolation troubles: As two database queries run one after the other, a concurrent thread sneaks in and modifies the database in between. The two queries can thus perform inconsistent fetches or updates. Maybe we will display funny values on the screen. Or face a relational constraint error. Worse, we can have a silent application model corruption: for example, when a person should have only one avatar, two avatars are stored by mistake.
  3. Conflicts: The same piece of data is both edited by the application user, and refreshed from a network operation. What will eventually be stored in the database? Can any conflict be noticed?

When an application uses SQLite through a wrapper library, the question is at which level those situations are handled, if they are handled at all.

Let’s see in details:

The serialized model

In this model, used by SQLite.swift, the application user is given a single connection to the database, and all database accesses are serialized: this means that at any moment, only one thread can read and write in the database.

This model successfully addresses the “single-writer” trouble, since it is impossible for two threads to write at the same time.

However, it leaves the application naked in front of “isolation troubles”. In the sample code below, we see an application that checks if an avatar exists before trying to insert one. This code is dangerous, because some other thread may have modified both the users and avatars tables between the two queries:

let userAvatars = avatars.filter(userId == 1)
let insert = avatars.insert(userId <- 1, url <- avatarURL)
if db.scalar(userAvatars.count) == 0 {
try db.run(insert)
}

Of course, the application can handle this situation itself. By wrapping the code in an explicit transaction, the application gets the required isolation from other threads:

try db.transaction {
if db.scalar(userAvatars.count) == 0 {
try db.run(insert)
}
}

But if the application developer overlooks the potential issue, we are left with a subtle bug that just waits to occur in an indeterminate fashion.

Finally, this model does not handle the conflicts that may occur during our concurrent user edition vs. download operation example above. The last write always wins, and conflicts have to be handled at the application level.

  1. Concurrent writes: SOLVED
  2. Isolation: handled by the application
  3. Conflicts: handled by the application

The isolated and serialized model

This model is exemplified by FMDB. Identical to the plain serialized model, it adds a tiny API modification with great consequences: there is no direct access to the database. The developer has to wrap database statements together in blocks:

dbQueue.inDatabase { db in
if db.intForQuery("SELECT COUNT ...") == 0) {
db.executeUpdate("INSERT INTO avatars ...")
}
}

It is those blocks of statements that are serialized, not individual statements.

In such a model, isolation is just a consequence of serialization: this is brilliant API design!

  1. Concurrent writes: SOLVED
  2. Isolation: SOLVED
  3. Conflicts: handled by the application

The isolated and concurrent model

In this model, we still profit from the isolation guarantees we’ve just seen above, but the pessimistic locking of serialization is lifted. Now several threads can read and write in the database at the same time.

This is the model introduced by GRDB. It is a mix of SQLite’s write-ahead logging (aka WAL mode), with extra GRDB topping.

Writes are still serialized — because the SQLite single-writer policy remains active in the WAL mode:

try dbPool.write { db in
if try Int.fetchOne(db, "SELECT COUNT ...") == 0) {
try db.execute("INSERT INTO avatars ...")
}
}

Conversely, several threads can read at the same time. They are not blocked by writes, which means that an application can update the application screen even if some long database update is running in the background. And eventual concurrent writes are not visible inside a read block:

try dbPool.read { db in
// Those values are guaranteed to be equal:
let count1 = try User.fetchCount(db)
let count2 = try User.fetchCount(db)
}

As with FMDB, the direct access to the database is actively discouraged, in favor of a block-based API where the isolation unit is the block, not the individual statement.

The write and read methods are the main entry points. More advanced use cases are also supported, but are beyond the scope of this article. Check the GRDB Concurrency Guide for more information.

  1. Concurrent writes: SOLVED
  2. Isolation: SOLVED (with WAL mode support)
  3. Conflicts: handled by the application

Core Data’s Managed Object Contexts

Core Data does not describe itself as an SQLite wrapper library, but it often uses SQLite as a storage engine. It deserves its place here.

Core Data does handle all three kinds of situations we have identified at the beginning of the article: concurrent writes, isolation troubles, and conflicts.

To achieve such a tour de force, Core Data drastically enlarges the scope, and duration, of the context managed by the library.

What is a managed context? For SQLite.swift, the managed context is the individual SQL statement, during the time of its execution. For FMDB and GRDB, the managed context is the block of subsequent statements, during the time of its execution. For Core Data, the managed context is, well, the whole world of operations that happen both in the database and in memory, during the whole lifetime of an instance of the NSManagedObjectContext class.

Yes, that is quite a gap. And this is how Core Data can detect conflicts, unlike the previous concurrency models which accept any write as long as it does not break a relational constraint of the database.

Now conflict detection has a consequence: it poses a threat to all writes. Unfortunately, handling conflicts with Core Data is difficult. Controlling when, where, and how conflicts should be raised and handled is difficult. And in the end writing correct multithreaded applications with Core Data is difficult.

It is so difficult that since the introduction of Core Data ten years ago, many good books written by smart people have been published on the subject. It is all but a closed case.

The results:

  1. Concurrent writes: SOLVED
  2. Isolation: SOLVED (with WAL mode support)
  3. Conflicts: handled by the application, and it is difficult.

Conclusion

All SQLite wrappers out there provide a concurrency model that aims at helping multithreaded applications dealing safely with the database.

But not all concurrency models are equal. Too simple, it is unforgiving to developers mistakes, and paves the way for unpredictable data corruption and crashes. Too complex, it raises errors that are very difficult to handle properly. Too pessimistic, it throttles your application performance.

FMDB and GRDB are the most safe, and easy to use properly. On top of that, GRDB allows parallels reads that improve application responsiveness.

--

--