The Readers-Writers Problem (Swift Edition)

Bruno Rovea
The Startup
Published in
4 min readJul 10, 2020

The readers-writers problem is a concurrency problem faced when a shared resource that is not thread-safe can be read and written by multiple threads at the same time. The problem happens when one thread is reading or writing and another thread writes at the same time. In contrast, when the resource is only being read by more than one thread the problem doesn't appear since the resource is up to date and it won't provide outdated/wrong data.

Personally, I studied the concepts in college and studied the implementation on iOS for tech interviews, but luckily (and probably also irresponsibly) never had to use it professionally. Why luckily? Well, because we write (thread) unsafe code all the time, but we normally don’t suffer from this problem because of the architecture and/or use case (or sometimes, just luck). If you doubt it, know that a simple mutable array is not thread safe and when it is used we’re susceptible to suffer from this problem 😱

And this data structure is exactly the reason I’m writing this.

One of these days a weird, and very intermittent bug (I had to initialize the app lots and lots of times to firstly see it, since I didn’t know the cause) appeared in a list component that was fed by a shared resource. The bug consisted of one random cell duplicating the image from another cell.

The first thought was the cell reusability, but this component was already handled for this case before, and after a quick check to the code it was discarded.

Since it was a "dumb" component only consuming data from the shared resource, it had to be something with it. The shared resource had a mutable array that was only modified (written) internally by different functions depending on some business rules, but was read by a few objects, which also observed the resource for changes, so they could update themselves with the new data. And one of these objects included the list component. The shared resource code looked something like this:

private(set) var items: [Item] = []func fetchItems() {
service.fetchItems { newItems in
guard
items != self.items else { return }
self.items = items
notifyUpdatedItems()
}
}
func addItem(_ item: Item) {
service.addItem(item) { addedItem in
self
.items.append(addedItem)
notifyAddedItem()
}
}
func removeItem(_ item: Item) {
service.removeItem(item) {
self.items.remove(item)
notifyRemovedItem()
}
}

As we can see, we have one function for fetching all items, one for adding and another one for removing an item from the service, always notifying the observers and mutating the local array in the shared resource, which works as the SSoT and SPOF for this information through the app lifecycle.

Debugging the resource through breakpoints and inspection never showed any discrepancy probably because of the asynchronicity to debug without passing instruction by instruction, what would take ages to catch the problem. Because of a recent business rule the functions were being called more often in the app initialization, which caused the shared object to update itself everytime it was called. The guard statement we see in the fetch was to avoid an unnecessary update and it was working, not causing any problems, at least that was what we thought. Since debugging without appealing to Instruments didn’t bring any light, before going for the “big guns” I tried the developer's best friend, the old print statement, looking at the url returned from the resource. It was firing in order, like expected, but then in a random new initialization the duplication happened. When I saw it the bell rang and I remembered the college classes. The readers-writers problem!

Since the functions were being called by different threads almost at the same time, eventually one read would happen at the same time of a write, causing a data race, that would cause the wrong url being fetched from the resource.

Fortunately it’s very simple to solve it in Swift, since we have DispatchQueues, and can also have a barrier that works as a Semaphore for free when needed, which can control any tasks we want to run.

The DispatchQueue can be either serial or concurrent.

let serialQueue = DispatchQueue(label: "serial.queue")
let concurrentQueue = DispatchQueue(label: "concurrent.queue", attributes: .concurrent)

As they say, one of them executes tasks serially while the other, parallelly, which can bring much more performance. There's another important parameter that is the qos, we should set the correct one for our case. We can read more about each option here.

We could use the serial queue, so just one write or one read is executed at time.

private var _items: [Item] = []
private(set) var items: [Item] {
get { serialQueue.sync { self._items } }
set { serialQueue.async { self._items = newValue } }
}

We need an auxiliar array because the API for this object provides an array of items to be accessed directly, that's why we have the queue on the get, but if our API provides the array through a function, we can use the array as private and embbed the queue call in the function. Since we use the get, we also need to use the set, but if we go with the other approach, we can also put the queue call in each function that writes in the array.

But since we said in the beginning that we only want to limit the writing, we can use the barrier flag that will transform the concurrent queue in a serial one for that explicit task, when using it, the queue will wait for all pending tasks to finish, so it can run this task. After implementing the concurrent DispatchQueue, the code looked like this.

private var _items: [Item] = []
private(set) var items: [Item] {
get { concurrentQueue.sync { self._items } }
set { concurrentQueue.async(flags: .barrier) { self._items = newValue } }
}

Neat, right?

And after that? Well, duplication no more!

Thanks for the reading and hope it can help you in the future!

--

--