sync.RWMutex

Solving readers-writers problems

Published in

golangspec

8 min readMar 26, 2018

Readers-writers problems (plural since there’re some variations) occur when shared piece of data needs to be accessed by multiple threads. There’re two types of threads accessing data — readers and writers. Readers only read data and writers modify it. When writer has access to data then none other thread (reader or writer) can share such access. This constraint takes place in real life when e.g writer cannot modify data (like database) in atomic way so reading incomplete changes must be blocked to prevent loading corrupted data. There’re many modifications of the core problem like:

writers cannot be starved (waiting indefinitely for their turn)
readers cannot be starved
no thread shall be allowed to starve

Concrete implementation of multi-reader / single-writer mutual exclusion lock (like sync.RWMutex) solve one of readers-writers problems. Let’s see how it’s done in Go and what kind of guarantees it gives…

As a bonus we’ll look into profiling contended mutexes.

Article has been translated into Chinese — here.

Usage

Before diving into implementation details let’s see sync.RWMutex in action. Program below uses reader/writer mutex to protect critical section — sleep(). To visualise the whole process critical section is decorated with counting how many readers and writers are currently executing it (source code):

package mainimport (
    "fmt"
    "math/rand"
    "strings"
    "sync"
    "time"
)func init() {
    rand.Seed(time.Now().Unix())
}func sleep() {
    time.Sleep(time.Duration(rand.Intn(1000)) * time.Millisecond)
}func reader(c chan int, m *sync.RWMutex, wg *sync.WaitGroup) {
    sleep()
    m.RLock()
    c <- 1
    sleep()
    c <- -1
    m.RUnlock()
    wg.Done()
}func writer(c chan int, m *sync.RWMutex, wg *sync.WaitGroup) {
    sleep()
    m.Lock()
    c <- 1
    sleep()
    c <- -1
    m.Unlock()
    wg.Done()
}func main() {
    var m sync.RWMutex
    var rs, ws int
    rsCh := make(chan int)
    wsCh := make(chan int)
    go func() {
        for {
            select {
            case n := <-rsCh:
                rs += n
            case n := <-wsCh:
                ws += n
            }
            fmt.Printf("%s%s\n", strings.Repeat("R", rs),
                    strings.Repeat("W", ws))
        }
    }()
    wg := sync.WaitGroup{}    for i := 0; i < 10; i++ {
        wg.Add(1)
        go reader(rsCh, &m, &wg)
    }
    for i := 0; i < 3; i++ {
        wg.Add(1)
        go writer(wsCh, &m, &wg)
    }
    wg.Wait()
}

Environment of programs launched on play.golang.org is deterministic (e.g. when time begins) so rand.Seed(time.Now().Unix()) will provide the same seed value and the program’s output will be most likely the same. Either put different seed each time or run program on your machine.

Sample output:

WR
RR
RRR
RRRR
RRRRR
RRRR
RRR
RRRR
RRR
RR
RWR
RR
RRR
RRRR
RRR
RR
RW

New line is printed each time set of goroutines (readers and writers) executing critical section changes. It’s visible that RWMutex allows for either at least one reader or exactly one writer.

What is also important and will be discussed further is that once writer calls Lock() then new readers are blocked. Writer waits for current set of readers to finish their jobs (departure) and when done writer is ready to go. It’s visible by decreasing number of readers one at a time and afterwards writer is displayed in a single line:

...
RRRRR
RRRR
RRR
RR
RW...

Once writer ends, previously blocked readers are resumed and next writer can start. Worth to know is the fact that if writer is done and there’re both readers and writers waiting then first readers will be unblocked and then writer. This way writer need to wait for the current set of readers so neither readers or writers will get starved.

Implementation

Be warned that implementation being investigated (718d6c58) can change any time.

RWMutex exposes two methods used by readers (RLock and RUnlock) and two dedicated for writers (Lock and Unlock).

RLock

For the sake of brevity let’s skip parts related to race detector (they will be replaced by ...).

func (rw *RWMutex) RLock() {
    ...
    if atomic.AddInt32(&rw.readerCount, 1) < 0 {    
        runtime_SemacquireMutex(&rw.readerSem, false)
    }
    ...
}

Field readerCount is an int32 value indicating number of pending readers — already reading data or blocked by writer. This is basically the number of readers which have called RLock function but no RUnlock yet.

atomic.AddInt32 is the atomic equivalent of:

*addr += delta
return *addr

where addr has type *int32 and delta has type int32. Because it’s atomic operation there is no risk that adding delta will interfere with other threads (more about fetch-and-add).

If writers don’t come into play then readerCount is always greater than or equal to 0 and readers have fast non-blocking path involving only calls of atomic.AddInt32.

Semaphore

It’s a data structure invented by Edsger Dijkstra and it’s useful to solve many synchronization problems. It’s an integer with two operations:

acquire (known also as wait, decrement or P)
release (signal, increment or V)

Operation acquire decreases the semaphore by 1. If the result is negative then thread blocks and cannot resume until other thread will increment the semaphore. If result is positive then thread continues execution.

Operation release increases the semaphore by 1. If there’re blocked threads then one of them gets unblocked.

Go’s runtime provides runtime_SemacquireMutex and runtime_Semrelease functions which are used e.g. to implement sync.RWMutex.

Lock

func (rw *RWMutex) Lock() {
    ...
    rw.w.Lock()
    r := atomic.AddInt32(&rw.readerCount, -rwmutexMaxReaders) + rwmutexMaxReaders
    if r != 0 && atomic.AddInt32(&rw.readerWait, r) != 0 {     
        runtime_SemacquireMutex(&rw.writerSem, false)
    }
    ...
}

Method Lock is used by writers to get exclusive access to shared data. First it’ll acquire a mutex w which excludes other writers. This mutex is unlocked at the very end of Unlock function. Next it decreases readerCount by rwmutexMaxReaders (1 << 30). When readerCount becomes negative then RLock will block upcoming readers:

if atomic.AddInt32(&rw.readerCount, 1) < 0 {
    // A writer is pending, wait for it.    
    runtime_SemacquireMutex(&rw.readerSem, false)
}

Since new readers will be blocked then what happens with readers already running? Field readerWait is used to remember current number of readers and block writer on semaphore. This semaphore will be unblocked when last reader unlocks the mutex in RUnlock method discussed below.

If there aren’t active readers then writer simply continues its execution.

rwmutexMaxReaders

In rwmutex.go there is a constant:

const rwmutexMaxReaders = 1 << 30

What it does and what is the meaning of 1 << 30 number?

Field readerCount is of type int32 so has range:

[-1 << 31, (1 << 31) — 1] or [-2147483648, 2147483647]

RWMutex uses this field both for counting pending readers and to mark pending writer. In Lock method:

r := atomic.AddInt32(&rw.readerCount, -rwmutexMaxReaders) + rwmutexMaxReaders

field readerCount is decreased by 1<<30. Negative value of readerCount means that writer is pending and readerCount + rwmutexMaxReaders is the current number of pending readers. It also limits how many readers can be pending. If we would have rwmutexMaxReaders or more pending readers then readerCount would be non-negative which would collapse the whole mechanism. So the actual limit on the number of readers is:

rwmutexMaxReaders - 1

which is still over a billion — 1073741823.

RUnlock

func (rw *RWMutex) RUnlock() {
    ...
    if r := atomic.AddInt32(&rw.readerCount, -1); r < 0 {
        if r+1 == 0 || r+1 == -rwmutexMaxReaders {
            race.Enable()
            throw("sync: RUnlock of unlocked RWMutex")
        }
        // A writer is pending.
        if atomic.AddInt32(&rw.readerWait, -1) == 0 {
            // The last reader unblocks the writer.       
            runtime_Semrelease(&rw.writerSem, false)
        }
    }
    ...
}

This method decrements readerCount (incremented by RLock). If readerCount is negative then it means that writer is either waiting or running. It’s because because readerCount has been decreased by rwmutexMaxReaders in Lock method. Afterwards it’s checked if number of departing readers (readerWait) is finally 0 because it means that writer can finally acquire semaphore.

Unlock

func (rw *RWMutex) Unlock() {
    ...
    r := atomic.AddInt32(&rw.readerCount, rwmutexMaxReaders)
    if r >= rwmutexMaxReaders {
        race.Enable()
        throw("sync: Unlock of unlocked RWMutex")
    }
    for i := 0; i < int(r); i++ {
        runtime_Semrelease(&rw.readerSem, false)
    }
    rw.w.Unlock()
    ...
}

To unlock the mutex held by writer first readerCount is increased by rwmutexMaxReaders so it’ll be again non-negative number. If readerCount is greater than zero then it means that some readers were waiting for writer to finish so now it’s time to wake them up. Afterwards writer mutex is released to allow writer to lock the RWMutex for writing.

Unlock and Runlock methods will throw an error if reader or writer will try to unlock unlocked mutex (source code):

m := sync.RWMutex{}
m.Unlock()

Output:

fatal error: sync: Unlock of unlocked RWMutex
...

Recursive read locking

Documentation says:

If a goroutine holds a RWMutex for reading and another goroutine might call Lock, no goroutine should expect to be able to acquire a read lock until the initial read lock is released. In particular, this prohibits recursive read locking. This is to ensure that the lock eventually becomes available; a blocked Lock call excludes new readers from acquiring the lock.

RWMutex works in a way that if there is a pending writer then all attempts to call RLock will lock no matter if e.g. read lock has been already acquired (source code):

package mainimport (
    "fmt"
    "sync"
    "time"
)var m sync.RWMutexfunc f(n int) int {
    if n < 1 {
        return 0
    }
    fmt.Println("RLock")
    m.RLock()
    defer func() {
        fmt.Println("RUnlock")
        m.RUnlock()
    }()
    time.Sleep(100 * time.Millisecond)
    return f(n-1) + n
}func main() {
    done := make(chan int)
    go func() {
        time.Sleep(200 * time.Millisecond)
        fmt.Println("Lock")
        m.Lock()
        fmt.Println("Unlock")
        m.Unlock()
        done <- 1
    }()
    f(4)
    <-done
}

Output:

RLock
RLock
RLock
Lock
RLock
fatal error: all goroutines are asleep - deadlock!

Copying locks

go tool vet is able to detect if locks are copied by value which can result in deadlocks. More about it in one of previous stories:

Detect locks passed by value in Go

Introduction to `go tool vet -copylocks`

medium.com

Performance

Some time ago it was discovered that RWMutex’s performance degrades with increased number of CPU cores — https://github.com/golang/go/issues/17973.

Contention

Go ≥ 1.8 enables profiling contended mutex (patch). Let’s see how it works:

package mainimport (
    "net/http"
    _ "net/http/pprof"
    "runtime"
    "sync"
    "time"
)func main() {
    var mu sync.Mutex
    runtime.SetMutexProfileFraction(5)
    for i := 0; i < 10; i++ {
        go func() {
            for {
                mu.Lock()
                time.Sleep(100 * time.Millisecond)
                mu.Unlock()
            }
        }()
    }
    http.ListenAndServe(":8888", nil)
}> go build mutexcontention.go
> ./mutexcontention

While the program mutexcontention is running:

> go tool pprof mutexcontention http://localhost:8888/debug/pprof/mutex?debug=1
Fetching profile over HTTP from http://localhost:8888/debug/pprof/mutex?debug=1
Saved profile in /Users/mlowicki/pprof/pprof.mutexcontention.contentions.delay.003.pb.gz
File: mutexcontention
Type: delay
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list main
Total: 57.28s
ROUTINE ======================== main.main.func1 in /Users/mlowicki/projects/golang/src/github.com/mlowicki/mutexcontention/mutexcontention.go
0     57.28s (flat, cum)   100% of Total
.          .     14:   for i := 0; i < 10; i++ {
.          .     15:           go func() {
.          .     16:                   for {
.          .     17:                           mu.Lock()
.          .     18:                           time.Sleep(100 * time.Millisecond)
.     57.28s     19:                           mu.Unlock()
.          .     20:                   }
.          .     21:           }()
.          .     22:   }
.          .     23:
.          .     24:   http.ListenAndServe(":8888", nil)

What’s this 57.28s and why it points to mu.Unlock() ?

When goroutine blocks on mutex by calling Lock method then exact time when it happens is recorded — acquiretime. Afterwards when another goroutine unlocks that mutex and there is at least one goroutine waiting to acquire it then one of them is unblocked and mutexevent function is called. Function mutexevent checks if based on rate set by SetMutexProfileFraction this particular event should be saved or discarded. Such event contains time spent on waiting (current time — acquiretime). In listing above total wait time across all goroutines blocked on particular mutex is gathered and displayed.

Contention on reader lock (Rlock and RUnlock) will be added in Go 1.11 (patch).

Resources

“The Little Book of Semaphores” by Allen B. Downey
Documentation of sync.RWMutex
Readers-writers problem on Wikipedia
Reusable barriers in Golang
Synchronization queues in Golang

👏👏👏 below to help others discover this story. Please follow me here or on Twitter if you want to get updates about new posts or boost work on future stories.