Race Conditions, Locks, Semaphores, and Deadlocks

Writing correct programs is hard, writing correct concurrent programs is even harder. This short article is a practical introduction to Race Conditions, Locks, Semaphores, Deadlocks, Livelocks and Starvations with Java code examples.

Alex Prut
The Startup

--

Writing thread-safe code is at its core about managing access to state, and in particular to shared mutable state. By share we mean that a variable could be accessed by multiple threads; by mutable we mean that its value could change during its lifetime.

If multiple threads access the same mutable state variable without appropriate synchronization your program is broken.

There are three ways to fix it:

  1. Don’t share the state variable across threads
  2. Make the state variable immutable
  3. Use synchronization whenever accessing the state variable

Concurrency bugs are difficult to reproduce and debug. The benefit of small performance gain on some infrequently used path code may well be dwarfed by the risk that the program will fail.

Stateless and Immutable objects are always thread safe.

A class is thread-safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or other coordination on the part of the calling code. Thread-safe classes encapsulate any needed synchronization so that clients need not provide their own.

Race Condition

The possibility of incorrect results in the presence of unlucky timing is so important in concurrent programming that it has a name: race conditions. The most common type of race condition is check-then-act.

Example 1: Thread Unsafe Sequence

Let’s consider the Sequence.java class in Example 1, it is suitable to lost updates. The counter increase instruction may look like a single action because of its compact syntax, it is not atomic, meaning that it does not execute as a single indivisible operation, instead it is shorthand for a sequence of 3 discrete operations: fetch the current value, add one to it, and write the new value back. This is an example of a read-modify-write (or check-then-act) operation, in which the resulting state is derived from a previous state.

Your warning bell should ring when you see a check-then-act operation, this is prone to race-conditions.

Consider the case of two threads (A and B) as shown in Image 1. Due to bad timing, the value may miss updates.

Image 1: Race condition in Sequence.java (Example 1)

We can solve this issue by making sure that only one thread at the time can perform the next operation. As shown in Example 2 the problem is solved by adding the synchronized block to the method.

Example 2: Thread-Safe Sequence

As another example let’s consider Lazy initialization, which is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. A singleton implementation may use lazy initialization, where the instance is created when the static method is first invoked. In Example 3 we can see a bad implementation of a lazy initialization and in Example 4 how to fix it.

Example 3: Race condition in lazy initialization
Example 4: Safe lazy initialization

Lock, Monitor

In concurrent programming, a monitor is a synchronization construct that allows threads to have both mutual exclusion and the ability to wait (block) for a certain condition to become false. A lock or mutex (from mutual exclusion) is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy.

Java provides a built-in locking mechanism for enforcing atomicity: the synchronized block, these build-in locks are called intrinsic locks or monitor locks. Intrinsic locks in Java act as mutexes (or mutual exclusive locks), which means that at most one thread may own the lock. When thread A attempts to acquire a lock held by thread B, A must wait, or block, until B releases it. If B never releases the lock, A waits forever. Every shared, mutable variable should be guarder by exactly one lock. Make it clear to maintainers which lock that is. In Example 5 is shown the usage of the synchronized block.

Example 5: lock usage on the current object

Avoid holding locks during lengthy computations or operations at risk of not completing quickly such as network or console I/O. Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock. Accessing shared mutable data requires using synchronization; one way to avoid this requirement is to not share. If data is only accessed from a single thread, no synchronization is needed. If an object's state cannot be modified these risks and complexities go away. Immutable objects are inherently thread-safe.

Thread confinement is one of the simplest ways to achieve thread-safety. When an object is confined to a thread, such usage is automatically thread-safe even if the confined object itself is not. A more formal means of maintaining thread confinement is ThreadLocal.

It is a good practice to make all fields final unless they need to be mutable.

The design process for a thread-safe class should include these three basic elements:

  1. Identify the variables that form the object’s state
  2. Identify the invariants that contain the state variables
  3. Establish a policy for managing concurrent access to the object’s state

Making a class thread-safe means ensuring that its invariants hold under concurrent access, this requires reasoning about its state. Objects and variables have a state space: the range of possible states they can take on. The smaller this state space, the easier it is to reason about it. By using final fields wherever practical, you make it simpler to analyze the possible states an object can be in. You cannot ensure thread safety without understanding an object’s invariants and post-conditions. Constraints on valid values or state transitions for state variables can create atomicity and encapsulation requirements. In Example 6 is shown an immutable object.

Example 6: an immutable object

Semaphore

A semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system. A semaphore is simply a variable. This variable is used to solve critical section problems and to achieve process synchronization in the multiprocessing environment. A trivial semaphore is a plain variable that is changed (for example, incremented or decremented, as shown in Example 7) depending on programmer-defined conditions.

Example 7: Simple Semaphore implementation

Counting semaphores (see Example 8) are used to control the number of activities that can access a certain resource or perform a given action at the same time. Counting semaphores can be used to implement resource pools or to impose a bound on a collection. A semaphore manages a set of virtual permits. A binary semaphore (see Example 9) can be used as a mutex with non-reentrant locking semantics; whoever holds the sole permit holds the mutex. Semaphores are useful for implementing resource pools such as a database connection pool.

Example 8: Simple Bounded Semaphore implementation
Example 9: Using a Bounded Semaphore as a Lock

Deadlock

A deadlock occurs when a thread enters a waiting state because a requested resource is held by another waiting thread, which in turn is waiting for another resource held by another waiting thread. If a thread is unable to change its state indefinitely because the resources requested by it are being used by another waiting thread, then the system is said to be in a deadlock.

Think of the threads as the nodes of a directed graph whose edges represent the relation Thread A is waiting for a resource held by thread B, if this graph is cyclical there is a deadlock.

In Example 10 is illustrated an example of an object that could end up in a deadlock due to bad timing, as shown in Image 2. This particular case is also known as a lock-ordering deadlock. If we make sure that the locks are acquired in the same order in all the threads then there will be no deadlock (as shown in Example 11).

Example 10: Deadlock
Image 2: Deadlock scenario in Example 10 due to bad timing
Example 11: Solving the Deadlock scenario from Example 10

A program that never acquires more than one lock at a time cannot experience lock-ordering deadlock.

Starvation

Starvation occurs when a thread is perpetually denied access to resources it needs in order to make progress; the most commonly starved resource is CPU cycles. Starvation happens when greedy threads make shared resources unavailable for long periods.

Livelock

Livelock is a form of liveness failure in which a thread while not blocked still cannot make progress because it keeps retrying an operation that will always fail. It often occurs in transactional messaging applications, where the messaging infrastructure rolls back a transaction if a message cannot be processed successfully, and puts it back at the head of the queue.

A livelock is similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing. Livelock is a special case of resource starvation; the general definition only states that a specific process is not progressing.

References

Last modified date 13/02/2019

--

--