Deadlocks

Aka Mexican Standoffs

oceanO
9 min readSep 18, 2023

The multi-threaded world of programming is much like the Wild West. The stakes are high; sometimes, threads (or our bandits) find themselves in a classic Standoff.

code inspiration

Imagine two bandits in a Wild West duel. Each has a gun at their side. As tension rises, both bandits attempt to snatch the guns. To avoid a deadlock-standoff, one bandit must take also the other bandit’s gun (in our code model).

The Code

#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <time.h>
#include <stdlib.h>
#include <unistd.h>

// Text attributes
#define RESET "\033[0m"
#define BOLD "\033[1m"

// Foreground colors
#define F_RED "\033[31m"
#define F_GREEN "\033[32m"
#define F_CYAN "\033[36m"


//GUNS
pthread_mutex_t ugly_gun = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t bad_gun = PTHREAD_MUTEX_INITIALIZER;


typedef struct s_cowboy
{
char *name;
unsigned long reaction_time;
pthread_t thread;
} t_cowboy;


void *action(void *data)
{
t_cowboy cowboy;

cowboy = *(t_cowboy *)data;
// How 🔥 the cowboy is
printf("⏰%s reaction_time "BOLD F_CYAN"%lu⏰\n"RESET,
cowboy.name,
cowboy.reaction_time);
usleep(cowboy.reaction_time);


if (!strcmp(cowboy.name, "ugly"))
pthread_mutex_lock(&ugly_gun);
else
pthread_mutex_lock(&bad_gun);
printf("%s has taken his own gun\n", cowboy.name);

// I wanna take the other gun
// DEADLOCK
if (!strcmp(cowboy.name, "ugly"))
pthread_mutex_lock(&bad_gun);
else
pthread_mutex_lock(&ugly_gun);

// The killer will reach this position
printf(F_RED"\t%s killed the other ☠️☠️☠️☠️☠️\n\n"RESET, cowboy.name);
exit(EXIT_SUCCESS);

return NULL;
}


int main()
{
srand(time(NULL) * getpid());
t_cowboy ugly = {"ugly", rand() % 10000};
t_cowboy bad = {"bad", rand() % 10000};

pthread_create(&ugly.thread, NULL, action, &ugly);
pthread_create(&bad.thread, NULL, action, &bad);

pthread_join(ugly.thread, NULL);
pthread_join(bad.thread, NULL);
}

Two threads, symbolizing our bandits, compete for two mutex locks, representing the guns. Each bandit:

  1. Waits for his reaction time.
  2. Grabs his own gun.
  3. Tries to reach for the other’s gun, if l already taken..Mexican deadlock-standoff

The Deadlock

In certain scenarios, both bandits lock their own guns simultaneously, leading to the subsequent circular wait impasse. This is a deadlock in programming terms: both threads are stuck, waiting for a resource the other holds (the other’s gun).

There won’t be no deadlock in my code if one bandit manages to take both guns. In that case an exit will be called. Beware, if i didn’t call exit, program will never end cause one thread will still be stuck trying to take his own gun, which has never been unlocked by the other bandit. So the exit and the need to take the other bandit’s gun is just an hacky way of modelling the standoff.

Try to comment the exit(EXIT_SUCCESS), and you’ll see.

Let’s get more technical: Coffman conditions

Established by Coffman, these are the 4 essential conditions, for a deadlock to occur:

  • Mutual Exclusion: This is shown by the pthread_mutex_lock function call, which ensures that only one thread can acquire a specific mutex (i.e., gun_1 or gun_2) at any given time. Once a bandit (thread) acquires a gun (mutex), no other bandit can acquire the same gun until the holding bandit releases it.
  • Hold and Wait or Resource Holding: After acquiring their own gun using pthread_mutex_lock, each bandit attempts to acquire the other bandit's gun. At this stage, the bandit is holding one resource (own gun) and waiting for another resource (the other bandit's gun, if he had already taken it due to reaction time). This models the "hold and wait" condition.
And the tension rises
  • No Preemption: Once a bandit has grabbed a gun (mutex), the gun cannot be taken away from him by force. It can only be released (unlocked) voluntarily. The pthread_mutex_lock ensures this as there's no code that forcibly takes away a locked mutex from a thread.
Nobody will drop the gun voluntarily
  • Circular Wait: If Bandit 1 grabs gun_1 first and, at the same time, Bandit 2 grabs gun_2, then Bandit 1 will wait for gun_2 and Bandit 2 will wait for gun_1. This leads to a circular wait, where each bandit is waiting for a resource held by the other.

The usleep function is not reliable

The usleep function, as used in the code, is meant to pause the execution of the program for a specified number of microseconds. However, it's crucial to note that usleep does not guarantee a precise sleep interval. It merely ensures that the sleep will be at least the duration specified.

Reasons why usleep might not be precise:

  1. OS Scheduling: When the sleep duration of a thread ends, it doesn’t guarantee immediate execution, the thread merely changes from sleep to ready state. The OS scheduler, which manages multiple tasks, will decide when the thread runs next.
  2. Interruptions: The sleeping thread can be interrupted by signals, causing it to wake up before the intended duration.
  3. Resolution Limitation: The actual sleep duration might be rounded up to the granularity level of system clock ticks.

TLDR ->The function might sleep for at least the specified time, but it doesn’t guarantee that it will sleep for exactly that duration.

#include <stdio.h>     // Required for printf
#include <sys/time.h> // Required for gettimeofday (time tracking)
#include <unistd.h> // Required for usleep

// The timeval structure provides a representation for storing time:
// tv_sec stores the number of seconds while tv_usec stores the number of microseconds.
// struct timeval {
// time_t tv_sec; // seconds
// suseconds_t tv_usec; // microseconds
// };

void usleep_glitch()
{
// Declare two timeval structures to store the start and end timestamps.
struct timeval start_time;
struct timeval end_time;

// Store the desired sleep time, in this case, 200 milliseconds or 200,000 microseconds.
long requested_sleep_time = 200 * 1000;

// Capture the current time and store it in start_time. This marks the beginning of our usleep.
gettimeofday(&start_time, NULL);

// usleep is used to pause the execution of the program.
// Here, the intention is to pause for 200,000 microseconds or 200 milliseconds.
usleep(requested_sleep_time);

// Once usleep completes, immediately capture the current time and store it in end_time.
gettimeofday(&end_time, NULL);

// Calculate the actual time the program was asleep.
// This is done by finding the difference between the start and end times.
long actual_sleep_time = (end_time.tv_sec - start_time.tv_sec) * 1000000 + (end_time.tv_usec - start_time.tv_usec);

// Print the desired sleep time and the actual sleep time.
printf("Requested Sleep Time: \t\t%ld microseconds\n", requested_sleep_time);
printf("Actual Sleep Time: \t\t%ld microseconds\n\n", actual_sleep_time);
}

int main()
{
// Call the example function to demonstrate the potential inconsistency in usleep.
usleep_glitch();

return (0);
}

The actual sleep duration, in many cases, do not align perfectly with the requested 200ms. This variability can range from a few microseconds to a few milliseconds.

NB-> Overhead of gettimeofday: There is a tiny overhead associated with the call to gettimeofday(). While this overhead is minimal, in systems that require high precision, it's something to be aware of.

run the code-> “while true; ./a.out; end”, you will see every time a different result.

Instructions-microsecond ratio?

Grace Hopper, brilliant brilliant woman. Fun fact, he found the first ever 🐞

The number of instructions a computer can execute in a microsecond depends on the clock speed of the processor (often measured in Hertz, which represents cycles per second) and how many instructions can be executed per clock cycle.

  1. Clock Speed: A 2 GHz (GigaHertz) processor can theoretically execute 2 billion cycles per second. So, in a microsecond (which is one millionth of a second), there would be 2,000 cycles.
  2. Instructions Per Cycle (IPC): Modern processors can execute multiple instructions per clock cycle, thanks to advancements like superscalar architecture, out-of-order execution, and pipelining. Conversely, some instructions might take multiple cycles to complete, reducing the effective IPC. The average IPC can vary significantly based on the type of workload, the specific architecture of the CPU, and other factors.
  3. Parallelism: Modern CPUs have multiple cores, and many have features like Hyper-Threading (in Intel CPUs) or Simultaneous Multi-Threading (in AMD CPUs) that allow each core to execute multiple threads simultaneously. This can further increase the number of instructions executed in parallel.
  4. Instruction Set: Different instructions might have different execution times. For example, a simple addition might take fewer cycles than a division or a floating-point operation.

Given the above complexities, a direct answer like “X instructions in a microsecond” is overly simplistic. The true number can vary based on the specific program being executed, the architecture and design of the CPU, and other system factors.

Let’s try to write down some numbers with this macOS

  • Clock Speed: The laptop has a base frequency of 3.1 GHz (3.1 billion cycles per second). Thus, in a microsecond (0.000001 seconds), the CPU experiences 3,100 cycles.
  • Instructions Per Cycle (IPC): The average IPC can vary. The typical IPC for Intel processors varies, often between 0.5 and 2 for many workloads. This means that, on average, each cycle might handle between 0.5 to 2 instructions.

Given the range for IPC:

  • On the low end (0.5 IPC), you’re looking at 3,100 × 0.5 = ~1,550 instructions per microsecond.
  • On the high end (2 IPC), it would be 3,100 × 2 = ~6,200 instructions per microsecond.

However, these are average figures and, in practice, the number of instructions executed in a microsecond can vary a lot based on the specific workload, how well it can be parallelized by the CPU’s pipelines, cache hits/misses, branch predictions, and other architectural and microarchitectural factors.

Should i lock the printf?

They both write in stdout, which can be considered a “shared resource”.

“Thread-Safe”

Before diving into printf, let's clarify what we mean by "thread-safe". A function (or code) is considered thread-safe if it can be invoked simultaneously from multiple threads without causing unintended interactions or data races. Essentially, if a piece of code is thread-safe, it promises stable behavior even when accessed concurrently.

Atomic Writes:

When multiple threads call printf simultaneously, the underlying system call (write) ensures atomic operations for short messages. This means that each individual call to printf won't interleave its characters with those from other calls. So, "Hello" from Thread A and "World" from Thread B won't ever produce "HeWorllldo" – that's a relief!

Try this code with 3000 threads calling printf.

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

#define NUM_THREADS 3000
#define MESSAGE "Hello from Thread %ld\n"

void* print_message(void* thread_id)
{
long tid = (long)thread_id;
printf(MESSAGE, tid);
return NULL;
}

int main() {
pthread_t threads[NUM_THREADS];

for (long i = 0; i < NUM_THREADS; i++)
{
int ret = pthread_create(&threads[i], NULL, print_message, (void*)i);
if (ret)
{
fprintf(stderr, "Error creating thread %ld\n", i);
exit(1);
}
}

for (int i = 0; i < NUM_THREADS; i++)
pthread_join(threads[i], NULL);
return 0;
}

Real glitches:

While printf exhibits thread-safe behavior at a basic level, especially with modern libraries, there are higher-level concerns:

  1. Output Ordering: If Thread A calls printf before Thread B, there's no guarantee A's message will appear before B's. This unpredictability arises from the OS's thread scheduling and can lead to out-of-order logs or messages.
  2. Consistent Multi-Line Outputs: Imagine a scenario where you want to print multiple lines atomically as a single log entry. Even if individual printf calls are thread-safe, the entire multi-line log entry isn't guaranteed to be atomic without additional synchronization.

TLDR

  • Basic Safety: For many typical use-cases, especially with short messages and modern C libraries, printf can be considered thread-safe in the sense that it won't produce garbled outputs or crash your program.
  • Consider Higher-Level Requirements: If you have specific needs like maintaining output order, ensuring atomic multi-line logs, or optimizing performance, additional synchronization or logging strategies might be necessary. This comes with a Performance Overhead, continually locking and unlocking (for ensuring thread safety) can introduce performance overhead, especially when printf is called frequently.
  • Always Check Your Library: Thread safety guarantees can vary based on the C library and its version. It’s essential to consult the library’s documentation when building multi-threaded applications.
  • Testing: While theoretical knowledge is crucial, always test your application under conditions that mirror its production environment. Multi-threading issues can be elusive, and real-world testing can provide insights that theoretical analyses might miss.

While printf is a powerful tool that behaves well in many multi-threaded scenarios, it's essential to understand its behavior and limitations fully. Tailor your synchronization strategies based on both the function's inherent properties and your application's specific requirements.

--

--