When I started studying C++ multithreading I was feeling confused and lost. The complexity of the program was blooming (yes, like a beautiful flower), the non-deterministic behavior of concurrency was killing me and everything was foggy. So I get you and I got you: here is a simple guide to learn C++ concurrency and multithreading without too many headaches (you can find the roadmap at the end of the article).
Today, let’s quickly refresh some basic concepts and then taste a bit of concurrent code.
1. What is a thread?
Upon creation, each process has a unique thread of execution called main thread. It can ask the Operating System to create other threads, that share the same address space of the parent process (code section, data section, other operating-system resources, such as open files and signals). On the other hand, each thread has its own thread ID, stack, register set and program counter. Basically a thread is a light weight process, switching between threads is faster and IPC is easier.
2. What is concurrency?
The scheduler allocates, over time, the use of available cores between different threads in a non-deterministic way. This is called hardware concurrency: multiple threads running on different cores in parallel, each of them taking care of a specific task of the program.
→ N.B. The
std::thread::hardware_concurrency() function is used to know how many tasks the hardware can truly run concurrently. If the number of threads exceed this limit, we will possibly incur in excessive task switching (switching between tasks many times per second to give an illusion of concurrency).
3. Basic thread operations with std::thread
- Header |
- Launching a thread |
std::thread t(callable_object, arg1, arg2, ..)This creates a new thread of execution associated with t, which calls
callable_object(arg1, arg2). The callable object (i.e. a function pointer, a lambda expression, the instance of a class with a function call
operator) is immediately invoked by the new thread, with the (optionally) passed arguments. By default they are copied, if you want to pass by reference you have to warp the argument using
std::ref(arg). Also, remember that if you want to pass a unique_ptr you must move it (
std::move(my_pointer)), since it cannot be copied.
- Thread life-cycle |
t.detach()If the main thread exits, all the secondary threads still running suddenly terminate, without any possibility of recovery. To prevent this to happen, the parent thread has two options for each child:
→ Blocks and waits for the child termination, by invoking the join method on the child.
→ Explicitly declaring that the child can continue its execution even after the parent’s exit, using the detach method.
- To remember: a thread object is just movable, not copyable.
Here you can find a small code example of almost all the above theory.
4. Why is synchronization needed?
With multiple threads sharing the same address space and resources, a lot of operations become critic: multithreading requires synchronization primitives. This is why.
- Memory is a haunted house
Memory cannot be considered as a “static warehouse” anymore, unless not hunted. Imagine: a thread is comfortably watching Netflix on a Smart TV, it blinks and the TV is gone. Panicking, it digits 911 on the phone and…”Thank you for calling Pizza Hut”. What is going on? Simply, the house is full ghosts (where ghosts are other threads): they are all in the same room, interacting with the same objects (this is called data race), but they are ghosts for one another.
In order to act safely, a thread should declare what is using and check if the object is in use prior to touch it. Is thread Green watching TV? Well, then no one can touch the TV in any way (if anything, others can sit and watch it too). This can be done with a mutex.
- Atomic operations wanted!
Most operations are not atomic. If an operation is not atomic, it is possibile to observe it half-done since it is not indivisible. For example: writing 64 bits, 32 bits at the time. During this operation, another thread could observe 32 old bits and 32 new one, obtaining a completely misleading result. For this reason, the effects of such operations must appear atomic, even if they are not.
→ N.B. Even just an increment is not an atomic operation!
int tmp = a; a = tmp + 1;The simplest way to solve this, is to use
std::atomic. This template enables atomic operations of multiple types.
- Cache coherence and out-of-order execution
Each core tries to save some effort by storing recent values in a local cache. With multiple threads, running on different cores, the values stored in the cache could not be valid anymore and eventually it has to be refreshed. At the same time, modifications are not visibile to others until the cache is flushed. Some mechanisms to propagate modifications and ensure the correct memory visibility are needed.
Also, to increase efficiency, the CPU and/or the compiler can reorder the instructions execution. In a concurrent program this can cause unpredictable behavior and it is necessary to guarantee the execution of sensitive instruction in the original order.
This job is done by synchronization primitives that imply memory barriers (a line in the code that can’t be crossed by certain operations) to ensure core consistency and prevent out-of-order execution (instructions inside the memory barrier cannot be moved outside).
Let’s see some code, so you can test yourself the non-deterministic behavior of multithreading.
A possible output:
Differently from a single-threaded implementation, each execution produces a different and not predictable output (the only certainty is that A lines are sorted in ascending order, as well as B lines). This can cause problems when instruction ordering is important.
[ 12 ] value 0
[ 13 ] value 1
[ 14 ] value 0
[ 15 ] value 0
[ 16 ] value 0
[ 17 ] value 0
[ 18 ] value 1
[ 19 ] value 0
What happened here? After thread A evaluates “value” as true, thread B changes it. Now we are inside an if-block, even if the constraint has been violated.
If two threads access the same data, one writing and one reading, there is no guarantee on which operation is executed first.
Accesses must be synchronized.
I know these are a lot of concepts. Keep in mind that you don’t need to understand everything right now, but it’s important to grasp the core ideas.
I suggest you to play with the examples and see how concurrency takes action. Also, try to think about other examples where synchronization is needed and test them out (hint: What about multiple threads popping the front of a queue? Remember you must check if the queue is empty before popping).
I have structured the series as follows:
- Theory + Basic examples
→ Low-level approaches
2. Condition variable
→ High-level approaches
3. Future and async
5. Packaged task
- Practice + Tutorial & Challenges
You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…
(The C ++11 library introduces a standard mechanism for synchronization, independent of the underlying platform, so I will not talk about native Linux and Windows threads. Anyway, the core concepts are very similar.)
I will publish one article per week and keep this roadmap updated.
In the next article we will see the mutex synchronization primitive and how to use it at its best.
If you want me to deepen some topic, let me know (you can find me on instagram too, @ valentina.codes).