Hands-On Multithreading with C++ 01—Overview

From toy examples to a real-world instance

Yu-Cheng (Morton) Kuo

Published in

Nerd For Tech

9 min readDec 23, 2023

Complete code: https://github.com/yu-cheng-kuo-28/multithreading-cpp

Hands-On Multithreading with C++ 02—Synchronization

Mutex, Atomic Operation, & Condition Variable

medium.com

You may refer to the technical article below for theoretical knowledge:

OS Walkthrough 01— Multi-threading & CPU Scheduling

PKU’s OS on Coursera: Week 01 ~ Week 04

yc-kuo.medium.com

Multithreading is one of the most significant concepts in Operating Systems
We’ll use multithreading to update and display the time at the end
All output pictures are the ouputs on my Ubuntu (Linux) VPS.

*Figure: Multiple threads in memory. Retrieved from* GeeksforGeeks

Let’s get started with an instance illustrating the function of multithreading.

Q: What result will be generated after executing the following code?

// 3_multithreaded_02.cpp
#include <iostream>
#include <thread>
#include <chrono>

void function1() {
  for (int i = 0; i < 200; ++i) {
    std::cout << "A";
    std::this_thread::sleep_for(std::chrono::milliseconds(1));
  }
}

void function2() {
  for (int i = 0; i < 200; ++i) {
    std::cout << "B";
    std::this_thread::sleep_for(std::chrono::milliseconds(1));
  }
}

int main() {
  std::thread worker1(function1);
  std::thread worker2(function2);
  worker1.join();
  worker2.join();
  std::cout << "\n";
}

You can pause here to think about it for a while before moving on.

…

Ready?

root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./3_multithreaded_02 
BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABA
root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./3_multithreaded_02 
BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABAABBABAABABBABAABABBABAABBAABBAAB

Let’s focus on the last 11 characters.

The the last 11 characters of the 1st output: ABABABABABA
The the last 11 characters of the 2nd output: AABBAABBAAB

We see they have only 4 corresponding characters in common out of 11 characters. This is how the multithreading works, we can’t know the exact order of the tasks that the CPU processes.

Outline
(1) A Sinlge Threaded Program
(2) Multithreaded Programs
(3) A Real-World Example: Fetching time
(4) Threads & Mutex
(5) References

(1) A Sinlge Threaded Program

// 1_singleThreaded.cpp
#include <iostream>

void function1() 
{
    int i;
    for (i = 0; i < 200; ++i)
        std::cout << "A";
}

void function2() 
{
    int i;
    for (i = 0; i < 200; ++i)
        std::cout << "B";
}

int main() 
{
    function1();
    function2();
    printf("\n");
}

Figure: Output of the single threaded program

root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./1_singleThreaded 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

The output is not a surprised at all. Let’s move forward to the multithreaded cases.

(2) Multithreaded Programs

2–1 The 1st Version

// 2_multithreaded_01.cpp
#include <iostream>
#include <thread>

void function1() {
    for (int i = 0; i < 200; ++i)
        std::cout << "A";
}

void function2() {
    for (int i = 0; i < 200; ++i)
        std::cout << "B";
}

int main() {
    std::thread worker1(function1);
    std::thread worker2(function2);
    worker1.join();
    worker2.join();
    printf("\n");
}

Note the join() function here ensure the whole program will not end until the thread end.

The output:

root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./2_multithreaded_01 
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Here, As and Bs are not mixed up as I expected from the article BASICS OF MULTITHREADING IN C++. So I asked GPT-4, and here’s the speculation it provided:

[1] Thread Scheduling and Execution Speed:

(GPT-4) In your case, printing 200 characters is a very quick operation for modern CPUs, so it’s entirely possible for one thread to finish printing all its characters before the other thread starts or makes significant progress.

[2] Buffering of std::cout:

(GPT-4) The std::cout stream is buffered. This means that the output might not appear immediately on the screen but is collected in a buffer and then written out in larger chunks. This buffering can cause all of the output from one thread to be grouped together if that thread manages to fill up the buffer and trigger a flush to the console before the other thread starts writing.

[3] Lack of Explicit Synchronization:

(GPT-4) Your code does not have any mechanism (like mutexes or other synchronization primitives) to explicitly control the order of execution or interleaving of the threads. Without such synchronization, the threads are effectively running independently, and their relative progress depends on the whims of the OS scheduler.

(GPT-4) If you want to see a more mixed output, you could introduce some sort of delay or synchronization mechanism inside the loops of function1 and function2. However, even then, due to the nature of concurrent programming, you can't guarantee a perfect interleaving of outputs unless you carefully control the execution order, which typically defeats the purpose of using threads for parallel execution.

(GPT-4) In practice, the unpredictability of thread scheduling is why concurrent programming can be challenging, especially when threads need to interact or share resources.

2–2 The 2nd Version: Introducing delays

// 3_multithreaded_02.cpp
#include <iostream>
#include <thread>
#include <chrono>

void function1() {
    for (int i = 0; i < 200; ++i) {
        std::cout << "A";
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }
}

void function2() {
    for (int i = 0; i < 200; ++i) {
        std::cout << "B";
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }
}

int main() {
    std::thread worker1(function1);
    std::thread worker2(function2);
    worker1.join();
    worker2.join();
    std::cout << "\n";
}

The output:

root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./3_multithreaded_02 
BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABA
root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./3_multithreaded_02 
BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABAABBABAABABBABAABABBABAABBAABBAAB

This is exactly what we expected.

2–3 The 3st Version: Passing parameters to functions that are run by threads

// 4_multithreaded_03.cpp
#include <iostream>
#include <thread>
#include <chrono>

void function1(char c) {
    for (int i = 0; i < 200; ++i) {
        std::cout << c;
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }
}

void function2() {
    for (int i = 0; i < 200; ++i) {
        std::cout << "B";
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }
}

int main() {
    std::thread worker1(function1, 'o');
    std::thread worker2(function2);
    worker1.join();
    worker2.join();
    std::cout << std::endl;
}

The output:

root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./4_multithreaded_03 
-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BA-BAB-A-BAB-A-BA-BAB-A-BAB-A-BAB-A-BA

(3) A Real-World Example: Fetching time

Now, we’ll use multithreading to update and display the time.

1. We’re going to fetch the time in Japan from “https://worldtimeapi.org/api/timezone/Japan”
2. Using library mutex for synchronizing access to sharedTimeData
3. The sharedTimeData is updated every 6 seconds and is displayed every 3 seconds.

// 5_fetchTimeMultithreaded_01.cpp
#include <iostream>
#include <thread>
#include <chrono>
#include <curl/curl.h>
#include <mutex>
#include <nlohmann/json.hpp> // Make sure this library is included and set up

std::mutex dataMutex;          // Mutex for synchronizing access to sharedTimeData
std::string sharedTimeData;    // Global variable to store the shared time data

// This is a callback function used by libcurl for storing fetched data
size_t WriteCallback(void *contents, size_t size, size_t nmemb, std::string *userp) {
    userp->append((char*)contents, size * nmemb);
    return size * nmemb;
}

// Function to fetch the current time from an API
std::string fetchCurrentTime() {
    CURL *curl;
    CURLcode res;
    std::string readBuffer;

    curl = curl_easy_init();
    if (curl) {
        curl_easy_setopt(curl, CURLOPT_URL, "https://worldtimeapi.org/api/timezone/Japan");
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
        res = curl_easy_perform(curl);
        curl_easy_cleanup(curl);

        // Parse the response to extract the time
        try {
            auto jsonResponse = nlohmann::json::parse(readBuffer);
            return jsonResponse["datetime"];
        } catch(const std::exception& e) {
            std::cerr << "JSON parse error: " << e.what() << '\n';
            return "";
        }
    }
    return "";
}

void updateTimeData() {
    while (true) {
        std::string currentTime = fetchCurrentTime();

        // Critical section 01
        {
            std::lock_guard<std::mutex> guard(dataMutex);
            sharedTimeData = currentTime;
        }

        std::this_thread::sleep_for(std::chrono::seconds(6));
    }
}

// Simulated function to display time data
void displayTimeData() {
    while (true) {
        std::string currentTime;
        
        // Critical section 02
        {
            std::lock_guard<std::mutex> guard(dataMutex);
            currentTime = sharedTimeData;
        }

        if (!currentTime.empty()) {
            std::cout << "Current time is: " << currentTime << std::endl;
        }
        std::this_thread::sleep_for(std::chrono::seconds(3));
    }
}

int main() {
    std::thread timeUpdater(updateTimeData);
    std::thread timeDisplayer(displayTimeData);

    timeUpdater.join();
    timeDisplayer.join();

    return 0;
}

The output:

root@DemoYuChengKuo:~/3_multithreading_cpp/code_01_overview# ./5_fetchTimeMultithreaded_01 
Current time is: 2023-12-29T10:45:23.495905+09:00
Current time is: 2023-12-29T10:45:23.495905+09:00
Current time is: 2023-12-29T10:45:29.512626+09:00
Current time is: 2023-12-29T10:45:29.512626+09:00
Current time is: 2023-12-29T10:45:35.564417+09:00
Current time is: 2023-12-29T10:45:35.564417+09:00

For the further explanation of the code below, since GPT-4 got really an oustanding answer, so I will just copy and paste what it reply to me =>>>

(GPT-4) Let’s analyze your questions based on the provided C++ code:

### 1. Where’s the critical section?

(GPT-4) A critical section in a multithreading context is a part of the code that accesses a shared resource (like data) and can potentially lead to a race condition if not synchronized. In your code, the critical sections are:

The block of code within `updateTimeData` that assigns a new value to `sharedTimeData`:

{
 std::lock_guard<std::mutex> guard(dataMutex);
 sharedTimeData = currentTime;
 }

The block of code within `displayTimeData` that reads the value of `sharedTimeData`:

{
 std::lock_guard<std::mutex> guard(dataMutex);
 currentTime = sharedTimeData;
 }

In both blocks, `sharedTimeData` is being accessed, which is a shared resource between the two threads.

### 2. Explanation of “std::lock_guard<std::mutex> guard(dataMutex);”

(GPT-4) - **`std::lock_guard`:** This is a mutex wrapper that provides a convenient RAII-style mechanism for owning a mutex for the duration of a scoped block. When `lock_guard` is created, it attempts to take ownership of the mutex it is given. When the `lock_guard` object is destroyed (which occurs at the end of its scope), it releases the mutex.

- **`std::mutex`:** This is a synchronization primitive that can be used to protect shared data from being simultaneously accessed by multiple threads.

- **`guard(dataMutex)`:** Here, `guard` is an instance of `std::lock_guard` that is being constructed with `dataMutex`. When the execution enters the scope, `guard` is created, and the mutex `dataMutex` is locked. At the end of the scope, `guard` is destroyed, and the mutex is automatically released.

### 3. Explanation of the Blocks with `std::lock_guard<std::mutex>`

(GPT-4) **`updateTimeData` Function Block:**

{
 std::lock_guard<std::mutex> guard(dataMutex);
 sharedTimeData = currentTime;
 }

In this block, `std::lock_guard` is used to lock `dataMutex` at the start of the block and unlock it at the end. This ensures that when `sharedTimeData` is being updated, no other thread can access `sharedTimeData` at the same time. This is critical to prevent data races when one thread is writing to `sharedTimeData`.

**`displayTimeData` Function Block:**

{
 std::lock_guard<std::mutex> guard(dataMutex);
 currentTime = sharedTimeData;
 }

Similar to the previous block, this block locks the mutex when reading from `sharedTimeData`. This ensures that the data is not being modified by another thread while it is being read, which is important to maintain data integrity and prevent race conditions.

In both cases, the use of `std::lock_guard` ensures that the mutex is properly released even if an exception is thrown within the block, thereby preventing potential deadlocks.

(4) Threads & Mutex

Refer to the OS notes I took before:

OS Walkthrough 01— Multi-threading & CPU Scheduling

PKU’s OS on Coursera: Week 01 ~ Week 04

yc-kuo.medium.com

OS Walkthrough 02 — Synchronization

PKU’s OS on Coursera: Week 05 ~ Week 06

yc-kuo.medium.com

Figures: User-level thread (Linux) VS. kernel-level thread (Windows) [4][5]

Figures: Concepts of threads, context switching, & POSIX PThread [5]

Figures: Process State Diagram [6][7][8]

Mutex (Mutual exclusion): In some OSs, it’s a semaphore with an initial value s = 1, the same as a binary semaphore. In others, it’s a little different from the binary semaphore. Here, we view mutex the same as a binary semaphore.