Let me detach those threads for you

Introduction to the std::thread, multithreading and a bit more

Vanand Gasparyan
9 min readJan 16, 2019

Historically, multithreading was a big pain for C++ programmers, because it wasn’t supported by the language for a long time. It was not until C++11 that it became a standard. And now “multithreading” is easier than ever. Or is it?

The short answer is: “Absolutely yes, but only IF you use it correctly”. C++ tried very hard to keep its multithreading interface as minimalistic and clean as possible. To prove my point, let’s see what does the std::thread class look like. In some other languages the Thread classes are pretty big, have lot’s of member functions, which make it not easy to master. But this has a good side effect: it forces you to devote more time to understand all these functions, and also the multithreading in general. Our std::thread class, however, is so simple that can be completely described in 4 simple sentences to someone with a basic understanding of multithreading.

The std::thread class

I’ll first describe it as promised, in 4 sentences.

  • An object of the std::thread class can be instantiated with a callable and its parameters, and it will start running right away.
  • After that, the object is in joinable state, meaning it has to be joined or detached before going out of the scope.
  • If at any point the object is joined (by calling join()), the parent thread will wait there for the child to complete its job.
  • If it’s detached (by calling detach()), it will keep running in the background and the parent thread won’t be waiting.

That’s a pretty simple and quite complete description, except it doesn’t say what you shouldn’t do, and if you do, what are the consequences. So, let’s “disassemblethese sentences.

An object of the std::thread class can be instantiated with a callable and its parameters, and it will start running right away.

There are three ways to construct an std::thread object. As stated above, one of the constructors takes a callable and its parameters (if any). A callable is basically anything that can be treated as a function, that is, pointers to functions, pointers to static and non-static member functions, functor objects, lambdas.

The “pointer to a non-static member function” one differs a bit and looks tricky (line 22). The &ClassName::functionName syntax is just how it’s supposed to be, and an extra first parameter is a pointer to the object, on which this function is to be called (the this). So, this constructor creates a new thread which runs our callable right away, and our object is sort of a handle on this thread. It can throw an std::system_error exception if the thread cannot be started (OS limitations).

Another available constructor is the move constructor.

In this example, a new thread is created and t1 is the handle to it, and while it’s running, t2 takes the ownership from t1. Now t2 has to take care of the fate of our thread.

And the last, default constructor creates an object which doesn’t create a new thread and isn’t associated with any.

Note, that the copy constructor is deleted and that makes sense: there should be one and only one object representing a specific thread. Later we’ll see that the move constructor isn’t just a fancy option but a necessity in some cases.

After that, the object is in joinable state, meaning it has to be joined or detached before going out of the scope.

Whether an std::thread object is or is not joinable determines whether you may or may not join or detach it. At any point, you can check an object’s “joinability” by calling the bool joinable() function.

Simply put, if an object is joinable, you must do something about it. If, at any point, an object is associated with a thread which hasn’t been joined or detached yet, is joinable. At the beginning of its lifetime, it depends on the constructor. If it’s constructed with a callable and successfully starts a new thread, it’s joinable (line 4), if it’s constructed by default, it’s not joinable (line 3), and if it’s move constructed (or later move assigned), depends on the object it takes the ownership from. Note, that the moved object, t2 on line 5, returns to a default constructed state, so it’s not joinable. However, the object you move into, t1 on line 5, must be not joinable, otherwise std::terminate() will be called.

The critical part is when the object goes out of the scope and is destructed. At this point it must be not joinable, or else it’ll call std::terminate() again.

If at any point the object is joined (by calling join()), the parent thread will wait there for the child to complete its job.

If it’s detached (by calling detach()), it will keep running in the background and the parent thread won’t wait for it to finish.

There is not more to say about these two. Perhaps it’s important to note that one and only one of these functions must be called on an object, and only if it is joinable, otherwise an std::system_error is thrown. Let’s consider the following situation to understand how and when the threads should be joined or detached.

The sleep(int seconds) function is to simulate a code that works for seconds seconds. Thus, we’ve created two new threads, call them thread_0 and thread_1. It takes 5 seconds for thread_0 to complete its job, and 7 for thread_1, and after the 12th line both of them have started running. The main thread continues it’s own job and has 3 functions to call, each taking respectively 3, 6 and 1 seconds to run. As mentioned above, before line 20, i.e. when t1 and t2 go out of scope and are about to be destructed, they must be joined or detached. If you choose to detach a thread object, it doesn’t matter on which line. In case of joining, however, it’s important. Doing it right after the construction (line 13) doesn’t make sense: there would be no concurrency. If we join, for example, t1 on line 15, the main thread will get there sooner after 3 seconds and wait for 2 more seconds until thread_0 is finished. Technically, you can’t precisely know how long each function will run and you shouldn’t even care. It doesn’t matter whether the main thread gets to the joining point sooner or later than thread_0 finishes. We just force the main thread to continue its execution from that point only when thread_0 has finished.

Now everything is fine, both std::thread objects are allowed to be destructed.

What’s the problem then?

Looks like everything is rather simple, a few easy-to-remember rules and you’ll be fine. Unfortunately, that’s not the case. All the rules we’ve discussed so far are kind of “kind”, in a way, that if you (accidentally) don’t follow them, you’ll know about it right away: the program will terminate or an exception will be thrown.

Multithreading is quite dangerous, the mistakes here are very costly. When you find a problem, it’s either hard or impossible to debug, but it’s not the worst. The worst is when your code has a mistake, but it works correctly. This can happen because once you use threads, the execution flow is not deterministic anymore. How long does it take to create a new thread? How many concurrent threads can there be? How is the CPU time distributed among the threads? All these factors affect the overall execution. Writing multithreaded code is about being prepared to anything, and that’s making it both dangerous and exciting.

But in this article, we’ll be talking about joining and detaching only. We’ve already answered on “how to do it?”, and it was quite easy. The question we’ll be answering now is “which one?”.

Scenario 1

Before being destructed an std::thread object must be either joined or detached. We might say joining is more intuitive because we surely know when we need it. And as they (joining and detaching) are mutually exclusive, you might think, you detach when you don’t join. Here is an example.

So, we’ve done the initial part of the calculation and we have enough results to do the report. We might want to do that reporting concurrently with the rest of the calculation, so we “send the email” in a new thread.

The last thing we need to decide is whether to join or detach it. We don’t really need to wait for it to finish, so we can simply detach. This approach is quite common and the reason is, there’s a good chance this will work fine. The problem is that the sendEmail function can’t be independent of everything, it’s still a part of the application. It must depend on some other object or resource, which have their own lifetimes. If we were to pass the topLevelResults by reference, the main thread might go out of this scope, destructing the topLevelResult, while the new thread is still using it (that’s when sending an email takes longer than the calculation). But even if it’s passed by value, all sorts of things might happen. The main thread (and the whole program) might finish early or sending the email might take longer.

Scenario 2

Imagine the same scenario, but this time instead of sending an email we have an extremely short operation (compared to calculation). As we’re really sure it’ll finish shortly, instead of thinking a correct place to join we simply detach it. The danger here theoretically is the same, but less probable in practice.

Scenario 3

If you google why or when should I detach a thread, you’ll probably find the term “background thread”. These kinds of threads are usually created at the beginning and keep working in the background the whole time. An example can be a logger. This scenario is the most common and with a little effort can be arranged safely, but, again, it’s not protected from the theoretical danger described above.

Solution

In all 3 scenarios above we should have asked one question: not here, but maybe somewhere else they have to be joined? If you think about it, in the first scenario that thread deals with the ReportManager singleton and must not stay up after that instance is destructed. In the second scenario, we could have a singleton which keeps track of all this kind of supposedly short tasks, just in case. In the third scenario, that background thread had to deal with some part of the code and had to have such dependency. Luckily, std::thread objects are movable, so we technically can join those threads anywhere else. It just may easily make the code uglier. The RAII idiom comes handy here.

The idea is very simple. Instead of creating an std::thread object and detaching it you let this class do the job. The most important createDetachedTask function has the same signature as the main constructor of the std::thread. It creates a thread, stores in a container and joins them all when destructed. This can be helpful in all 3 scenarios discussed above.

In the first scenario, we don’t want our thread to run after the ReportManager singleton object’s lifetime, so we can simply add a member of this type in ReportManager, add a new function called asyncSendEmail which will run the old function in a new thread.

Note that Detacher is a non-copyable, which is a restriction we adopted from std::thread, but when dealing with singletons we can simply derive them from Detacher, and no need to add any function.

And it’s the same for the other scenarios, the thing is to find the correct place.

Summary

  • Multithreading is much more than the classes and the functional the language provides. It’s a whole other world with its quirks and challenges. Do take time to learn the theory.
  • Detaching is not a no-no. It’s legal, a lot of high-quality production code have it, it just might be dangerous. So, think carefully when choosing between join and detach.

--

--