Parallel Programming in Python — Lesson 2. The Thread

Published in

CodeX

15 min readAug 6, 2021

This is the second in a series of lessons, covering the various facilities that the Python programming language offers for parallel programming and the motivation for using each of them. In the previous lesson, we explored the applicative need for parallel design, and used a Python thread to demonstrate it. In this lesson, we proceed to study the Python thread in detail.

Sections in this lesson:

Thread: the solution domain
A “File Watcher” thread function
Encapsulating the File Watcher
Elaborating on the Python Thread
Exercise: “Tail utility”

1. Thread: the solution domain

The original model of computation involved a processor, fed with input (instructions and data) in discrete pieces, handling each input and changing its internal state, accordingly. And then, a stored-program digital computer has the program (sequence of instructions and data) stored in its own memory. It still reads input in discrete sequential pieces but they come from within, using a program counter that points to the next instruction to be performed. While the two models appear similar, there is a world of difference — and opportunities — between them. There is a vast difference between being told directly from the outside what to do and being told (still from the outside) how to tell yourself what to do. The first machine can only be as smart as its operator. The latter is a smart machine with the potential to learn.

Armed with a reliable mechanism of performing instructions in the correct sequence, and assuming that instructions sit in contiguous memory (and may, therefore, be associated with a unique memory address), we can, for example, concentrate a sequence of recurring logic in a function (used to be called subroutine), which, if general enough, may be called from various places in the program. When a function is called, the program counter is changed to point to its entry-point. But first, the address of the next instruction in the calling code must be recorded down (to be restored when the function returns), together with other information, such as where to put the result of the function call (if any). Since the called function may call still another function (or itself, for all we know) and so forth, additional such execution frames may be stacked in a dedicated structure known as the the execution stack.

These function calls are described as blocking or synchronous, because once a function calls another function, its execution is “suspended” until the other function returns (because there is only one program counter and it is temporarily pointing elsewhere).

Modern computers allow programs to manage more than one stack, from which follows the capability to run several threads of control simultaneously. The various threads of control may interleave, either by a smart time-sharing scheme (on a single processor) or by being assigned to different processors (where available), or (most likely) — both. For the functions-and-function-call paradigm, the multi-threading option opens the opportunity to emulate non-blocking or asynchronous function calls. (And the applicative need for such feature is discussed in detail elsewhere in these lessons). The calling function — instead of calling the other function synchronously and blocking itself (giving the program counter over to it) — now has the opportunity to launch the called function on a separate thread (letting the operating system worry about the deployment), and proceed to execute in parallel. But this is just the beginning. Parallel programming creates a host of minor problems to be solved (if we insist on still emulating some — or all — of the blocking-function-call use case, and why should we do that is another matter): what happens when the asynchronous “call” ends? (We, the caller, are not there when it does). How to retrieve the result of the asynchronous call (if any)? How to monitor the progress of the parallel thread of control, in the mean time? How to intervene with the operation of the parallel thread? Etc.

We summarize the particular solution to these (and many other) technical challenges by saying that the two functions must agree on a protocol.

In theory, this does not seem beyond normal comprehension, but in practice, it turns out to be everything but easy to learn, and its implementation is notoriously error-prone. In my humble opinion, what makes the concept of threading hard to acquire, is the fallacy of reification (mistaking our understanding of the motivation behind what we see out there for a thing that is physically standing out there). We have a natural tendency to perceive a computer program as a thing, which it may be indeed — but only in the problem domain. In the solution domain (i.e. the computer that runs it), it does not exist. There are only so many discrete instructions, waiting silently to be performed, in a sequence yet to unfold. Since there is only one program counter (assuming one core), and given the execution stack mechanism, there can be only one function “running” at any given time. When a function “invokes" another function, it cannot continue to run, because it does not exist (the program counter has gone elsewhere). Actually, its state is preserved (on the execution stack), so that it may be brought back to life, if all goes well, in due time. In this context, the execution stack and multi-threading (the capability for multiple execution stacks) represents a major victory in the programmers' quest to implement the requirement for a program (or at least, a function) as a thing that does live in computer-land. (As long as one does not get carried away and mistake the metaphor for the real world, blaming someone else —e.g. the facility of multi-threading — for one’s own incompetence ).

While an “asynchronous function call" that just performs a discrete job (in parallel) and walks away is possible and legitimate, the interesting (and quite frequent) multi-threading use case involves a thread that loops indefinitely, processing its input one discrete piece at a time, producing responses that somehow reach the sending thread(s) in one piece each, and on time. (This is also called event loop, as discussed in the introductory lesson). The whole discontinued process (composed of using and used threads) must be orchestrated carefully, to make sure that each thread of control does its job on time and in harmony with the others.

2. A “File Watcher” thread function

The problem: to monitor a disk file, logging alert message when modified.

A solution: the File Watcher function uses the operating system facilities to monitor the file’s latest known modification time and alerts it’s client, when that changes.

Some design decisions:

The File Watcher is launched in a separate thread of control (in order not to block the using application from doing its job in the mean time).
The File Watcher is decoupled from the human Interface by a callback that is responsible for doing the notification.

Here is a straightforward Python (3.9) implementation (notes below):

Global end-of-watch flag. This simplistic solution will do for the sake of this starter prototype. (We are going to consider a more robust solution in due time).
The file watcher alerts it’s client using a client-supplied callback function. This decouples the producer of the information (file watcher) from the human interface (which may be the terminal, a dialog window, a socket, a counter, etc).
After validating that the file to be watched indeed exists, the file watcher obtains its latest known modification time (“get-m-time”) and proceed to iterate on comparing it every two seconds, alerting of change, when detected. The iteration is stopped when the global end-flag is signaled (for all we know, by the client).
The simple alert function of this examples outputs to the terminal, formatting the (binary) change time for display.
The program attempts to obtain the name of the file to be watched from the command line. Otherwise it requests it from the user.
The program writes something to the file, to make sure that it will exist (by the time the File Watcher accesses it).
Now, that we have a file to watch, the file watcher is launched in a separate thread of control, given the names of the file to watch and the console notification callback . The thread is not active yet!
Now the thread is free to go! The file watcher function will be invoked anytime soon (asynchronous!).
The test program makes changes to the file, appending some text, five times in a row, in five-second interval. It expects the File Watcher (launched previously in a thread of its own) to catch these changes on the fly). The test program opens and closes the file on each access (to update the file modification time).
Five seconds is plenty of time for the File Watcher (that wakes up every two seconds) to detect the change. (It also keeps the progress accessible to the human eye).
Test done, the test program signals the File Watcher to end.
The test program waits for the thread to end (it “joins” it), to prevent the thread from continuing to run past the end of the main program. This precaution is not essential here (we can trust the File Watcher to end anyway, since it has just been signaled to do that). Still, this is good parallel programming style!

Output:

file "fileWatcherTest.txt" changed on Sat Dec 12 19:35:22 2015 file "fileWatcherTest.txt" changed on Sat Dec 12 19:35:27 2015 file "fileWatcherTest.txt" changed on Sat Dec 12 19:35:32 2015 file "fileWatcherTest.txt" changed on Sat Dec 12 19:35:37 2015 file "fileWatcherTest.txt" changed on Sat Dec 12 19:35:42 2015

What have we learned in this example?

A Python thread of control takes a function and, optionally, arguments to pass to it (when it eventually starts).
When requested to start, the thread invokes the function, but on a dedicated execution stack, so that it does not block the launcher of the thread.
A thread terminates when the thread function returns. (The thread object continues to exist, but does not have much use).
Threads should end for a reason. Either the thread has finished its job or the other thread that launched it (the main program, in our example) does not need it anymore. Terminating a thread requires synchronization. Both the thread and it’s launcher must agree upon a proprietary protocol. The main program must be capable to signal the thread to end and the thread must be capable to catch the signal. This design pattern is common in parallel programming. The thread is in loop, and the Boolean, accessible also to the client side, is the loop’s termination condition.
Due to programmatic negligence, the thread may continue to run (at least in Python) even after the main program was supposed to return. In case the programmer is certain that the thread is indeed bound to finish (for example, it has just been signaled to this effect), but should be allowed some time to clean up, the main program joins the thread — it is blocked until the thread ends.

Other Python thread functionality. Although CPython seems to implement it’s thread as wrapper over the host operating system thread, it does not expose the latter’s full functionality (and adds some minor attributes off is own).

You cannot kill a Python thread (i.e. terminate it by brute force). Although this limitation has great educational value (see discussion above), it may become a burden in those (not very rare) real-life cases, when a thread gets stuck for reasons beyond the programmers control (for example, is doing blocking I/O), leaving no other way but to kill the entire program from the outside!
Python threads never run at the same time, due to the Global Interpreter Lock (“GIL”). This may be relaxed in some future release, but until then, is preventing multi-threaded Python programs from benefiting from the performance of multi-core platforms.
You cannot query thread termination status (by default). Implement this capability, if needed, on your own.

Additional capabilities (thread specific):

To mark the thread as “daemon”. (This term is dedicated, in the UNIX world, to servers that work silently in the background and are also to be removed silently). Daemon threads (at least in theory) end with the main program (and do not have to be joined).
To give the thread a user-defined name.
To tell if a thread is active (or has returned).

Additional capabilities (non-thread specific):

To get the current thread. This includes getting a wrapper to the main program (which is also a thread, actually).
To count and enumerate the active Python threads (in the current process)
To get the current thread id (Python-given and OS-given)
To get and set thread stack size (in the rare occasion that the default will not do).

3. Encapsulating the File Watcher functionality

The problem: The File Watcher thread function does the job in this simple example, but, as a general purpose infrastructure, it leaves something to be desired. In particular, it would benefit from the the following capabilities:

To watch multiple files, on several, unrelated occasions. (On the contrary, the current implementation by function, signaled by global Boolean, limits the application to watch exactly one file!)
To separate the File Watcher to a library (e.g. real infrastructure). The global variable stands in the way.

A solution: File Watcher functionality deserves a class.

Some design decisions:

File Watchers are constructed using file name and alert function (but not the thread function).
The Python thread is created internally by the File Watcher.
The thread function is a (private) method of the File Watcher.
Starting the File Watcher thread is requested from the File Watcher.
The terminator flag is a private attribute of the File Watcher, accessible to the client through getter and setter.

Refactoring the starter prototype code, to meet the object-oriented design (notes follow, output remains the same):

“File Watcher” is now the proud name of a class.
What used to be arguments and local variables of the file watcher function are now member data of the File Watcher class.
The terminator flag is now a member of the File Watcher class.
The Python thread is a member of the File Watcher class, constructed inside the latter’s constructor. For thread function, it is given a bound method, consisting of the the File Watcher object (“self”) and the File Watcher method (“FileWatcher.run”).
Starting the thread is now under responsibility of the thread object (but requested by the client).
The opposite — stopping the thread is now also under responsibility of the thread object, sheltering the client from the Boolean implementation.
The event loop was extracted to make the thread method, appropriately called run.
The main program initializes a File Watcher object, using file name and alert function (but not the thread function). The rest of the main program logic remains the same.

Now, with this object oriented design, we can examine the parallel logic visually:

Notes to the sequence diagram:

The File Watcher is launched on a parallel thread of control. Here the ways of the main program and the File Watcher part, each proceeding on its own.
The two main loops (“to add line to file” and “to watch file”) occur independently, still in parallel.
“To notify change” is coupled (by data) with “to add line to file”. The input of the first (“file modify date [changed]”)is the output of the latter “file appended line [and closed]” — i.e. file changed. “Data-coupling” is a frequent design idiom in parallel design, replacing temporal coupling (commands following in the order written) which you may know from procedural programming. Procedural programming goes like “first do A then do B” (in this order). In event-driven design, this is replaced by “B consumes what A produces” (and therefore, must follow it).
Stopping the File Watcher is synchronous for the main program. The job (signaling) is performed immediately by the File Watcher object, in the main program thread, non-blocking). But it is asynchronous for the File Watcher thread, running in the background. According to protocol, it is scheduled to stop the loop above on the next iteration.

A fact that is missed by many, and thus responsible for at least some parallel programming bugs, is that, in an object-oriented parallel design, a thread may be accessed both asynchronously and synchronously. (1) Asynchronous: first, the thread is launched to do its job on its own and then, it is occasionally signaled or fed with data at its own pace. (2) Synchronous: thread methods are invoked to change thread state, non-blocking. Note the ambiguity: signaling appears in both lists! The key to not loosing one’s path around here is to recall that thread of control — regardless of the programmatic idiom — is not an object! it is the state of execution of a function. Your implementation may add, on top of this, whatever you need (such as keeping the state available synchronously, signaling and sending data), as long as you remember that the thread in which the function is invoked is what matters! When you invoke the method of your thread synchronously in the context of the main program, your current thread is the main program (and not the thread object, that keeps running in the background)!

4. Wait a minute! Python already has a Thread class!

Containing a built-in threading.Thread in a user-defined thread class, such as our File Watcher, is overkill. We can achieve the same by inheriting from it. Refactoring is rather straightforward: the logic remains very much the same, since inheriting from a class and containing an instance of it is — at least in the present case — practically the same. (What we have here is object-orientation by reuse. The threading.Thread is not an abstract base class or an interface, and no substitutability is involved). Add to this the bonus of coding less and inheriting all the rest — data and methods. (And on the contrary, the punishment of being restricted to what the design of the base class takes for granted, if we can live with it). In this case, the restrictions (listed below) are such that we can easily live with.

The only override-able Thread method is run — the thread function.

Consider that Python does not support “virtual” functions. The base Thread start method invokes self.run(), if such a method exists (otherwise nothing). Therefore, the method must be called “run” to the letter. Any other name (such as “Run”) would fail to be invoked, leaving us with the do-nothing default!
The base Python thread does not support an event loop. (And neither should it! Not all — or even most — threads are event-loop based). Whatever happens inside the run method is the derived class author’s responsibility. Consequently, the base Thread does not respond to stop. (Rather asymmetrically. As we have already seen, it does respond to start). Implement your own stop method, if you need one.

Refactoring the class-based prototype code, to meet the library-conformant design (notes follow, output and logic remain the same):

The “File Watcher” class inherits the built-in “threading.Thread”.
There is no longer a built-in thread member inside this File Watcher (one is inherited, instead). The File Watcher constructor delegates to the base built-in thread constructor.
The rest of the File Watcher members remain the same.
The run method remains the same. Its name (“run”) designates it as the thread function.
The only method we must add is stop. (The base thread does not support stopping, because it is not aware of the functionality of the thread method).
The main program is not affected by the changes made internally to the File Watcher class.

5. Exercise: “Starter Tail utility”

“Tail” is a Unix utility whose logic closely resembles our “File Watcher” example, but, instead of announcing the fact of the file being modified, it outputs the very added lines, assuming that (1) the file is only modified by appending, and (2) the file contains plain text. A typical use for tail is to follow log files (as they are being created).

Modify the “File Watcher” example into a basic “Tail” utility. Consider the following: The file is read from the end (rather than from the start), which requires:

To open the file in binary mode — ‘rb’ — , and then…
To position to end — inp.seek(0, 2)
The next line (past the last-known end-of-file) is obtained by inp.readline(). Non-empty result indicates a valid line (to be displayed). Empty result means that we are still at end of the file (but this is not an error).
Writing on the file from the other side remains the same. Do not forget to open the file in append mode — ‘a+’!

Try to make a honest job! You are going to refactor this simple starter as exercise in the following lessons!

What next?

In the next lesson, we meet the classical Producer/Consumer example. We are going to consider some “bare-handed” implementations using Python’s synchronization primitives and try to evaluate what each of them is good for. then, in the following lessons, we will implement the Producer/Consumer algorithm with advanced Python facilities: multi-processing and cooperative processing (synchronous and asynchronous).

Introduction
The Thread (you are here!)
Synchronization Primitives (Multi-threading)
Synchronization Primitives (Multi-processing)
Cooperative Processing — synchronous
Cooperative Processing — asynchronous