Parallel Programming in Python Lesson 5. Cooperative programming — synchronous

Avner Ben
CodeX

--

Sections in this lesson:

  1. Introduction
  2. Pull iterator
  3. Iterator essentials
  4. Single Consumer adapted to pull from multiple Producers
  5. Push iterator
  6. Coroutine essentials
  7. Automatic Priming
  8. Single Producer, adapted to push to multiple Consumers (Multi-casting)
  9. Propagating the Data-sink
  10. Comparison of state-machine vs. coroutine based solutions
  11. Exercise: Cooperative Tail Server

1. Introduction

In the previous two lessons (three and four), we studied a simple “Producer/Consumer” (one-to-one) use case:

  • “The Producer sends enumerated text messages, which are received and promptly displayed by the Consumer, in two-second interval. Both Producer and Consumer are created (and paired) by the main program, which shuts down the Producer after 20 seconds, causing the Consumer to shut down as well, resulting in exactly 10 messages displayed”.

We have experimented with various “conventional” implementations of the Producer/Consumer use-case: first with multi-threading and then with Python’s own multi-processing twist. All these “classical” solutions had these two principles in common:

  1. The Producer and Consumer are loosely coupled — they never meet in person, communicating through a third party. We considered the following means of communication: global variable, shared variable, point-to-point message-queue, pipe, publish/subscribe and socket.
  2. The protocol is dictated by the Producer (that manufactures the goods). The Consumer complies.

And now, to an alternative solution where the Producer and Consumer (1) loop in the same thread of control, and (2) are temporally coupled (where either the Consumer pulls directly from the Production loop, or the Producer pushes directly into the Consumption loop). Of course, these solutions are restricted to the subset of use cases where Producer and Consumer are never required to execute simultaneously(at exactly the same time — which is undeniably thread territory). And then, this is not much to ask, considering that almost all Producer/Consumer use-cases in our universe do not really require simultaneity! (By definition, sending the message must occur before receiving it. There is only scope for sending and receiving at the very same time in bizarre cases, where, for example, the message is sent in packets or multiple messages must be issued before being received, so peripheral activity on both sides may happen to occur simultaneously). Before questioning if and how this magic can be performed, there is good reason to try: (1) multi-threading/processing is resource-expensive, and (2) it introduces complexity to the design that makes the program (unnecessarily) hard to manage and extend.

The solution involves two programmatic paradigms:

  1. Cooperative processing (use of Coroutines). We shall consider two design patterns: (1) Pull iterator. In each iteration, the consumption loop pulls the next message from the production loop (which pauses), uses it and pauses, etc. (2) Push iterator. In each iteration, the production loop pushes the next message to the Consumption loop (and pauses). The Consumption loop takes over from there (and pauses), etc. While such a solution can be built with plain procedural programming, the result is likely to be too complicated to justify the effort (which explains why such designs are not frequent in practice). Here, the programmatic feature of coroutines (functions that pass control to each other, retaining execution state) comes handy! The Python support for coroutines makes these solutions quite simple and readable (for those who are familiar with the paradigm, of course).
  2. Substitutability (or Polymorphism). These solutions seem to compromise the essential loose coupling (between Producer and Consumer) — there is no denying that this Producer is explicitly calling a method of the Consumer (or vice-versa). But, here comes the twist: this Producer is unaware of exactly (or even vaguely) who it is talking to. The Producer (or vice-versa) can be given any object, as long as that object implements the Consumer interface (the part that is of interest to the Producer, and as the Producer expects to find it). As we shall see, the substitutability of the iteration argument opens some interesting extensions. For example, why do with 1:1? The object behind the interface may be an adapter that allows to easily implement such use cases as single Producer / multiple Consumers, multiple Producers / single Consumer and others. For example, the object on the other side of the interface may be a proxy for the real object that sits in another thread (if simultaneity is really needed!)

2. Pull iterator

The following example couples the Consumer with the Producer explicitly, but with a twist, using Python’s support for structured iteration. If we constrain our Producer/Consumer use case to exactly one consumer, (which is quite realistic), then we are free to rephrase it in this simple form:

  • “The consumer iterates over messages from the Producer”. (”Iterates” in the same way that it would pull records from a file, characters from a string or objects from a list).

Here is an example (notes follow):

Notes:

  1. In this design, the Producer is not a thread.
  2. A producer method that implements structured iteration. This magic function is silently invoked by Python to implement for loops on Producer objects. (More on this later).
  3. The yield statement suspends the loop, feeding the calling for-loop with the current element.
  4. The Consumer (a thread) is initialized with an object complying to the Iterable interface, (such as our Producer).
  5. The Consumer iterates on the messages “stored” (for all it knows) in the Producer.
  6. The main program creates the Producer and initializes the Consumer with it.

Output:

Round 1 
Round 2
Round 3
Round 4
Round 5
Round 6
Round 7
Round 8
Round 9
Round 10
[End of input]

3. Iterator/generator essentials

This sequence diagram demonstrates the synchronization by data coupling. In each iteration in the consumption loop, “to pull next message from the Producer” (implicit in the for-loop header) is design-wise coupled by “message” (dotted arc) with (and programmatically blocked by) “to yield next message” in the production side, which loops in parallel.

Unlike other languages that support structured iteration under some strict use cases (such as extracting records from files or objects from standard containers), Python’s iteration model is — in the Python spirit — as general as can be. Any object may be iterated over, provided it’s class complies with the “iterable” protocol, which consists of a single capability: the magic method __iter__. The built-in function iter over object invokes the object’s magic method __iter__ (if it has one), which returns an iterator. The loop then proceeds by calls to the iterator’s __next__. The loop is terminated when the iterator throws a special exception.

The magic method __iter__ may be either a generator yielding the next entry or a normal function that returns an iterator. Precisely, the built-in iter always returns an iterator. When presented with a generator (function that yields), it silently creates a default iterator over it.

Notes:

  1. First, an iterator is obtained from the object of iteration (using the built-in iter, which uses the producer’s magic __iter__ method). This gives a (default) iterator that “points” before the first message. (I.e. no message has yet been produced).
  2. In each iteration, the built-in next attempts to obtain the next message from the producer (using the iterator’s magic method __next__, which unleashes the producer’s magic method __iter__ until the next yield statement, where it pauses, returning the next message.
  3. When the producer decides to stop producing messages (i.e. its magic method __iter__ returns “normally” — as distinct from yield), the iterator throws Stop Iteration, which terminates the consumption loop.

While the typical usage of Python iterators is implicit within for loops (as demonstrated above), there are occasions where we had rather take matters into hand and advance the iterator explicitly. Take for example the following function that gets the first entry (or none) from an arbitrary iterable.

Notes:

  1. The function obtains an iterator from its argument, whatever it may be, and returns the next item in it (actually resulting in the first item, because Python iterators are born uninitialized)
  2. In case the iterator is empty (next fails in the very first iteration), the None object is returned.

Output:

'a' 
None

4. Single Consumer adapted to pull from multiple Producers

The design decision to couple the Consumer to the Producer by interface opens the design, offering some interesting extended (and useful) use-cases. But first, a demonstration that the Consumer is indeed shielded from the true identity of the Producer:

Notes:
1. This consumer is initialized with a string (that iterates on character).

Output:

a 
b
c
d
e
f
g
[End of input]

A more useful extension is to implement the case of single consumer / multiple producers. And this is done without any major change to the current implementation. Since the Consumer does not know what it really iterates on, we are going to provide it with an adapter that, in each iteration-request from the Consumer, iterates on the next message from an array of producers that it hides inside!

Incidentally, this Multi Producer Adapter gives the occasion to demonstrate “low-level” usage of iteration, as discussed above.

Notes:

  1. The producers are provided with a unique prefix, to distinguish their output. Otherwise, there is no change to producer logic.
  2. Producer output is prefixed.
  3. The Multi Producer Adapter receives a list of producers and prepares to iterate on each of them.
  4. The Multi Producer Adapter iterates over each producer, one (next) element at a time.
  5. The delay is now under responsibility of the Multi Producer Adapter. (We do not want it to depend on the — arbitrary — number of producers).
  6. Here, the first producer to end terminates the entire iteration. (An alternative solution is to remove the expired producer from the list and defer the termination of the production loop to when the list becomes empty).
  7. The delay is no longer under responsibility of the consumer. Otherwise, there is no change to Consumer logic.
  8. The main program creates three producers, prefixed “a” to “c”.
  9. This time the Consumer is given an adapter over three prefixed Producers.
  10. There are now multiple Producers to stop.

Output:

Round a1 
Round b1
Round c1
Round a2
Round b2
Round c2
Round a3
Round b3
Round c3
Round a4
Round b4
Round c4
Round a5
Round b5
Round c5
[End of input]

5. Push Iterator

And now, to the opposite use case, where the Producer feeds the Consumer coroutine with messages, one at a time. As in the previous solution (Consumer iterates on the Producer), control is passed between them in the obvious (procedural) way, needing no synchronization facilities. Interestingly, Python allows us to use an iterator (over generator function) here as well, but this particular iterator pushes to the data target (instead of pulling from the data source), using dedicated syntax.

Notes:

  1. The producer primes the iterator (i.e. calls next). Since a Python iterator is born uninitialized, the first get-next (“priming”) is required to position it to the start. Otherwise the following send will result in error!
  2. The Producer sends the message to the consumption loop, blocking.
  3. Since send, unlike yield is not terminated by the (end of) loop on the other side, the output stream must be closed explicitly.
  4. The consumption loop blocks, waiting for someone (the Producer, in our case) to send the next message (implemented by yield giving message). When message finally received, the Producer (that sent it) remains blocked until the next yield or consume exit. Enclosing the yield between brackets is not essential, but is common Python practice.
  5. The delay is managed here by the Consumer.
  6. The decision of the producer to stop sending (which, here, is not implicit) raises the Generator Exit exception. The Consumer interprets this “exception” as the legitimate end of loop.
  7. The main program calls consume, giving a Python coroutine wrapper over consume and hands it over to the Producer. The rest of the main program is not affected by the change of protocol.

6. Coroutine essentials

The sequence diagram demonstrates the synchronization by data coupling. In each iteration in the consumption loop, “to receive next message in the Consumer” (yield) is design-wise coupled by “message” (dotted arc) with (and programmatically blocked by) “to push message to the Consumer” (send) in the consumption side, which is looping in parallel.

Note the difference!

  • The pull-iterator’s yield produces the argument, blocks the production loop and releases the consumption loop.
  • The push-iterator’s yield blocks the production loop, releases the consumption loop and consumes the argument,

“Calling” a dependent coroutine (like our consumer) does not yet execute the function. Instead, it returns an iterator, ready to run, somewhat resembling the thread “constructor” (for example using initialization parameters, where needed). When the coroutine is entered for the first time (when requested to push the next — that is first — output), it executes up to the first yield and pauses, a procedure known as priming. Then, each item sent from the other side releases the coroutine to “push” the pending yield value.

Of course, this mechanism will work as long as the push-iterator coroutine is blocked at a yield point, which is guaranteed during the output loop, but not at the first time, when the coroutine is not yet positioned anywhere useful. This is why priming is needed. Admittedly, the functionality and motivation of generator priming is a somewhat obscure. In order to fully understand it, one must delve into the history of the language. The push iterator (generator coroutine) is a late addition to Python, and was implemented over the existing (pull) iterator. To demonstrate this, it is possible to write such code as:

inp = yield outp

Although the utility of such code is dubious, and is strongly advised against by all textbooks, it will work. (The yield will happen first and the assignment — driven by the send — last). It all begins with the Python iteration paradigm, which does not distinguish between pointing at the current element and returning (yielding) it, as does, for example, the C++ STL. (Python is not unique in this. So does C#, and others, and it makes better sense, design-wise). A Python iterator is born “uninitialized” and will only “point” to the first element (and return it) when required to get next. The first next request will fetch the first element (if any). Incidentally, it will also run through the iteration initialization procedure, if present. This works well for the original pull iterator, because it controls the iteration (the other side is blocked until it yields). However, the push iterator cannot be born “uninitialized”, because it is controlled from outside! It is blocked, until someone does it a favor and sends into it. When that happens, it must be positioned at some yield point, waiting for the send to come. To make that happen for the first time, we must use it in its (unwanted) capacity as pull iterator and instruct it to retrieve the next element, i.e. get to the first yield point (where it retrieves nothing, because it is given nothing), and stand ready to receive. This is an admittedly lame excuse, but that is the way it works…

At the other end of the coroutine lifespan, while it is terminated by returning “normally” (as distinct from yielding), it is not a good idea for it to do so of its own accord. Typically, it is the pushing side that should decide when to stop pushing and terminate the loop. (In contrast with the pull-iterator that keeps sending back the goods, and whose natural duty is to stop the loop, by raising Stop Iteration). When the producer is through sending messages, it signals the consumer to stop receiving, by raising Generator Exit. This gives the consumer the opportunity to do cleanup, prior to exiting.

7. Automatic Priming

The burden of priming the generator may be generalized and delegated to a decorator, as in the example below. (Oddly enough, this is not part of the built-in Python library).

Notes:

  1. The coroutine function takes a function.
  2. Inside the coroutine decorator hides another function called start, that is meant to wrap the received function. Since we do not know in advance what parameters the wrapped function is going to take, start settles for the widest case possible: so many unnamed parameters and so many named parameters.
  3. Start creates a coroutine (by calling the wrapped function). It does not need the wrapped function as formal parameter, because it is already in its closure.
  4. The Consumer is decorated as coroutine. The function name consume is now pointing at the function start of coroutine. (Consume is not lost. It is in the closure of start.) Calls to consume will call start (of the coroutine function) instead (as result of the decorator above), which will happen to call consume, prime it and return the (primed) resulting coroutine object.
  5. The main program calls what it holds to be consume (actually, start of coroutine) which calls the real consume, giving a Python coroutine wrapper over consume, primes it and returns it to the main program, which hands it over to the Producer. The rest of the main program is not affected by the change of protocol.

8. Single Producer, adapted to push to multiple Consumers (Multi-cast)

This version “multi-casts” the messages of the producer to multiple consumers. A multicast coroutine is inserted between the producer and its consumers. The delay is moved to the multi-caster

Notes:

  1. The different consumers are identified by prefix.
  2. The consumer’s message is prefixed accordingly.
  3. The multi-cast adapter takes a list of generators.
  4. The message is dispatched to each consumer, in turn.
  5. The delay is delegated to the multi-caster. (we would not like it to depend on the number of consumers which is arbitrary).
  6. Finally, the multi-cast adapter closes each consumer, in turn.
  7. The main program loads the multi-cast adapter with consumer generators, prefixed ‘a’ to ‘c’.

Output:

a: Round 1 
b: Round 1
c: Round 1
a: Round 2
b: Round 2
c: Round 2
a: Round 3
b: Round 3
c: Round 3
a: Round 4
b: Round 4
c: Round 4
a: Round 5
b: Round 5
c: Round 5
a: [End of input]
b: [End of input]
c: [End of input]

9. Propagating the Data-sink — coroutine pipeline

This version generalizes the display of the message by introducing a substitutable sink medium — yet another coroutine in the chain — that is propagated down the hierarchy, making a pipeline.

Notes:

  1. The producer takes two generators: the consumer and the output medium.
  2. Output is fed to the consumer, using the output medium.
  3. The consumer is fed by two arguments: the message and the target medium. (Note that, in this design, the target medium is sent again, in each message, stressing the independence of the consumer from the output medium. An alternative design would be to initialize the consumer using the output medium, assuming it does not change and does not depend upon the message).
  4. The consumer forwards the display to the given output medium.
  5. The output medium is implemented as coroutine to conform with the design. (It could as well be a function or a substitutable object)
  6. The main program initializes the producer with the output medium.

10. Comparison of event-driven vs. coroutine based solutions

For the next discussion, we are going to switch to the use case of building a course structure, implemented by dictionary-of-lists, and loaded from XML. We shall use the procedure of loading the data from XML, in order to compare between a “conventional” state-machine solution and an alternative coroutine based solution, which may seem unusual at first, but has its charm and certainly demonstrates the paradigm.

This is the data structure instance of our example:

XML input:

A “conventional” version: to populate course according to tag, observing loading state

Notes:

  1. The “Course” tag resets the parsing.
  2. The “Paradigm” tag opens a new paradigm in the course, with (for now) empty course list and sets both as current.
  3. The “language” tag adds a language to the current paradigm.
  4. The end element is registered but not implemented. (Reserved for future use).

Output:

Procedural 
C
COBOL
OO
Python
C++
C#
Java
Functional
Clojure

A coroutine-based solution: to populate course, iterating in data hierarchy

And here is a coroutine-based XML builder, using the same parser. The builder registers begin-tag and end-tag handlers at the parser which feed the coroutine. The coroutine is built as loop within loop within loop, true to the one-to-many-to-many course data structure

Notes:

  1. Course loop (normally executed once).
  2. The begin-course tag clears the course and resets the current paradigm.
  3. Paradigm loop.
  4. The begin-paradigm tag opens a new paradigm as current and clears its language list.
  5. Language loop
  6. The language tag adds a language to the current paradigm.
  7. The end-language tag (introduced by the parser) is ignored.
  8. Next language.
  9. Next paradigm
  10. Next course (expected end-tag).

This little gem is fascinating methodically (although its actual utility is admittedly questionable). What we have here is what appears to the eye as traditional procedural code, but is emulated by functional programming! It is event-driven, but under the surface. The coroutine usage effectively hides the state machine from the programmer. As in the common procedural program, the control structure follows true to the data structure. Of course, the fidelity to the data structure accounts for the strong points of this design pattern, as well as its weak points. On the one hand, the code is crystal clear. Any beginner Python programmer can understand what this code does, even if not familiar with exactly what yield does. (In contrast with the event-driven solution that is as detached from the data structure as can be, and is hardly readable). On the other hand, it is not open-closed/extensible. Any change to the data structure requires to physically change the code (while the event-driven solution can do with just registering extra handlers). Still, I would recommend considering this design pattern, due to its superior readability, where the data structure is solid, or extending the code when it does change is not an issue.

11. Exercise: Cooperative tail server

Refactor your solution of the multi-threaded Tail Server exercise presented in lesson three, to use coroutines.

Here is the official “schoolbook solution”:

Output:

File 1 to watch: file1.txt 
File 2 to watch: file2.txt
File 3 to watch: file3.txt
File 4 to watch:
[Tailing "file1.txt"]
[Tailing "file2.txt"]
[Tailing "file3.txt"]
file3.txt: 1. Mon Aug 23 23:02:22 2021
file1.txt: 1. Mon Aug 23 23:02:22 2021
file2.txt: 1. Mon Aug 23 23:02:22 2021
file3.txt: 2. Mon Aug 23 23:02:27 2021
file1.txt: 2. Mon Aug 23 23:02:27 2021
file2.txt: 2. Mon Aug 23 23:02:27 2021
file3.txt: 3. Mon Aug 23 23:02:32 2021
file1.txt: 3. Mon Aug 23 23:02:32 2021
file2.txt: 3. Mon Aug 23 23:02:32 2021
file3.txt: 4. Mon Aug 23 23:02:37 2021
file1.txt: 4. Mon Aug 23 23:02:37 2021
file2.txt: 4. Mon Aug 23 23:02:37 2021
file1.txt: 5. Mon Aug 23 23:02:42 2021
file3.txt: 5. Mon Aug 23 23:02:42 2021
file2.txt: 5. Mon Aug 23 23:02:42 2021
[stopped tailing "file1.txt"]
[stopped tailing "file2.txt"]
[stopped tailing "file3.txt"]

In this solution, each Tail Watcher, as well as the very Tail Server, lives on a thread of its own. Refactor the solution to a Tail Server that does run on a thread of its own, but iterates on its Tail Watchers, synchronously, using the pull iterator pattern. All in all, There are going to be two threads in the solution: the Thread Server and the file-touch function.

References

  1. “A Curious Course on Coroutines and Concurrency”, David Beazley, 2009 (lecture foils)

What next?

One more lesson to go! In the next — and last — lesson, we are going to consider the dispatch-based cooperative processing alternative, called async execution.

Contents:

  1. Introduction
  2. The Thread
  3. Synchronization Primitives (Multi-threading)
  4. Synchronization Primitives (Multi-processing)
  5. Cooperative Processing — synchronous — (you are here!)
  6. Cooperative Processing — asynchronous

--

--

Avner Ben
CodeX
Writer for

Born 1951. Active since 1983 as programmer, instructor, mentor in object-oriented design/programming in C++, Python etc. Author of DL/0 design language