Exception Propagation From Another Process

alex_ber
Geek Culture
Published in
4 min readApr 11, 2021

It is very difficult if not impossible to pickle exceptions back to the parent process.

Simple ones work, but many others don’t.

For example, CalledProcessError is not pickable (my guess, this is because of stdout, stderr data-members).

This means, that if child process raise CalledProcessError that is not catched, it will propagate to the parent process, but the propagation will fail, apparently because of bug in Python itself. This cause pool.join() to halt forever — and thus memory leak! See https://stackoverflow.com/questions/15314189/python-multiprocessing-pool-hangs-at-join and https://bugs.python.org/issue9400

So, I have created GuardedWorkerException context-manager to mitigate this problem.

The source code you can found here. It is available as part of my AlexBerUtils s project.

You can install AlexBerUtils from PyPi:

python3 -m pip install -U alex-ber-utils

See here for more details explanation on how to install.

Let’s see some code stub for application that has some multi-process computation.

Suppose, that you wrote some script or application that is packaged and installed as YourApp (that is, it is installed into site-packages or venv). Something like this:

Code of app.py:

Note: Here I’m using init_app_conf module. You can read about it here.

Note: Typically, at the end of the script/application’s main() function there are following lines:

See here Making more yo relative path to file to work for my alternative setup. You will find the complete code in the appendix below.

The really interesting part happening inside compute_parallel() and prepare_parallel(). Note, that prepare_parallel() runs in single child subprocess.

Basically, compute_parallel() and prepare_parallel()runs in sperate processes. So, if exception happens their, it will propagated to the main process and potentially hangs our application.

If we wrap out code of these function inside GuardedWorkerException context-manager it will prevent this hanging.

How GuardedWorkerException works?

Let’s look on prepare_parallel() function first.

Beside, import, all the body of the function is encapsulated in

with GuardedWorkerException(logger=logger):
#...

Logger parameter is optional. If it is not passed, than sys.stderr is used instead.

In my code above, the log is configured in global_init() function. This global function is actually also initialized global logger variable that is passed to GuardedWorkerException and is used for the logging.

Basically, if any Exception (note: not BaseException) is raised, that it will be logged and, by new Exception(‘Worker failed) will be raised.

This is done to halt regular execution, but to use “well-behaving” on process propagation Exception. I’m sure that exception above will not halt program execution or lead to memory leak.

If code in with block returns normally, that nothing interesting happen.

What flexibility do we have with GuardedWorkerException?

Let’s look on compute_parallel() function.

Beside, import, all the body of the function is encapsulated in

with GuardedWorkerException(logger=logger, suppress=True):
#...

Basically, if any Exception (note: not BaseException) is raised, that it will be logged and the Exception itself will be suppressed, nothing will be raised as if the block execution exits normally.

Default value of suppress parameter is False.

Default value of default_exc_message parameter is ‘Worker failed’. If you supply another value here, than the message of raised Exception will be default_exc_message.

This make sense, if you have some unrelated computations that is run in parallel processed. If you have exception in one computation, you don’t want to abort parallel one, you want to finish everything (it is good idea to have some indication that you have only partial result, but you can do it after all compute_parallel() computations will finished).

Note: that you can also implement retry mechanism. The simplest implementation is as following:

Note: I’ve been omitted parameters that are passed (compute_parallel()will typically have some parameters, that will be passed to __init__() method of YourClass.

So, in lines 6–8 we’re instantiated and running YourClass. If this code block raises Exception we will wait for a while and retry again. Any failure in retry will be suppressed.

Note: If we have some severe problem (BaseException is thrown, it will not be cached; this is done by design).

Appendix

Some “external main”:

--

--