Exception Propagation From Another Process
It is very difficult if not impossible to pickle exceptions back to the parent process.
Simple ones work, but many others don’t.
For example, CalledProcessError
is not pickable (my guess, this is because of stdout
, stderr
data-members).
This means, that if child process raise CalledProcessError
that is not catched, it will propagate to the parent process, but the propagation will fail, apparently because of bug in Python itself. This cause pool.join() to halt forever — and thus memory leak! See https://stackoverflow.com/questions/15314189/python-multiprocessing-pool-hangs-at-join and https://bugs.python.org/issue9400
So, I have created GuardedWorkerException
context-manager to mitigate this problem.
The source code you can found here. It is available as part of my AlexBerUtils s project.
You can install AlexBerUtils from PyPi:
python3 -m pip install -U alex-ber-utils
See here for more details explanation on how to install.
Let’s see some code stub for application that has some multi-process computation.
Suppose, that you wrote some script or application that is packaged and installed as YourApp (that is, it is installed into site-packages or venv). Something like this:
Code of app.py
:
Note: Here I’m using init_app_conf module. You can read about it here.
Note: Typically, at the end of the script/application’s main()
function there are following lines:
See here Making more yo relative path to file to work for my alternative setup. You will find the complete code in the appendix below.
The really interesting part happening inside compute_parallel()
and prepare_parallel()
. Note, that prepare_parallel()
runs in single child subprocess.
Basically, compute_parallel()
and prepare_parallel()
runs in sperate processes. So, if exception happens their, it will propagated to the main process and potentially hangs our application.
If we wrap out code of these function inside GuardedWorkerException
context-manager it will prevent this hanging.
How GuardedWorkerException works?
Let’s look on prepare_parallel()
function first.
Beside, import, all the body of the function is encapsulated in
with GuardedWorkerException(logger=logger):
#...
Logger parameter is optional. If it is not passed, than sys.stderr
is used instead.
In my code above, the log is configured in global_init()
function. This global function is actually also initialized global logger
variable that is passed to GuardedWorkerException
and is used for the logging.
Basically, if any Exception
(note: not BaseException
) is raised, that it will be logged and, by new Exception(‘Worker failed’)
will be raised.
This is done to halt regular execution, but to use “well-behaving” on process propagation Exception. I’m sure that exception above will not halt program execution or lead to memory leak.
If code in with
block returns normally, that nothing interesting happen.
What flexibility do we have with GuardedWorkerException
?
Let’s look on compute_parallel()
function.
Beside, import, all the body of the function is encapsulated in
with GuardedWorkerException(logger=logger, suppress=True):
#...
Basically, if any Exception
(note: not BaseException
) is raised, that it will be logged and the Exception
itself will be suppressed, nothing will be raised as if the block execution exits normally.
Default value of suppress
parameter is False
.
Default value of default_exc_message
parameter is ‘Worker failed’. If you supply another value here, than the message of raised Exception
will be default_exc_message
.
This make sense, if you have some unrelated computations that is run in parallel processed. If you have exception in one computation, you don’t want to abort parallel one, you want to finish everything (it is good idea to have some indication that you have only partial result, but you can do it after all compute_parallel()
computations will finished).
Note: that you can also implement retry mechanism. The simplest implementation is as following:
Note: I’ve been omitted parameters that are passed (compute_parallel()
will typically have some parameters, that will be passed to __init__()
method of YourClass
.
So, in lines 6–8 we’re instantiated and running YourClass.
If this code block raises Exception
we will wait for a while and retry again. Any failure in retry will be suppressed.
Note: If we have some severe problem (BaseException
is thrown, it will not be cached; this is done by design).
Appendix
Some “external main”:
See also:
- Integrating Python’s logging and warnings packages.
fixabscwd()
function inmains
module or Making relative path to file to work.fix_retry_env()
function inmains
module or Make path to file on Windows works on Linux.FixRelCwd()
function inmains
module or Making more to relative path to file to workGuardedWorkerException()
function inmains
module or Exception Propagation from another Processjoin_files()
function infiles
module or Join Files.- stdLogging module, or My stdLogging Module