Running programs in parallel

George Shuklin
OpsOps
Published in
3 min readJun 29, 2023

You may think I’m rehearsing a Unix textbook from 1990s. Oh, you can run programs in parallel. How nice! What an amazing UNIX you have here!

Nope. This is a real problem I was struggling to solve.

The problem:

My script (just a list of executables with parameters) has some possibility for parallelism. I want to add it. There are ‘join points’ (when I need to wait for all programs before continue). The problem which make me desperate at certain point is that I want to fail my script if any of parallel programs fail.

For illustrative sakes we will deal with simple script:

stage1.1
stage1.2
stage1.3
(join)
stage2
stage3.1
stage3.2
stage3.3
(join)
stage4

90’s bash: Use &

That’s it. Let’s do it. To make our live spicier, state3.2 is /bin/false, e.g. failing task.

stage1.1 &
stage1.2 &
stage1.3 &
wait
stage2
stage3.1 &
/bin/false &
stage3.3 &
wait
stage4

Like a textbook, right? Just shovel & every time you want to run something in parallel.

Unfortunately, if stage2 and stage4 are okay (return code 0), the script from above will return 0, even there is /bin/false running in parallel.

Turned out, those sweet 90s with ‘garbage in garbage out’ idea, thought, that wait should succeed. Did the program end? Yep. Return code 0. The rest is not out problem.

When I found this first time, I was a a denial phase. But of course they should thought about it and there should be magical option to wait. Nope, none. You can wait for individual pids, which turn program into true bash script:

#!/bin/bash
set -e

# Start your background jobs and remember their PIDs
job1 & pid1=$!
job2 & pid2=$!
job3 & pid3=$!

# An array to hold the PIDs
pids=("$pid1" "$pid2" "$pid3")

# An array to hold the exit statuses
statuses=()

# Wait for all jobs and collect their exit statuses
for pid in "${pids[@]}"; do
wait "$pid"
statuses+=($?)
done

# Check the exit statuses
for status in "${statuses[@]}"; do
if [ "$status" -ne 0 ]; then
echo "A job failed with exit status $status"
exit 1
fi
done

echo "All jobs succeeded"

(It’s even not mine script, it’s ChatGPT, I avoid writing in bash if I can).

Does not look nice at all. How many bugs are there? I though about using traps, but nope, traps didn’t worked (may be there is a way, but it for sure will be complicated, so I abandoned it).

Moment of desperation

I just realized there is no easy way to do it. I can write in Bash, I can write in Rust, I can write in any language I want, and I’ll have to deal with all those nitty-gritty details of handling parallel processes. I’m not writing a program here, I’m running stuff in parallel!

Parallel

There is a tool I skipped up to the point. xargs. It can run in parallel, but it run the same program with different arguments. It has beefier cousin, parallel. Which do the same, but can throw jobs on multiple machines.

I need to run different programs, not the same program with different args.

It was until ChatGPT (again!) said that parallel can do it. Yep. 20+ years in industry and I didn’t know that.

Behold:

parallel ::: "cmd1 args args args" "cmd2 args args args"

Yes, like that. Quotes are important. What is that alien syntax with :::? I have no idea.

       ::: arguments
Use arguments on the command line as input source.

Obscure… But it does the job.

Our script become:

parallel ::: \
"stage1.1" \
"stage1.2" \
"stage1.3"
stage2
parallel ::: \
"stage3.1" \
"stage3.2" \
"stage3.3"
stage4

And yes, quotes are essential, because IRL there are arguments to those ‘stages’. And yes, there going some tinkering about nested quotes, and may be I’ll switch from reading command from stdin (separated by what? huh). But, it does the job.

The main thing it does it returning code (not-zero) when there is non-zero code from any of the run programs.

Conclusion

Trivial problem which took me a half-day to solve, but I got a nice hammer in my tool belt. Turned out, I didn’t paid enough attention to basic system utilities, and simple parallel (which is not simple, 2914 lines in man page), actually, way more useful when I thought.

--

--

George Shuklin
OpsOps

I work at Servers.com, most of my stories are about Ansible, Ceph, Python, Openstack and Linux. My hobby is Rust.