Bash Job Control

Cindy Sridharan
May 29 · 7 min read

Job Control is one of the more advanced features of Bash, and one, until recently, I hadn’t taken time to learn properly. My general philosophy with scripting has been Python when I can, Bash when I must, to the point where for years I never wrote any Bash.

Taking the time to learn Bash better has always given rise to mixed feelings in me — if something is complicated enough that it can’t be done in rudimentary Bash, then it probably shouldn’t be done in Bash to begin with. What’s the point of investing time in learning the advanced features of something that’s best avoided altogether?

However, Bash is unsettlingly pragmatic. More often than not, I’ve found myself in situations where I’ve realized that it’d just be easier and faster to do something in Bash than in Python. So I decided to become conversant with the parts of Bash I’m not terribly familiar with — job control being one of them.

Foreground and Background Processes

I expect most folks to be aware of foreground and background processes, but it doesn’t hurt to revisit the topic.

In Bash, apipeline is a sequence of one or more commands separated by one of the control operators | or |&. Each command in a pipeline is executed in its own subshell, which is a separate process from the shell process. These processes are, by default, started in the foreground, meaning once these processes begin execution, the user can’t interact with the shell until the process completes or changes state.

It’s also possible to run a process in the background. Background processes don’t restrict access to the shell but execute in the background. They return control to the shell immediately upon start. Any command can be started in the background by appending an & to it.

In the example above, the function foo is started in the background. The script exits immediately, while the function executes in the background.

Jobs vs Processes?

Builtins like kill, disown and wait operate on both processes and jobs. However, a job isn’t quite the same as a process.

A job is something that’s tracked by the shell. The shell maintains a table of currently executing background processes and processes that have been suspended.

If I suspend a running emacs process with C-z, and then type jobs -l on the terminal (the -l option to the jobs builtin prints the pid of the job), I will see:

~/copyconstruct@bailey: jobs -l
[4]+ 38992 Suspended: 18           emacs .

However, if I open a new shell and type jobs -l , I wouldn't see emacs being listed as a suspended job. This is because the new shell isn’t tracking the suspended emacs process. However, the new shell is still aware of the process 38992 , since a process is tracked by the operating system and not the shell from which it was launched.

Job Spec

A jobspec can be thought of as a job identifier or job number.

As mentioned previously, a job is purely a shell-level construct. The shell tracks all suspended and background processes. The jobspec is simply an identifier used by the shell to track the suspended or backgrounded process.

In the above example:

~/copyconstruct@bailey: jobs -l
[4]+ 38992 Suspended: 18           emacs .

[4] is the jobspec. The jobspec is used by the job control builtins to operate on jobs. To refer to a jobspec in the shell, it needs to be prefixed with a %.

Enabling Job Control

Job control can be enabled using the set builtin.

set -m or set -o monitor

Job Control Builtins

fg, bg and jobs are three job control commands that work purely on jobs.

However, the jobspec can also be used with process control commands like kill, wait, disown and suspend.

bg

Used to resume a suspended job in the background.

Usage: bg [jobspec]

If no jobspec is provided, the currently running job is used. Trying to use this with an invalid jobspec results in an error.

~/copyconstruct@bailey  bg %5

fg

Used to resume a jobspec in the foreground, making it the current job.

Usage: fg [jobspec]

jobs

The jobs command lists all such jobs tracked by the current shell.

Usage:

jobs [-lnprs] jobspec

Per the manual:

The first form lists the active jobs. The options have the following meanings:

-l : List process IDs in addition to the normal information.

-n : Display information only about jobs that have changed status since the user was last notified of their status.

-p : List only the process ID of the job’s process group leader.

-r : Display only running jobs.

-s :Display only stopped jobs.

If jobspec is given, output is restricted to information about that job. If jobspec is not supplied, the status of all jobs is listed.

If the -x option is supplied, jobs replaces any jobspec found in command or arguments with the corresponding process group ID, and executes command, passing it arguments, returning its exit status.

Were you to run the script above, you’d see the background jobs listed amidst the output of function bar.

...
14
Background jobs: [1]+  Running                 bar &
15
16
...

disown

disown works on both processes and jobs. When job control is enabled, the disown command can be used to remove jobs from the job table of the shell.

Usage: disown [-ar] [-h] [jobspec … | pid … ]

~/copyconstruct@bailey  disown %4

The -h option is used when we don’t want the job removed from the shell’s table but we wish to turn off SIGHUP being sent to the job by the shell when the shell that launched it receives one.

The -a option without a jobspec will remove all the jobs from the table, whereas the -r option will only remove currently running jobs.

suspend

Used to suspend the shell. The shell’s parent process can resume it with a SIGCONT signal.

Usage: suspend [-f]

wait

wait, like disown, works on both processes and jobs. Job control mode needs to be enabled for wait to work with jobs.

Usage: wait [-fn] [jobspec or pid]

wait tells the shell to wait until the subprocess specified by the pid or the jobspec exits. The return code is that of the last command the shell waited for. When a jobspec is provided, the shell will wait until all the processes in the job exit.

The above script, unlike its predecessor, waits for the function bar to complete before it exits.

Invoking wait without any arguments causes the shell to wait for all currently active child processes. The following are the arguments wait accepts:

-n: waitwaits for a single job to terminate and returns its exit status.

-f: In the job control mode, wait will return when the job changes state. The -f option causes wait to wait for each pid or jobspec to terminate before returning.

If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.

copyconstruct@bailey: wait %3

kill

kill, like disown, works on both processes and jobs. The job control mode needs to be enabled for kill to work with jobs.

Usage:

kill [-s sigspec] [-n signum] [-sigspec] jobspec or pid

The kill builtin sends a signal to the process specified by thejobspec or pid. kill works with the following options:

-s: sigspec is either a case-insensitive signal name such asSIGINT (with or without the SIG prefix) OR a signal number.

-n: signum is a signal number (kill -n 2 %1 will send a TERM to the job with the jobspec 1)

If sigspec and signum are not present, SIGTERM is used.

In the above example, we start two background jobs. We then proceed to kill one with an INT(line 16), and another with a TERM (line 18).

Setting set -b causes the status of terminated background jobs to be reported immediately, rather than before printing the next primary prompt.

checkjobs

The shell prints a warning message when one tries to exit a shell that has suspended jobs, until a second exit is attempted at which point the shell actually exits without further ado. If the checkjobs option is enabled, the shell lists each job and its status the first time one tries to exit the shell.

checkjobs can be enabled with the shopt builtin.

Conventions

So far we’ve only referrred to jobs with a %[n], where n is the jobspec. There exist other ways to refer to jobs:

%% — “current” job (last foreground job stopped or last background job started) %+ — “current” job (last foreground job stopped or last background job started) % — current job %- — previous job

A job can also be referred to using a prefix of the name used to start it, or using a substring that appears in its command line. If the prefix or substring matches more than one job, Bash reports an error.

%foo — Invokes a job beginning with string foo %?foo — Invokes a job contains within it string foo

For example, C-z can be used to suspend emacs. To bring back the suspended emacs process, %emacs will do the trick.

Conclusion

In long running scripts, it’s useful to be able to start jobs in the background and be able to control when and how they terminate.

Whether a script of even this modest level of complexity should be written in Bash as opposed to a real programming language like Go or Python is a matter of opinion. However, it’s also odds on that such a script might be a few lines of Bash as opposed to tens of lines in Go or Python. Furthermore, if this happens to be a script that needs to run in an environment that’s not one’s laptop, shipping a Go binary or setting up a Python environment along with all the dependencies might be non-trivial. Bash is worth learning, not least since it’s more ubiquitous than any other language.

Cindy Sridharan

Written by

@copyconstruct on Twitter. views expressed on this blog are solely mine, not those of present or past employers.