An overview how the BASH shell works

Beta Scribbles
Geek Culture
Published in
8 min readAug 23, 2021
All Rights Reserved

In this article, we are going to look at the BASH (Bourne Again SHell) and how it interprets the command ls -l *.c

Before we rush things away, let’s take the subject a step at a time. What is a shell? How is the bash shell unique? What is the difference between a shell and a terminal? How does a shell interpret commands? Let’s break this into detail one at a time.

What is a shell?

In computing, a shell is a computer program that exposes an operating system’s services to a human user or another program. Operating system shells use either a (CLI) command-line interface or (GUI) graphical user interface, depending on a computer’s role and particular operation. The shell presents the contents of a computer in a tree structure, as shown below.

Command-line shells require the user to be familiar with commands and their calling syntax and to understand concepts about the shell-specific scripting language (for example, bash).

Command-line shell. All Rights reserved: dwmkerr.com

Graphical shells place a low burden on beginning computer users and are known as being easy to use. Since they also come with certain disadvantages, most GUI-enabled operating systems also provide CLI shells.

Graphical shell. All Rights Reserved: wikipedia.org

How is the Bash shell unique?

On most Linux systems, a program called bash (which stands for Bourne Again SHell, an enhanced version of the original Unix shell program, sh, written by Steve Bourne) acts as the shell program. Besides, there are other shell programs available for Linux systems. These include: ksh, tcsh and zsh.

What is the Terminal?

It’s a program called a terminal emulator. This is a program that opens a window and lets you interact with the shell. There are a bunch of different terminal emulators depending on your Linux distribution, some of them are;gnome-terminal, konsole, xterm, rxvt, kvt, nxterm, and eterm.

What is the Kernel?

The kernel is a computer program at the core of a computer’s operating system and has complete control over everything in the system. It is the “portion of the operating system code that is always resident in memory” and facilitates interactions between hardware and software components. We use the shell to get access to the kernel and its abilities.

How does the shell interpret commands?

The shell is a type of program called an interpreter. An interpreter operates in a simple loop: It accepts a command, interprets the command, executes the command, and then waits for another command. The shell displays a “prompt,” to notify you that it is ready to accept your command.

Interpreter execution loop

The shell recognizes a limited set of commands, and you must give commands to the shell in a way that it understands: Each shell command comprises a command name, followed by command options (if any are desired) and command arguments (if any are desired), all separated by blank spaces.

When you type a command in bash, it goes through a certain process that will lead to either the correct output or an error. Let’s see the interpretation process.

When you write a command in the terminal, the shell needs to be able to interpret it correctly in order to know what exactly to do. Maybe you have multiple options or redirect the output to a file. In any event, the shell goes through several steps to figure out what needs to be done.

One question I had was, “In what order does everything get done?” We have shell variables to expand, maybe an alias or function to process, “real” commands, pipes and input/output redirection. There are a lot of things that the shell must consider when figuring out what to do and when.

For the most part, this is not very important. Commands do not get so complex that knowing the evaluation order becomes an issue. However, on a few occasions, I have run into situations in which things did not behave as I thought they should. By evaluating the command myself (as the shell would), it became clear what was happening. Let’s take a look.

The first thing that gets done is that the shell figures out how many commands there are on the line. (Remember, you can separate multiple commands on a single line with a semicolon.) This process determines how many tokens there are on the command line. In this context, a token could be an entire command or it could be a control word such as if. Here, too, the shell must deal with input/output redirection and pipes.

Once the shell determines how many tokens there are, it checks the syntax of each token. Should there be a syntax error, the shell will not try to start any of the commands. If the syntax is correct, it begins interpreting the tokens.

First, any alias you might have is expanded. Aliases are a way for some shells to allow you to define your own commands. If any token on the command line is actually an alias that you have defined, it is expanded before the shell proceeds. If it happens that an alias contains another alias, they are both expanded before continuing with the next step.

The next thing the shell checks for is functions. Like the functions in programming languages such as C, a shell function can be thought of as a small subprogram.

Once aliases and functions have all been completely expanded, the shell evaluates variables. Finally, it uses any wildcards to expand them to file names. This is done according to the rules we talked about previously.

After the shell has evaluated everything, it is still not ready to run the command. It first checks to see if the first token represents a command built into the shell or an external one. If it’s not internal, the shell needs to go through the search path $PATH.

At this point, it sets up the redirection, including the pipes. These obviously must be ready before the command starts because the command may be getting its input from someplace other than the keyboard and maybe sending its output somewhere other than the screen.

This is an oversimplification. Things happen in this order, though many more things occur in and around the steps than I have listed here. What I am attempting to describe is the general process that occurs when the shell is trying to interpret your command.

Once the shell has determined what each command is and each command is an executable binary program (not a shell script), the shell makes a copy of itself using the fork() system call. This copy is a child process of the shell. The copy then uses the execive() system call to overwrite itself with the binary it wants to execute. Keep in mind that even though the child process is executing, the original shell is still in memory, waiting for the child to complete (assuming the command was not started in the background with &).

If the program that needs to be executed is a shell script, the program that is created withfork() and exec()is another shell. This new shell starts reading the shell script and interprets it, one line at a time. This is why a syntax error in a shell script is not discovered when the script is started, but rather when the erroneous line is first encountered.

Understanding that a new process is created when you run a shell script helps to explain a very common misconception under UNIX. When you run a shell script and that script changes directories, your original shell knows nothing about the change. This confuses many people who are new to UNIX as they come from the DOS world, where changing the directory from within a batch file does change the original shell. This is because DOS does not have the same concept of a process as UNIX does.

Look at it this way: The sub-shells environment has been changed because the current directory is different. However, this is not passed back to the parent. Like “real” parent-child relationships, only the children can inherit characteristics from their parents, not the other way around. Therefore, any changes to the environment, including directory changes, are not noticed by the parent. Again, this is different from the behaviour of DOS .bat files.

Using the tilde (~)

Under many shells, you can use the tilde as a shortcut to refer to a particular users home directory. For example, if I had a program in my personal bin, I could start it like this:

~firdaus/bin/mycommand

Note that if I am already logged in as the user firdaus, I do not need to specify my own username. Instead, I could have run the command like this:

~/bin/mycommand

Some shells keep track of your last directory in the OLDPWD environment variable. Whenever you change directories, the system saves your current directory in OLDPWD before it changes you to the new location.

You can use this by simply entering cd $OLDPWD. Because the variable $OLDPWD is expanded before the cd command is executed, you end up back in your previous directory.

A full breakdown of the command ls -l *.c

In relation to the shell operations as described above, this simple command can be interpreted as:

ls -l *.c

ls a command is used to list all files and directories in a specified directory.
-l is an option of the ls command that list files in long format.
*.c shows all files ending with a .c extension in the specified directory. The * stands for all.

When you enter the command ls -l *.c in the terminal, the output will be like

firdaus@Firdaus-PC:~/$ dir
AUTHORS README.md extra_strings.c main.c parser.c strings.c
BAShell builtin.c extra_tools.c memory.c shell.h tools.c
firdaus@Firdaus-PC:~/$ ls -l *.c
-rwxrwxrwx 1 firdaus firdaus 2380 Aug 23 12:41 builtin.c
-rwxrwxrwx 1 firdaus firdaus 576 Aug 23 12:43 extra_strings.c
-rwxrwxrwx 1 firdaus firdaus 1851 Aug 23 12:43 extra_tools.c
-rwxrwxrwx 1 firdaus firdaus 3379 Aug 23 16:44 main.c
-rwxrwxrwx 1 firdaus firdaus 1533 Aug 23 12:37 memory.c
-rwxrwxrwx 1 firdaus firdaus 1392 Aug 23 12:43 parser.c
-rwxrwxrwx 1 firdaus firdaus 2026 Aug 23 12:44 strings.c
-rwxrwxrwx 1 firdaus firdaus 1503 Aug 23 12:44 tools.c
firdaus@Firdaus-PC:~/$

Congratulations for reading through here. Now test yourself if you can use the Feynman Technique to answer these questions; if you cannot, scroll up and reread. What is a shell? How is the bash shell unique? What is the difference between a shell and a terminal? How does a shell interpret commands? What happens when you type ls -l *.c in the terminal and press enter?
See this tutorial for more about getting started with Linux commands.

References

Wikipedia
Bash man page
linuxcommand.org
The Feynman Technique
Getting started with Linux commands

--

--