“Linux Command, ls -l”

In software engineering, a shell is a user Command-Line Interface (CLI) to access an operating system. It is a layer around an operating system’s core services, known as a kernel. Many people use shells in relation to UNIX/Linux because of the wide variety of tasks available in it. A software engineer can program in it, tailor their environment, query other machines, connect to other devices, etc.

My coding partner and I have brainstormed about building our own Linux shell, following in footsteps of Stephen Bourne (Bourne shell), Brian Fox (bash), David Korn (korn shell), Bill Joy (csh), Paul Falstad (zsh), and others. Recently, we have discovered the low-level interworking of a Unix/Linux shell in greater detail. Let’s start with a simple command.

$ ls -l

This is a common command, where the user would like a listing of the files in the current directory with files listed one file per line with its details.

The main interface starts at the prompt, “$”. The user types in their command at the prompt. The prompt can be a simple dollar sign or can be set to a multitude of variations by the setting in the $PS1 environment variable. To learn more about $PS1, click here.

At the “$” prompt, the user types “ls -l”. After hitting the return key, you will see the results, which should resemble the example below.

What is the Shell Doing?

The Linux shell reads the user’s command from the standard input (stdin) once a return character is read. The input is parsed or tokenized by delimiters, such as [space], [newline \n], [carriage return \r], [tab \t]. The strtok() command is useful for tokenizing the entered command.

The first element or token is the command. Now what?

Is the Command An Alias?

An alias is substitute for a custom command. You can consider an alias an abbreviation of a command and/or its arguments by another string. In bash shells, aliases are often found in the .bashrc or .bash_aliases file. If there is a match, the typed command must be substituted with the alias definition. But, the command may not be ready to be executed. It may refer to a built-in or a command in a different directory.

Is the Command A Built-In?

In Linux, a set of commands is built into the operating system. (See a list of built-ins here). If it is a built-in, the the command must be matched to the right function to run, and the remaining arguments can be passed as arguments to this function. However, in our example, “ls” is not a built-in command.

Is the Command in the $PATH?

In Linux the $PATH environment variable exists in a Linux shell. The environment is inherited from the parent shell. If you run “printenv” or “echo $PATH”, a user’s $PATH may resemble

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

To find the “ls” executable program, the PATH must be tokenized with “:” as the delimiter. Each individual path must be appended with a “/ls” with an fstat command to verify if the command exists. Each path and file combination is tested until all of the possibilities are checked or found. In our example, the ls command is found in “/bin/ls”, the sixth test of the path.

Execute Command With Arguments

A fork() command will spawn a child process and will make a copy of the parent’s stack and heap. As a result the child process will have access to the tokenized “/bin/ls -l” command. The child process will receive most of the same environment and program information from the parent. In the child, the command will be run with execve() (or another related exec() command). The execve() command will delete the process and replace with its own. It will run the ls command and terminate. Then, the parent will wait() for the child to finish.

Print Prompt

The parent then prints the prompt (as specified by the $PS1 environment variable, mentioned above).

Rinse and repeat

The command has been processed. The Linux shell now waits for the user to enter another command.

The End

Stick a fork() in it. We’re done.