What happens when you type ‘ls -l’ in a Shell program ?

--

In this article, we’ll try to describe precisely the operations that are happening when you type a simple command in a Shell program.

The shell is just a program, although it plays an important role in the system. It usually display a prompt and use your standard input file, ie your Keyboard, to wait for a command line.

Shell reads the user input

When typing ‘ls -l’ on your keyboard, this will display ‘ls -l’ after the prompt and it waits for an EOF character, which is obtained when you press Enter. Now, the shell program is going to perform a series of operations to try to do what you want : listing all the files and folders in the current working directory with details

Example of ls -l execution

The Shell program will first store the input line in a buffer, storing the input line into a simple string.

Shell prepares the user input

  • Tokenization

The buffer is now splitted into a list of tokens, which can be words (‘ls’ or ‘-l’), logical operators (like ‘;’ or ‘&&’), redirection instructions (like ‘|’ or ‘>’), etc… This is usually done by parsing the common delimiters, the “space” character being the most common one alongside with “horizontal tab”, “vertical tab”, etc…

The program performs a grammatical analysis of the buffer : each token is categorized as OPERATOR (is token a logical operator ?), COMMAND (the first item after a logical operator), ARG (the following items), etc…

If the grammatical analysis detects incorrect patters (for example, 2 consecutive logical operators), the programs displays an error message on the standard error output, which, unless specified otherwise, will be your monitor. It then displays a new prompt on PF1 and waits for a new (hopefully correct) inputs.

Example of syntax error
  • Check for alias

The program will check each token which have been analyzed as command : if the token is a definite alias, the token will be replaced by the alias value.

  • Expansion

After the command has been split into tokens, these tokens or words are expanded or resolved. This is done through multiple steps, although very few applies when you type the simple ‘ls -l’ command.

There are seven kinds of expansion performed: brace expansion, tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, word splitting, filename expansion.

The most basic expansion is when you use the character *, for example if you type ‘ls *.c’ : the shell program will look through your current word directory and will replace the *.c token by all files which names are matching the *.c pattern.

The simple variable expansion is pretty common as well : the token in the format $XXX (which are not specific cases like ‘$$’ or ‘$?’) will be replaced by the environment variable XXX if it exists, or the blank character if not.

Example of variable expansion
  • Pathing

At this point, the program still doesn’t know what to do with the simple command “ls” : it has to find a program, called ls, to be able to execute its (with the optional following arguments). The shell program will go through the PATH environment variable, which is a string containing multiple paths, all separated by the ‘:’ character.

The program will go through this string, folder by folder, and will see if it can find a program called “ls” in the corresponding folder and if it is executable. If and when found, the token “ls” will be replaced by the full path token : in this example, the ls executable is located in the /bin folder, thus “ls” is replaced by “/bin/ls”

Content of the /bin folder

Shell executed the prepared command line

Now everything is prepared to execute the command line. The shell programs knows what function to call (“/bin/ls”) with which arguments (“-l”).

The shell itself cannot execute the file associated with the command. Instead, it performs a fork() system call : it asks the kernel to copy the shell program and its environment into a new operating process and have it start where it left off. This new operating process is called the child process and is given a “Process ID”, or pid, that the child’s parent (our shell) can track.

The shell then sends a wait() system call which tells the kernel that the shell will wait for its child process to finish before continuing to operate.

The child process then does an exec system call. It submits an execve() call specifically, which takes the command, its arguments, and a list of environmental variables — in the form of an array of strings — for the new program to adopt. The execve system call erases the child processes’ current data and replace it with the executable file whose path is stored in our command token. Now, the child process has become the command. It will execute itself.

Example of execution of ls -l

Back to square 0

The last step, after execution of the “ls -l” command via the Child process, is to display back the command line prompt and let the user the possibility to enter a new command : usually, environment variable (PS1) stores the format of the prompt and is displayed on the screen. In the above example, the prompt is simply the character $

--

--