ls -l in the shell

Author(s): Lisa Leung and Joseph McDaniel

The Shell

The shell (i.e the command line interpreter) takes in a variety of commands. In short, the shell is simply a program that read commands typed by a user and execute appropriate programs in response to those commands. When the user types in ls -l, there are series of operations that the program takes before returning a beautiful list of files, folders, etc in long format in the current working directory.

Process Creation and Program Execution

Understanding processes is important to understand how the shell program works since each a user inputs a command, the program is executed on a separate process. There are several system calls that are essential to process creation for file executables: fork() exit() wait(), and exec().

The fork() system call allows a process (the parent process) to create a new process(the child process). The child process is nearly an exact duplicate of the parent process, it obtains a copy of the parent’s stack, data, heap, and text segments.

The exit(status) system call terminates a process, making all resources (memory, etc) available for subsequent reallocation determined by the kernel. The status argument is an integer that determines the termination process for the process. Note that this status integer can be retrieved when the parent invokes the wait() system call. [fun fact: the exit() library is built on top of the _exit() system call]

The wait(&status) system call suspends the execution of the process until one of its children has terminated. the termination status of the child is returned in the status argument of wait(&status).

The exec() Library Functions

The execve() function is a variation of the exec() library functions that executes an executable files. Parameters required depends on which of the exec functions is called. The execve() takes in 3 parameters: pathname, array, and the environment source.

#include <unistd.h>

int execve(const char *filename, char *const argv[],
char *const envp[]);
  1. Pathname: the filename or (pathname) provided can be a relative or absolute path. execve() is “stricter” when it comes to path input, it requires you to be specific with the pathname. The executable for ls defined in the /bin folder for linux systems. The execve() function will not know to execute /bin/ls if you simple pass in ‘ls’ (unless you have a ls executable in your current working directory); the execvp, however, assumes as default path list of .:/usr/bin:/bin.
  2. Array: argv is an array of argument strings passed to the new program. By convention, the first of these strings (i.e., argv[0]) contains
     the filename associated with the file being executed.
  3. The environment: the envp is an array of strings, conventionally of the form key=value, which are passed as environment to the new program.
The Linux Programming Interface (Michael Kerrisk) page 568

How the fork(), exit(), wait(), and exeve() comes together

The image below provides an excellent illustration of the relationship between a child process and its parent. Child forks from parent process, runs the program until the end. If the child exits with a status, it is returned to the parent (if available). With wait(), the parent will not continue with the execution of the “parent process” until the child process is finished.

The Linux Programming Interface (Michael Kerrisk) page 515

So why is this important? Remember that the shell is a program, and your commands is telling the shell to run a program.

The shell program does the follow:

- Read/standard input
- Expansion
- Break the command into tokens
- check for alias
- Check built ins
- find the command in the PATH
- Call the program ls with the filename ending with -l as its parameters
- when ls is done, print the prompt
- Wait for a new command to be entered

Step 1: Reads from standard input: the shell prompts for user input, it may invoke system calls such as read().

Step 2: Expansion:

The shell program expanded into something else before the shell acts upon it. There are pathname expansions, tilde expansions, arithmetic expansions, and forth.

Step 3: Break the command into tokens:

When the user types in ls -l, the shell program breaks the command into tokens when is stored in an array. This array gets passed in as the second argument for the one of the exec functions to run an executable file.

Step 4: Check for Alias

An alias is a shortcut created by the user, generally to avoid typing long commands. The shell checks the tokens against the list of aliases and replaces it with its respective value.

Step 5: Check Builtins

Builtin commands are contained within the shell and are executed directly without invoking another program. They’re utilized to bypass functionality or efficiency issues that would be caused by running a program. Even though ls is not a builtin command, the shell will check against its list of builtin commands. To illustrate this step, the type command can be used to see if commands like ls or cd are builtins.

Step 6: Find the command in the PATH

The first token of the array is a command, the shell program locates the path of where the executable is (/bin, /usr/bin, etc).

Step 7: Call the program ls with the filename ending with -l as its parameter(s)

Step 8: Lastly, the shell Prints the prompt and Wait for a new command.

Resources:

The Linux programming interface: a Linux and UNIX system programming handbook Michael Kerrisk — No Starch Press — 2012

http://linuxcommand.org/lc3_lts0080.php