Ever wonder what”s going on in there?

What happens when you type ls -l in the shell?

The shell is a command line interpreter that takes commands (commonly typed into the terminal on your computer) and sends them to the operating system to be processed. While the GUI, or graphical user interface, has moved to a dominant position, the command line remains the method of choice for programmers when interacting with their computers. The first shell was written by Ken Thompson at Bell Laboratories and published in 1971. It may not be used as much now, but it spawned a whole line of Unix/Linux versions of the shell program, including the “Bourne shell” authored by Stephen Bourne, as well as the “Bourne again shell,” developed by Brian Fox and commonly called “bash,” which is probably the most widely used of the shell programs, and based on Thompson’s original shell.

When you open the shell, you will generally see a prompt (for example: “$”). When you type a command, a series of processes are set in motion. Let’s walk through what happens when you type in a simple command: “ls -l.”

When you wish to list all the files in the directory with only the names of the files, you might type in “ls.” But when you want more information about those files, you may add the flag “-l” which will signal that you want the long format (notice the letter ‘l’ for long, clever no?). The long format gives additional information on things like file permissions or the date the file was created, for example:

Example of output after ls -l

The shell program captures what you type on the command line through what we call the standard input file stream. All of the information will be captured up until either a new line (meaning when you press enter) or, a second possibility, what we call end of file (EOF/CTRL -d). In the case where the program reaches EOF, the program will terminate. Otherwise, upon reaching the new line, the information is copied into a buffer and the program continues. At this point, the buffer is separated into tokens (you can think of tokens as the meaningful elements you typed on the command line). The string of symbols is examined and delimiters, such as white spaces or new lines, are converted to what we call null bytes (which denote the end of a string of characters). A null terminated array of pointers (memory addresses) is created; each pointer points to the beginning of each token, for easy reference. In our “ls -l” example, we can visualize this process through the following representations of the computer’s memory.

Notice that in the chart below, the command “ls -l” which has been entered into standard input is now copied into a buffer.

During the tokenization process, white spaces are replaced with null bytes (“\0”), which break up the buffer into multiple strings, which we call tokens.

An array of pointers is created in memory. Each pointer holds the memory address of the beginning of the token as per the following chart.

The first token is what we call the command, for example “ls.” The shell will take the first token and compare it to a list of aliases. An alias can be compared to a nickname for a command. If a match is found, the shell will replace the alias with the corresponding command. Next, the shell will compare the token to a list of builtins, which are functions that are literally built in to the shell (even more clever, no?). In our case, “ls” is neither an alias nor a builtin. So what happens next? The shell will check to see if the command has what we call an absolute path (for example “/bin/ls”). If it does, the shell will proceed to execute the command. In the case of “ls” there is no absolute path, so the command will not yet be executed. Absent an absolute path, the shell will use what is known as the PATH environment variable to search for the command’s location in the file system. The PATH is a string of colon separated directories which actually house the commands. The way the shell searches for the location of a corresponding command in the PATH is of particular interest. The shell will begin to check each directory in the PATH in the order in which they are listed (as per the PATH environment variable). When the shell finds the location of a corresponding command, “ls” for example, it can begin the process of executing the command. Of course, the shell must first verify that the user has the appropriate permissions to execute the file that contains the command. The shell will then fork a child process, that is, begin a sub-process by which the original process is copied through a fork system call. The shell creates this child process so as to avoid terminating its own process upon executing a command. While the parent process is waiting, the shell will execute the command in the child process by using the command’s absolute path, the array of tokens that houses the arguments, and a null terminated array of environment variables. Upon successful execution, the child process will be terminated. The waiting parent process will then resume. The shell will then use the PS1 environment variable to print the prompt and await a new command on the command line.

This post was co-authored by Alexa Orrico and John Cottrell

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade