Simple Commands Aren’t So Simple

Sam Hermes
Feb 4 · 3 min read

Let’s say you’re a novice Linux user who’s just becoming familiar with the command line. It probably didn’t take you long to learn how to type something like ls *.c. You’d barely be able to navigate the command line at all without ls, and even Windows lets you refer to multiple files using *.c.

Since you’ll learn “what it does” so early on, I’m not going to explain that here. Instead, I’m going to explain “how it works” in depth.

I’m going to assume readers are using the “GNU Bourne-Again Shell”, also known as bash. (If you don’t know if you’re using it, you probably are.) I’m also going to make reference to the official bash and ls manuals as reproduced by the “man-pages project” on http://man7.org/linux/man-pages.

Parsing the Command

(Reference)

I’m going to skip the explanation of how your command got to bash in the first place. We’ll start at the point where it’s taken your command, ls *.c, and needs to know what to do with it.

Before bash can run your command, it needs to know what it means. In this case, the first step of that process is breaking the line down into words. This is easy: you typed one space, and you didn’t do anything special with that space, so there are two words: ls and *.c.

After “word splitting”, bash now performs “pathname expansion”. There are many types of “expansion”, and they all have the same purpose: to allow you to use shortcuts in your command that represent larger amounts of text than you need to type. The process of expansion can vary depending on options that you can toggle within bash. I’m going to assume you’re using all the default options.

Before I continue, let’s say your current directory contains theses files:

$ ls
main.c numconvert.c numconvert.h specparse.c specparse.h

Since one of the words in your command contained a * character, bash is going to look through the current directory for files that match the “pattern” you gave. Since the asterisk represents any possible text when used in a pattern, your pattern is a very simple one: you asked for any file that ends with “.c”.

Now that bash understands your pattern, it searches the current directory for files whose names match that pattern. In this case, it finds 3. It sorts those files alphabetically, then splits them into separate words.

When bash is done parsing your command, there are now 4 words: ls, main.c, numconvert.c, and specparse.c in that order.

Executing the Command

(Reference)

Now that bash knows exactly what your command is supposed to mean, it’s time to delegate to the application that will actually perform the operation you want. In this case, it’s a program called ls.

The first challenge is to find ls. If this is the first time you’re running ls, bash is going to search through a list of directories to find a file with that name. The directories bash looks through are initially stored in an “environment variable”. (These variables are an operating system concept, not a shell one.)

After finding the ls executable, bash builds the argument list it’s going to give to the program. This list just consists of the word list it determined earlier. As a reminder, this list is ls, main.c, numconvert.c, then specparse.c.

Finally, bash can create a new ls process. How an application creates a new process in Linux and executes a program within it is a topic for another day.

In order for you to see the results of your ls command, bash needs to connect its output to the output of the new command. Once it does, any text produced by ls appears in your terminal just like the text bash produces on its own.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade