What happens when you type ls *.c and hit Enter in a shell

100% Accurate Depiction Of A Hacker’s Desktop. Original source: reddit

This is the first in a series of two posts aimed at beginners for using a shell to navigate their OS environment.

What is a shell?

(skip this section if desired)

To start off with, if you don’t know what a shell is, it is an executable program like any other on your computer. On a Mac, you can open a shell by going to Spotlight or your Application folder and searching for ‘Terminal’. You can do the same on a standard flavor of Ubuntu. Windows computers have a different built-in shell program called Powershell, or cmd.exe in older operating systems than Windows 10.

If you bothered to open a shell just now, you should see a prompt followed by a highlighted cursor (a good hint that you can type something). This is called the command line.

What does this program do? You might have seen the likes of it on movies or TV shows that feature hackers, like that oldschool movie with Angelina Jolie or the more recent show Mr. Robot. You can enter commands on the aptly-named command line which will allow you to navigate your computer’s files and directories, modify them, and display information about and control system resources.

In other words, you can use the shell to navigate your computer much like graphical interfaces like Finder on Mac OS or Windows Explorer on Windows. (Actually, according to Wikipedia, graphical user interfaces are also a kind of shell!) Now — why would you want to do that, instead of just using Finder or Explorer?

When you become adept as using the command line shell, you will find it’s actually a lot faster to type and read information than it is to click and scroll around to find something you’re looking for. Not only that, you’re very powerful in a shell: you can do much more than looking in directories and opening files with all the built-in commands that are available. If you’re looking to become a developer or software engineer, you will find this is very handy.

Finally, did you know in the olden days when computers were new inventions, textual interfaces and command lines were the only way of interacting with the operating system? Isn’t it cool to pretend you were one of those early-day users of computers and know how to get around a computer like one of them? If I’m sensing uncertainty from you, that’s OK. I personally think it’s very cool. 😎

Let’s talk about an example command, ls.

In the rest of this article, we will be covering an example command from Bash, a type of shell built into Unix environments, which is the most commonly-used environment for software development.

If you’ve worked in a shell in a Unix environment before, you probably know ls is a handy command. You may also be familiar with wildcards (‘*’), and used them to list all the files of a particular type.

Here’s the files in a sample directory when using ls :

When I use ls *.c, I can get all the files that end in the *.c extension (files written in the C language):

If you’re not familiar with these commands, have no fear. We’ll break down what’s happening step-by-step to give a better picture of what the shell does when you type these words. You can also skip to the next section if you are a bit familiar with ls and just want to find out how it works. ;)

ls is a command that stands for “list directory contents” (abbreviated, “list”). Let’s take a look at the entry for ls in the Bash manual. You can find this same content by entering man ls in your terminal.

 ls — list directory contents

So far, so good.

 ls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file …]

This gives you information about the usage of the command. The command itself, ls , is followed by the ‘arguments’ or ‘operands’ it takes, meaning the information you can feed to the command for it to work, in the form of words.

The first set of character in braces are possible options (flags preceded by a dash, such as ls -a or ls -l or ls -als ) you can use. Since we didn’t use any options in our example command ls *.c , we won’t be covering any of them here, but they are all listed in the manual entry if you would like to read about their meanings.

The second set of characters in braces indicate that you can specify a file, and the ellipses means you can specify more of them, meaning you can list many files, not just one. The braces mean that both operands/arguments, options and the files, are optional.

 For each operand that names a file of a type
 other than directory, ls displays its name as
 well as any requested, associated information.
 For each operand that names a file of type direc-
 tory, ls displays the names of files contained
 within that directory, as well as any requested,
 associated information.
If no operands are given, the contents of the
 current directory are displayed. If more than
 one operand is given, non-directory operands are
 displayed first; directory and non-directory op-
 erands are sorted separately and in lexicographi-
 cal order.

It’s kind of formal language, but if you take the time to read it, I don’t think it should be more difficult to understand than my own writing. Nonetheless, to paraphrase, ls will list the names of any files you specify. If you list any directories, it will list the files within those directories. Finally, it explains the result of our first example where we used ls by itself: if no arguments are given, it will list the files of the current directory (by default, your home folder if you just opened a shell). The listed results are alphabetized.

Okay, cool. But what really happens here??

Surely you didn’t think I went through all the trouble of explaining what typing ls does to explain something you could figure out from the first two screenshots. Nah. If you’re going to use a shell, you might as well know how it works.

I mentioned the term arguments before. Bash detects arguments much the same way you detect separate words: it looks at a string of words and splits them up by the spaces between the words.

In actuality, ls itself is an argument that we feed to the command line. That means in the example command ls example.txt , there are two arguments:

ls arg o, and example.txt arg 1 (presumably a file)
In the computer field, we start numbering by 0 instead of 1 by convention.

The first argument, arg 0, is a special argument in Bash. This is a command and must be an executable file. Every time you type a word and hit Enter in bash, you are actually running another program, starting it from the shell. Wow, who knew, right?

You can find out where this program is located by using the Bash command which (type man which for more information):

/ᐠ。.。ᐟ\ electra [n_trees] 11:05 PM $ which ls

This means “ls” is a file located in a folder named “bin”. “bin” in turn is located in the root folder of your system (which ultimately houses all of your files and folder), symbolized by the leading slash. And of course, which itself is a program:

/ᐠ。.。ᐟ\ electra [n_trees] 11:29 PM $ which which

These are called absolute paths, because they give the path to the file starting from the root folder, instead of relative to your current directory. An example of the latter would be example.txt, which is interpreted as a file in your current directory (n_trees in the example). An easy way to remember is that absolute paths will start with a /.

You can run these commands directly by typing the absolute path to the program as the first argument:

/ᐠ。.。ᐟ\ electra [n_trees] 01:36 AM $ /bin/ls /
Applications etc
Library home
Network installer.failurerequests
System net
Users private
Volumes sbin
bin tmp
cores usr
dev var
/ᐠ。.。ᐟ\ electra [n_trees] 01:36 AM $ /bin/ls /bin
[ df launchctl pwd tcsh
bash domainname link rcp test
cat echo ln rm unlink
chmod ed ls rmdir wait4path
cp expr mkdir sh zsh
csh hostname mv sleep
date kill pax stty
dd ksh ps sync
/ᐠ。.。ᐟ\ electra [n_trees] 01:36 AM $

Which raises the question, how does Bash know where to find the program you are executing if you don’t give it the absolute path? In our case, it’s not like ls is located in the current folder.

The answer lies in a special environmental variable called PATH .

Environmental Variables

To sum it up quickly, environmental variables contain information about the environment used by Bash. You can set your own variables, and environmental variables, but that’s a topic for another time (see further reading listed at the bottom).

You can print environmental variables by using the command env (FMI: man env):

/ᐠ。.。ᐟ\ electra [n_trees] 11:53 PM $ env

Woah, that’s a lot of information. We’re only concerned about PATH. To print the value of a single variable or environmental variable, you can use the echo command (FMI: … you get the drill) and the dollar sign to invoke the variable:

/ᐠ。.。ᐟ\ electra [n_trees] 11:57 PM $ echo $PATH

Notice anything here? The folders where which and ls were located (as well as echo and env if you would like to check) are listed in the value for PATH, separated by colons along with some other absolute paths.

What Bash does with the first argument you give it, if you do not give it an absolute path, is check in each of the directories listed in PATH to see if it can find an executable file by that name in that folder. If not, it returns this:

/ᐠ。.。ᐟ\ electra [n_trees] 12:07 AM $ sunshinepie
-bash: sunshinepie: command not found

This means that, even if we moved ls to our current folder from /bin/ls and tried to invoke it using a relative path, we would get this:

/ᐠ。.。ᐟ\ electra [n_trees] 12:07 AM $ ls
-bash: ls: command not found

That’s because Bash only checks the folders listed in PATH and our current folder, n_trees, is not listed there. But we could totally add it. Like I said, I don’t want to stray from the topic too much, so I’ll move on.

What about wildcards?

The remaining arguments are passed to ls to deal with. But before Bash passes these arguments to the new program, it interprets any special characters.

The $ is one special character in Bash which translates the names of variables into their values. * is another one. Let’s use the echo command to find out how Bash interprets it:

/ᐠ。.。ᐟ\ electra [n_trees] 01:00 AM $ echo  *
a.out free_str_array.o main.c ntree_free.o ntree_insert.c ntree_print.o string_split.o tree.h tree.h.gch

These are all the files in our current directory. But as we saw earlier, you can add characters onto the wildcard:

/ᐠ。.。ᐟ\ electra [n_trees] 01:03 AM $ echo *.c
main.c ntree_insert.c

Bash looks at the files in your current directory and expands the * to match any files which follow the same pattern, where the * can be any set of characters. You can also put characters in front of the *:

/ᐠ。.。ᐟ\ electra [n_trees] 01:06 AM $ echo m*.c

Earlier we saw ls can take multiple arguments, including multiple files. To be honest, Bash doesn’t really care what ls does. ls is the one in charge of taking the arguments that are passed to it, detecting if there are any options and what to do with those, and finding out if the files listed exist in the directory and listing them alphabetically.

On the other hand, ls does not want to bother figuring out what files match an asterisk and how to separate arguments. That’s Bash’s job, and he, she or they, whichever personification you prefer, comes earlier in the assembly line.

To summarize:

  1. We pass a string, “ls *.c” to Bash via the commandline
  2. Bash reads the string and transforms the special character so the string becomes: “ls main.c ntree_insert.c”
  3. Bash divides up the arguments by the whitespace into the following list: “ls”, “main.c”, and “ntree_insert.c”
  4. Bash detects the first argument is not an absolute path, and searches for a file by that name in the paths listed in $PATH
  5. Bash finds ls in /bin/ and executes the program, passing it the remaining arguments “main.c” and “ntree_insert.c”
  6. ls takes the remaining arguments and performs special operations to list those files if they exist in the current directory
  7. Bash prints the prompts and waits for your next command.

If you think about it, the shell is pretty smart (or dumb, depending on the way you look at it). It doesn’t memorize how to handle all of the possible input you may give it, but delegates the work to programs that it can find when you instruct it to carry out a command. And you, by knowing how it works, can call yourself smart too!

Further reading: