The Linux Filesystem ? A Thorough Walk-through

Soufiane Ezzaim
ENSIAS IT
Published in
11 min readApr 14, 2020

Back in ~2015 I learned how to install software on my new Kali Linux (yes i did switch to Linux just to crack my neighbour’s wifi password ) before even knowing what a file system is. Programs would just magically work even though I didn’t have a clue of where the actual executable files landed.

Linux was just not as intuitive and user-friendly as i thought. Maybe the author is a grumpy old man and resents new kids and their pretty graphical tools, or maybe i was on edge because i didn’t crack that wifi password.

I soon realized /etc is not for miscellaneous files, /usr is not for user files, and /bin not a trash can. That’s when i knew i needed to spend some time getting a handle on how the directories were organized and what all their exotic names meant.

This article will help you get up to speed faster than I did.

Every general-purpose computer needs to store data of various types on a hard disk drive (HDD) or something equivalent. There are a couple reasons for this, but mostly because RAM loses its contents when the computer is switched off. There are non-volatile types of RAM that can maintain the data stored there after power is removed but they are pretty expensive.

Therefore, we need a logical method of organizing and storing large amounts of information in a way that makes it easy to manage. All data in Unix is organized into files. All files are organized into directories. These directories are organized into a tree-like structure called the “Filesystem”.

Definition

While looking for an “official” definition of what a file system is. Things can get pretty confusing. You may hear people talk about filesystems in a number of different ways. The word itself can have multiple meanings, and you may have to discern the correct meaning from the context of a discussion or document.

Here i will summarize 3 of the most used definitions in books and official documentation.

  1. A specific type of data storage format, such as EXT3, EXT4, BTRFS, XFS, and so on. Linux supports ~100 types of filesystems. Each of these filesystem types uses its own metadata structures to define how the data is stored and accessed.
  2. The entire Linux directory structure starting at the top (/) root directory.
  3. A partition or logical volume formatted with a specific type of filesystem that can be mounted on a specified mount point on a Linux filesystem.
Image courtesy of G4G

Main functions of a filesystem

The filesystem is designed to provide space and structure for the storage of non-volatile data; that is its ultimate function. However, disk storage is a necessity that brings with it some interesting and inescapable details. There are many other important functions and requirements to be met.

1 ) Namespace :

All filesystems need to provide a “namespace”. This defines how files can be named. Specifically : the length of a filename and the subset of characters that can be used for filenames.
It also defines the logical structure of the data on a disk. Namely, the use of directories for organizing files instead of just lumping them all together.

2) Meta-data Structure :

This is necessary to provide the logical foundation for the OS. It includes :

  1. Structures required to support a hierarchical directory structure
  2. Structures to determine which blocks of space on the disk are used and which are available
  3. Structures that allow for maintaining the names of the files and directories
  4. Information about the files such as their size and times they were created, modified or last accessed, and the location or locations of the data belonging to the file on the disk.

Other metadata is used to store high-level information about the subdivisions of the disk, such as logical volumes and partitions. This higher-level metadata and the structures it represents contain the information describing the filesystem stored on the drive or partition, but is separate from and independent of the filesystem metadata.

3 ) API ( Application Programming Interface ) :

The API provides access to system function calls, which manipulate filesystem objects like files and directories. APIs provide for tasks such as creating, moving, and deleting files. It also provides algorithms that determine things like where a file is placed on a filesystem (to optmise speed or minimize disk fragmentation)

4 ) Security :

Defining access rights to files and directories. The Linux filesystem security model helps to ensure that users only have access to their own files and not those of others or the operating system itself.

Directory Structure

Files in Unix System are organized into multi-level hierarchy structure known as a directory tree. At the very top of the file system is a directory called “root” which is represented by a “/”. All other files are “descendants” of that root.

And because the title promised a “Thorough Walk-through”, we’ll be exploring each of these directories and what is their purpose.

It only makes sense to explore the Linux filesystem from the terminal, so we’ll be using “tree” package for that.
For Debian based distros (Ubuntu, Kali, Elementary ..) :

$ sudo apt-get install tree

Or if you are a fellow intellectual, for Arch-like Distros :

$ sudo pacman -S tree

Then, run :

$ tree -L 1 /

You should see something like this :

If you were to runtree / It’ll show you the tree of your whole filesystem,every single file you have in your OS. -L 1 Is to only show level 1 and / to start at root.

To get started , let’s look at what each directory is used for. While we go through each, you can peek at their contents using ls.

A brief description of all directories and files :

  • / : The slash / character alone denotes the root of the filesystem tree. The top-level directory of the filesystem. It must include all of the required executables and libraries required to boot the remaining filesystems.
  • /bin : Stands for “binaries” and contains certain fundamental utilities, such as ls or cp, which are generally needed by all users, basic tools for making and removing files and directories, moving them around, and so on. There are more bin directories in other parts of the file system tree, but we’ll be talking about those in a minute.
  • /boot : Contains all the files that are required for successful booting process. The static bootloader and kernel executables and configuration files required to boot a Linux computer. DO NOT TOUCH!. If you mess up one of the files in here, you may not be able to run your Linux and it is a pain to repair.
  • /dev : Stands for “devices”. Contains file representations of peripheral devices and pseudo-devices. These are not device drivers, rather they are files that represent each device. For example, if you plug in a USB into your machine, a new device entry will automatically pop up here.
  • /etc : Contains system-wide configuration files and system databases. Gets its name from “et cetera” because it was the dumping ground for system files administrators were not sure where else to put. But now it contains most, if not all system-wide configuration files. For example, the files that contain the name of your system, the users and their passwords, the names of machines on your network and when and where the partitions on your hard disks should be mounted are all in here. Again, if you are new to Linux, it may be best if you don’t touch too much in here until you have a better understanding of how things work.
  • /home : Contains the home directories for the users. Each user has a subdirectory in /home. It’s where you will find your users’ personal directories.
  • /lib : Contains system libraries, and some critical files such as kernel modules or device drivers. Libraries are files containing code that your applications can use. Snippets of code that applications use to draw windows on your desktop, control peripherals, or send files to your hard disk …
    There are more lib directories scattered around the file system, but the one hanging directly off of / contains the all-important kernel modules. The kernel modules are drivers that make things like your video card, sound card, WiFi, printer, and so on, work.
  • /media : Default mount point for removable devices, such as USB sticks, media players, etc.
  • /mnt : Stands for “mount”. Contains filesystem mount points. These are used if the system uses multiple hard disks or hard disk partitions. It is also often used for remote (network) filesystems, CD-ROM/DVD drives, and so on. This is where you would manually mount storage devices or partitions. It is not used very often nowadays.
  • /opt : Optional files such as vendor supplied application programs should be located here. Also, often where software you compile ( you build yourself from source code and do not install from your distribution repositories) sometimes lands. Applications will end up in the /opt/bin directory and libraries in the /opt/lib directory.
  • /proc : like /dev, it’s a virtual directory. It contains information such as CPU and the kernel your Linux system is running. As with /dev, the files and directories are generated when your computer starts, or on the fly, as your system is running and things change.
  • /root : This is not the root (/) filesystem. It is the home directory for the root user; the system administrator. It is separate from the rest of the users’ home directories because, again, YOU ARE NOT MEANT TO TOUCH IT.
    ( will be used in case specific maintenance needs to be performed )
  • /run : System processes use it to store temporary data. This is another one of those DO NOT TOUCH folders.
  • /sbin : is similar to /bin, but it contains applications that only the superuser (hence the initial s) will need. You can use these applications with the sudo command that temporarily concedes you superuser powers. /sbin typically contains tools that can install, delete and format stuff. As you can imagine, some of these instructions are lethal if you use them improperly, so handle with care. Needless to say, Stay Away if you are new to Linux.
  • /snap : Because the shot was captured on an Debian system. Recently, snap was incorporated as a way of distributing software. The /snap directory contains all the files and the software installed from snaps.
  • /srv : Contains data for servers. If you are running a web server from your computer, your HTML files would go into /srv/http (or /srv/www). If you were running an FTP server, your files would go into /srv/ftp.
  • /sys : Another virtual directory like /proc and /dev and also contains information from devices connected to your computer.
  • /tmp : A place for temporary files, usually placed there by applications that you are running. Many systems clear this directory upon startup; it might have tmpfs mounted atop it, in which case its contents do not survive a reboot. /tmp is one of the few directories hanging off / that you can actually interact with without becoming superuser.
  • /usr : Originally, the directory holding user home directories,its use has changed. However, now /home is where users kept their stuff as we saw above. /usr now holds executables, libraries, documentation, icons and shared resources that are not system critical. We should note that you will also find bin, sbin and lib directories in /usr. The question is, what is the difference with their root-hanging cousins?
    In early UNIX days /bin (hanging off of root) would contain system basic commands, like ls, mv and rm and /usr/bin on the other hand would contain stuff the users would install and run, like word processors, web browsers, and other apps.
    But nowadays, modern Linux distributions just put everything into /usr/bin and have /bin point to /usr/bin (just in case erasing it completely would break something). Debian and its derivatives (Ubuntu, Mint ..) still keep everything separate, Arch and its derivatives have one “real” directory for binaries, /usr/bin, and the rest of bins are “fake” directories that point to /usr/bin.
  1. /usr/bin : This directory stores all binary programs that the user installed and not basic for the OS.
  2. /usr/include : Stores the development headers used throughout the system. Header files are mostly used by the #include directive in C/C++ programming language.
  3. /usr/lib : Stores the required libraries and data files for programs stored within /usr or elsewhere.
  • /var : A short for “variable.” Was originally given its name because its contents was deemed variable. MySQL, and other database files, web server data files, email inboxes, and much more. /var also contains things like logs in the /var/log subdirectories. Logs are files that register events that happen on the system. If something fails in the kernel, it will be logged in a file in /var/log, if someone tries to break into your computer from outside, your firewall will also log the attempt here.

Your system may have some more directories we haven’t mentioned above. Be sure to refer to the official Linux Filesystem Hierarchy Standard (FHS) web page for details about each of these directories and their many subdirectories. This standard should be followed as closely as possible to ensure operational and functional consistency. Regardless of the filesystem types used on a host, this hierarchical directory structure is the same.

We just covered level 1 of the root directory. To dig deeper, many of the subdirectories lead to their own set of files and subdirectories. Image below should give you an overall idea of what the basic file system tree looks like

Courtesy of Paul Gardne, under a CC By-SA license

Conclusion

Although there are minor differences between Linux distributions, the layout for their filesystems are mercifully similar. So once you know one, you should know them all. And the best way to know your filesystem is to explore it. So go ahead and experience a little with tree, ls, and cd .

I hope that some of the possible confusion surrounding the term filesystem has been cleared up by this article. It took a long time and a very helpful mentor for me to truly understand and appreciate the complexity, elegance, and functionality of the Linux filesystem in all of its meanings.

If you have questions, please add them to the comments below and I will try to answer them.

And if you are new to UNIX and BASH, this 4 part course may interest you !

--

--

Soufiane Ezzaim
ENSIAS IT

CS student. Deep Learning enthusiast. NBA fanatic. Mint tea >> Everything