File systems — An in-depth intro

Emmanuel Bakare
Consonance Club
Published in
6 min readMay 17, 2018

This isn’t a story about cabinets, file systems aren’t actually real cabinets but you could think of them that way.

First off, what’s a filesystem?

In computing, a file system or file system is used to control how data is stored and retrieved. Without a file system, information placed in a storage medium would be one large body of data with no way to tell where one piece of information stops and the next begins. By separating the data into pieces and giving each piece a name, the information is easily isolated and identified. Taking its name from the way paper-based information systems are named, each group of data is called a “file”. The structure and logic rules used to manage the groups of information and their names is called a “file system”.

Source: https://en.wikipedia.org/wiki/File_system

Here’s what all the definitions mean in real world terms

We’ve all held flash devices which are basically hardware. When we plug in a flash / USB device into our computers, the OS / BIOS (For those who’ve tried booting OSes from devices) loads the drivers from the USB device which enables us to access it. From there, it’s loaded but not usable.

The file system is what then determines that the device we require can be used on our OS/ BIOS. File systems, as we might not know them, are basically code which can access storage.

Here’s an example of the Ext4 file system which is popular for those who use Linux:

That’s an example of a file system and it’s possible to create a custom one. I’d be doing that in a separate series as this is an intro.

Before we get to the code part, we’d need to first of all understand some basics about what disks actually are.

All disks on the hardware side consists of blocks and these blocks are then composed of sectors.

A sector is a physical spot on a formatted disk that holds information. When a disk is formatted, tracks are defined (concentric rings from inside to the outside of the disk platter). Each track is divided into a slice, which is a sector. On hard drives and floppies, each sector can hold 512 bytes of data.
A block, on the other hand, is a group of sectors that the operating system can address (point to). A block might be one sector, or it might be several sectors (2,4,8, or even 16). The bigger the drive, the more sectors that a block will hold.

Source: http://www.alphaurax-computer.com/computer-tips/hard-drive-knowledge-blocks-vs-sectors

When we load a file system on a device, we basically tell the OS that the files we intend to store on the device should have parts of it stored in one or more blocks, Sectors always have a fixed size and this is dependent on the type of device we intend to use (Flash drive, Floppy :-| etc). The file system also includes other features but this is the most basic of them.

So for storage devices, our file system lies on a partition.

A definition of a partition is shown below:

A partition is a region or combination of two or more regions on a hard disk or other secondary storage, so that an operating system can manage information in each region separately. It is typically the first step of preparing a newly manufactured disk, before any files or directories have been created. The disk stores the information about the partitions’ locations and sizes in an area known as the partition table that the operating system reads before any other part of the disk. When a hard drive is installed in a computer, it must be partitioned before you can format and use it.

Source: https://en.wikipedia.org/wiki/Disk_partitioning

Here’s an example of the partitions and file systems on my own computer.

Local partitions and File systems(Contents in this case is the File system and Volumes show the partitions in use)

When we finally create a file system on a partition, we push data onto a special side of it called a superblock. The superblock is the section of the partition that holds information about the file systems that we currently have on it.

A superblock is a record of the characteristics of a file system, including its size, the block size, the empty and the filled blocks and their respective counts, the size and location of the inode tables, the disk block map and usage information, and the size of the block groups.

Inodes are a special place where details about the files we have on the filesystem get stored, so kinda like a superblock but for files.

Source: http://www.linfo.org/superblock

When we plan to use the drive, the OS looks in the superblock to tell what kind of file system the drive is using. Some of us might have seen types like FAT32 and NTFS(NT File System) which are very popular on Windows. If the OS reads the superblock and is able to see that the file system driver is available, it attempts to read the device using the driver which then allows us to mount it and use it in anyway we plan.

For the OS to be able to mount the flash drive, we must have the file system driver installed on our OS hence the reason why we have limited support for file systems depending on the OS.

For example, most of us using Linux might have wondered why we can’t see our Linux partition on our Windows, this is due to the fact that Windows doesn’t provide support natively for the Ext4 file system.

For those willing to try it, download the drivers from here if you plan to mount your Linux drives on Windows:

https://sourceforge.net/projects/ext2fsd/files/Ext2fsd/

In fact, files as we know them are actually a type of file system as different files store data in different forms. We’d cover more details about this in subsequent series.

Why use one file system over another then?

Most might not see the purpose for using one file system over another as the main thing is to be able to copy files on and off a flash drive.

Why is there so much fuss about which one is in use?

The reason for this is the purpose for which the file system was created.

As we might know from past notes, I made mention about file system compatibility on the operating system a lot. This is because file systems are / were made to work on a specific task and accomplish it effectively.

For example, a file system called ZFS was built to work on NAS systems and allow for data to be copied to multiple drives at the same time with little to no data corruption, hence it’s not built to work as a native OS for a laptop for example. It could be used on a laptop but it would not be effective for it (From experience, ZFS is very IO heavy and would tend to increase your disk usage and drain your battery since it consistently checks for disk errors and repairs them). Unlike Ext4, which doesn’t do what ZFS does, but was built to improve on the Ext2 partition which was present on olden day Linux PC’s.

What’s next?:

We’d be covering a lot about file systems and Linux in detail over a couple posts.

Do plan to create a file system. I’d be doing this on a Linux computer but I would try to see if its possible to do that on Windows.

We’d also be learning about the different kinds of disks and SSD’s. Consider this to be the full tutorial on all things Storage and Linux related towards cloud computing architecture.

Hope to see you soon!.

--

--