Pseudo File systems In Linux

shashank Jain
5 min readAug 11, 2018

In the previous blog (https://medium.com/@jain.sm/a-primer-on-linux-filesystem-2228fafcd39b)we discussed the block device based file systems. Today we intend to touch upon some of the other non block device filesystems provided by the Linux OS.

Since everything is a file is the general philosophy of Linux. Working on that premise there are filesystems which expose some of the kernel resources over the file interface. We call them pseudo file systems .One of such filesystems is the procfs.

The procfs is mounted on the rootfs under the proc directory. The data under procfs is not persisted and all operations happen in memory.

Some of the structures exposed via procfs

/proc/cpuinfo

Provides cpu details like cores,cpu size,make etc

/proc/meminfo

Provides information on physical memory

/proc/interrupts

Information about interrupts and the handlers

/proc/vmstat

Virtual memory stats

/proc/filesystems

Active filesystems on the kernel

/proc/mounts

Current mounts and the devices. This will be specific to the mount namespace

/proc/uptime

Time since the kernel is up

/proc/stat

System statistics

/proc/net

Network related structures like tcp sockets files etc

Then proc also exposes some process specific information via files

/proc/pid/cmdline

Command line name of the process

/proc/pid/environ

Captures environment variables of the process

/proc/pid/mem

Virtual memory of the process

/proc/pid/maps

Shows the mapping of the virtual memory

/proc/pid/fdinfo

Open file descriptors of the process

/proc/pid/task

Shows details of the child processes

Sysfs file system

Sysfs is also a pseudo filesystem which exposes the hardware and driver information to the userspace. Sysfs is mounted on sys directory of rootfs. Some of the aspects of the drivers and hardware can be controlled by sysfs.

Sysfs has the following subdirectories

1. Devices — Allows to enumerate the devices registered with the kernel

2. Bus — Has two further subdirectories one for devices and other for drivers. The devices represent the devices attached to the Bus.

3. Class — This is a way of grouping similar devices based on functionality

4. Firmware

5. Modules

Debugfs File System

This pseudo filesystem is mounted on /sys/debug directory and is used by tracing utilities like ftrace to derive function traces.

Pipefs File System

This pseudo file system is used for creating the IPC structure pipe. Pipe in linux forms as the name suggests a pipe to allow inter process communication.

Pipefs is registered with the vfs and is a non mountable filesystem.

Each instance of the pipe is represented by a file structure and thereby has a corresponding inode in memory.

The inode structure looks like this

struct inode {

umode_t i_mode;

unsigned short i_opflags;

kuid_t i_uid;

kgid_t i_gid;

unsigned int i_flags;

union {

struct pipe_inode_info *i_pipe;

struct block_device *i_bdev;

struct cdev *i_cdev;

char *i_link;

unsigned i_dir_seq;

We can see there is a pipe_inode_info structure part of the inode. This structure holds the memory buffer specific to the pipe

struct pipe_inode_info {

struct mutex mutex;

wait_queue_head_t wait;

unsigned int nrbufs, curbuf, buffers;

unsigned int readers;

unsigned int writers;

unsigned int files;

unsigned int waiting_writers;

unsigned int r_counter;

unsigned int w_counter;

struct page *tmp_page;

struct fasync_struct *fasync_readers;

struct fasync_struct *fasync_writers;

struct pipe_buffer *bufs;

struct user_struct *user;

};

As we can see the pipe_buffer will hold the data specific to the pipe instance. Each pipe instance is allocated 64k , which means 16 memory pages of 4k each. These are configurable parameters.

The buffer itself is ringbuffer with each entry represented by the pipe_buffer structure. So diagrammatically this looks like the below

Shared memory (tmpfs)

The Kernel supports shared memory via posix interfaces through a filesystem known as tmps. This filesystem is mounted on /tmp/shm of the rootfs. Since this is backed by a file, each instance of a shared memory is represented by a file and thereby an inode in the kernel.

Api to create an instance of a shared memory is shm_open. This returns back a file descriptor which is a handle to the shared memory structure.

Mmap is the API provided by linux to map this shared memory into the process address space. Process address space is a linked list of vm_area_struct where each structure represents different segments of memory like text,data,bss,heap,stack etc. For memory mapped file one of the vm_area_struct holds a reference to the file structure created for the shared memory.

The above diagram shows how the vm_area_struct is mapped to the shared memory structure created via the tmpfs. So if we observe carefully the shared memory is represented by the address_space structure (page cache) which we talked about in the previous blog .

The difference between the pipefs and tmpfs is that the memory buffers held by pipefs are not in page cache where as the tmpfs created shared memory is part of page cache.
Another aspect worth understanding here is that since the memory mapped files are accessed by touching memory areas, there is no File system involved and the page tables are updated via page faults. This is unlike how we were dirtying the specific block representation (buffer_head) , here the currency of operation is page. So any update that happens is at page level.

Unmap api is invoked to unshare the memory from the process address space. The shared memory in this case is also backed by a file which represents memory pages similar to how a normal file can be mapped in memory. This gives flexibility to the programmer to use shared memory as an IPC primitive. On a side note the fd used to create the shared memory has to be transferred to the other process (if not a child process) via mechanisms like domain sockets. The kernel mediates this call and knows what fd in one process is mapped to and so on the other process is able to create the mapping to facilitate IPC setup.

Disclaimer : The views expressed above are personal and not of the company I work for.

--

--