Linux Beyond the Basics: How Linux Redirects I/O Streams

File Descriptors and Tables

Dagang Wei
4 min readMay 24, 2024

This blog post is part of the series Linux Beyond the Basics.

Introduction

Ever wished your Linux terminal was a bit more flexible? Like, being able to save command outputs to files or even feed commands with data from somewhere other than your keyboard? That’s where I/O (Input/Output) redirection comes in. Let’s dive into how it works and how you can harness its power.

Files, Descriptors, and the File Descriptor Table

In Linux, everything is treated as a file — even your keyboard, your screen, and network connections. These “files” are accessed using numerical handles called file descriptors. The first three are special:

  • 0 (Standard Input — STDIN): Where a program gets its input (usually your keyboard).
  • 1 (Standard Output — STDOUT): Where a program sends its regular output (typically your terminal screen).
  • 2 (Standard Error — STDERR): Where a program sends error messages (also usually your terminal screen).

These file descriptors are managed within a process’s file descriptor table, a data structure that keeps track of which files (or I/O streams) the process is working with.

Redirecting the Flow: The Operators

Redirection operators let you change the source or destination of these I/O streams. Here’s the breakdown:

  • > (Output Redirection): Overwrites an existing file or creates a new one to hold a command’s output. Example: ls > directory_list.txt (Saves the output of the ls command into directory_list.txt)
  • >> (Append Output): Adds output to the end of a file, instead of overwriting. Example: date >> logfile.txt (Appends the current date to logfile.txt)
  • < (Input Redirection): Reads input from a file instead of the keyboard. Example: sort < unsorted_numbers.txt (Sorts numbers from unsorted_numbers.txt and prints the result to the terminal)
  • 2> (Error Redirection): Redirects error messages to a file. Example: command_that_might_fail 2> error.log
  • &> (Combined Redirection): Redirects both standard output and standard error to the same file. Example: command_that_might_fail &> output_and_errors.log

Under the Hood: The dup2 System Call

The core system call that powers I/O redirection is dup2. Here's how it works:

int dup2(int oldfd, int newfd);
  • oldfd: The existing file descriptor you want to duplicate.
  • newfd: The file descriptor you want the duplicate to have.

When you use a redirection operator, your shell (e.g., Bash) makes calls to dup2 to make the following changes:

  • File Opening: The file specified in the redirection is opened.
  • dup2 Call: The dup2 system call is used to: 1) close the file descriptor specified by newfd (if it was open); 2) make newfd refer to the same underlying file or stream as oldfd.

For example, with ls > output.txt, the shell would roughly do this:

  • Open output.txt and get a new file descriptor (let's say it's 3).
  • Call dup2(3, 1) to make file descriptor 1 (STDOUT) point to the same file as file descriptor 3.

Now, any output written to STDOUT goes to output.txt!

Checking File Descriptors of a Process

If you want to see which files or streams a process is interacting with, you can inspect its file descriptor table. Here’s how:

1. Find the Process ID (PID):

ps aux | grep <process_name>

Look for the number in the second column — that’s the PID.

2. List File Descriptors:

ls -l /proc/<PID>/fd

This will show you the symbolic links to the files the process has open. You’ll often see entries like 0 -> /dev/pts/0 (standard input linked to your terminal) or 1 -> /dev/pts/0 (standard output linked to your terminal).

Important Note: You may need root permissions to inspect file descriptors of processes that aren’t owned by your user.

Piping

The pipe (|) is a powerful tool for chaining commands, making the output of one command the input of the next. It's like a virtual data pipeline. For example:

ls | grep "txt"

Here’s the step-by-step process that happens under the hood when you execute this command:

  • Forking: The shell creates a child process for each command in the pipeline (ls and grep).
  • Piping Creation: The shell creates a pipe, which is a special type of file that exists only in memory. A pipe has two ends: a write end and a read end.
  • File Descriptor Manipulation: The shell uses dup2 (or similar mechanisms) to modify the file descriptor tables of the child processes: 1) ls Process: Its STDOUT (file descriptor 1) is redirected to the write end of the pipe. 2) grep Process: Its STDIN (file descriptor 0) is redirected to the read end of the pipe.
  • Command Execution: 1) The ls process runs and writes its output (the list of files) to the write end of the pipe. 2) The grep process runs and reads its input from the read end of the pipe, filtering the list of files for those ending in "txt". 3) grep writes its filtered output to the terminal (since its STDOUT isn't redirected).
  • Synchronization and Cleanup: The shell waits for both processes to complete. Once done, the shell automatically closes the pipe and cleans up resources.

In Conclusion

I/O redirection is one of those essential tools that can transform your Linux experience. It gives you a fine-grained control over how your commands interact with files and other data streams. By mastering redirection, you’ll unlock a world of automation, debugging, and data manipulation possibilities. So go ahead and experiment — you’ll be surprised at how much more efficient you can become!

--

--