Heredoc: A Deep Dive

Oduwole Dare
15 min readMar 19, 2024

--

Here-documents (or here-docs) are a secret weapon in the toolbox of any software engineer, especially when it comes to scripting and automation. They’re incredibly handy for 42 students tackling projects like Born2beroot, Mini$hell, Inception, and more. Here-docs really shine when you need to jot down a list of instructions or text on the fly, without the hassle of opening a text editor — perfect for those times when you’re automating tasks and working with scripts.

Imagine you’re writing a series of commands to a MySQL database, for instance. Think of here-docs as a quick, in-script text editor that doesn’t require real-time interaction. If this all seems a bit abstract, don’t worry — I’ll break it down with practical examples that’ll make everything click.

The Components of Heredoc:

This is how a typical heredoc is used.

[CMD - optional] << DELIMITER
Input lines
DELIMITER

The command above means “Take everything after “<< DELIMITER” until you find a line that only contains `DELIMITER`, and pass it as input to `[CMD]`”. In this case, ` Input Line` will be passed as input to `[CMD]`.

[CMD]:
This represents the command where the here document (`<< DELIMITER`) will provide the input. It could be any command that accepts input from standard input, such as cat, echo, mysql, etc.

<<:
This is the here-document syntax, which tells the shell to take all input until it encounters the specified “DELIMITER”.

DELIMITER:
It’s a user-defined word that marks the end of the heredoc. You can use any word as the delimiter so far it does not appear anywhere in the input lines, otherwise, the here document will terminate prematurely when it encounters the first occurrence of the word.

Input Lines:
This input stream/buffer will be provided as input to [CMD] via the here document.
So when the command below is executed:

line 1 ├── cat << EOF
line 2 ├── Hello
line 3 ├── World
line 4 └── EOF
  • Line 1: When you enter the first line into your terminal you get into the mode where heredoc expects your input streams.
  • Lines 2 & 3: This mode takes multilines of inputs. In this case, Hello and then World, until the EOF is encountered.
  • Line 4: Finally, the EOF that ends the subshell process created by the heredoc

Heredoc Execution Order

The heredoc operates like a do-while loop that “cats” your input texts into a file until a certain condition is met. Therefore, you can view it as a multiline string or file literal for sending input streams to other commands and programs. Understanding this last statement is crucial to grasping the use of heredoc in practice. When a script or command block with a here-document is executed, the shell follows these steps:

  1. Start Execution: The shell starts executing the command that precedes the << (in your case, cat).
  2. Begin Here Document: When the shell encounters the <<, it expects input. The word immediately following << will be used as the delimiter to mark the start and end of the here document.
  3. Input Collection: The shell collects all lines of input that follow until it encounters the DELIMITER. These lines can include commands, variables, or any text you want to pass as input.
  4. Command Substitution: If the here document includes command substitutions ($(...)), like $(ls) for example, these are executed at this stage and their output will be included in the input to [CMD].
  5. Execute Command: Once the shell has collected all the input lines and processed any command substitutions, it passes this entire block of input to the command [CMD].
  6. Command [CMD] Execution: The [CMD] command (ie any command specified) receives the input from the here document and processes it.
    For example, cat receive the list of files generated by $(ls).
  7. End of Here Document: After the entire heredoc content has been passed to the command, the shell sees the DELIMITER (EOF), which marks the end of the here document.
    The command [CMD] finishes processing the input from the here document.
  8. Continue Script Execution: When more commands are in the script or shell, they are executed as usual after the here document block.

Variable and Command Expansion in HereDoc

Heredocs, like regular shell scripts, can expand commands and variables similarly. When we use variables or commands within a heredoc, they are expanded or executed similarly like in regular scripts. The heredoc operates within its own subshell, similar to a child process, so it has its own environment where variable expansions and commands are processed. This allows you to use variables and execute commands within the heredoc just as you would in the rest of your script.

Command Expansion:
Command expansion within a heredoc allows you to dynamically include the output of a command as part of the input text. This is incredibly useful for generating content on the fly, especially when the content depends on the result of a command. The two common ways to achieve command expansion within a heredoc are using `$()` or backticks ``.

  • Using $(): This syntax allows you to execute a command and capture its output. Here's an example
cat << EOF
Current PATH: $(echo $PATH)
Files in current directory: $(ls)
Current directory: $(pwd)
EOF
  • Using backticks ``: This is an older method for command substitution and achieves the same result but $() is generally preferred for readability and nesting commands.
cat << EOF
Current PATH: `echo $PATH`
Files in current directory: `ls`
Current directory: `pwd`
EOF

The concept of variable and command expansion is powerful and useful in adapting your scripts to different use cases and scenarios. Let's explore 2 scenarios for example, imagine you want to write a script that checks the state of a service e.g. Docker or Apache, etc Or let’s say you would like your script to give you detailed information about your system. Then, you would probably need to get the statement from systemctl(Linux) like this:

cat << EOF
Server Status:
$(if systemctl is-active --quiet apache2; then
echo "Apache server is running"
else
echo "Apache server is down"
fi)
EOF
cat << EOF
System Info:
- Hostname: $(hostname)
- Disk Usage: $(df -h)
EOF

Variable Expansion:
You can expand variables within the heredoc to their values when it's being processed. Variables inside the heredoc will be expanded with their current values when the heredoc is evaluated. This allows for dynamic content generation based on the script’s state or external commands.

host_name=$(hostname)
cat << EOF
My hostname is: $host_name
EOF

The combination of command and variable expansion allows for powerful and dynamic text generation within heredocs.

Ignoring Variable And Command Expansion

Oh, as much as the expansion of commands and variables sounds interesting, sometimes that isn’t what we want and for this reason, heredoc provides a way to prevent this functionality. Ignoring variable and command expansion in heredocs can be useful in certain situations where you want to preserve the exact text, including the literal `$` symbol and backticks ```, without being interpreted as variable or command expansions. To achieve this we will need to indicate our DELIMITER with a quote.
Here are a few reasons why you might want to ignore expansions:

Literal Text: Sometimes we want to include text that contains symbols like `$` or backticks ``` without them being interpreted. This technique is useful when you want to ensure that the content of the heredoc remains static and doesn’t get affected by the current state of variables or potential commands that could cause issues if expanded. It’s a way to make the heredoc more predictable and avoid unintended behavior. For example, when writing documentation or a script template and want to show examples of commands.

cat << 'EOF'
This is a literal $ symbol.
Command: `ls -l`
EOF

Code Snippets: When we want to show code snippets or examples in a script or documentation, we often want to preserve the exact text to ensure accuracy.

Escaping Special Characters: If there is a need to include the heredoc delimiter in the content, we can escape it with a backslash (`\`).

cat << 'EOF'
This is an example of using the delimiter \EOF inside the heredoc.
EOF

By ignoring variable and command expansions with single quotes around the delimiter (`’EOF’`), we ensure that the text within the heredoc remains exactly as we’ve written, without any unexpected substitutions or expansions.

How to use Heredoc with pipes and redirection:

In shell scripting, piping (|) and redirection (>, >>, <) are powerful features that allow us to manipulate input and output streams of commands. When combined with here-documents (<<), we have ourself a versatile set of powerful tools for manipulating and working with data in our scripts.

Piping (|)

Pipes take the output of one command and pass it as input to another command. For example:

# This command list files in our current dir and then filter for files ending
# with ".txt"
ls | grep ".txt"

Redirection (>, >>, <)

Redirection on the other hand allows us to control where the output of a command goes. Here are the basic redirection operators:

  • >: Redirects output to a file, and overwrites the file if it already exists.
  • >>: Redirects output to a file, appending the output to the end of the file if it exists.
  • <: Redirects input to a command, taking input from a file.
# Redirect output to a file
ls > files.txt

# Append output to a file
echo "Some text" >> files.txt

# Use a file as input
sort < files.txt

Here are examples of how piping and redirection work with heredocs:

Piping with Heredocs:

Input Piping
When we say Input piping, we mean the standard output (stdout) of the command on the left (ls) is sent as input to the command on the right (grep). Here, we’ll use a pipe (`|`) to send the output of a command as input to our heredoc.

cat << EOF
$(ls | grep ".log")
EOF

The command above lists all “.log” files filtered by grep and then passes them as an input stream within the heredoc.

Output Piping:
Output piping with a here document allows us to pass the output of one command as input to another command, all within the here document. We can also pipe the output of a here document into another command.

cat << EOF | grep "grape"
apple
banana
grape
orange
EOF
  • The lines within the here document (apple, banana, grape, orange) are provided as input to cat
  • cat outputs these lines, which are then piped (|) to grep "grape".
  • grep "grape" searches for lines containing "grape" from the output of cat.
  • The final output will be only the line with “grape” in it, because grep filtered the output.

Redirection with Heredocs

Input Redirection:
We can redirect the content of a file into a here document.

cat << EOF
$(<your_filename)
EOF

This will include the contents of “your_file” within the heredoc, using the <. Specifically, $(< filename) is a Bash construct that reads the contents of the file passed and substitutes it in place. It’s a more efficient and concise way to read the content of files unlike using `cat` because when using cat you’re invoking an external command.

Output Redirection:
This is a common and useful technique in shell scripting. Output redirection with a here document allows us to redirect the output of the here document block to a file. If the file does not exist, it will be created. If it does exist, the existing content will be replaced with the output of the here document. You should use the “>>” syntax instead of the “>” if you just want to append its content to the file.

cat << EOF > output.txt
Hello, world!
EOF

cat << EOF > output.txt: This redirects the output of the “here document” block to a file named output.txt. The above command will write “Hello, world!” into the `output.txt` file.

Combined Piping and Redirection:

Piping and redirection can also be combined to create more complex operations:

Input Redirection with Pipe:

grep "error" << EOF | tee error.log
This is an error message
Another error occurred
This is free of e***r
EOF

Here’s what each part of the above command does:
grep “error”: This is the command to search for lines containing “error” in the input.
<< EOF: This starts a “here document” block, which provides input to the grep command.
|: The pipe then redirects the output of the grep command to another command.
tee error.log: The tee command reads from standard input and writes to standard output and files. In this case, it writes the output of grep to both the standard output (which will be displayed in the terminal) and the error.log file.
EOF: This marks the end of the “here document” block.

Output Redirection with Pipe:

cat << EOF | grep "apple" > apple_list.txt
apple
orange
banana
EOF

This will grep for lines containing “apple” from the input provided by the heredoc and write the filtered output to `apple_list.txt`.

In summary, piping and redirection with here documents allow you to manipulate input and output streams within a shell script, providing flexibility in processing and managing data.

Heredoc process

When using heredocs, the commands inside the heredoc are executed in a subshell. This means that the shell creates a new child process (subshell) to handle the commands within the HereDoc. What this means is:

Main Shell: Also referred to as the parent shell that is where we are typing and interacting with. This shell has its own process ID (PID) and manages the execution of the script or commands we are typing.

Subshell: When a heredoc is encountered, it signals the shell to create a new child process known as a subshell. This subshell is a separate instance of the shell, created specifically to handle the commands within the heredoc block. So, a subshell can be imagined as a fork of the current shell process with the new shell inheriting the state of the parent shell but operating independently there.

Illustration of what happens:

When the shell encounters a heredoc block (<< EOF), it knows that everything between << EOF and EOF should be treated as input to a command.
The shell creates a subshell — a separate process that inherits the environment (variables, functions, etc.) from the parent shell.
The subshell executes the commands within the heredoc block as if you typed them directly into the terminal.
Any changes in the environment (variables, etc.) inside the subshell do not affect the parent shell. Once the subshell completes execution, it exits, and the parent shell continues where it left off.
To illustrate this let's write a script to prove the different environments of the main shell and subshell

#!/bin/bash

# This is a script using Heredocs
cat << EOF
Setting variable inside Heredoc.
my_pid=$$
My PID is: $my_pid
EOF

# Check if my_var is accessible outside Heredoc
if [ -z "$my_pid" ]; then
echo "my_pid is undefined outside Heredoc"
else
echo "my_pid outside Heredoc: $my_pid"
fi

Purpose of the Subshell
The purpose of using a subshell for Heredocs is to ensure that the commands within the Heredoc should not affect the state of the main shell. It provides a clean and isolated environment for executing the Heredoc contents.

Heredocs vs HereStrings

Heredocs (`<<`) and HereStrings (`<<<`) are both ways to provide input to commands, but they have different use cases. HereStrings are more suitable for single-line input, while Heredocs are better for multi-line or formatted input.

grep "pattern" <<< "This is a single line input"

grep "pattern" <<< "This is a single line input with pattern"

Nested Heredocs:

It is possible to nest Heredocs within commands inside a Heredoc block.

cat << 'EOF'
This is a command with a nested Heredoc:
$(cat << 'INNER_EOF'
Nested Heredoc content
INNER_EOF
)
EOF

This means you can have a command within a Heredoc block that also takes its input from another Heredoc.
Nested Heredocs can be useful when you have a command that requires multi-line input, and within that command, you need to provide another multi-line input.
Example:
Let’s say you have a script that needs to create a file with some content, and also concatenate the content of another file which itself has multi-line sections. Here is an example script demonstrating this concept:

#!/bin/bash
cat << EOF > file1.log
This is file1 nested heredoc content.
EOF

cat << EOF > file2.log
This is file2 nested heredoc content!!!
EOF

cat << EOF >> file3.log
This is the beginning of the file
$(cat << NESTED_EOF
$(< file1.log)
$(< file2.log)
NESTED_EOF
)

This is the end of the file outside the nesting.
EOF

Closing a Heredoc Early:
You can terminate a Heredoc earlier than the delimiter with an external command like `exit`.

cat << 'EOF'
Content 1
$(exit 1) # Terminate Heredoc here
Content 2
EOF

Inside the Heredoc above, there is a command $(exit 1), which will cause the Heredoc to terminate early if the command returns an exit code of 1.

Debugging Heredocs

If you encounter issues with Heredocs, you can use the `-x` option with the shell to enable debugging output. How does this work? When you have a script that utilizes heredoc and you suspect it might be buggy or you would like to see all the execution of the here doc, then you need to use the “-x” flag

bash -x sample_script.sh

Heredocs’ real-world usages

You will encounter interesting usages of Heredoc in your journey as a Software Engineer.
1. We have seen its usage with “cat”. Here is its usage with no command

<< EOF
Hello
World
EOF

2. echo: Used to display a line of text or a variable value. It’s often used with here documents to output text.

echo "Hello" << EOF
World
EOF

3. sed: The stream editor filters and transforms text. Here documents can provide the text to be processed by `sed`.

sed 's/old/new/' << EOF
This is the old text.
EOF

4. SSH: Using SSH to execute commands on a remote server can benefit from heredocs passing the commands like logging into a remote machine and executing commands. Here documents can be used to send commands to the remote shell.

ssh user@host << EOF
cd /path/to/directory
ls -l
EOF

5. FTP: The File Transfer Protocol can accept commands via here documents for automating file transfers.

ftp -n << EOF
open ftp.example.com
user username password
put localfile.txt
quit
EOF

6. MySQL: Heredocs can be useful for scripting MySQL queries, especially when you have multi-line queries or scripts to execute.

mysql -u username -p << EOF
USE db_name;
SELECT * FROM table_name;
EOF

7. sudo: The `sudo` command allows you to execute commands with elevated privileges. Here documents can be used with `sudo` to provision input to the command that requires elevated permissions.

sudo -u some_user << EOF
command_needing_elevated_privileges
EOF

8. bc: The `bc` command is a calculator that can be used with here documents to perform calculations.

bc << EOF
10 * 5
EOF

9. awk: The `awk` command is used for text processing and pattern matching. Here documents can be used to provide `awk` scripts.

awk 'BEGIN { print "Start of output:" } { print $0 }' << EOF
Line 1
Line 2
EOF

10. bash: You can use a here document to execute a series of Bash commands.

bash << EOF
echo "Hello from Bash"
ls -l
EOF

11. curl: The curl command transfers data to or from a server. Here documents can provide the data to be sent in the request.

curl -X POST http://example.com/api -d @- << EOF
{"key": "value"}
EOF

12. sort: The `sort` command is used to sort lines of text. Here documents can provide the text to be sorted.

sort << EOF
banana
apple
orange
EOF

13. zip/unzip: The `zip` and `unzip` commands for working with zip archives can accept input file names via here documents.

zip myarchive.zip -@ << EOF
file1.txt
file2.txt
EOF

14. Sending Email Content: When sending email from a script or command line, heredocs can be used to specify the email content, including subject, body, and recipients.

sendmail user@example.com << END
Subject: Hello from Script
From: sender@example.com
Hello,
[Email content].
END

15. Generating Configurations: Heredocs can also generate configuration files on the fly. For example, creating an Apache virtual host configuration:

cat << CONFIG > /etc/apache2/sites-available/example.com.conf
<VirtualHost *:80>
ServerName example.com
DocumentRoot /var/www/html/example.com
<Directory /var/www/html/example.com>
AllowOverride All
Require all granted
</Directory>
</VirtualHost>
CONFIG

Heredocs Best Practices:

  1. Choose Descriptive Delimiters: When using heredocs, choose delimiters that are descriptive and unlikely to appear in the text you are enclosing. This helps prevent premature termination of the heredoc.
  2. Use Quoting to Prevent Expansion: If you want to prevent variable and command expansion within the heredoc, enclose the delimiter in single quotes (').
  3. Indentation: Maintain consistent indentation within the heredoc to improve readability, especially when dealing with nested heredocs or complex content.
  4. Avoid Trailing Whitespace: Ensure there is no trailing whitespace after the closing delimiter EOF to prevent unexpected behavior.
  5. Error Handling: Include appropriate error handling within the heredoc block if necessary. This ensures that errors encountered during command execution within the heredoc are properly handled.
  6. Avoid Mixing with Interactive Input: Heredocs are designed for non-interactive input. Avoid mixing heredocs with commands that require interactive input, as it can lead to unexpected behavior.
  7. Clear Documentation: If the heredoc contains complex or important information, consider adding comments or documentation within the heredoc itself to explain its purpose or usage.
  8. Testing: When using heredocs in scripts, test them thoroughly to ensure they behave as expected, especially when dealing with complex or nested heredocs.
  9. Avoid Excessive Nesting: While heredocs can be nested, excessive nesting can reduce readability and make scripts harder to maintain. Use judiciously and consider refactoring if nesting becomes too complex.

Conclusion

Whether you’re scripting a series of commands, generating files, or automating tasks, here-docs are a powerful feature to have in your arsenal. With a bit of practice, you’ll find them indispensable in making your scripts more flexible and dynamic. Stay tuned for more practical examples that’ll show you just how versatile here-docs can be!

OTHER ARTICLES

42 Push Swap Explained With Psuedocodes
Compilation Process in C
Heredoc: A Deep Dive
Neovim For 42 Students

Socials

LinkedIn
Github

--

--