Rust: Raw string literals
While working with Rust, you will often come across
r#"something like this"#, especially when working with
TOML files. It defines a raw string literal. When would you use a raw string literal and what makes a valid raw string literal?
When would you use a raw string literal?
First, let’s understand what a string literal is. According to the The Rust Reference¹, A string literal is a sequence of any Unicode characters enclosed within two U+0022 (double-quote) characters, with the exception of U+0022 itself². Escape characters in the string literal body are processed. The string body cannot contain a double-quote. If you need to insert one, you have to escape it like this:
Escaping double-quotes can be cumbersome in some cases such as writing regular expressions or defining a JSON object as a string literal. In these situations, raw string literals are helpful since they allow you to write the literal without requiring escapes.
Here is a snippet from the
Or another from
So, raw string literals are helpful, but what makes a valid one?
What makes a raw string literal?
The Rust Reference defines a raw string literal as starting with the character U+0072 (r), followed by zero or more of the character U+0023 (#) and a U+0022 (double-quote) character. The raw string body can contain any sequence of Unicode characters and is terminated only by another U+0022 (double-quote) character, followed by the same number of U+0023 (#) characters that preceded the opening U+0022 (double-quote) character⁵.
Escape characters in the raw string body are not processed.
Therefore the following raw string literals are all valid:
If you need to include double-quote character in a raw string, you must tag the start and end of the raw string with hash/pound signs(
The raw string body can contain any sequence of UNICODE characters except
"# since it would terminate the literal. If you want to include the particular sequence, you have to change the number of
# that precede the opening double-quote. For instance:
"## is to be included, you can add another
# to the starting and ending delimiters.
Raw string literals are helpful when you need to avoid escaping characters within a literal. The characters in a raw string represent themselves. Informally, a raw string literal is an r, followed by N hashes (where N can be zero), a quote, any characters, then a quote followed by N hashes⁶.
Here’s how visualising⁷ raw string literals works for me:
That’s it for now!
Enjoyed this post?
Originally published at rahul-thakoor.github.io.