Modern C++ In-Depth — Is string_view Worth It?

Michael Kristofik
FactSet
Published in
5 min readMay 11, 2023

std::string_view makes it easier to write generic code that can accept read-only references to character sequences, regardless of the underlying container that holds that data. String parsing and tokenization workflows may improve performance by avoiding unnecessary copies. However, like normal references, there is potential for misuse. This post will examine what string_view is, and more importantly what it is not, so you can make the best choices for your programs.

A brief history of strings at FactSet

The history of string data types in C++ is long and winding. Prior to C++98, there was no standardized string class included in the language. In those days, C++ relied on the null-terminated char arrays it inherited from C. Compiler vendors and early C++ adopters filled in the gap with their own proprietary string classes to provide automatic memory management and richer interfaces for operating on string data.

At FactSet, we adopted C++ in the early 1990’s during the pre-ISO-standard era. We initially used the string classes included with the compilers supplied by our OS vendors. Unfortunately, by the time a mature implementation of std::string became available to us, these string classes were heavily embedded throughout our multi-million line code base. Code that deals with more than one string type must either incur the performance penalty that comes with converting between them, or regress to C-style const char* as the lingua franca.

Enter string_view

Introduced in C++17, std::string_view provides a better way to pass
references to string data around in your program. Conceptually, a string_view is a reference to read-only string data. Just like a reference, a string_view does not take ownership of the data it refers to. The lifetime of that memory must be managed external to the string_view. It can be thought of as a const char* and length, or as a pair of begin/end const char* pointers.

Usage

Use string_view anywhere you would have previously used a non-owning pointer or reference to const string data. For example, string_view is often a good replacement for:

  • const char* and a length
  • const std::string&
  • begin/end pair of const_iterator from a string-like class (requires C++20)

As an illustration, consider a function that tests if a string begins with a
given prefix. We might write it like this:

bool has_prefix(const char* str, const char* prefix)
{
return std::strncmp(str, prefix, std::strlen(prefix)) == 0;
}

This code will work with most string types but has a few drawbacks:

  • the original strings must be converted to const char*
  • only works for null-terminated strings
  • slower than necessary for string types that store their length (strlen
    recomputes it)

To get around these issues, we might be tempted to write multiple overloads of this function for each string type. Or we might further complicate the interface by introducing a prefix_size parameter. With string_view, this function becomes simpler:

bool has_prefix(std::string_view str, std::string_view prefix)
{
return str.substr(0, prefix.size()) == prefix;
}

The substr function creates a new string_view referring to a subset of the original string in constant time. The operator== then compares the contents referred to by the two views. Note: in C++20, this entire function could be replaced with string_view::starts_with().

When using string_view as a function parameter or return value, prefer to pass it by value rather than by reference. It is small, and is designed to
mimic a reference:

void inspect_string(std::string_view s);       // DO THIS
void insepct_string(const std::string_view& s) // NOT THIS

Gotchas

There are a few gotchas to be aware of when using string_view.

Lifetime

std::string_view models a non-owning reference, so we must ensure the string data out-lives the string_view object. All the usual safety rules for references apply equally to string_view. For example, be careful not to
return a string_view from a function if it refers to a function-local string
object.

Also, be aware that string_view will not extend the lifetime of a temporary object like a normal reference to const will. Suppose we had a function get_name() that returned a string object:

// SAFE - lifetime extended to scope of 'longer_name'
const std::string& longer_name = get_name() + " foo";

// ERROR - 'bad_name' refers to temporary whose scope ends on this line
const std::string_view bad_name = get_name() + " foo";

Nulls

std::string_view is not null terminated. It is not a generic wrapper
around a proper string object. This gives it the flexibility to refer to a fragment of a larger string, enables efficient slicing operations (e.g., substr, remove_prefix, remove_suffix), and also allows for embedded null characters (which std::string supports).

This also explains why there is no c_str() function, only data() and
size(). Any call to data() without a corresponding call to size() is
likely a coding error.

Guidance

How should you get started using string_view? What if you’re modernizing an existing code base that already uses std::string references everywhere?

  • Functions that accept a const char* or const string& (of any string type) parameter, consider replacing with string_view unless:
    - you’re passing the argument to a function requiring const string& or other null-terminated string (e.g., fopen or printf)
    - you’re copying the data to a new string object (see below)
  • string_view knows how to print itself with operator<<
  • Standard associative containers using strings as keys will accept string_view in their lookup functions. Support for unordered containers was added in C++20.
  • A string_view can be stored in a container, where a normal reference cannot. Be aware of the lifetime of the underlying character sequence.

The bit about copying string data requires some explanation. Yes, you’re only reading from it. But some string types (including std::string on certain platforms) have copy-on-write semantics. Or perhaps the caller already has a temporary object of the needed type and could have moved from it. Forcing an explicit copy operation in those cases would prevent such optimizations.

Instead, consider accepting the destination string type by value. This allows the caller access to the full set of constructors to efficiently perform
the copy. You can then efficiently move it into place.

struct Person
{
std::string m_name;

void set_name(std::string name)
{
m_name = std::move(name);
}
};

Other posts in this series

The Modern C++ In-Depth series has explored some of the more technically challenging features of C++11 and beyond. Other topics we have covered previously:

Acknowledgments

Special thanks to all who contributed to this blog post:

Authors: Jim Arena and Michael Kristofik
Reviewers: James Abbatiello, Jennifer Ma, Jens Maurer, and Jason Wang

--

--

Michael Kristofik
FactSet
Editor for

Principal Software Architect at FactSet. I post on behalf of our company's C++ Guidance Group.