Modern C++ In-Depth — Transparent Comparisons

Published in

FactSet

4 min readJun 30, 2023

Last time, we teased the idea of leveraging std::string_view to do lookups on containers of arbitrary string types. Creating and destroying many temporary strings can become a performance bottleneck, particularly if the string data has to be allocated on the heap. It would be nice if we could avoid that, but how?

Motivation

Searching for an element in a container normally requires constructing an
object of the container’s key_type. If the key is small, such as one you’d
normally pass by value, it isn’t a performance concern. But what if the key is large?

struct Foo
{
  int id;
  std::vector<std::string> data;
};

bool operator<(const Foo& lhs, const Foo& rhs)
{
  return lhs.id < rhs.id;
}

std::set<Foo> foo_collection;

Calls to foo_collection.find() require an object of type Foo as an argument. Constructing those objects could be expensive if there are many of them. But notice that the comparison only depends on the id variable.
Wouldn’t it be nice if we could call find() with just the integer idinstead of the full object? Sets compare their elements using operator<, so let’s write one to compare Foo with an integer.

bool operator<(const Foo& lhs, int rhs)
{
  return lhs.id < rhs;
}

bool operator<(int lhs, const Foo& rhs)
{
  return lhs < rhs.id;
}

auto iter = foo_collection.find(42);  // will this work?

Unfortunately, this fails to compile, with an error message similar to “no
matching function for call to std::set<Foo>::find(int).” It appears to only
accept a key of type Foo. There is still one more piece to this puzzle.

Introducing Transparent Comparisons

C++14 introduced two new overloads to the find() function, taking an argument that compares equivalent to the key but not necessarily of the same type. The footnote for them contains this cryptic bit of standardese:

This overload participates in overload resolution only if the qualified-id
Compare::is_transparent is valid and denotes a type. It allows calling
this function without constructing an instance of Key.

This appears to be what we want, but what does Compare::is_transparent mean? In short, it’s a typedef that signals a comparison functor will accept
arbitrary types with perfect forwarding. It promises to be fast by not spending time constructing any temporary objects. Aha, so that means we need to write our own comparator.

struct TransparentCompare
{
  using is_transparent = void;

  template <typename T, typename U>
  bool operator()(T&& lhs, U&& rhs) const
  {
    return std::forward<T>(lhs) < std::forward<U>(rhs);
  }
};

std::set<Foo, TransparentCompare> foo_collection;
auto iter = foo_collection.find(42);

Now this works. To avoid breaking existing code, the standards committee made this an opt-in feature — the is_transparent typedef is how you opt in. Any valid type will do, you can just use void.

At this point one might say, “This feels like a lot of work to write custom
comparators for all my containers.” Fortunately, also in C++14, all of the
standard function objects now have specializations that deduce their argument types. Their implementation is equivalent to our example here. Thus, the above example can be condensed to the following:

#include <functional>

std::set<Foo, std::less<>> foo_collection;
auto iter = foo_collection.find(42);

What About Unordered Containers?

The above feature works for all standard associative containers, but support for unordered containers wasn’t added until C++20. We will need both a transparent comparison and a transparent hash function. Let’s revisit the idea of using string_view for lookups.

#include <string_view>
using namespace std::literals;

struct StringHash : public std::hash<std::string_view>
{
  using is_transparent = void;
};

std::unordered_set<std::string, StringHash, std::equal_to<>> str_collection;

// Neither of these will construct a temporary string.
auto iter = str_collection.find("this is a test");
iter = str_collection.find("this is too"sv);

This technique can yield a performance improvement when doing lots of string lookups, particularly if the strings are large or you want to use substrings. It is more awkward than for associative containers, but it has to be done this way because the standard states that hash equivalence is a separate concept from object equality.

Measure, Measure, Measure

This is admittedly a neat feature, one that’s tempting to use everywhere now that you know how to use it. However, our advice remains consistent with other performance tricks. Don’t bother, unless your profiler shows you have a bottleneck in container lookups. In our experience, the slow parts are usually where you least expect them. Prefer simpler code where possible, only digging into the bag of tricks when necessary.

Acknowledgments

Special thanks to all who contributed to this blog post:

Author: Michael Kristofik
Reviewers: James Abbatiello and Jens Maurer