Modern C++ In-Depth — Lambdas, Part 2

Michael Kristofik
FactSet
Published in
6 min readJan 6, 2023

This week, we continue our three-part series on one of the more popular features introduced in C++11, lambda expressions. Last time, we introduced lambdas and saw why one might want to use them. In this installment, we’ll take a closer look at how to write and use lambdas, along with examining a common pitfall.

The Lambda Type

Let’s begin by examining the type produced by a lambda expression. Our previous post concluded with an example that invokes std::sort with a short lambda:

void lambda_demo(std::vector<std::string>& tickers)
{
const auto& cache = get_cache();

std::sort(
std::begin(tickers), std::end(tickers),
[&cache](const std::string& lhs, const std::string& rhs) {
return cache.get_opening_price(lhs) < cache.get_opening_price(rhs);
});
}

If the lambda expression above were substantially longer, we might want to assign it to an intermediate variable to aid readability. Doing so, however, raises an interesting question: what should the type of that variable be?

Recall from last time that lambda expressions allow us to declare function objects, and that these objects are classes that define the function call operator. When the compiler encounters a lambda expression, it will generate a new class definition for each expression. Since these are all unique classes, they will each need a distinct name. These class names will be automatically generated at compile time and will therefore be unknown to the developer beforehand. As a result, we’ll need to use the auto keyword.

Here’s a simple example of what it might look like to assign a lambda expression to a variable:

const auto lambda = [&cache](const std::string& lhs, const std::string& rhs) {
return cache.get_opening_price(lhs) < cache.get_opening_price(rhs);
};

std::sort(std::begin(tickers), std::end(tickers), lambda);

Capturing Variables

Each lambda expression begins with a capture clause. The capture clause consists of zero or more tokens in a comma-separated list, enclosed by square brackets. This syntax allows us to capture or create variables as part of a lambda’s closure.

Variables that exist in the surrounding scope of a lambda expression can be captured either by value or reference. Any variable that is listed in the capture clause and prefixed by an ampersand (e.g., &foo) will be captured by (lvalue) reference, while any variable lacking such a prefix will be captured by value.

In the following example, cache will be captured by reference and date will be captured by value:

[&cache, date](const std::string& lhs, const std::string& rhs) {
return cache.get_opening_price(lhs, date) <
cache.get_opening_price(rhs, date);
};

Instead of listing each variable explicitly, we can also use one of two capture-default modes to automatically capture any variable that is used in the body of the lambda expression. Capture clauses support two capture-default modes: [&] and [=].

Using [&], we can automatically capture variables from our surrounding scope by reference. As mentioned earlier, the compiler will only capture those variables that are used within the body of the lambda. If the scope that encapsulates our lambda contains ten variables and we only use two of those in our lambda body, our closure will only contain references to those two variables.

The other capture-default mode, [=], exhibits similar behavior, but instead of capturing variables by reference, it will capture them by value, making copies of each variable used in the lambda body.

When a lambda expression is used within a class member function, the this pointer can be explicitly captured to provide the lambda access to other class member functions and member variables. Using capture-default modes will automatically capture the this pointer by value when it’s available.

Capturing Move-Only Types (from C++14 onwards)

In addition to capturing existing variables, the capture group can also be used to declare new variables for use within the lambda body. By using the copy-initialization syntax within the capture group, we can declare a new variable. This feature is known as “init-capture,” and is commonly used to transfer ownership of a move-only resource into a closure.

[cache = std::move(ticker_cache)](const std::string& lhs, const std::string& rhs) {
return cache.get_opening_price(lhs) < cache.get_opening_price(rhs);
};

Note that while in the example above, the name of the variable from the surrounding scope (ticker_cache) is different from that of the closure’s variable (cache), it is possible to use the same name for both variables.

Pitfalls

Despite being short and convenient, capture-default should be used with caution. Whenever a variable is accessed via pointer or reference, we need to be sure that the lifetime of the underlying object being referenced is at least as long as that of the aliasing variable. Consider this problematic example:

auto get_functor()  // Note the use of auto as the return type (C++14 onwards).
{
const std::string data = "My lifetime is limited to this function.";
return [&](){ std::cout << data << std::endl; };
}

int main()
{
const auto functor = get_functor();
functor();
return 0;
}

The function object returned by get_functor() captures a variable data whose lifetime is limited to the scope of the function. When the lambda is finally invoked from within main(), the data variable will have already been destroyed. As a result, the captured reference will have been left dangling, and the program may crash.

There are a few ways to address the issue. In this case, we can either change [&] to [=] in order to capture data by value, or we can transfer ownership of data to the closure using init-capture. While using [=] will allow us to fix the snippet above, it is not a silver bullet. Consider this example:

class widget
{
public:
auto get_functor()
{
return [=](){ std::cout << m_data << std::endl; };
}

private:
const std::string m_data = "My lifetime is limited to this class.";
};

int main()
{
// The lifetime of the widget instance starts and ends on the line below:
const auto functor = widget().get_functor();

functor();
return 0;
}

The code above suffers from the same dangling reference issue as our first example. At first glance, one might reasonably expect m_data to be a copy within the lambda body. However, since m_data is a member variable, our lambda expression is actually using an implicitly captured copy of the this pointer to access it. In fact, if we were to rewrite the lambda to explicitly capture the same pointer, we would end up with a function object that is semantically equivalent:

auto get_functor()
{
return [this](){ std::cout << m_data << std::endl; };
}

Regardless of how we capture the pointer, we’re still left with the same issue. The memory belonging to the member variable m_data will be cleaned up before the lambda is invoked. We can rectify the problem in several ways. One option would be to create an explicit local copy of the data that we wish to capture:

auto get_functor()
{
const auto copy = m_data;
return [copy](){ std::cout << copy << std::endl; };
}

A more efficient alternative would be to use init-capture. Doing so would reduce the number of copies being made from two to one:

auto get_functor()
{
return [copy = m_data](){ std::cout << copy << std::endl; };
}

Lastly, if we have access to C++17 or later, we might also consider copying the entire widget by dereferencing the this pointer in the capture clause:

auto get_functor()
{
return [*this](){ std::cout << m_data << std::endl; };
}

C++20 addresses the confusion created by the implicit capture of the this pointer when using [=] by deprecating that behavior. Going forward, if we want to capture the current object by pointer, we will have to do so explicitly.

Broadly speaking, it is generally recommended to capture individual variables explicitly instead of relying on capture-default modes. If we force ourselves to consider each variable individually, there is a higher chance that we will catch lifetime issues.

For further guidance on lambda expressions, consult the C++ Core Guidelines. Also, the lambda page on cppreference.com contains a summary of the capture syntax (along with far more information than would fit here).

What’s Next?

In the next installment, we’ll examine std::function and how it can be used to store and pass functions and function objects.

Acknowledgments

Special thanks to all that contributed to this blog post:

Author: Tim Severeijns
Reviewers: James Abbatiello, Michael Kristofik, Jennifer Ma

--

--

Michael Kristofik
FactSet
Editor for

Principal Software Architect at FactSet. I post on behalf of our company's C++ Guidance Group.