c#: How I Stopped Worrying and Learned to Love the Closure


As a traditional c++ programmer I was used to a simpler life. Any advanced constructs were written by a colleague, a third party library, or by myself.

More often than not I could step into the source code while debugging if I experienced any unexpected behavior.

A grocery bag captures fruit similar to a closure captures variables

With c#, life has changed. Many complicated constructs are built into the language. One of these fancy features are closures. (c++11 also includes many advanced features, this article is about c#)

In the past, to implement callbacks etc. I used function pointers, virtual methods, and function-objects (aka functors).

These constructs are comparable to c# delegates and deal with state differently:

  1. A function pointer holds no state
  2. A functor is an object that holds state and implements the () operator
  3. A virtual method is a member of an object that holds state

To clarify the difference between a functor and a virtual method, the calling code does not need to know the type of what it is calling at compile time.

The above concepts handles state in a very clear way. Simpler times indeed.

Modern callbacks

In c# delegates in general and anonymous functions in particular seem very popular. As a traditional c++ programmer you will see some new terms:

  • Delegate (a callback-function)
  • Anonymous Function (a nameless function, often assigned to a delegate)
  • Lambda (a fancy way of writing simple one-liner anonymous functions)
  • Closure (an implicit class that holds state of an anonymous function)
  • Capture (the concept of moving variables to the closure)

In code you might see standard concrete version of a delegate, such as:

  • Action (a delegate with no arguments and no return type )
  • Func (a delegate with one argument and a return type)
  • MethodInvoker (like Action but in a different namespace)

Introducing The Closure

Let’s look at a simple delegate:

int myCounter = 0;
Action myAction = delegate(){ ++myCounter; };

When myAction is called it needs to know about myCounter, the delegate thus “keeps a reference” to myCounter and uses the reference when incrementing.

This “keeping a reference” conceptually moves the variable myCounter from local scope and in to a closure (a hidden class holding state) where it will live until no closure references it.

To illustrate where the closure is created and the capture happens, let’s look at an example:

List<Action> l = new List<Action>();
string str = "";
for( int i = 0 ; i < 5 ; ++i )
l.Add ( delegate() { str += i.ToString(); } );
foreach( Action d in l )
d.Invoke();

What do you think the value of the string is at the end of the code above?

[drumroll]

the correct result is:

“55555“

The trick here is that the loop-variable is only one variable. It kind of looks like we are creating a copy of “i” for each delegate we push on the list, but there is only one closure, the “i” variable was captured only once, before the loop started.

At the time of invocation, all delegates are referring to the same “i” variable, and its value is 5.

To get a unique closure for every loop, you need each iteration of the loop to reference a unique variable, such as this:

for( int i = 0 ; i < 5 ; ++i )
{
int j = i;
l.Add ( delegate() { str += j.ToString(); } );
}

The string will now be:

“01234“

Note that a closure extends the lifetime of the captured variables way beyond the scope. Specifically until the closure holding the variables is garbage-collected.

Reference to a Value type?

In the first example we had five delegates all sharing the same closure. Since the usage of the closure is not explicitly typed out, the body of the delegate becomes short, but also make it look a bit confusing to someone caring about weather a variable is a value type or a class type:

The anonymous function is written to look like we are using a reference to a value type, something (I believe) not possible in c#.

str += i.ToString();

This is obviously not the case and the underlying behavior become apparent if we add the hidden closure-class-reference for illustration:

str += myClosure.i.ToString();

Now it easier to see that the loop variable is still a value type but the closure is a class and thus a reference.

Conclusion

Closures in c# are powerful, but not without pitfalls.

Thanks for your attention,
Jonas Norberg

PS.

The first example is not contrived but something found in actual (would-have-been) production code. The example is obviously stripped down to highlight the closure-issue.

After finding it I realized that this is a common source of confusion and Eric Lippert himself wrote about it five years ago. Read it here:

http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx

Show your support

Clapping shows how much you appreciated Tales from the Crypt’s story.