Fixing .NET middle-age crisis with Java ReferenceQueue and Cleaner
My colleague Kevin has just described how to implement Java ReferenceQueue in C# as a follow-up to Konrad Kokosa’s article on this Java class. Among the different discussed features, one is still missing. This post will discuss how to deal with the “middle age crisis” scenario and control finalizer threading issues. I’m sure that my former Microsoft colleague Sebastien won’t be surprised by my interest in the subject.
When a class references both
IDisposable instances and native resources, the usual C# pattern is to implement both
IDisposable for explicit cleanup and a Finalizer to deal with developers who would have forgotten the explicit cleanup. This pattern might have a side effect when these classes are also referencing a large objects graph.
Let’s take a minute to describe how finalizers are managed by the CLR
This animation shows what happens at the end of a collection. The darkened objects are no more referenced and should be collected. B, G and H do not implement finalizers so that could be discarded. It is different for E, I and J because their classes implement a finalizer. First, a Finalization list was holding a “weak” reference to them since they were created. Then, at the end of a collection, these references are moved to the FReacheable queue and the collection ends. Later on, after the collection ends, the finalizer thread wakes up and calls the finalizer of all objects referenced by the FReacheable queue. This is the important part of the issue: it means that even though those objects weren’t referenced anymore, they couldn’t be collected nor their memory be reclaimed because the finalizer thread has not run yet. As they could not be reclaimed, they are promoted to the next generation just like other survivors. So if those objects were in generation 0, they now end up in generation 1, extending their lifetime. It’s even worse if they get promoted from generation 1 to generation 2, as the next gen 2 collection might happen only very far in the future. This artificially increases the memory consumption of the application.
To summarize, in case of business objects that hold a large references tree with also native resources, it would be great to be able to:
- Allow explicit cleanup resources with the
- Discard the managed memory when the objects are collected
- Automatically cleanup native resources AFTER they are collected
- Have control on the thread that is cleaning up native resources
Mix a Phantom with IDisposable
The requirement #3 seems impossible to fulfill: how to access to field of an object if its memory has been reclaimed? Maybe it is possible to cheat: what if these native resources usually held as
IntPtr field would be copied when the object is still alive? That way, the cleanup code could be moved outside of the object itself. This is basically the
PhantomReference Java idea implemented in C# by Kevin with his
Let’s make it generic in term of native payload:
Also note that the cleaning method has been removed due to the requirement #1: the
LargeObjectshould be responsible for cleaning the resources because it will also implements
IDisposable. The cleaning native part will obviously be shared with the
LargeObject could be rewritten to use it and the first step is group native resources in a state:
The native payload is stored in a
NativeState object that also contains the
IDisposablestatus. This is required to be able to know if the object has been disposed explicitly when the static
Cleanup method is called. This implementation fulfills the requirement #1 even though the cleanup code is throwing an exception: we will have to see how to control it.
Introducing the Cleaner a la Java
The next step is to focus on requirements #2 and #3: how to ensure that our
LargeObject memory gets reclaimed by the garbage collector but still automatically cleanup the native resources? This scenario is handled by the Cleaner class in Java mentioned reading Konrad’s article and that I have learnt to know better by discussing at length with Jean-Philippe, our team Java internals expert.
You can register an object, a state and a callback that will be called when the object is no more referenced. It is a kind of secondary finalization mechanism.
Let’s see how I would like to use it in C#:
There will be a unique
Cleaner object for all
LargeObject instances. Each one will register itself and its native state in its constructor by calling the
Cleaner instance receives two static callbacks:
- Cleanup: this method will be called by the cleaner after a tracked
LargeObjectinstance has been collected. As you can see, there is no need to change its initial implementation. It was static and receives the
NativeStatethat stores the native state of a
LargeObject. Since the
NativeStatetype is a private inner class, the implementation details does not leak from
LargeObjectlike it was the case with Kevin’s
- OnError: when an exception occurs during the cleanup (like my naïve implementation did by throwing an
InvalidOperationException), the method gets called. This is a new feature compared to a .NET finalizer: you are notified if something goes wrong and you are able to log it. However, I would recommend to still exit the application like the default CLR behavior when a finalizer throws an exception.
LargeObject code is therefore responsible for cleaning both
IDisposableand native resources: no need for its users to know the gory details.
The high-level API of the
Cleaner class has been defined; it is now time to see how to implement it. If you have read Kevin’s post, the first step should be obvious: a
ReferenceQueue will keep track of the
PhantomObjectFinalizer bound to each “business object” like
LargeObject. When the latter is collected, the phantom finalizer gets called to enqueue itself to the
There is one big missing step: who will call the queue
Poll method to get the finalized
PhantomObjectFinalizer that contains the native state to cleanup?
Stay in control of the cleaner job
The simple implementation I’ve chosen is to create a dedicated thread that will poll the queue every period you want and call the cleanup callback. I did not want to add pressure on the
ThreadPool that is shared with the application. If an exception is raised, the error callback will be called.
Since I’ve created the thread as a background thread, it won’t block .NET to exit the process when the last foreground thread returns. However, you are free to follow the
IDisposable pattern, and call
Dispose to explicitly stop the cleaning thread at the right time of your application lifecycle.
IDisposable/finalizer pattern, the
GC class provides the
SuppressFinalize static method to remove an object when it has been explicitly disposed: that way, the object won’t go to the FReacheable queue nor be promoted into the next generation after it is collected. The Cleaner class provides the
Untrack method to achieve the same effect: the object native payload won’t be cleaned up. I just had to update the
ReferenceQueue to remove the object from the
ConditionalWeakTable and remove the
PhantomReference from the FinalizationList:
The requirement #4 is now fulfilled. You are obviously free to pick another implementation more suitable to your needs than a thread-based periodic cleanup. I would like to mention that if the cleanup callback never returns, the effect is almost the same as in the case of a stuck finalizer: the native resources won’t be cleaned up anymore.
The following code shows how all this “complicated” code does not leak in a C# application:
And you get the expected output: