Criteo R&D Blog
Published in

Criteo R&D Blog

Fixing .NET middle-age crisis with Java ReferenceQueue and Cleaner

My colleague Kevin has just described how to implement Java ReferenceQueue in C# as a follow-up to Konrad Kokosa’s article on this Java class. Among the different discussed features, one is still missing. This post will discuss how to deal with the “middle age crisis” scenario and control finalizer threading issues. I’m sure that my former Microsoft colleague Sebastien won’t be surprised by my interest in the subject.

When a class references both instances and native resources, the usual C# pattern is to implement both for explicit cleanup and a Finalizer to deal with developers who would have forgotten the explicit cleanup. This pattern might have a side effect when these classes are also referencing a large objects graph.

Let’s take a minute to describe how finalizers are managed by the CLR

This animation shows what happens at the end of a collection. The darkened objects are no more referenced and should be collected. B, G and H do not implement finalizers so that could be discarded. It is different for E, I and J because their classes implement a finalizer. First, a Finalization list was holding a “weak” reference to them since they were created. Then, at the end of a collection, these references are moved to the FReacheable queue and the collection ends. Later on, after the collection ends, the finalizer thread wakes up and calls the finalizer of all objects referenced by the FReacheable queue. This is the important part of the issue: it means that even though those objects weren’t referenced anymore, they couldn’t be collected nor their memory be reclaimed because the finalizer thread has not run yet. As they could not be reclaimed, they are promoted to the next generation just like other survivors. So if those objects were in generation 0, they now end up in generation 1, extending their lifetime. It’s even worse if they get promoted from generation 1 to generation 2, as the next gen 2 collection might happen only very far in the future. This artificially increases the memory consumption of the application.

To summarize, in case of business objects that hold a large references tree with also native resources, it would be great to be able to:

  1. Allow explicit cleanup resources with the pattern
  2. Discard the managed memory when the objects are collected
  3. Automatically cleanup native resources AFTER they are collected
  4. Have control on the thread that is cleaning up native resources

Mix a Phantom with IDisposable

The requirement #3 seems impossible to fulfill: how to access to field of an object if its memory has been reclaimed? Maybe it is possible to cheat: what if these native resources usually held as field would be copied when the object is still alive? That way, the cleanup code could be moved outside of the object itself. This is basically the Java idea implemented in C# by Kevin with his :

Let’s make it generic in term of native payload:

Also note that the cleaning method has been removed due to the requirement #1: the should be responsible for cleaning the resources because it will also implements . The cleaning native part will obviously be shared with the method.

The could be rewritten to use it and the first step is group native resources in a state:

The native payload is stored in a object that also contains the status. This is required to be able to know if the object has been disposed explicitly when the static method is called. This implementation fulfills the requirement #1 even though the cleanup code is throwing an exception: we will have to see how to control it.

Introducing the Cleaner a la Java

The next step is to focus on requirements #2 and #3: how to ensure that our memory gets reclaimed by the garbage collector but still automatically cleanup the native resources? This scenario is handled by the Cleaner class in Java mentioned reading Konrad’s article and that I have learnt to know better by discussing at length with Jean-Philippe, our team Java internals expert.

You can register an object, a state and a callback that will be called when the object is no more referenced. It is a kind of secondary finalization mechanism.

Let’s see how I would like to use it in C#:

There will be a unique object for all instances. Each one will register itself and its native state in its constructor by calling the method.

The instance receives two static callbacks:

  • Cleanup: this method will be called by the cleaner after a tracked instance has been collected. As you can see, there is no need to change its initial implementation. It was static and receives the that stores the native state of a . Since the type is a private inner class, the implementation details does not leak from like it was the case with Kevin’s implementation.
  • OnError: when an exception occurs during the cleanup (like my naïve implementation did by throwing an ), the method gets called. This is a new feature compared to a .NET finalizer: you are notified if something goes wrong and you are able to log it. However, I would recommend to still exit the application like the default CLR behavior when a finalizer throws an exception.

The code is therefore responsible for cleaning both and native resources: no need for its users to know the gory details.

The high-level API of the class has been defined; it is now time to see how to implement it. If you have read Kevin’s post, the first step should be obvious: a will keep track of the bound to each “business object” like . When the latter is collected, the phantom finalizer gets called to enqueue itself to the .

There is one big missing step: who will call the queue method to get the finalized that contains the native state to cleanup?

Stay in control of the cleaner job

The simple implementation I’ve chosen is to create a dedicated thread that will poll the queue every period you want and call the cleanup callback. I did not want to add pressure on the that is shared with the application. If an exception is raised, the error callback will be called.

Since I’ve created the thread as a background thread, it won’t block .NET to exit the process when the last foreground thread returns. However, you are free to follow the pattern, and call to explicitly stop the cleaning thread at the right time of your application lifecycle.

In the /finalizer pattern, the class provides the static method to remove an object when it has been explicitly disposed: that way, the object won’t go to the FReacheable queue nor be promoted into the next generation after it is collected. The Cleaner class provides the method to achieve the same effect: the object native payload won’t be cleaned up. I just had to update the to remove the object from the and remove the from the FinalizationList:

The requirement #4 is now fulfilled. You are obviously free to pick another implementation more suitable to your needs than a thread-based periodic cleanup. I would like to mention that if the cleanup callback never returns, the effect is almost the same as in the case of a stuck finalizer: the native resources won’t be cleaned up anymore.

The following code shows how all this “complicated” code does not leak in a C# application:

And you get the expected output:

Maybe Konrad will integrate a smarter Java -like feature within the CLR itself or Alexandre in his new .NET ;^)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store