Tracing allocations with EventPipe part 2 — Reading call stacks without TraceLog

Olivier Coanet
5 min readJul 30, 2024

--

In the first post, I explained how to use EventPipe to detect and locate heap allocations in a target application. The solution was based on the following steps:

  1. Trace the target application with dotnet-trace using the GCAllObjectAllocation, Type, and GCHeapAndTypeNames provider keywords.
  2. Convert the output nettrace file to ETLX.
  3. Use TraceLog to parse the ETLX file and extract heap allocations with call stacks.
  4. Add a custom event source to ignore irrelevant allocations and add valuable context to captured allocations.

In this post, I will try to explore if the output nettrace file can be parsed directly with EventPipeEventSource.

A pull request was recently merged to add real time support for EventPipe in TraceLog. When it is released, it will probably be the recommended way to do real time profiling with call stack support. The content of this post should be mostly treated as educational, even though it is still useful if you do not want to use TraceLog.

The problem with EventPipeEventSource

EventPipeEventSource can be used to read the trace events from the nettrace file:

However, call stacks cannot be read, because the CallStack extension method is only valid for TraceLog-based events.

It turns out there is a great post by Christophe Nasarre explaining exactly how to read allocation call stacks without relying on TraceLog! Maybe I could just have used his code, et voilà?

Well, not so fast! His solution is based on the Stack keyword and the ClrStackWalkTraceData event, but the Stack keyword was not enabled in our current implementation. Would it be possible that dotnet-trace automatically enables the Stack keyword for us? A short analysis revealed that the trace file did not contain any ClrStackWalkTraceData event. Worse, I could not find any ClrStackWalkTraceData event even when I explicitly enabled the Stack keyword. The explanation was simple: Christophe Nasarre’s solution works for ETW but not for EventPipe, because the Stack keyword has no effect on EventPipe.

ETW is a tracing API that was created before EventPipe. ETW is more powerful and allows capturing machine-wide user and kernel events on the Windows platform. However, EventPipe is cross-platform and does not require admin privilege. ETW and EventPipe share many concepts (event sources, providers, parsers) and use common library types, which can sometimes make it difficult to identify specific features.

Loading the call stack addresses

So, I chose to investigate how call stacks were extracted, partly by reading the TraceLog and EventPipeEventSource source code, partly by debugging my program and stepping into the TraceEvent library. I discovered that call stacks were written by EventPipeEventSource inside the trace events in an internal eventRecord field and then read by TraceLog. Sadly, I could not find any direct or indirect way to read this field using the public TraceEvent API, so I needed a way to read an internal field with an internal field type. I considered multiple options.

UnsafeAccessorAttribute

At first glance UnsafeAccessorAttribute seemed appropriate to read an internal field. Yet, I could not use it because the field type is also internal and accessing members with internal types is not supported (see this issue).

Type-proxy pattern

The type-proxy pattern consists in defining a proxy type with the same memory layout as your target type, and then using Unsafe.As<TargetType, ProxyType>.

This option worked but TraceEvent has so many fields that using this pattern seemed brittle. However, it was my first effective solution, so I stuck with it for most of my experiments. You can find the code here.

IgnoresAccessChecksToAttribute + UnsafeAccessorAttribute

IgnoresAccessChecksToAttribute is the opposite of InternalsVisibleToAttribute: it allows accessing the internal members of another assembly. This attribute can be problematic to use because it is not declared in the BCL and has no compiler support, but the package IgnoresAccessChecksToGenerator solves both of these issues.

This option almost worked out-of-the-box, but the generator was unable to make TraceEvent members public. It was still very useful to get access to the TraceEventNativeMethods members, so I simply took advantage of UnsafeAccessorAttribute to read the field:

For the record, I initially used the excellent InlineIL.Fody library to read the field, but even though the code was quite simple, I believe UnsafeAccessorAttribute is more straightforward here.

Please note that all mentioned options are in the “unsafe” path. Any update of the TraceEvent library could break the code and require manual adjustments. There are alternative options, like opening an issue on the GitHub project to ask for a supported way to extract the stack traces, or build your own version of the library with a few changes.

Reading the extended data

Once I had access to the eventRecord field, it was easy to extract the call stack addresses:

The first step was completed, but the code was only loading unresolved call stacks, i.e.: lists of addresses. The next step was to find the related methods.

Resolving call stacks

Again, I mostly used a combination of source code reading and debugging to understand how TraceLog was able to resolve method names from addresses. TraceLog has multiple mechanisms to resolve method names, I will only describe the one that comes into play in our use case.

I discovered that methods are identified from the MethodLoad / MethodUnload events, and then saved in a list. The list is sorted by method start addresses to support binary-search-based lookups. I used a similar pattern in my code:

The method list could then be used to resolve method names from addresses:

I thought that my solution was complete, but I realized that I was only able to resolve the names of methods from my own program and not those of the BCL. At that point, I noticed that TraceLog was also using events from the CLR Rundown provider. I also spotted in the dotnet-trace source code that the rundown provider is enabled by default (and there is an unreleased explicit option to disable it).

After integrating the CLR Rundown events, I was able to resolve all the call stacks:

Although I was relatively satisfied at this point, I was slightly disappointed because most of the methods were identified in the MethodDCStopVerbose, which is only raised at the end of the profiling. Because of this behavior, I could not do real-time address resolving: I had to store the call stack addresses and resolve them at the end of the profiling. However, I could not find an alternative event to identify loaded methods.

Closing

In the end, I was able to extract resolved call stacks from trace events without relying on TraceLog. The method resolving part was implemented using standard trace events, but I could not find a way to load the call stack addresses using the TraceEvent public API and I had to rely on “unsafe” code to read an internal field.

I clearly underestimated the complexity of TraceLog and the huge amount of logic that is required to support the CallStack extension method of the TraceEvent library.

Still, my objective was completed, and I learned many things in the process. But reading the nettrace file with EventPipeEventSource was only a stepping stone toward my main goal, which was to do real-time tracing on a running application. Of course, that’s a story for the next post!

Many thanks @Lucas_Trz and @romainverdier for the reviews.

--

--