Tracing allocations with EventPipe part 2 — Reading call stacks without TraceLog
In the first post, I explained how to use EventPipe to detect and locate heap allocations in a target application. The solution was based on the following steps:
- Trace the target application with dotnet-trace using the
GCAllObjectAllocation
,Type
, andGCHeapAndTypeNames
provider keywords. - Convert the output nettrace file to ETLX.
- Use
TraceLog
to parse the ETLX file and extract heap allocations with call stacks. - Add a custom event source to ignore irrelevant allocations and add valuable context to captured allocations.
In this post, I will try to explore if the output nettrace file can be parsed directly with EventPipeEventSource
.
A pull request was recently merged to add real time support for EventPipe in
TraceLog
. When it is released, it will probably be the recommended way to do real time profiling with call stack support. The content of this post should be mostly treated as educational, even though it is still useful if you do not want to useTraceLog
.
The problem with EventPipeEventSource
EventPipeEventSource
can be used to read the trace events from the nettrace file:
However, call stacks cannot be read, because the CallStack
extension method is only valid for TraceLog-based events.
It turns out there is a great post by Christophe Nasarre explaining exactly how to read allocation call stacks without relying on TraceLog
! Maybe I could just have used his code, et voilà?
Well, not so fast! His solution is based on the Stack
keyword and the ClrStackWalkTraceData
event, but the Stack
keyword was not enabled in our current implementation. Would it be possible that dotnet-trace automatically enables the Stack
keyword for us? A short analysis revealed that the trace file did not contain any ClrStackWalkTraceData
event. Worse, I could not find any ClrStackWalkTraceData
event even when I explicitly enabled the Stack
keyword. The explanation was simple: Christophe Nasarre’s solution works for ETW but not for EventPipe, because the Stack
keyword has no effect on EventPipe.
ETW is a tracing API that was created before EventPipe. ETW is more powerful and allows capturing machine-wide user and kernel events on the Windows platform. However, EventPipe is cross-platform and does not require admin privilege. ETW and EventPipe share many concepts (event sources, providers, parsers) and use common library types, which can sometimes make it difficult to identify specific features.
Loading the call stack addresses
So, I chose to investigate how call stacks were extracted, partly by reading the TraceLog and EventPipeEventSource source code, partly by debugging my program and stepping into the TraceEvent library. I discovered that call stacks were written by EventPipeEventSource inside the trace events in an internal eventRecord
field and then read by TraceLog. Sadly, I could not find any direct or indirect way to read this field using the public TraceEvent API, so I needed a way to read an internal field with an internal field type. I considered multiple options.
UnsafeAccessorAttribute
At first glance UnsafeAccessorAttribute seemed appropriate to read an internal field. Yet, I could not use it because the field type is also internal and accessing members with internal types is not supported (see this issue).
Type-proxy pattern
The type-proxy pattern consists in defining a proxy type with the same memory layout as your target type, and then using Unsafe.As<TargetType, ProxyType>
.
This option worked but TraceEvent
has so many fields that using this pattern seemed brittle. However, it was my first effective solution, so I stuck with it for most of my experiments. You can find the code here.
IgnoresAccessChecksToAttribute + UnsafeAccessorAttribute
IgnoresAccessChecksToAttribute
is the opposite of InternalsVisibleToAttribute
: it allows accessing the internal members of another assembly. This attribute can be problematic to use because it is not declared in the BCL and has no compiler support, but the package IgnoresAccessChecksToGenerator solves both of these issues.
This option almost worked out-of-the-box, but the generator was unable to make TraceEvent
members public. It was still very useful to get access to the TraceEventNativeMethods
members, so I simply took advantage of UnsafeAccessorAttribute
to read the field:
For the record, I initially used the excellent InlineIL.Fody library to read the field, but even though the code was quite simple, I believe UnsafeAccessorAttribute
is more straightforward here.
Please note that all mentioned options are in the “unsafe” path. Any update of the TraceEvent library could break the code and require manual adjustments. There are alternative options, like opening an issue on the GitHub project to ask for a supported way to extract the stack traces, or build your own version of the library with a few changes.
Reading the extended data
Once I had access to the eventRecord
field, it was easy to extract the call stack addresses:
The first step was completed, but the code was only loading unresolved call stacks, i.e.: lists of addresses. The next step was to find the related methods.
Resolving call stacks
Again, I mostly used a combination of source code reading and debugging to understand how TraceLog
was able to resolve method names from addresses. TraceLog
has multiple mechanisms to resolve method names, I will only describe the one that comes into play in our use case.
I discovered that methods are identified from the MethodLoad
/ MethodUnload
events, and then saved in a list. The list is sorted by method start addresses to support binary-search-based lookups. I used a similar pattern in my code:
The method list could then be used to resolve method names from addresses:
I thought that my solution was complete, but I realized that I was only able to resolve the names of methods from my own program and not those of the BCL. At that point, I noticed that TraceLog
was also using events from the CLR Rundown provider. I also spotted in the dotnet-trace source code that the rundown provider is enabled by default (and there is an unreleased explicit option to disable it).
After integrating the CLR Rundown events, I was able to resolve all the call stacks:
Although I was relatively satisfied at this point, I was slightly disappointed because most of the methods were identified in the MethodDCStopVerbose
, which is only raised at the end of the profiling. Because of this behavior, I could not do real-time address resolving: I had to store the call stack addresses and resolve them at the end of the profiling. However, I could not find an alternative event to identify loaded methods.
Closing
In the end, I was able to extract resolved call stacks from trace events without relying on TraceLog
. The method resolving part was implemented using standard trace events, but I could not find a way to load the call stack addresses using the TraceEvent public API and I had to rely on “unsafe” code to read an internal field.
I clearly underestimated the complexity of TraceLog
and the huge amount of logic that is required to support the CallStack
extension method of the TraceEvent library.
Still, my objective was completed, and I learned many things in the process. But reading the nettrace file with EventPipeEventSource
was only a stepping stone toward my main goal, which was to do real-time tracing on a running application. Of course, that’s a story for the next post!
Many thanks @Lucas_Trz and @romainverdier for the reviews.