me while using backtrace to debug clang/LLVM

Writing a GDB Frame Filter

Min-Yih Hsu
6 min readAug 2, 2018

Showing backtrace in GDB saves lots of developers’ life while hunting bugs — but brings more disturbance when you’re facing a large project that has call frames at least 60 layers deep. Luckily there is a built-in python (unless you’re using custom built or very old version) infrastructure inside gdb to enable developers to customize their own stack trace appearance.

The first thing that surprises me is that frame filter infrastructure inside GDB actually went through some breaking changes before(that is one of my primary motivations writing this article, as most of the tutorial blogs online are outdated). So be sure to check your version before everything starts. I’m going to use GDB 7.11, whose frame filter interface align with the latest official documents.

There are two components we’re going to talk about here (All of them are Python API, of course):

  1. GDB Frame Filter API
  2. GDB Frame Decorator API

Let’s go through them from top-down: Starting with Frame Filter.

Frame Filter

As mentioned previously, frame filter would process the input backtrace presentation and (might)return a new one. There can be multiple filters at the same time, and they are arranged in a pipeline fashion:

[Frame]--> (Filter A) -[Frame']-> (Filter B) -[Frame'']->

Their running order is determinate by priority, one of the three required class properties you should implemented while writing your filter class. This bring us to our first frame filter prototype:

class UpperCaseLoverFilter:
def __init__(self):
self.name = "uppercase_lover"
self.enabled = True
self.priority = 100
gdb.frame_filters[self.name] = self
def filter(self, ...):
(...)

Though there are some code, the “…” part, I don’t want to reveal now, it turns out that the basic structure is pretty simple, we don’t even need to inherit a parent class! The filter function is also self-explained, which is where processed backtrace from previous filter passed in, and where we return our backtrace.

Until now I only use the term backtrace, but how exactly a backtrace be represented? You’re right, a series of frame, which is represented by a iterator on the first(or second, if you prefer) argument for the filter function.

def filter(self, frame_iter):
(...)

But don’t rush to get your hand dirty with the frame extracted from the iterator — you’ll sadly find that it’s not a real frame, or gdb.Frame type object. Instead, it’s literally a processed frame. What does that mean? In gdb, they try to maintain data integrity on their original frame data. Such that no matter how a frame filter ruin a frame(and return a ruined frame), filters at the next layer are still be able to see the original frame. So frames extracted from frame_iter iterator are processed frames(i.e. ruined frames) returned from last filter, and you can get the original frame by calling inferior_frame function, which we will talk about it shortly, on the processed frame object.

As some of you might notice, the filter function is not passed with a single frame, it’s passed with a series of frames. So different from filters you wrote before, you should also returns an iterator, rather than a single (processed) frame.

There are many ways to return an iterator carrying processed frame. But all of them are facing one common problem: How do I modify a frame? GDB doesn’t explicitly reveal their processed frame format. Instead, they provide a functor-like interface for you to implement: Frame Decorator.

Frame Decorator

Before we dive in, let’s stand one step backward to see how this decorator help us building an iterator mentioned previously. A decorator is a functor-like object, so you can think it and use it as a function. Therefore, the simplest way to create a new frame iterator is using itertools.imap

def filter(self, frame_iter):
return itertools.imap(YourDecorator, frame_iter)

itertools.imap would create a new iterator by applying a function, or any callable object, on elements in the given iterator. And YourDecorator here is the callable it’s going to apply on frame_iter.

Of course there are other ways to return an iterator. For example, create an plain iterator from scratch, which is pretty useful if you would like to jump arbitrarily among frames with the iterator.

Now let’s back to the Frame Decorator topic. To ease your efforts of modifying a frame, gdb provides a parent class for you to inherit, the FrameDecorator class.

from gdb.FrameDecorator import FrameDecoratorclass UpperCaseDecorator(FrameDecorator):
def __init__(self, prev_decorated_frame):
super(UpperCaseDecorator, self).__init__(prev_decorated_frame)
self.prev_decorated_frame = prev_decorated_frame
def function(self):
(...Do something to the function name...)

def line(self):
(...Do something to the line number...)
def elided(self):
(Collapse several frames, we would cover this later)

It’s pretty important to figure out what is the object passed in from the class constructor. And as the name suggested, it is a decorated frame, a processed frame return from previous frame decorator in the frame filter pipeline. So don’t be confused with the frame in previous level of stack.

You can modify components in a frame, function name and line number, to name a few, by overriding the corresponding member functions in FrameDecorator , just like the snippet shown above. For example, I want to change all of the function name into upper case:

def function(self):
orig_frame = self.prev_decorated_frame.inferior_frame()
return str(orig_frame.name()).upper()

Remember inferior_name mentioned earlier? It would return the “unmodified” frame before any frame decorator was applied.

You should refer to the API document to check out what and what data type you should return for each member functions you’re interested in.

Finally, we can run this tiny filter in our gdb, just source the script file and execute a python statement to initialize and register the filter.

(gdb) source uppercase_lover_frame_filter.py
(gdb) python UpperCaseLoverFilter()

As you might notice, the last statement in UpperCaseLoverFilter ‘s constructor, gdb.frame_filters[self.name] = self , registers the filter itself to the global frame filters dictionary. Now you would get uppercase function names in your stacktrace!

(gdb) bt
#0 0x123456 void REMTHEBEST(arg=...)
#1 0x234567 int SAYREMTHEBEST(...)

Now let’s talk about something more advance: Coalescing several frames. Recall the very first section, while we’re talking about driven crazy by a really deep stacktrace. Sometimes is was caused by many layers of wrapper function, for example:

(gdb) bt
#0 0x121212 void doPrintImpl(str="Rem the best", ...)
#1 0x343434 void doPrint(str="...")
#2 0x565656 void print(str="...")
#3 0x787878 int HandleMsg(...)

We all know that frame #0 to #2 are basically doing the same thing. So when you want to trace who calls the print function, i.e. the HandleMsg function above, it would be more useful to merge frame #0 and #1 into #2 .

(gdb) bt
#2 0x565656 void print(str="...")
#0 0x121212 void doPrintImpl(str="Rem the best", ...)
#1 0x343434 void doPrint(str="...")
#3 0x787878 int HandleMsg(...)

To do this, you need to override the FrameDecorator.elided function, to return a list of frames you with to collapse. However, overriding that function is never a difficult job, the problem people usually bump into is how to manipulate the iterator.

Recall that FrameDecorator is a functor-like(callable) object, not an iterator we will return from filter function in a frame filter. We should either use some iterator tools like the itertools.imap mentioned earlier or create a new iterator. The one we’re going to talk about falls to the latter category. Here is the first peek:

class SqueezerIterator:
def __init__(self, input_frame_iterator):
self.input_frame_iter = input_frame_iterator
def __iter__(self):
return self
def getOrigName(self, frame):
return str(frame.inferior_frame().name())
def next(self):
frame = next(self.input_frame_iter)
elided_frame = []
while not EndOfFrame:
if getOrigName(frame).startswith('doPrint'):
# Skip
elided_frame.append(frame)
frame = next(self.input_frame_iter)
else:
break

if len(elided_frame) > 0:
return SqueezerDecorator(frame, elided_frame)
else:
return frame

(The italic word is pseudo code) Idea behind this code is pretty simple: Creating an iterator that would skip next item if it is one of the wrapper functions. What FrameDecorator (i.e. SqueezerDecorator above) do here is to record the elided frames. Here is what SqueezerDecorator might look like.

class SqueezerDecorator(FrameDecorator):
def __init__(self, processed_frame, elided_frames):
super(SqueezerDecorator, self).__init__(processed_frame)
self.elided_frames = elided_frames

def elided(self):
return self.elided_frames

All we need for overriding FrameDecorator.elided is return the frames we want to elide. And the processed_frame passed from constructor, which is also passed to parent constructor, is the next frame we’re going to return from iterator, it would be untouched in this case since we’re only interested in frames that would be eliminated.

By default, those frames from elided function would be shown with prefix indents in the backtrace just like what I showed in demo backtrace previously. You can hide them from backtrace view with command backtrace hide . However, only later version support this hiding feature, so if your an old fashion guy(just kidding), you can always return frame instead of returning SqueezerDecorator(...) and frame conditionally, that is, getting rid of frame decorator, just remove that frame from iterator.

There are still so many thing to say about frame filter, like what code should put in filter and what code should put in decorator (spoiler: you should put most of your filtering logic in decorator). To sum up, frame filter can save you lots of time browsing the endless backtrace log. It’s even cooler to do automation in combination with gdb script(actually that’s my initial motivation to use frame filter, and it’s doing pretty well so far)

--

--