Finding Memory Leaks in C - 2.0
The place matters
The main idea of this article is to describe an approach that gives an ability to find memory leaks in C code on macOS. It considers one of the possible options for finding memory leaks and represents a skeleton that may be extended if necessary.
In contrast to the previous article, this approach is not entirely POSIX compatible since it uses backtrace
and backtrace_symbols
functions that are not a part of the IEEE 1003.1 standard. So, if your operating system does not provide these functions, you should find alternatives on your own.
The approach was tested on macOS Catalina 10.15.6 only with clang
compiler by version 11.0.3.
The article has a more practical nature than theoretical. It touches some deep and exciting system mechanisms, but it only mentions them and does not dive too deep into system runtime. Though it can be a quite exciting journey, it’s beyond of article’s scope.
In the previous article was considered a basic approach to detect memory leaks. The only thing that approach does is informing a developer if any memory leaks happened in the run application and their number. Generally speaking, it’s not enough in most cases. We usually want not just to know that there are any leaks in the application, but we also want to fix them. To fix them we need to know where they happened. More precisely — we want to know the exact place where a memory piece was allocated.
Below we will consider one of the possible solutions on how to achieve that.
Just run
The process is quite the same as in the previous article:
- Download the file with leak checking code here.
- Place it into your project anywhere you want. Just be sure that it compiles and links with the entire project. So you need to add it to your Makefile, project file, etc.
- In your
main.c
file add the declaration ofcheck_leaks
function at the beginning of the file. - Add
check_leaks
function call before exitingmain()
function end.
5. Run your application.
6. Check the output.
If you have any memory leaks you should see detailed information about them — their addresses, leaked memory sizes and call stack.
Dig a little bit deeper
If you need just to run the code to detect memory leaks, you may stop reading here. This section contains some technical details about the main concepts lying in our solution. Let’s consider the main idea of the approach that we used. It based on two key concepts:
- Intercept
malloc
/free
functions calls. - Obtain the call stack info to get more details where a leak happened.
First of all, to start our discussion, let’s define the terms. What is a memory leak? In a narrow meaning, it’s a piece of memory that was allocated and wasn’t freed. So our task is to detect all memory allocation and all memory deallocations in our application. After that, we can compare allocations and deallocations counts, and if they are different — we have got memory leaks.
The other task — detect the places where leaked pieces were allocated as precisely as possible.
Look into these tasks one by one.
System functions interception
To intercept malloc
/free
functions, we used dlsym function. dlsym
is a part of the POSIX standard, so you may easily find its description in man dlsym
or in the POSIX standard document (see References section). Here we will just demonstrate how it may be used for our goals.
The code snippet above demonstrates the main idea of system functions interception. We will follow it not line by line but by calling logic.
In (4) we placed a function that has the exact same signature as the system malloc
function does. When we call malloc
anywhere from our application, it is this implementation will be called. Next (5) we check if we have already initialized our memory leaks detection logic, and if we haven’t, we call malloc_init
. The malloc_init
does the second trick — it calls dlsym
function that returns a pointer to the real malloc
function and stores it in reall_malloc
static variable declared at (1).
Then in (6) we increment malloc_counter
variable declared at (2) that gives us the ability to count memory leaks at the end of our program.
And as the final step (7) we call real_malloc
function that does real memory allocation and returns a pointer to the caller.
In the same way we intercept function free
and may intercept others.
We should notice here that any system call may be intercepted in this way. It gives the developer a huge room to experiment and tune their applications.
Obtaining call stack
Having information where a malloc
call has been made give the developer an ability to find it fast and fix it. Call stack is one of the approaches that can help with that. It’s not 100% precise since it does not give any information about a file name and a line where the call has been made, but it still provides a lot of information and makes it easier to find the leaked call.
Here is the example from man backtrace
. It gives the main idea of how we can obtain a call stack information.
In (1) we declare an array of pointer to void
, we will store pointers to functions on the call stack. Then we call backtrace
(2) function that fills the pointers array with call stack functions pointers and returns its size. In (3) we call backtrace_symbols
that ‘converts’ pointers to functions to their names. As a result, we obtain C strings array with information about functions on the stack, including their names.
Here we should notice that it’s not necessary to call free
function for strs
and its content. backtrace_symbols
returns pointers but does not pass ownership to that memory area. It handles this memory somewhere under the hood. Moreover, it does not call malloc
, so it won’t lead to stack overflow.
After obtaining call stack info, we may parse it and use it in any way. See the full implementation for details. Probably you may want to do something different, so feel free to write your own implementation that requires your needs.
What’s next
For sure not all the allocation functions has been considered in this article. There are still calloc
, realloc
and others. You may intercept them too if you use them in your project. You may also write logs into a file instead of stderr
, group leaks if they have same call stack and so on.
Actually, this approach gives you the total power on memory allocation management. If you are brave enough, you may experiment with the code and implement almost any logic to play with memory leaks.
Happy coding.
References
- Sources for this solutions on gist.
- POSIX IEEE 1003.1-2017 document available for free download
man dlsym
man backtrace
— documentation for bothbacktrace
andbacktrace_symbols
functions.