The Null Pointer Strikes Again: How a Simple Mistake Led to a Segfault Odyssey
From Missing Core Dumps to Mismatched Headers in Embedded Linux
As we approach the end of 2023, I want to reflect on a particularly vexing bug that challenged me during the integration of an application into a new hardware in Embedded Linux. Remarkably, this turned out to be the final hurdle I successfully overcame as an Embedded Software Engineer before embarking on a week-long vacation!
1. Missing Core Dumps
The story starts when I took up a task to integrate a camera application to a new hardware which runs a custom Yocto-based Linux. Step-by-step, when I thought every piece of the app would be integrated seamlessly, suddenly it crashed with the intimidating Segmentation fault
.
Upon the occurrence of segmentation fault, my immediate thought was to get the core dump file for analysis. But guess what? There was no core dump file generated!
Even after going through all the hints from Linux man-pages about missing core dump file, the issue still unresolved. So, my next step was to … … turn off the laptop and call it a day!
While travelling on the train, the term – signal handler
came up on my mind, which gave myself a big hint on the next step to find the missing core dump file. The next day, I searched through the application code and found that a custom signal handler for SIGSEGV
was setup which overrode default one that generates core dump file.
With the custom sigaction
disabled, the core dump file was finally generated upon segmentation fault. So, moral of the story, a relaxed mind is the greatest tool for debugging! :D
2. Unveiling Mystery of Mismatched Headers
Analysis from cross-platform GDB showed that a pointer to an object was overridden to NULL
. Then, I found that the pointer variable was placed right after a v4l2_plane
variable in a class.
class Imager
{
...
...
private:
struct v4l2_plane plane;
void *pointer;
}
Further debugging showed that buffer overflow occurred on the __u32 reserved[11]
array of the v4l2_plane
in the ioctl
system call of command VIDIOC_QUERYBUF
, where a total of 256 bytes were set to 0, including the reserved[11]
array. Upon realising that 256 bytes precisely matched the size of an array of uint32_t
with 64 elements, I promptly delved into the source code of V4L2
driver. After thorough investigation, I identified that the reserved
array of v4l2_plane
structure defined in the videodev2.h
file had an element size of 64, contrary to the expected 11 (which was defined in the videodev2.h
file in the SDK).
// videodev2.h from SDK
struct v4l2_plane {
__u32 bytesused;
__u32 length;
union {
__u32 mem_offset;
unsigned long userptr;
__s32 fd;
} m;
__u32 data_offset;
__u32 reserved[11];
};
At this point, it became evident that the buffer overflow was caused by mismatched headers between userspace and kernel. Fortunately, after a somewhat challenging journey, this issue was easily rectified by a small update.
It’s a Wrap!
Undoubtedly, it was satisfying to have the final issue resolved before calling it a wrap for 2023 and heading off to a well-deserved vacation. I look forward to embracing new challenges in the upcoming year. Being prepared and motivated, I am ready to tackle whatever comes my way, eager to continue growing and advancing in my endeavors.