A debugger from scratch — part 1

Using ptrace to set a breakpoint in an executable

Have you ever wondered how debuggers work? What happens when you set a breakpoint? How does the debugger control the flow of your program, or change values in variables? Let’s find out by writing a basic debugger in Go!

In this part 1 we’ll start by using the ptrace system call to get control of a target program.

The ptrace system call

There’s a very powerful system call called ptrace that lets one process inspect and modify the memory and register states of another process. Here’s an extract from the man page:

The ptrace() system call provides a means by which one process (the
"tracer") may observe and control the execution of another process
(the "tracee"), and examine and change the tracee's memory and registers. It is primarily used to implement breakpoint debugging
and system call tracing.

I’ve previously shown how to use ptrace() to trace system calls; now let’s look at how it’s used to implement a breakpoint debugger.

Although ptrace() is a single system call, it can perform various different actions based on a command parameter that’s passed into the system call. The Go syscall package includes a whole set of Ptrace* functions, which give an idea of the range of things you can do with it:

func PtraceAttach(pid int) (err error)
func PtraceCont(pid int, signal int) (err error)
func PtraceDetach(pid int) (err error)
func PtraceGetEventMsg(pid int) (msg uint, err error)
func PtraceGetRegs(pid int, regsout *PtraceRegs) (err error)
func PtracePeekData(pid int, addr uintptr, out []byte) (count int, err error)
func PtracePeekText(pid int, addr uintptr, out []byte) (count int, err error)
func PtracePokeData(pid int, addr uintptr, data []byte) (count int, err error)
func PtracePokeText(pid int, addr uintptr, data []byte) (count int, err error)
func PtraceSetOptions(pid int, options int) (err error)
func PtraceSetRegs(pid int, regs *PtraceRegs) (err error)
func PtraceSingleStep(pid int) (err error)
func PtraceSyscall(pid int, signal int) (err error)

What do these functions do? We can figure out quite a few things without even looking at the docs.

They all take a process ID as their first parameter — in our context, that’s the target process that we’re going to debug. We can surmise that PtraceAttach() is how we can start examining that target process, and PtraceDetach() is how we let go of that target.

My Commodore 64 was probably responsible for a lot of my subsequent career choices

I remember “peeking” and “poking” data back when I had a Commodore 64 as a kid — it’s reading or writing values directly into memory. So those are the functions to use if we want to manipulate information in memory.

PtraceGetRegs() and PtraceSetRegs() let us look at or set CPU register values. (If you’re not familiar with registers, don’t worry — we’ll get to that.)

And if we want the target process to run its executable, we have a few options:

  • PtraceCont() tells the target process to restart execution
  • PtraceSingleStep() only allows it to run the next machine code instruction
  • PtraceSyscall() tells it to restart and keep running until the next system call. (We won’t use this for now in the debugger, but this is the key to doing system call tracing with ptrace.)

Execute a process with ptrace enabled

Let’s say we have an executable we want to debug called “hello”. We can start a process and run our target process like this:

cmd := exec.Command(“hello”)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{Ptrace:true}
cmd.Start()
cmd.Wait()

Passing Ptrace: true in the SysProcAttr structure indicates that we’d like to have ptrace control over this target. When the kernel starts the process for this target, which happens during cmd.Start(), we get a ptrace attachment to it straight away.

We make a call to cmd.Wait() to wait for the target process to return a signal — and this will immediately get triggered with SIGTRAP, the breakpoint trap signal triggered as soon as the target process is created. If you were to print out the error that cmd.Wait() returns, this is what you’d see:

stop signal: trace/breakpoint trap

At this point the target process execution is stopped, waiting for us to allow it to continue running with one of PtraceCont(), PtraceSingleStep() or PtraceSyscall().

We can easily get the process ID for the target:

pid := cmd.Process.Pid

If we want to allow the target process to run to completion, that’s easy:

syscall.PtraceCont(pid, 0)

But it’s not very interesting to let it run to completion; instead we want to stop execution at a particular point in the program.

Setting a breakpoint

SIGTRAP gets generated when the CPU encounters INT 3

When you’re running an executable there’s a CPU register called the Program Counter (aka the Instruction Pointer) which contains the address of the next instruction to run.

As each instruction runs, the Program Counter is updated to point to the next instruction. If the CPU finds the instruction code 0xCC, it will stop execution and issue a SIGTRAP signal.

Setting a breakpoint simply involves writing 0xCC (aka INT 3) into the address where we want to stop.

syscall.PtracePokeData(pid, breakAddress, []byte{0xCC})

Waiting for the breakpoint

We set the breakpoint before the PtraceCont() call, which starts the target running again. Our debugger needs to watch for this with the Wait4() function, which blocks until a signal is received.

syscall.Wait4(pid, 0)     # Catch any signal with 0 as the second          
# parameter

When this returns, we can look at the state of the CPU registers in the target process:

syscall.PtraceGetRegs(pid, &regs)

We could also read or write the target process’s memory with PtracePeekData() and PtracePokeData(), and we can even set registers with syscall.PtraceSetRegs().

But before we do this, let’s consider how you know which address to write the 0xCC byte into when you want to set a breakpoint. I’ll be answering this in Part 2 of this story, and in future parts we’ll look at interesting things we can find out from the contents of the CPU registers.

If you can’t wait to find out, all is revealed in the video included below, and in the accompanying repo on GitHub.


This series of posts is based on a talk that I first did at dotGo Paris, and I recently extended it for GopherCon UK last week. Here’s the video of the former, and hopefully the videos from GopherCon will be out in time for part 2!

Notes

  • This series assumes an x86 processor. Other processors will behave in a similar way but the registers, interrupt codes and so on will likely be different.
Like what you read? Give Liz Rice a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.