Extracting Ghidra Decompiler Output with Python

James Sebree
Tenable TechBlog
Published in
4 min readJul 28, 2022

Ghidra’s decompiler, while not perfect, is pretty darn handy. Ghidra’s user interface, however, leaves a lot to be desired. I often find myself wishing there was a way to extract all the decompiler output to be able to explore it a bit easier in a text editor or at least run other tools against it.

At the time of this writing, there is no built-in functionality to export decompiler output from Ghidra. There are a handful of community made scripts available that get the job done (such as Haruspex and ExportToX64dbg), but none of these tools are as flexible as I’d like. For one, Ghidra’s scripting interface is not the easiest to work with. And two, resorting to Java or the limitations of Jython just doesn’t cut it. Essentially, I want to be able to access Ghidra’s scripting engine and API while retaining the power and flexibility of a local, fully-featured Python3 environment.

This blog will walk you through setting up a Ghidra to Python bridge and running an example script to export Ghidra’s decompiler output.

Prepping Ghidra

First and foremost, make sure you have a working installation of Ghidra on your system. Official downloads can be obtained from https://ghidra-sre.org/.

Next, you’ll want to download and install the Ghidra to Python Bridge. Steps for setting up the bridge are demonstrated below, but it is recommended to follow the official installation guide in the event that the Ghidra Bridge project changes over time and breaks these instructions.

The Ghidra to Python bridge is a local Python RPC proxy that allows you to access Ghidra objects from outside the application. A word of caution here: Using this bridge is essentially allowing arbitrary code execution on your machine. Be sure to shutdown the bridge when not in use.

In your preferred python environment, install the ghidra bridge:

$ pip install ghidra_bridge

Create a directory on your system to store Ghidra scripts in. In this example, we’ll create and use “~/ghidra_scripts.”

$ mkdir ~/ghidra_scripts

Launch Ghidra and create a new project. Create a Code Browser window (click the dragon icon in the tool chest bar) and open the Script Manager window. This can be opened by selecting “Window > Script Manager.” Press the “Manage Script Directories” in the Script Manager’s toolbar.

In the window that pops up, add and enable “$USER_HOME/ghidra_scripts” to the list of script directories.

Back in your terminal or python environment, run the Ghidra Bridge installation process.

$ python -m ghidra_bridge.install_server ~/ghidra_scripts

This will automatically copy over the scripts necessary for your system to run the Ghidra Bridge.

Finally, back in Ghidra, click the “Refresh Script List” button in the toolbar and filter the results to “bridge.”. Check the boxes next to “In Toolbar” for the Server Start and Server Shutdown scripts as pictured below. This will allow you to access the bridge’s start/stop commands from the Tools menu item.

Go ahead and start the bridge by selecting “Run in Background.” If all goes according to plan, you should see monitor output in the console window at the bottom of the window similar to the following:

Using the Ghidra Bridge

Now that you’ve got the full power and flexibility of Python, let’s put it to some good use. As mentioned earlier, the example use-case being provided in this blog is the export of Ghidra’s decompiler output.

Source code for this example is available here: https://github.com/tenable/ghidra_tools/tree/main/extract_decomps

We’ll be using an extremely simple application to demonstrate this script’s functionality, which is available in the “example” folder of the “extract_decomps” directory. All the application does is grab some input from the user and say hello.

Build and run the test application.

$ gcc test.c
$ ./a.out
What is your name?
# dino
Hello, dino!

Import the test binary into Ghidra and run an auto-analysis on it. Once complete, simply run the extraction script.

$ python extract.py
INFO:root:Program Name: a.out
INFO:root:Creation Date: Tue Jul 26 13:51:21 EDT 2022
INFO:root:Language ID: AARCH64:LE:64:AppleSilicon
INFO:root:Compiler Spec ID: default
INFO:root:Using 'a.out_extraction' as output directory…
INFO:root:Extracting decompiled functions…
INFO:root:Extracted 7 out of 7 functions
$ tree a.out_extraction
a.out_extraction
├── ___stack_chk_fail@100003f6c.c
├── ___stack_chk_fail@10000c000.c
├── _printf@100003f78.c
├── _printf@10000c010.c
├── _scanf@100003f84.c
├── _scanf@10000c018.c
└── entry@100003edc.c

From here, you’re free to browse the source code in the text editor or IDE of your choice and run any other tools you see fit against this output. Please keep in mind, however, that the decompiler output from Ghidra is intended as pseudo code and won’t necessarily conform to the syntax expected by many static analysis tools.

--

--