De-Obfuscating GoLang Functions

Jason Reaves
Jan 12 · 10 min read

GoLang[8] presents an excellent opportunity for malware developers to create tools that present a challenge to malware researchers and reverse engineers while at the same time coming out of the box capable of cross compiling across various architectures.

A number of people have since come out with various tools to help with statically reverse engineering these binaries and in kind other tools have been developed to demonstrate how to attack these techniques through various obfuscation methods. Most of the current toolage for statically reversing GoLang binaries resides on the existence of the .gopclntab structure which even in stripped binaries is present with a listing of the function names and their offsets, therefor you can enumerate this structure and rename all the stripped functions with their standard library names. This structure is not required and can be zeroed out or enumerated by another process to mangle the function names to cause extra confusion. In fact there are two existing methods for obfuscating golang compiled binaries one that does so before it is compiled by manipulating the source code and another can enumerate the gopclntab structure of an already compiled binary to manipulate it.

My initial idea for how to de-obfuscate GoLang binaries is creating a system where you can match standard library functions based on the bytecode sequences which would allow you to identify what the functions really are regardless of the name and could then lead to eventually rebuilding a gopclntab structure if needed. To do this we need a way to catalog standard library functions based on the GoLang version used and the architecture, for proving out our idea we can focus on a single architecture to begin with in this case ‘windows_386’ and ‘windows_amd64’.

Cataloging Standard Functions

What we will need to do is compile all the standard GoLang library into object files, parse the object files and store the relevant data we will need. You can force GoLang to generate this data for a given architecture:

env GOOS=<os> GOARCH=<arch> go install -v std

For parsing the object files we just need to know their structure and luckily it is well documented, GoLang object files are basically lists of SymID structures, the structures involved are documented in the standard GoLang documentation[10,11]. We can also inspect them ourselves by building some simple GoLang programs to print the data based off of code using the goobj package[5] from the standard library.

f,_ := os.Open(*file);

Output snippet:

goobj.SymID{Name:"type..importpath.math.", Version: 0}, goobj.SymID{Name:"type..importpath.os.", Version: 0}, goobj.SymID{Name:"type..importpath.reflect.", Version: 0}, goobj.SymID{Name:"type..importpath.strconv.", Version: 0}, goobj.SymID{Name:"type..importpath.sync.", Version: 0}, goobj.SymID{Name:"type..importpath.unicode/utf8.", Version: 0}}, Syms:[]*goobj.Sym{(*goobj.Sym)( 0xc420112000), (*goobj.Sym)( 0xc420112070), (*goobj.Sym)( 0xc420112150), (*goobj.Sym)( 0xc4201121c0), (*goobj.Sym)( 0xc420112230), (*goobj.Sym)( 0xc4201122a0), (*goobj.Sym)( 0xc420112310), (*goobj.Sym)( 0xc420112380), (*goobj.Sym)

Inside the structure returned is a list of Sym objects which are structures that we can access:

type Sym struct {

To retrieve the functions we need to loop through this list and store the name which is inside the substructure SymID.

type SymID struct {

As well as the function itself which can be found using the substructure Data as it contains the offset and size of the function.

type Data struct {

Now we can enumerate these object files and store the relevant data we will need such as the entire function and it’s name, we also have the advantage that each function has NULL memory addresses and offsets in the bytecode. Next we need to be able to parse GoLang compiled binaries to enumerate their functions.

Parsing Compiled Binaries

Parsing compiled GoLang binaries is straightforward and the process is also already documented along with example code from tools designed to obfuscate the table[4] and also to parse it for static analysis[3]. The major draw back here is that the function length is not contained in the structure we will be processing, so we will have to account for that in our process.

lookup = binascii.unhexlify("FBFFFFFF0000")

Using the code we modified from an existing script[3] we can find the gopclntab structure. After finding it we will need to parse the structure[4]:

struct gopclntab {

After parsing each function we store the first 200 bytes after the function address in order to have a decent sampling, since we have the exact function length from the compiled standard library objects we can use that size as the determining factor for where to stop scanning for a match. This amount of bytes and focusing on a few libraries should give us enough evidence to suggest if the proposed method will be effective or not.

Compare Functions

For comparing the bytecode we will walk the length of the function based on the minimum length of both bytecode sequences while calculating how close the function from the binary is compared to the ones from our database. The method will be simple as a calculation of ‘miss_bytes’, but we overwrite this check for our initial testing to see if exact matches can be found where we check every byte while skipping null bytes. The reasoning behind skipping null bytes being that null bytes in the bytecode sequence from the standard library compiled objects will be possible locations for memory offsets to be filled in.

def compare_bytecode(s1, s2):
good_bytes = 0
checked_bytes = 0
for i in range(min(len(s1),len(s2))):
if s1[i] != '\x00':
checked_bytes += 1
if s1[i] == s2[i]:
good_bytes += 1
if good_bytes == 0 or checked_bytes <10:
p = 0
miss_bytes = 100
else:
miss_bytes = checked_bytes-good_bytes
p = float(checked_bytes)/good_bytes
p *= 100
p = int(p)
return((miss_bytes, p))

Using this function we will loop over all the functions in a GoLang binary comparing each one against every function in our database, build out a list of these comparisons and then sort them. This would allow for finding the closest match but for the purposes of this blog we will simply look for exact matches by checking if the returned miss_bytes count is 0.

Initial Results

For the purpose of proof of concept testing we focus on the functions from the standard library package ‘fmt’[12]. We will then perform 3 categories of checks, a check against a unobfuscated sample compiled on the same system used to harvest the library object bytecode sequences, a check against an obfuscated sample and checks against samples compiled from other systems but using the same version of GoLang.

The first test worked flawlessly which is good news but should be expected since the sample is from the same system as the compiled library opcodes.

Found match for :fmt.newPrinter with library function: fmt.newPrinter

The next test is purely demonstrative as the sample is simply now obfuscated by mangling function names. It is informative however because this is the ultimate goal of de-obfuscating the function names.

Found match for :wSvJxDLJ9eeZ17 with library function: fmt.(*pp).free

The next test I use a sample downloaded from VirusTotal which says it is the same version of GoLang as the one I use on my system but was compiled elsewhere.

Found match for :fmt.(*fmt).fmtS with library function: fmt.(*fmt).fmt_s

This single example shows that it is possible to match the functions but we need to test more samples to verify. Our sample set consisted of four more 32 bit GoLang 1.10 compiled binaries harvested from VirusTotal.

Focusing purely on the 32 bit ‘fmt’ package our test successfully fully matched the major functions in the sample set without needing to use partial match logic.

Sample SHA256: 970ccf5abed19b5499afd57864709817b8f8d44350a7d38220776121313b8cbc

Found match for :fmt.(*fmt).truncate with library function: fmt.(*fmt).truncate

Sample SHA256: dc3b0de17ae3d83763025174fd152aeea0b614bd4316af30510e842dca806ebe

Found match for :fmt.(*buffer).Write with library function: fmt.(*buffer).Write

Sample SHA256: ed3ce1b2022d702f837c255283df13a7396b2ba5db0fecd5a81da83e0c56c5e9

Found match for :fmt.newPrinter with library function: fmt.newPrinter

Sample SHA256: fec50384f57656e47844403a38b77f9ac27e99c5662757e96f500e1de6315c5a

Found match for :fmt.(*fmt).fmtBoolean with library function: fmt.(*fmt).fmt_boolean

Testing on 64 bit windows samples was equally successful but required cataloging 64 bit functions from the standard library.

Sample SHA256: 58f40eb3f9c7b8ccbb6482b58ab9eb4c84a3f50e19c50e22aac6593e56144779

Found match for :fmt.(*fmt).fmt_q with library function: fmt.(*fmt).fmt_q

Sample SHA256: e1eb18ae92fe10d4cb1e1df2db65b2fe0e08e282746f2b5365522cfa52460328

Found match for :fmt.(*pp).catchPanic with library function: fmt.(*pp).catchPanic

Initial results are promising that deobfuscating GoLang binaries is possible but with some problems to overcome as well. The biggest problem encountered so far would be the amount of functions you would need to catalog for each architecture and GoLang version you are interested in which will cause speed issues, a possible avenue for overcoming this problem would be to catalog the functions into YARA[13] rules so they can be used to speedup the process. This approach would also have the luxury of making rebuilding the gopclntab structure easier instead of relying upon it for enumerating functions in the binary.

References

1: STRAZZERE, Tim. Bsides GO Forth and Reverse, 2017, github.com/strazzere/golang_loader_assist/blob/master/Bsides-GO-Forth-And-Reverse.pdf.

2: Sibears. “Sibears/IDAGolangHelper.” GitHub, github.com/sibears/IDAGolangHelper.

3: Sysopfb. “Sysopfb/GoMang.” GitHub, github.com/sysopfb/GoMang.

4: “Golang Internals, Part 3: The Linker, Object Files, and Relocations.” Altoros, 27 Nov. 2019, www.altoros.com/blog/golang-internals-part-3-the-linker-object-files-and-relocations/.

5: S-Matyukevich. “s-Matyukevich/goobj_explorer.” GitHub, github.com/s-matyukevich/goobj_explorer.

6: Unixpickle. “Unixpickle/Gobfuscate.” GitHub, github.com/unixpickle/gobfuscate.

7: “Managing Go Installations.” Go, golang.org/doc/manage-install.

8: “Go Is an Open Source Programming Language That Makes It Easy to Build Simple, Reliable, and Efficient Software.” Go, golang.org/.

9: “SmartAssembly 6.” Obfuscating Code with Name Mangling — SmartAssembly 6 — Product Documentation, documentation.red-gate.com/sa6/obfuscating-your-code-with-smartassembly/obfuscating-code-with-name-mangling.

10: “Package Goobj.” Go, golang.org/pkg/cmd/internal/goobj/#Sym.

11: “Package Goobj.” Go, golang.org/pkg/cmd/internal/goobj/#SymID.

12: “Package Fmt.” Go, golang.org/pkg/fmt/.

13: YARA — The Pattern Matching Swiss Knife for Malware Researchers, virustotal.github.io/yara/.

Walmart Global Tech Blog

We’re powering the next great retail disruption.

Walmart Global Tech Blog

We’re powering the next great retail disruption. Learn more about us — https://www.linkedin.com/company/walmartglobaltech/

Jason Reaves

Written by

Malware Researcher, Crimeware Threat Intel, Reverse Engineer @Walmart

Walmart Global Tech Blog

We’re powering the next great retail disruption. Learn more about us — https://www.linkedin.com/company/walmartglobaltech/