Fun building shared libraries in Go

Tony Walker
6 min readJan 15, 2018

--

A classic from the talented Ashley McNamara

I was working on a problem recently when I thought to myself “what if I could cause read operations on file X to actually come from file Y”.

I’d seen examples of overriding system calls using C and LD_PRELOAD before and ever the keen Gopher, I wondered whether I could do something similar in Go. As luck would have it, I found that as of version 1.5, the new ‘buildmode’ option enables the creation of C shared libraries! I hadn’t used cgo at all before but with a little Googling I was able to establish the basic building blocks for creating ‘custom’ functions. This post will describe my journey to working code.

The basics

My goal was simple: I wanted a command like ‘cat foo’ to actually return the result as if I had run ‘cat bar’. Before I could do this, I had to determine which system call needed to be overridden. I was already pretty sure, but for the sake of this post — let’s take a look at how we can easily do this starting with the ever useful strace.

First, some sample files:

$ echo foo > foo
$ echo bar > bar

Now we run strace:

$ strace cat foo 2>&1 | grep foo
execve("/bin/cat", ["cat", "foo"], [/* 60 vars */]) = 0
open("foo", O_RDONLY) = 3
read(3, "foo\n", 131072) = 4
write(1, "foo\n", 4foo

OK, so we can see that cat uses the ‘open’ syscall which according to the manpage has the following signature:

int open(const char *pathname, int flags)

So, I would need to write my own open function which examines the path, modifies it if appropriate and then returns an integer which is either the file descriptor or an error (-1). For the sake of testing, I would use environment variables to set the ‘from’ and ‘to’ path names.

A first attempt

After some Googling I found some useful blog posts which pointed me in the right direction (see here for example). Ultimately I discovered I’d need to:

  • import “C” before anything else
  • convert C.char into C.GoString before I can operate on it
  • place ‘//export’ above my function for it to be recognized by C
  • maintain the same signature as the underlying syscall

With this in mind, here was my starting point:

I should point out here that the syscall.Open function requires a 3rd argument — mode. For the sake of this example I’m setting it to 0600 which is superflous since we’re only reading the file.

To build this into a shared library:

$ go build -o preload -buildmode=c-shared

And now the test. First, here’s what it looks like without any preload:

$ cat foo
foo

Now, we set our ‘from’ and ‘to’ paths and preload our shared library:

$ export LD_FROM=foo
$ export LD_BAR=bar
$ LD_PRELOAD=./preload cat foo
bar

Success!

When open isn’t open

Now, let’s test it with simple Python script which opens a file for reading and prints the first line:

$ cat test.py
import sys
print open(sys.argv[1]).readline().strip()

And now our test:

LD_PRELOAD=./preload python test.py foo
foo

Sad face.. I had assumed this would work in exactly the same same way! Let’s check it with strace:

$ strace python test.py foo 2>&1 | grep foo
execve("/usr/bin/python", ["python", "test.py", "foo"], [/* 60 vars */]) = 0
open("foo", O_RDONLY) = 3
read(3, "foo\n", 4096) = 4
write(1, "foo\n", 4foo

As you can see, this looks just like the output from our trace of the cat command. So, why wasn’t my open function being called!?

A little more Googling revealed that ‘open’ is both a syscall and a library function. This led me to try using ltrace which traces library calls:

$ ltrace python test.py foo 2>&1 | grep foo | grep open
fopen64("foo", "r") = 0x5599a95db140

Ah! So python’s ‘open’ actually calls fopen64 underneath, not the ‘open’ syscall. Here’s the signature for fopen64:

FILE *fopen64(const char *pathname, const char *mode)

This complicated things.. With ‘open’, I could simply pass off to syscall.Open and return the file descriptor integer. Now I needed to return a FILE object. The Go syscall package doesn’t have an equivalent to fopen64 and certainly nothing that returns a FILE object. Some more Googling revealed that I should be able to refer to the FILE object and use the underlying fopen64 system call from C itself. A key difference here is that I would have to convert the new path name from a Go string back into a CString. In order to refer to fopen64, I would need to include stdio.h as well. Here’s what the next iteration looked like:

This didn’t quite do what I wanted:

$ go build -o preload -buildmode=c-shared
./main.go:34:35: could not determine kind of name for C.FILE
./main.go:38:12: could not determine kind of name for C.fopen64
./main.go:37:8: could not determine kind of name for C.free

Despite including the correct header, none of the C types could be found. After some trial and error, it seems that cgo is very particular about formatting. So I changed my imports to look like this:

package main

// #include <stdio.h>
// #include <stdlib.h>
import "C"

import (
"os"
"syscall"
)

This improved things slightly:

$ go build -o preload -buildmode=c-shared
./main.go:39:12: could not determine kind of name for C.fopen64

Hmm, so it found the references to ‘free’ and ‘FILE’ but not ‘fopen64’. In the name of desperation, I changed the reference to ‘C.fopen’ which compiled OK:

$ go build -o preload -buildmode=c-shared
$ LD_PRELOAD=./preload python test.py foo
bar

As you can see, the test was successful. Here’s the working code:

So, this appears to work as intended but I’m concerned about the need to use fopen instead of fopen64 as I believe this would cause problems attempting to read a ‘large file’ (>2GB).

Aiming for completeness

Although I’m not using it, I figured I should replicate fopen as well (in case other tools use it instead of fopen64). To do this I simply copied my fopen64 function and removed the ’64’. This however produced 2 strange results!

First, the python test now produces an error:

$ LD_PRELOAD=./preload python test.py foo 
fatal: morestack on g0
zsh: trace trap LD_PRELOAD=./preload python test.py foo

Second, running ‘ls’ under preload causes the process to hang:

$ LD_PRELOAD=./preload ls
<hangs here>

Simply removing the fopen function and rebuilding the shared library fixes this.

Using ltrace, I don’t see any calls to fopen at all, so I’m unsure as to why this is happening or how to debug it further. I’m hoping my question on the ever-useful golang-nuts board will turn up something useful.

When cgo can’t quite make it

While I was writing this post, I had a helpful reply to my golang nuts question pointing out that fopen64 is only defined if the pre-processor macro _LARGEFILE64_SOURCE is declared. In order to do this, you must add a CFLAGS directive at the top of your code like so:

#cgo CFLAGS: -D_LARGEFILE64_SOURCE=1

Now, things really start to break:

go build -o preload -buildmode=c-shared
In file included from _cgo_export.c:3:0:
cgo-gcc-export-header-prolog:44:14: error: conflicting types for ‘fopen64’
In file included from ./main.go:5:0,
from _cgo_export.c:3:
/usr/include/stdio.h:298:14: note: previous declaration of ‘fopen64’ was here
extern FILE *fopen64 (const char *__restrict __filename,
^~~~~~~
_cgo_export.c:37:7: error: conflicting types for ‘fopen64’
FILE* fopen64(char* p0, char* p1)
^~~~~~~
In file included from ./main.go:5:0,
from _cgo_export.c:3:
/usr/include/stdio.h:298:14: note: previous declaration of ‘fopen64’ was here
extern FILE *fopen64 (const char *__restrict __filename,
^~~~~~~

Remember what I said about making sure the function signatures match? Well, as you can see here, cgo has ultimately created a signature with ‘char*’ but the declaration of the function in stdio.h uses ‘const char*’ (Go doesn’t have a const modifier). After some back and forth on golang-nuts with workarounds involving writing an fopen64 function in C and calling into my Go code, alas I wasn’t able to get further than this.

What about writes?

Remember when I said I wanted to impact all read operations? Well… Not all calls to ‘open’ are read-only. The code above would need to be modified to inspect the flags (or mode in the case of fopen) and only modify the file path for non-write calls.

Conclusion

Well, this was a useful learning exercise. It taught me a little more about Go, much more about C and ultimately produced the behaviour I was after.

I would however love to be able to call C.fopen64 — if you know of a way please let me know!

--

--

Responses (1)