Use Fuse to Inject Failure to I/O

siddontang
5 min readJun 24, 2018

--

In our previous article chaos-tools-and-techniques-for-testing-the-tidb-distributed-newsql-database/, I mention that we use SystemTap to do some failure injection to I/O, SystemTap is a powerful tool but it still has some limitations, like:

  • If we want to delay I/O but the delay time is too long (1s+), the SystemTap may be thought to be hung and return error.
  • Don’t support dynamic change, if I want to change the delay time from 100 ms to 200 ms, I have to restart the SystemTap script.
  • Don’t support accurate control, for example, if I want to limit the write speed for one file but inject I/O error for another file, I can’t find an easy to do it with SystemTap.

So sometimes, we need another failure injection mechanism. Luckily, there are some, like Namazu. Referring Namazu, we also use Fuse to inject failure to I/O.

What is Fuse?

Fuse is a user-space filesystem framework. According to its simple API, many other filesystems are built base on it. Definitely, we can also develop our own I/O failure injection filesystem. (Notice: although Fuse can be used on many operation systems, we focus on Linux here)

The following picture shows the architecture of Fuse, from the paper of To FUSE or Not to FUSE: Performance of User-Space File Systems

Fuse contains two parts — kernel and user daemon. The kernel part is a kernel module of the Linux, it registers a Fuse filesystem driver on Linux VFS. The Fuse driver can be thought as a proxy which redirects the request to the backend user daemon.

The Fuse kernel also registers a block device as /dev/fuse, which is an interface between the kernel and user daemon. The daemon reads Fuse request from /dev/fuse and writes back data to /dev/fuse. A simple Fuse flow is like:

  • The application does I/O operations on the mounted Fuse filesystem.
  • VFS redirects the operations to the Fuse kernel driver.
  • The driver creates a request and submits the request to the Fuse queue.
  • The Fuse user daemon reads the request from the queue through /dev/fuse and handles it. Notice here, we may still enter the kernel when handling the request, E.g. sending the request to Ext4.
  • After the request is finished, the user daemon writes back the result to the /dev/fuse.
  • The Fuse kernel marks the request is finished and wakes up the application again.

Go Fuse

As we can see, if we want to implement our own filesystem, the major thing for us is to implement our own user daemon. To achieve this, we need to support the Fuse User-Kernel protocol. Fortunately, many projects have already provided simple APIs for us to create our own user daemon.

Here I won’t talk more about the protocol, only just to show how to use Fuse. There are two famous Fuse projects in Go, one is go-fuse and the other is fuse, I will show you how to use go-fuse to build a zip filesystem.

At first, we need to generate a zip file, the origin uncompressed directory looks:

a/
a/a.log
b.log

Both a.log and b.log have the same content “123”.

We will mount the zip file to a directory and wish to operate the zip file like operating the common directories and files. For example, we can mount the zip file to directory m, enter m and do the following operations:

➜  m ls
a b.log
➜ m cat b.log
123
➜ m cat a/a.log
123

Amazing? Now we are accessing the zip file like accessing the common directory, how can we do this through go-fuse? It is very easy.

At first, we just need to create a zip filesystem and mount to the directory, the code is:

root, _ := zipfs.NewArchiveFileSystem("test.zip")
opts := &nodefs.Options{
AttrTimeout: time.Second,
EntryTimeout: time.Second,
Debug: false,
}
state, _, _ := nodefs.MountRoot("m", root, opts)
state.Serve()

From the above code, we use zipfs.NewArchiveFileSystem to create a filesystem, and then use nodefs.MountRoot to mount the filesystem to directory m.

So the next thing is how to create a zip filesystem. Here we can use Go standard library archive/zip to decode the zip file, a simple example:

r, _ := zip.OpenReader("./test.zip")
defer r.Close()

for _, f := range r.File {
log.Printf("name: %s, is dir %v", f.Name, f.FileInfo().IsDir())
}

According to the directory structure in the zip file, we can first use nodefs.NewDefaultNode() to create a root node, then use NewChild to create the sub-node repeatedly. The detailed code is at zipfs, we won’t talk about more here.

Hook I/O

Now we can use go-fuse to build our own filesystem, so the next thing is how to build an I/O failure injection filesystem. It is easy too, what we only to do is to hook all the I/O operations, and inject failure. We can create a Loopback filesystem through go-fuse. The Loopback filesystem looks like a mirror, all the I/O operations will be redirected to the backend real filesystem. Before redirecting, we can hook the operations on the Loopback filesystem.

Namazu has already provided a library hookfs to support all — creating the Loopback filesystem, providing a Hook API to inject failure.

Delay Example

Here, I will show you how to use hookfs to support a simple injection - delay 1s for any Read I/O operation, of course, we can change the delay time dynamically too.

We need to implement HookOnRead interface:

type HookOnRead interface {
// if hooked is true, the real read() would not be called
PreRead(path string, length int64, offset int64) (buf []byte, err error, hooked bool, ctx HookContext)
PostRead(realRetCode int32, realBuf []byte, prehookCtx HookContext) (buf []byte, err error, hooked bool)
}

The code is:

type MyHookContext struct{}
type MyHook struct {
dur time.Duration
}

func (h *MyHook) PreRead(path string, length int64, offset int64) ([]byte, error, bool, hookfs.HookContext) {
time.Sleep(h.dur)
return nil, nil, false, MyHookContext{}
}

func (h *MyHook) PostRead(realRetCode int32, realBuf []byte, prehookCtx hookfs.HookContext) (buf []byte, err error, hooked bool) {
return realBuf, nil, false
}

We use sleep to sleep special time in the PreRead function, then we start hookfs:

h := &MyHook{
dur: time.Second,
}
fs, _ := hookfs.NewHookFs(originPath, mountPath, h)
fs.Serve()

The default delay time is 1s, we can add a simple HTTP service to change the delay time dynamically (please ignore the data race problem here):

go func() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
h.dur, _ := time.ParseDuration(r.FormValue("dur"))
})

http.ListenAndServe("127.0.0.1:8080", nil)
}()

At last, let’s build and run the example, we mount the /tmp/b directory to /tmp/a, and there is a file “a.log” in /tmp/b with content “124”.

time cat /tmp/a/a.log
124
cat /tmp/a/a.log 0.00s user 0.00s system 0% cpu 1.002 total

As we can see, the cat takes 1s, now, let’s change the time and go on:

curl http://127.0.0.1:8080\?dur\=2s
time cat /tmp/a/a.log
124
cat /tmp/a/a.log 0.00s user 0.00s system 0% cpu 2.002 total

The cat now takes 2s. Simple and crazy, yet :-) ?

Notice: if you finish the test, you need to use fusermount -u /tmp/a to unmount the directory.

Epilogue

Using Fuse, we can inject failure to I/O easily. The above code is just a simple example, in fact, we build a powerful tool which supports read/write speed limitation, error injection, data changing and even embeds Lua for complicated control. If you are interested in this, please join us, my email is: tl@pingcap.com。

--

--

siddontang

VP of Engineering / Chief Architect at PingCAP. Author of TiDB, TiKV, Chaos Mesh, etc. Contract me: https://www.linkedin.com/in/siddontang/