Angry monkeys pounding on keyboards, aka Fuzzy Testing

6 min readMay 30, 2018

Principle

Fuzzing is an automated technique for finding bugs in programs. Unlike a brute-force attack, the fuzzer’s payload is constructed to follow a logic that the program can understand. For example, if the application expects a JSON input, a valid request would contain something like:

{
 "request": "refresh_token",
 "token": "d2ba0523-53f7-43d6-9740-5af74ef52380",
 "version": 4
}

The fuzzer tests various payloads and is looking for unintended crashes or leaks. Being 100% bug-free is a hard task and defining a formal specification takes a lot of time. Fuzzing is a good compromise, as it covers a large amount of test cases and gives a respectable result in calculable time.

History

Fuzzing dates back to the times when punched cards were the main programming medium. By feeding malformed cards (like cards coming from the trash) to the computer, the “trash-deck technique” produced random results and interesting crashes.

In 1981, fuzzing was studied by Duran and Ntafos: they proved that fuzzing is a cost effective alternative to systematic testing.

In 1983, Steve Capps created “The Monkey”: it was the first application to fuzz Mac OS (in reference to the Infinite Monkey Theorem).

In 1988, the word “fuzzing” was coined by Marton Miller during a Unix class. The original “fuzz” project was able to crash a third of the Unix tools.

By 1995, fuzzing was extended to GUI, network protocols and many system APIs.

Since 2000, fuzzing grew a lot (especially in the open-source community) and is now a well known, automated and appreciated way of testing applications.

Difference

Where classic tests are defined by the developer, fuzzing is more impromptu: With fuzzing, the aim is to find unexpected results and errors, outside of the scope of what is covered. Because of this, a fuzzing can run for a long time without interruption to provide more and more results.

In unit testing, numerous tests are used to cover as many cases as possible. These tests can act as a safeguard, but will rarely discover new errors; you can’t predict all possible errors. Fuzzing is a great way to explore the unknown, and to extend the scope of the existing tests.

End-to-end tests, on the other hand, follow a track and are a very good way of ensuring that common flows are working. But this is a blessing and a curse: because there is never any deviation from the track, and a lot of possible flows are always left unexplored. Fuzzing with multiple steps can get rid of this limitation and produce a better coverage.

Categories/Complexity

There are two main axes we can use to define a fuzzer: information and complexity. The information axis describes how “aware” the fuzzer is of the program’s structure. A black-box fuzzer knows nothing about the application, whereas a white-box one can see all of its inner workings. A black box fuzzer offers simplicity and effortless integration but can be limited in terms of coverage. A white box fuzzer, on the other hand, broadens the scope but requires more work. The distinction is not completely binary, and a balance needs to be found in real-life implementations.

Complexity is related to the level of detail put into the fuzzer’s work. For example, a simple fuzzer would act randomly without considering the application’s reactions. A more complex fuzzer can determine which path to follow depending on the results.

In terms of development time, a simple/black-box fuzzer is the quickest one to make. No extra-knowledge or configuration is required; the difference in comparison with a brute-force tool is very thin.

Increasing the information the fuzzer has on the application, the fuzzer can access the type of data the application accepts and use it as its starting point: imagine a PNG image fuzzed so that the header stays valid but the content is gibberish.

Finally, a complex structure such as an HTTP request could give a good base for a more subtle fuzzing: Any part of the HTTP definition is subject to fuzzing, and a web application needs to handle every possible case as well as invalid ones.

Usage at N26

At N26, we started implementing fuzzing on the REST APIs, in the Continuous Deployment pipeline. Testing a web application requires an extra step of checking if the service is up and running, and to constantly monitor for status change.

We integrate the fuzzer like any other test, so the deployment pipeline can look something like:

During deployment, any failing step can (and should) stop the process. If fuzzing is introduced to an existing project, it will most probably stop the deployment for a short to medium amount of time. As secure as this could be, it doesn’t fit well in an established flow. The middle ground we decided to use is to make the fuzzing process report without blocking, and once the project doesn’t yield errors it will be mandatory.

Swagger

Fuzzing the REST endpoints works around an OpenAPI configuration, which is a JSON file looking something like:

{
 "paths": {
  "/": {
   "post": {
    "parameters": [
    {"in": "body", "name": "first_name", "type": "string"}
    ],
    "responses": {
     "200": {"description": "Success"},
     "400": {"description": "Invalid input"}
    }
   }
  }
 }
}

This says that a POST request on / needs to have a first_name value in the body of the request, and that it can return either a 200 or a 400. Of course, a real file contains many paths, verbs, parameters and responses.

Error Waterfall Side note

Once your application is live, you want to reveal as little as possible regarding your internal configurations. Every case should be handled, whether success or failure, and specific messages should be defined. For example, in python-ish:

def main():
 try:
  return do_the_thing()
 except ValueError:
  return "There was a value error."
 except NameError:
  return "There was a name error."
 except:
  return "There was an error."

The key idea here is that every success/failure is handled, and the last except will catch every trailing error and return a non-informative error to the user. If it was not here, an uncaught error could crash the program and return the stacktrace to the user.

Flow

The fuzzing flow is as follows:

for all paths:
 for all verbs:
  repeat n times:
   send fuzzed request
   if return code not defined:
    report

The logic is very straightforward and the resulting code is very minimal, but it is nonetheless a strong foundation for later improvements. To determine if a request is valid or not, the return code is compared to the defined possible ones; any deviation from this list implies a potential information leakage.

To generate the request, the script looks at the type of data the fields are expecting (string, int, float, …) and generate a valid value. For int, the values can be simple (-1, 0, 42, …), large (-9223372036854775808, 9e100, …) or small (1e-200, -1e-200, …).

Further

The presented fuzzer is a very simple one, and many improvements can be made:

All verbs on all endpoints: simple to implement and can extend the coverage of the fuzzer.
Change the data types: use various types for a field, this will be a good parsing and input validating test.
Invalid HTTP requests: a level lower, and changing the raw content of the HTTP request can yield interesting results.
Invalid TCP packets: same principle, we can mess the TCP headers. At this point, the fuzzier might have a harder time to pass through the application.

Those changes are only applicable to a single-step fuzzing tool, so many scenarios are not covered here. Still, this process is very useful in terms of coverage and visibility.

Conclusion

Creating a secure application is a long and intricate task. A fuzzing tool is only another step into the right direction and shouldn’t be viewed as a solution to all problems. It is a simple and reliable way to get an idea of your project’s resilience to bugs and crashes. It is a simple and effective test you can implement against a number of different types of applications/services.

Interested in testing the limits of N26’s services?

Constantly improving the way we write and test code is one of our core principles at N26. We are looking for people to join us in building the digital bank the world loves to use. Check out our open positions at http://n26.com/careers.