The Undefined Behavior Sanitizer

Luciano Almeida
5 min readJan 27, 2018

--

In a recent post "A little bit about Thread Sanitizer" we talked about the Thread Sanitizer tool and how it works. If you still didn't read it, is recommended for you to do it before start reading this one to understand some concepts and references.

In today's article we are going to talk about the Undefined Behavior Sanitizer or UBSan.

So… let's start with the question:

What is UBSan?

From Apple's Undefined Behavior Sanitizer Documentation

The Undefined Behavior Sanitizer, or UBSan, is an LLVM tool for C languages that detects undefined behavior at runtime. Undefined behavior describes the result of any operation with unspecified semantics, such as dividing by zero, loading memory from a misaligned pointer, or dereferencing a null pointer.

There are many other undefined behaviors that can be detected by UBSan such as out-of-bounds access of arrays, integer overflow, out-of-range casts to, from, or between floating-point types and other types.

Undefined behavior is one of the hardest errors to debug because you never know if it will cause a crash, or just return garbage data from a lost pointer data, or if you are lucky it maybe just works as if there's nothing wrong.

How it works

Let's use a simple C code example

So in the above code, we have deliberately an integer overflow to demonstrate UBSan.

So first let’s compile and run the code without UBSan to see the results:

// Compiling clang ubsan.c -o ubsan //Runnig./ubsan

As you can see, the result seems a little weird right? since the value for 32 bit MAX_INT is 2147483647, so why the result is equal the INT_MIN? That’s because of how binary addition is made.

But the more important here is that it was not a runtime error and it just presents the “wrong” output.

Now let's see how it works with USan enabled

To compile with UBSan we just pass the -fsanitize=undefined flag

clang ubsan.c -o ubsan -fsanitize=undefined

OBS: You can enable the UBSan for especific checks. For example, check only for integer overflows changing the flag to -fsanitize=signed-integer-overflow. See more on: Enabling the Undefined Behavior Sanitizer

Now we have the UBSan detecting the integer overflow error and showing as an output. That's so cool \o/

UBSan on XCode

To setup UBSan on Xcode is very simple. Let's setup an Objective-C project and see how it works.

First, we have to enable the UBSan:

On the Product>Scheme>Edit Scheme, in the Diagnostics section for Run action just check the Undefined Behavior Sanitizer.

So setting up a project, put that same code from before, enabling UBSan and run the project on Xcode will give us a runtime error as output like this:

Xcode interface output of undefined behavior sanitizer
Xcode undefined behavior sanitizer runtime error

Important

One important thing from Apple Core Diagnostics Documentation

The performance impact of the Undefined Behavior Sanitizer is minimal, with with an average 20% CPU overhead in the Debug configuration.

Different from the TSan, UBSan performance impact is minimal and it can run on the devices.

Under the Hood

You may have noticed on the Runtime Sanitization section that it needs recompilation. That’s because Undefined Behavior Sanitizer is done at compile time and is implemented as a pass for the LLVM compiler.

All the compiler instrumentation is done at the LLVM IR level.

From Apple’s Undefined Behavior Sanitizer Documentation

The Undefined Behavior Sanitizer works by using Clang to emit checks into your code during compilation. The nature of the inserted code depends on the kind of undefined behavior being checked for.

So, let's see that …

First, let's see the LLVM IR without the UBSan compiling the code with:

clang ubsan.c -S -emit-llvm -o llvm

Let's enable UBSan and see what happens with the IR output

clang ubsan.c -S -fsanitize=undefined -emit-llvm -o llvm

Now we can see highlighted in blue, the UBSan emitted checks code for ubsan_handle_add_overflow inserted on the LLVM IR output.

Conclusion

Undefined Behavior errors the same way Data Races are really hard to detect and the same way TSan is a fantastic tool to help us debug those error, the UBSan helps us detecting this kind of undefined/unexpected errors.
Personally, I think those Compiler Sanitization Tools such as TSan, ASan, UBSan are amazing, is a great job that’s being done by the people who work on then.

That’s all for this article, hope you like it.

If I got something wrong or you have some comment or question, please let me know. I will be really happy in receiving your feedback.

Thanks for reading :)

References

  1. Undefined Behavior | Apple Developer Documentation. https://developer.apple.com/documentation/code_diagnostics/undefined_behavior_sanitizer
  2. Getting Started: Building and Running Clang. https://clang.llvm.org/get_started.html
  3. Serebryany, K., Potapenko, A., Iskhodzhanov, T., VyukovDynamic, D.: Race Detection with LLVM Compiler Compile-time instrumentation for ThreadSanitizer. https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/data-race-test/ThreadSanitizerLLVM.pdf
  4. Swift — Child of the LLVM, http://yaunch.io/llvm-and-swift/

--

--

Luciano Almeida

Aspiring Compiler Engineer, Swift and OpenSource enthusiast