A little bit about Thread Sanitizer

Luciano Almeida
5 min readNov 26, 2017

--

Thread Sanitizer or TSan-LLVM, that is available since Xcode 8, is a tool that allows us to debug data races when multiple threads try to access the same memory area in a nonatomic way and with at least one write operation in one of those threads. This data races are really hard to debug because they are extremely unpredictable, sometimes works and sometimes not, you never know what thread got there first.

In this article, we will try to understand a little bit more about what it can do and how it works.

How it works

Let's see an example with a piece of code that contains a data race with two queues and see how Thread Sanitizer helps us to debug it.

On the above code we write to the same global variable text in two different queues and after that, we have a read on that variable. But reading the code, can you answer what value will be printed? Which queue will write first? It’s unpredictable, and really hard to debug.

That's where the Thread Sanitizer comes to help us.

First, we have to enable the Thread Sanitizer in Xcode:

On the Product>Scheme>Edit Scheme, in the Diagnostics section for the Run action just check the Thread Sanitizer

If you are running on the command line, you can enable it with flags for each one of the following commands:

  • clang -fsanitize=thread
  • swiftc -sanitize=thread
  • xcodebuild -enableThreadSanitizer YES

You can find it on the Apple Documentation on enabling Thread Sanitizer

Important

There are two very important things from Apple Core Diagnostics Documentation

Thread Sanitizer is supported only for 64-bit macOS and 64-bit iOS and tvOS simulators (watchOS is not supported). You cannot use it when running apps on a device.

And also the performance impact of using this tool

Running your code with Thread Sanitizer checks enabled can result in CPU slowdown of 2⨉ to 20⨉, and an increase in memory usage by 5⨉ to 10⨉. You can improve memory utilization and CPU overhead by compiling at the -O1 optimization level.

So, now that we’ve enabled the Thread Sanitizer, let’s run the code that we made earlier.

The first thing that we can see at runtime is a warning from the Thread Sanitizer telling us that we have a race condition here and the result can be unpredictable.

Also on the warning section on the sidebar of the Xcode you can see this warning:

There's also an output on the console with the same warning.

After Thread Sanitizer does all the work to show where we can have this race condition, it’s much easier for us to fix.

Under the Hood

To better understand it, let's remember the flow of the Swift compiler

https://www.draw.io/

The swift compiler takes your Swift code, parse it to generate an Abstract Syntax Tree(AST), after that there is a Semantic Analysis where the compiler takes the AST generated by the parser and makes a type-checked AST and check for semantic issues on that. Then the Swift Intermediate Language Generation (SILGen) phase transforms the AST generated by the semantic analysis into what they call raw SIL, after some optimisations on the SIL such as generic specialisation and ARC optimisations it passes this optimised SIL to the IRGen to generate the Intermediate Representation(IR) to be passed to LLVM for it to continue the job and generate the object file.

Well, let's get back to the Thread Sanitizer.

You may have noticed on the enabling Thread Sanitizer section that it needs recompilation. That’s because Thread Sanitizer is done at compile time and is implemented as a pass for the LLVM compiler.

All the compiler instrumentation is done at the LLVM IR level [3].

Basically, the Thread Sanitizer inserts code at LLVM IR level, that records information about each memory access. With this information, it can, when receiving events for each memory access, report a potential race based on a state machine algorithm(you can see with more detail in [3]).

Apple pseudo code example that you can find on Thread Sanitizer Apple Documentation

There's a way you can see the real codes generated by the Thread Sanitizer on your swift code by first compiling it with -emit-ir

  • swiftc -emit-ir -sanitize=thread your-code.swift

And then still emitting IR but without Thread Sanitizer disabled

  • swiftc -emit-ir your-code.swift

This way you can make a diff and see what was inserted on the version compiled with the Thread Sanitizer.

You can see a pretty good detailed information in the paper Race Detection with LLVM Compiler Compile-time instrumentation for ThreadSanitizer[3] about the compiler instrumentation and the state machine algorithm.

Conclusion

Data races are one of the most difficult things to debug on real multi-thread applications. And the Thread Sanitizer is a really great tool to helps us a lot on that really hard task. Although it has it’s trade-offs in terms of performance, it is really easy to enable when you need it and disable when you don’t.

That’s all for this article, hope you like it

If I got something wrong or you got some comment or question, please let me know. I will be really happy in receiving your feedback.

Thanks for reading :)

References

  1. Thread Sanitizer | Apple Developer Documentation. https://developer.apple.com/documentation/code_diagnostics/thread_sanitizer
  2. Data Races | Apple Developer Documentation. https://developer.apple.com/documentation/code_diagnostics/thread_sanitizer/data_races
  3. Serebryany, K., Potapenko, A., Iskhodzhanov, T., VyukovDynamic, D.: Race Detection with LLVM Compiler Compile-time instrumentation for ThreadSanitizer. https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/data-race-test/ThreadSanitizerLLVM.pdf
  4. Swift — Child of the LLVM, http://yaunch.io/llvm-and-swift/
  5. Compiler Sanitizers for Fun and Profit, Greg Heo talk on iOS Conf SG 2017.

--

--

Luciano Almeida

Aspiring Compiler Engineer, Swift and OpenSource enthusiast