Open Source Code Obfuscation Tool for Protecting iOS Apps

Polidea
13 min readJul 19, 2018

--

We are happy to announce the launch of Polidea’s open-source project Sirius — a security tool for obfuscating code in iOS apps. Developed by Polidea’s team, it is now fully accessible and ready for use — we offer commercial support for the tool as well.

Its creation echoes our past fintech project with one of the leading banks in Poland. Having a broad choice of “guarding” Android tools and with the lack of iOS ones, we developed our own — more on this version here. After some time, the tool became outdated and our open-source library was no longer supported. Now, seeing the huge need of the market, we’ve decided to release a new, updated version, supporting the newest Apple’s products as well as Swift language.

Mobile apps’ security through code obfuscation

Security is a crucial aspect of mobile apps development. There are multiple tools in the security toolbox, like SSL pinning, database encryption, two-factor authentication or end-to-end encryption between devices. All these techniques focus on preventing the user data from being intercepted or stolen — that’s what matters most! When working with users data, any possibility of an attack that compromises their privacy must be prevented. However, there’s one other thing that we might want to protect as well: the logic of our app, the actual algorithms that we programmed. While it’s less important than the users’ data, for some apps (especially in the financial sector) it’s still a part of the overall security net for the whole system. And while it’s impossible to completely prevent the attackers from getting to understand the logic of your app, sometimes it’s worth slowing them down. It can also prevent the automated bots from searching through your code and looking for some easily discoverable patterns. The common technique helping with that is code obfuscation.

One particular code obfuscation technique is the symbol renaming. The idea is basically to replace the name of symbols in the app (class names, method names etc.) with random strings. It strips them of any meaning and makes it harder to identify their role in the application. The attacker, while reading the decompiled app, might no longer leverage the information carried by the names to deduct the logic of the app.

That’s one of the reasons why on the Android platform there is a widely used tool called Proguard that does precisely that. It has become a standard part of the build process. However, on iOS, there’s no free and open source solution that would be as widely accepted and used as Proguard. The scarcity of obfuscating tools is especially visible when working with Swift codebase, as most of the obfuscation solutions has been designed to work with Objective-C, and for a good reason.

When developing in Objective-C, code obfuscation by symbol renaming is particularly important. The reason is that all the symbols must be stored in the binary and visible to the runtime so that they can be referenced by strings. This is what allows the message passing using selectors or checking the protocol conformance by the protocol’s name. Although Swift doesn’t use the Objective-C runtime by default and the compiler strips the symbols and applies the optimizations that are making it harder for the attacker to read decompiled app, the obfuscation might come in handy. Every time we derive from the NSObject or use the @objc attribute, we voluntarily participate in the Objective-C runtime and we must obey its rules. Therefore, the symbols are not striped.

Presenting Sirius Obfuscator — code obfuscation tool

That’s why we’ve developed Sirius obfuscator, an open source tool that can take your Swift code and transform it from this:

class Sample: UIViewController { var value = 42 override func viewDidLoad() { super.viewDidLoad() configure() foo(bar: 1) } func foo(bar baz: Int) { value += baz } } protocol Configurable { func configure() } extension Sample: Configurable { func configure() { } }

to this:

class aqoxMfcPUXffEuurviH_ZPMDW2hCmXDR: UIViewController { var a0vLRcFFAQ1Lvw2sf4ZIigWKjXjpJpug = 42 override func viewDidLoad() { super.viewDidLoad() A6PP2E5mcmjEsgOvTeXwy2G44vzYLa6H() xG1qrXIMEJC1Eoma2Qbp_ZWJ5y2lrGYX(KuT5vOLIISvSJyju6bYxsHO_vlWUU589: 1) } func xG1qrXIMEJC1Eoma2Qbp_ZWJ5y2lrGYX(KuT5vOLIISvSJyju6bYxsHO_vlWUU589 vjCKgTT7Cf0ZlEi9giLZstzgdC9XLQcd: Int) { a0vLRcFFAQ1Lvw2sf4ZIigWKjXjpJpug += vjCKgTT7Cf0ZlEi9giLZstzgdC9XLQcd } } protocol dVUt_HSz_a1q1JsbyTJVfk0KeXej8a4z { func A6PP2E5mcmjEsgOvTeXwy2G44vzYLa6H() } extension aqoxMfcPUXffEuurviH_ZPMDW2hCmXDR: dVUt_HSz_a1q1JsbyTJVfk0KeXej8a4z { func A6PP2E5mcmjEsgOvTeXwy2G44vzYLa6H() { } }

Sirius obfuscator is working with Xcode 9.2 and Swift 4.0. It’s easy to use (one command line tool execution) and a pleasure to integrate with your development pipeline, including the CI/CD server. It was built as the extension of the Swift compiler and uses the Swift AST directly. And it’s completely free!

In this blogpost, we will discuss the motivation behind creating the Sirius tool, the challenges that we’ve encountered, the solution we’ve chosen, the resulting obfuscation process and the plans for the future.

Why did we develop our own code obfuscation tool?

From the technical perspective, the obfuscator should make it harder for the attacker to understand the logic after the app is decompiled. But how can one decompile an app, exactly? Let’s quickly go through the process.

The story of the app starts with the source code. Readable by design, it expresses the business logic. Then the source code is compiled into the binary file. If the app is written in Swift, most of the symbols might be stripped during this step and the execution flow is transformed by the compiler optimizations. Then the binary file is uploaded into the App Store. It is encrypted using the Fair Play DRM technology, the same that Apple uses for the iTunes Store as well.

Here comes the attacker. They download the app from the App Store onto the jailbroken device. When the app is launched and loaded into the memory for execution, the system decrypts it. Because the device is jailbroken, the attacked can dump the memory containing your app back into the binary file. Then it can be disassembled and read in a specialized tool like Hopper. The attacker can now read either the assembly code or the pseudo-code, navigate the symbols and follow the execution paths.

There’s nothing in the code obfuscation process that prevents any of those steps but the last one — understanding the decompiled code by the attacker. Although it’s applied early in the process, its results should be noticeable only during the last step of reading the decompiled binary. Any changes in the app behavior must not be visible to the user.

There are three basic ways of applying code obfuscation: before compilation (source to source), during compilation (source to binary) and after compilation (binary to binary). They correspond to different places in the diagram above. If the obfuscation is done before compilation, it works on the textual source files and its output consists of the modified source files. You might see it as a strange refactoring of a kind. If obfuscation is done during compilation, it becomes a step in the compilation process. This is how Proguard works in the Android build system (although it can also be seen as binary-to-binary obfuscator, since it works on the Java .class files). On the iOS side there’s the Obfuscator-LLVM project that provides a forked compiler that has this additional step baked in. If the obfuscation is done after compilation, it consumes the binary file and spits out transformed binary file. It might be applied any time after the compilation and before the app is published, downloaded and loaded into the device memory. However, since everything that happens to the binary file after uploading to the App Store is in the sole control of Apple, code obfuscation is done before publishing. The commercial tool doing exactly that is called iXGuard.

For the Sirius tool we’ve chosen the source to source method: reading the source files and producing transformed source files. We’ve decided that it’s the most user-friendly method that doesn’t require the user to trust that we’ve not injected anything malicious into the binary file. Since we’re producing the source code, everyone can easily make a diff and visualize all the changes.

The challenges of simple renaming

Having chosen to apply obfuscation before the compilation, we’ve found ourselves in the need of a tool that will help us with proper renaming. Although it might seem as an easy task, there’s a number of cases that require a lot of information to be handled properly. Here’s one illustrative example:

struct foo {} protocol SomeProtocol { associatedtype foo func foo(foo: foo) -> foo } protocol AnotherProtocol { func foo(foo: foo) -> foo } class SomeBaseClass<foo> { func foo(foo: foo) -> foo { return foo } } class SomeClass: SomeBaseClass<foo>, SomeProtocol { override func foo(foo: foo) -> foo { return foo } } SomeClass().foo(foo: foo())

There’s a lot going on here, but two things are worth pointing out. The first one is the abundance of foo symbol name, which, depending of context, stands for the type name, associated type name, function name, parameter name, generic type name, variable name or initializer call. We aim at introducing as much entropy to the symbol names as possible, so we should change each foo variant but the last one to different random name. To do it, we need a way of distinguishing between these usages. It’s impossible without the comprehensive analysis.

The second interesting thing is the relation between the foo functions in the SomeProtocol, AnotherProtocol, SomeBaseClass and SomeClass types. They’re named the same and therefore we might suspect that they express similar meaning. Since our job is to obfuscate the meaning, we should change foo to different names in each scope. However, we must not cause any compilation or runtime error, so we must honor the relationship between these foo functions originating from the fact that there’s a point in code that binds some of them together. Namely, SomeClass by both inheriting from SomeBaseClass and conforming to SomeProtocol is creating the connection between these three types. The AnotherProtocol, on the other hand, is never bound to neither SomeProtocol nor SomeBaseClass. The symbol renaming should therefore change the foo function name in AnotherProtocol to different name than the foo functions in the other scopes. The most challenging part is that it’s only after we process all the source code we might identify these relationships, as there’s nothing in the SomeProtocol and SomeBaseClass themselves that indicates that they are anyhow connected.

The resulting code might look similar to:

struct gU9OIWUVMN {} protocol c54MiljGWM { associatedtype YAxsJ4FEM9 func wuxWC0Z9ic(mBeXqpr5Ec: YAxsJ4FEM9) -> YAxsJ4FEM9 } protocol RamC43kstx { func oWjESxSMiK(aPZTcAZeLr: gU9OIWUVMN) -> gU9OIWUVMN } class H1eoOLdD2V<gnl5fOVXT6> { func wuxWC0Z9ic(mBeXqpr5Ec: gnl5fOVXT6) -> gnl5fOVXT6 { return mBeXqpr5Ec } } class I6ScLpZfQF: H1eoOLdD2V<gU9OIWUVMN>, c54MiljGWM, RamC43kstx { override func wuxWC0Z9ic(mBeXqpr5Ec: gU9OIWUVMN) -> gU9OIWUVMN { return mBeXqpr5Ec } } I6ScLpZfQF().wuxWC0Z9ic(mBeXqpr5Ec: gU9OIWUVMN())

The above example shows that the obfuscator must be able to read the source code, parse it, analyze it, typecheck it and then transform it, all with the great knowledge of the Swift language.

There’s a number of possible approaches to this problem, starting from a set of regular expressions and bash commands, through a lightweight homemade parser, an industry standard tool like ANTLR, using one of the already available parts of Swift toolchain like SourceKit or libSyntax, up to forking the compiler and using it directly. After some research, we’ve landed on the last solution.

Technical choices and their consequences

Using the compiler directly enables us to work with the Swift AST that carries the most information about the app. We could also, if needed, expand the lexer or parser to handle the special cases. What’s more, since Swift compiler contains the refactoring engine used by Xcode, it may help to perform the actual changes in the source files. Basically, it ticks a lot of boxes for us.

The main issue originating from using the forked compiler is that none of the internal libraries, modules and APIs that we use is public. Therefore there’s no versioning, no promise of stability, no compatibility guarantees. Any commit to Swift open project might cause the major rewriting on obfuscator’s side. Also, the parts of the compiler that we use are not designed for our needs. For example, it is sometimes tricky to get the information about the symbol name from the AST. It might be buried deep down in the hierarchy of nodes that capture the information that is crucial for the efficient type checking or SIL generation, but that we do not care about.

The good thing, however, is that using the existing compiler infrastructure is surprisingly easy. The build system already supports defining the libraries and the dependency relationships between them. It means that creating a swiftObfuscation library that uses swiftAST or swiftDriver was trivial. Also, the command line tools like obfuscator-renamer that use the swiftObfuscation under the hood could be added in few dozens lines of code.

While the details on how to integrate with Swift compiler infrastructure are beyond the scope of this blogpost, more information is provided by this talk.

Another hoop to go through is that the Sirius obfuscator must be versioned according to the public Swift versions. With each Xcode release that contains new Swift version we must provide a separate obfuscator binary so that the language versions match. The current version supports Xcode 9.2 and Swift 4.0, but we’re finishing enabling support for Xcode 9.3 and 9.4 (Swift 4.1 and 4.1.2, respectively).

The code obfuscation process

Having all the tools in place, we’ve designed the obfuscation process following two principles. Firstly, the tool must be easy to use and integrate into the build process. Secondly, the tool must be modular to enable extendibility and customization. We’ve been coming back to these principles time and again during the development.

To make the tool easy to use, we’ve aimed for the single execution of a single binary with as few parameters as possible. As the target user of Sirius obfuscator is an iOS / macOS developer, we’ve decided that the input should be provided in the form of the path to the project directory that contains either .xcodeproj or .xcworkspace. The actual paths to the Swift source files and the frameworks that they are importing, as well as the build settings (including architecture and SDK version) will all be identified by us, not by user. The output of the single execution is the copy of the Xcode project that contains all the transformed source code files and resources (like .storyboard and .xib files, assets, configuration plists etc.). There’s also a possibility of obfuscating in place, without creating a copy, which is useful for CI/CD integration.

The resulting interface is:

$ bin/sirius -projectrootpath `<path-to-xcode-project>` -obfuscatedproject `<path-for-obfuscated-project>` [-namemappingstrategy `<name-mapping-strategy>`] [-keepintermediates] [-inplace] [-verbose]

where all the parameters in square brackets are optionals and backed by good defaults.

The modularization principle led us to splitting the whole process into small standalone steps and creating a separate command line tool for each one of these steps. The process therefore is as follows:

First, the Xcode project is parsed in search of all the information necessary for performing obfuscation. It’s done by a tool called Files Extractor which is written in Ruby and uses CocoaPods’ Xcodeproj under the hood. The output is a JSON file in a Files.json format. Then the Files.json file is consumed by the tool that parses the source code in search of all the symbols that should be renamed. It’s called Symbol Extractor and it uses Swift compiler to perform the analysis and walk the AST. The output is a JSON files in a Symbols.json format. The third step is the generation of new names. It’s done by a tool called Name Mapper that uses the Swift compiler infrastructure, although only for defining CLI and JSON parsing. It consumes the Symbols.json file and outputs a JSON file in a Renames.json format. At this point we have already gathered all the information required to perform the renaming. The last step is to finally do it! The tool responsible for that is called Renamer and it takes Files.json and Renames.json as input. The output is the obfuscated project. It relies on Swift compiler for making the actual changes in the source files.

The single-execution tool is just calling all the small components one by one. There’s also a simple tool for verifying the process’ result by comparing the symbols before and after obfuscation.

As the information passing between these tools is done via JSON files of the well-defined format, it allows for additional scripting and customization. For example, if you’d like a particular symbol to be renamed to some special name, it’s enough to change the data in Renames.json file before it’s been read by the Renamer.

Where to go from here?

The Sirius obfuscator is not a finished tool by any means, but it’s already quite powerful. It supports a lot of Swift language constructs, few Xcode project configurations (including basic CocoaPods integration), renaming in XIB / Storyboard files and the additional configuration that might be used, for example, to exclude the Core Data models from renaming.

It has a number of limitations too, most notably no DSYM transformation (so your crash reports will contain obfuscated names), no multiple targets obfuscation (so your unit tests won’t run after obfuscation) and no support for mixed Swift / Objective-C projects (although we’re working on it). There’re also some particular Swift language constructs that we haven’t yet provided support for.

The good news is that the project is open source and Polidea is very happy to help the aspiring contributors with getting up to speed with the development. So, if you’re interested in contributing to any of the components of Sirius obfuscator, please take a look into the dedicated repository for description, documentation and all the other required information.

If you’re interested in the technical details, further explanation of the design decisions, challenges and tradeoffs, please check the Documentation folder in each dedicated repository. We’ve put a lot of effort into documenting the development process, as you might see in the example.

Security for your next mobile app

If you’re interested in using the Sirius obfuscator for your app, please check the repository on Github for links to the executables and / or the build steps. If you’ll find the tool useful enough to integrate it into your app release process, but the current version doesn’t cover your Xcode project configuration, particular Swift language construct or any other need, Polidea would be happy to provide the consulting and support. Also, if your company is interested in developing a solution requiring high level of security protection, our team is ready to help.

We hope you’ll find the Sirius obfuscator useful! If so, please let us know at hello@polidea.com.

Want to discuss your project idea? Contact us here!

By Krzysztof Siejkowski, originally published at www.polidea.com.

--

--

Polidea

Development studio delivering digital products. #UniqueTech