This is going to be a low-level article, but I guess you already knew that since you landed here, right?
I wanted to talk about this mysterious term Mach-O…
What is it?
How does it work?
Really, what is it!?
To answer all of that, we’ll have to dig deep and get our hands dirty…
When we build an application in Xcode, a lot of things happen at the same time. One of them is converting all the source code into an executable. This executable contains the byte code that will run on the CPU, the ARM processor on an iOS device, or the intel processor on Mac.
This executable is called Mach-O.
Well, this was easy and fun, Goodbye!
Unless you want to stick around to know about the internals 😈
Mach-O is a binary stream of bytes grouped into meaningful chunks of data. These chunks contain information about the meta like byte order, CPU type, size of the chunk and so on.
There are various Mach-O types, most typically you’d have seen these
- Executable — Main app binary, like Example.app/Example
- Dylib — Dynamic library, like libSwiftCore.dylib
Yeah, I know right, lurking right under our nose!
So, Mach-O files are divided into several segments and it looks something like this
but before diving into the segments let’s look into something else
Every Mach-O file begins with a header structure that defines the structure of the file. It also contains information about file type and target architecture (armv7, armv7s, i386, etc).
Just below the header structure are a bunch of load commands which help in the layout and linking of the file. Also, load commands can specify
- The initial layout of the file in the virtual memory (we’ll come back to this)
- Section names and addresses
- dylibs to be loaded
- “main” function address
- Code Signature
And, this is how the complete header looks like!
As you can see, Mach-O header consists of a bunch of load commands which are defining the addresses of the sections, main function, and the dependent binaries to be loaded.
The addresses mentioned above are actually offset from the memory address where your Mach-O is loaded. This is done because the starting memory address is randomized every time your app launches using a nifty technique called Automatic Space Layout Randomization or as we lovingly call it ASLR.
What this means is when your app process starts, you do not know from which address will it start beforehand.
Let’s imagine its implications, assume you have a global variable which occupies some memory address in your RAM, but since you don’t know where your process started from, you cannot possibly determine the memory address of this global variable!
As you might have guessed, this is done for security purposes, otherwise, it would become very easy to hack the binary if everything has the same address on every launch!
Let’s look at the individual segments of the Mach-O file
This is the first segment of an executable file and it has no data inside so it takes up no space in the file. This segment is full of zeroes to catch NULL pointer dereferences. You might have faced a
EXC_BAD_ACCESS crash, that is precisely because something in your code tried to access data from here, which is not allowed.
As an aside, this segment can be a good place to hide malicious code 😉
This segment contains executable code and read-only data. It is made read-only to allow the sharing of the segment when it is mapped into the memory. This is primarily used with frameworks, bundles and shared libraries.
And, since the __TEXT segment is read-only, there are no changes that need to be saved back to the disk. If the kernel needs to free up memory, it will simply remove the __TEXT page and re-read them when needed.
This is the reason how iOS and OSX cache their dynamic libraries so aggressively.
This segment contains writable data (e.g. globals, static variables, etc), and because it is writable, the __DATA segment of a framework or other shared library is logically copied for each process linking with the library.
If you have any experience with Swift, you must be familiar with copy-on-write, this essentially means do not create a copy until the thing being referenced is edited. Similarly, when the __DATA segment is copied, it isn’t really until some process modifies it, that process then receives its own private copy of the page.
This is an optional segment and contains data used by the Objective-C language runtime support library.
This is also an optional segment and contains symbol stubs and non-lazy pointers to symbols not defined in the executable. This segment is generated only for executables targeted for the IA-32 architecture.
This segment contains raw data for the linker (link editor) like symbols and string tables, compressed dynamic linking info, code signing info, and the indirect symbol table — all of which occupy regions as specified by the load commands.
So, now that we have an understanding of the individual segments, let’s try to look at the bigger picture and see how it all fits together.
The Big Picture — DYLD
Till now we know
- How a Mach-O file is generated and its load commands used to link dependencies in various ways.
- Load Commands are used to map the segments in the memory commands.
- Execution of the file begins from
Well, this is only information and this information requires a brain to process it.
And this brain is called Dyld!
Let me tell you a secret! 🤫
Well, it’s not really a secret
When you launch your app by tapping the app icon, instead of launching your app the kernel launches
I know right! This guy is a big deal around here.
The kernel will actually load dyld at some random address space and it will itself has its own __TEXT segment, __DATA segment… well, you get the idea.
Its the job of dyld to basically load and setup all the dependent dylibs for us.
This is where the dyld reads Mach-O header to find out about the dependent dylibs. It then finds that library file on the file system and parses them.
This process is done recursively because a dylib A can be dependent on dylib B which can be dependent on dylib C, so it has to resolve this whole graph of dependencies and finally memory map all of these dylib’s segments to the original Mach-O header.
And, this whole transaction might look something like this.
Now, remember we talked about ASLR and how you cannot know which address will be assigned to all the variables in your app. This is something that dyld has to fix using the below techniques.
__LINKEDIT section contains locations of all the pointers that need to be shifted. Dyld will go through all these pointers and shift them based on your application’s start address.
Notice that to do this, we have to read and write to the data pages, causing those pages to become dirty, and would need a copy on write. This is why Rebasing is expensive in IO.
References to other dylib functions are fixed using binding, like NSLog, malloc, etc.
Once dyld loads the dependent libraries, it needs to search the symbol tables and find the implementation of these symbols. So, there’s actually a string named
_NSLog inside your binary that is unresolved and what dyld will do is look up the symbol table and fill it up with the addresses of these functions from the dependent libraries.
This is computationally complex and is expensive
All Objc class definitions need to be registered, why? because you can construct an Objc class from a string calling
So, dyld has to build this table before the app can launch.
Adding categories to method lists — what this means is, if you have created a category over UIView and added a bunch of new functions, those new functions will be added to the method list of UIView.
It also ensures selectors are unique.
+loadmethods are called at this point
- C++ static initializers
This happens in a bottom-up fashion so basically the dependent libraries will be initialized first.
Whew!! After all of this is done, finally, your
main() will execute.
And, this is the story behind the elusive Mach-O file.
- Binaries use Mach-O format with __TEXT, __DATA and __LINKEDIT segments.
- Dyld needs to parse and load all dynamic library dependencies.
- Dyld needs to fix all pointers both internal and external (rebase, bind, setup runtime).
- Run static initializers and
- AND THEN
- Dyld runs before your app starts.
- Avoid using
+loadmethods, it is a deprecated API and called before your main() increasing the app’s launch time.
- For App libraries, dylibs will incur a lot of startup overhead as dyld has to do all the parsing, loading, fixing up for every dylib. Use static libraries instead.