Understanding Swift Performance

rozeri dilar

Published in

iOS App Mastery

12 min readMay 27, 2023

Understand the implementation to understand the performance

Dimensions of Performance

When you are building an abstraction and choosing an abstraction mechanism you should be asking yourself, “Is my instance going to be allocated on the stack or the heap?” Allocation (Heap-or-Stack)
When I pass this instance around, how much reference counting overhead am I going to occur? Reference Counting(Less-or-More)
When I call a method on this instance, is it going to be statically or dynamically dispatched? Method Dispatch(Static or Dynamic)

If we want to write fast Swift code, we’re going to need to avoid paying for dynamism and runtime that we are not taking advantage of. And we’re going to need to learn when and how we can trade between these 3 different dimensions for better performance.

Swift allocates and deallocates the memory on your behalf, some of that memory allocates in the stack, some of them in the heap.
- The stack is a really simple data structure, you can push onto the end of the stack and pop off the end from the stack. Because you can only add or remove to the end of the stack, we can implement push and pop just by keeping a pointer to the end of the stack (a.k.a. Stack Pointer).

Allocation

Stack

Decrement stack pointer to allocate (When we call into a function, we can allocate that memory that we need just trivially decrementing the stack pointer to make space.)
Increment stack pointer to deallocate. (Then after executing the function, we deallocate that memory just by incrementing the pointer back up to where it was before we called the function)
Allocating and deallocating from the stack is fast. It is literally the cost of assigning an integer.

Heap

Advanced-Data Structure (The Heap is more dynamic but less efficient than the Stack. It lets you do things that Stack can’t, like, allocate the memory with a dynamic lifetime. But, that requires a more advanced data structure.)
Search for unused blocks of memory to allocate
Reinsert the block of memory to deallocate
Heap costs more than just assigning an integer like Stack did.
Thread safety overhead. (Because multiple threads can be allocating memory on the heap at the same time, the heap needs to protect its integrity by using locking or other synchronization mechanisms.)

Examples:

There is a struct Point. Whenever I create an instance of Point, since it is a value type, I create it via Stack (Value Semantics).
There is a class Point. Whenever I create an instance of Point, since it is a reference type, I create it via Heap, locks the heap creates it which might lead to an unintended share of state (Reference Semantics). Deallocation goes with first locking the heap and retraining the unused block to the appropriate position.

🚀 Classes are more expensive to construct than structs because classes require a heap allocation. Because classes are allocated on the heap and have reference semantics, classes have some powerful characteristics like identity and indirect storage. But, if we don’t need those characteristics for abstraction, we’re going to better use a struct.

🚀 Structs aren’t prone to the unintended sharing of state like classes are.

🚀 Let’s say I have to cache the name of UI images, so that compiler wouldn’t have to create the same images over and over. It would not be smart to cache the names with a String like this: let key = "\(color):\(oriantation):\(name)"

String isn’t particularly a strong type for this key. I’m using it to represent this configuration space, but I could just as easily put the name of my dog in that key. So, not a lot of safety there. Also, String can represent so many things because it actually stores the contents of its characters indirectly on the heap. So, that means every time we’re calling into this function, even if we have a cache hit, we’re incurring a heap allocation.

In Swift, we can represent this configuration space of color, orientation, and name just by using a struct. This is a much safer way to represent this configuration space than a String. And because structs are first-class types in Swift, they can be used as the key in our dictionary.

struct Attributes {
    var color: Color // enum type
    var orientation: Orientation // enum type
    var name: Name //some custom enum type
}

Now the key can be below as structs are first class types in Swift, 
they can be used as the key in our dictionary:
let key = "let key = Attributes(color: color, orientation: orientation, name: name)"

Now, when we call the above function, if we have a cache hit, there’s no allocation overhead because constructing a struct like this attributes one, doesn’t require any heap allocation. It can be allocated on the stack. So, this is a lot safer and it’s going to be a lot faster.

Reference Counting

Swift keeps a count of the total number of references to any instance on the heap. And it keeps it on the instance itself. When you add a reference or remove a reference, that reference count is incremented or decremented. When that count hits zero, Swift knows no one is pointing to this instance on the heap anymore and it’s safe to deallocate that memory.

The key thing to keep in mind with reference counting is this is a really frequent operation and there’s actually more to it than just incrementing and decrementing an integer. First, there are a couple of levels of indirection involved to just go and execute the increment and decrement. But, more importantly, just like with heap allocation, there is thread safety to take into consideration because references can be added or removed to any heap instance on multiple threads at the same time, we actually have to atomically increment and decrement the reference count. And because of the frequency of reference counting operations, this cost can add up.

There’s more to reference counting than incrementing, decrementing

Indirection
Thread safety overhead
Examples:
There is a class Point. Let's assume I created an instance of Point which is point1, now point1 has gained an additional property, refCount. And we see that Swift has added a couple of calls to retain and a couple of calls to release. Retain is going to atomically increment our reference count and release is going to atomically decrement our reference count.
In this way, Swift will be able to keep track of how many references are alive to our point on the heap. And if we trace through this quickly, we can see that after constructing our point on the heap, it’s initialized with a reference count of one because we have one live reference to that point. As we go through our program and we assign point1 to point2, we now have two references and so Swift has added a call to atomically increment the reference count of our point instance. As we keep executing, once we’ve finished using point 1, Swift has added a call to atomically decrement the reference count because point 1 is no longer really a living reference as far as it’s concerned. Similarly, once we’re done using point 2, Swift has added another atomic decrement of the reference count. At this point, no more references are making use of our point instance, so Swift knows it’s safe to lock the heap and return that block of memory to it.
There is a struct Point. Well, when we constructed our point struct, there was no heap allocation involved. When we copied, there was no heap allocation involved. There were no references involved in any of this. So, there's no reference counting overhead for our point struct.

What about a more complicated struct, though?

Let’s assume I have a label struct that contains text which is of type String and font of type UIFont. String, as we heard earlier, actually stores the contents of its characters on the heap. So, that needs to be reference counted. And the font is a class. And so that also needs to be reference counted. If we look at our memory representation, labels got two references. And when we make a copy of it, we’re actually adding two more references, another one to the text storage and another one to the font. The way Swift tracks these heap allocations is by adding calls to retain and release.

So, with the instances of label struct, we see the label is actually going to be incurring twice the reference counting overhead that a class would have.

In summary, because classes are allocated on the heap, Swift has to manage the lifetime of that heap allocation. It does so with reference counting. This is nontrivial because reference counting operations are relatively frequent and because of the atomicity of the reference counting.

If structs contain references, they’re going to be paying reference counting overhead as well.

🚀 In fact, structs are going to be paying reference counting overhead proportional to the number of references that they contain. So, if they have more than one reference, they’re going to retain more reference counting overhead than a class.

Examples:

struct Attachment {
    let fileURL: URL
    let uuid: String
    let mimeType: String

init?(fileURL: URL, uuid: String, mimeType: String) {
        guard mimeType.isMimeType else { return nil }
        self.fileURL = fileURL
        self.uuid = uuid
        self.mimeType = mimeType
    }
}

Above, there is a lot of reference counting overhead and if we actually look at our memory representation of this struct, all 3 of our properties are incurring reference counting overhead when you pass them around. Because there are references to heap allocations underlying each of these structs.

Improve better:

enum MimeType: String {
    case png, jpeg, gif
}

struct Attachment {
    let fileURL: URL
    let uuid: UUID // It is now a struct
    let mimeType: MimeType // It is now an enum
    init?(fileURL: URL, uuid: UUID, mimeType: MimeType) {
        guard let mimeType = MimeType(rawValue: mimeType) else { return nil }
        self.fileURL = fileURL
        self.uuid = uuid
        self.mimeType = mimeType
    }
}

🚀 For UUID, which is really great because it stores those 128 bits inline directly in the struct. And so let’s use that. What this is going to do is, it’s going to eliminate any of the reference counting overhead we’re paying for that UUID field, the one that was a String. And we’ve got much more type safety because I can’t just put anything in here. I can only put a UUID. That’s fantastic. Let’s take a look at mimeType and let’s look at how I’ve implemented this isMimeType check. I’m actually only supporting a closed set of mimeTypes, JPG, PNG, and GIF.

🚀 Swift has a great abstraction mechanism for representing a fixed set of things. And that’s an enumeration. So, I’m going to take that switch statement, put it inside a failable initializer, and map those mimeTypes to the appropriate case in my enum. So, now I’ve got more type safety with this mimeType enum and I’ve also got more performance because I don’t need to be storing these different cases indirectly on the heap. Swift actually has a really compact and convenient way of writing this exact code, which is using an enum that’s backed by a rawString value. And so this is effectively the exact same code except it’s even more powerful, has the same performance characteristics, but it’s way more convenient to write. So, if we looked at our attachment struct now, it’s way more type-safe. We’ve got a strongly typed UUID and mimeType field and we’re not paying nearly as much reference counting overhead because UUID and mimeType don’t need to be reference counted or heap allocated.

Method Dispatch

When you call a method at runtime, Swift needs to execute the correct implementation.

If it can determine the implementation to execute at compile time, that’s known as a static dispatch. And at runtime, we’re just going to be able to jump directly to the correct implementation. And this is really cool because the compiler actually going to be able to have visibility into which implementations are going to be executed. And so it’s going to be able to optimize this code pretty aggressively including things like inlining.

Static Dispatch

Jump directly to implementation at runtime.
Candidate for inlining and other optimizations.

Dynamic Dispatch

Look up implementation in the table at runtime.
Then jump to implementation.
Prevents inlining and other optimizations.

At dynamic dispatch, we’re not going to be able to determine a compile time directly which implementation to go to. And so at runtime, we’re actually going to look up the implementation and then jump to it. So, on its own, a dynamic dispatch is not that much more expensive than a static dispatch. There’s just one level of indirection. None of this thread synchronization overhead is like we had with reference counting and heap allocation.

🚀 But this dynamic dispatch blocks the visibility of the compiler. So while the compiler could do all these really cool optimizations for our static dispatches, at a dynamic dispatch, the compiler is not going to be able to reason through it.

What is inlining?

Let’s return to our familiar struct point.

struct Point {
    var x, y: Double
    func draw() {
        // Point.draw implementation
    }
}

func drawAPoint(_ param: Point) {
    param.draw()
}

let point = Point(x: 0, y: 0)

// When we call via drawAPoint function, compiler knows exactly which implementations are going to be executed 
// and so it's actually going to take our drawAPoint dispatch 
// and it's just going to replace that with the implementation of drawAPoint.
drawAPoint(point)

// how compiler see above:
let point = Point(x: 0, y: 0)
// Point.draw implementation // Directly inside the draw() function

The drawAPoint function and the point.draw() method are both statically dispatched. What this means is that the compiler knows exactly which implementations are going to be executed and so it's actually going to take our drawAPoint dispatch and it's just going to replace that with the implementation of drawAPoint.

And then, it’s going to take our point.draw() method and, because that's a static dispatch, it can replace that with the actual implementation of draw() function. So, when we go and execute this code at runtime, we're going to be able to just construct our point, run the implementation, and we're done. We didn't need the overhead of those two static dispatches and the associated setting up of the call stack and tearing it down. So, this is really cool. And this gets to why static dispatches and how static dispatches are faster than dynamic dispatches.

🚀 A single static dispatch compared to a single dynamic dispatch, there isn’t that much of a difference, but in a whole chain of static dispatches, the compiler is going to have visibility through that whole chain. Whereas the chain of dynamic dispatches is going to be blocked at every single step from reasoning at a higher level without it. And so the compiler is going to be able to collapse a chain of static method dispatches just like into a single implementation with no call stack overhead. So, that’s really cool.

So, why do we have this dynamic dispatch thing at all?

It enables really powerful things like polymorphism. If we look at a traditional object-oriented program here with a drawable abstract superclass, I could define a point subclass and a line subclass that overrides draw with their own custom implementation. And then I have a program that can polymorphically create an array of drawables. Might contain lines. Might contain points. And it can call draw on each of them.

class Drawable { func draw() {} }

class Point: Drawable {
    var x, y: Double
    override func draw() {
        //...
    }
}

class Line: Drawable {
    var x1, y1, x2, y2: Double
    override func draw() {
        //...
    }
}

var drawables: [Drawable]
for d in drawables {
    d.draw()
}

Because drawable, point, and line are all classes, we can create an array of these things and they’re all the same size because we’re storing them by reference in the array. And then when we go through each of them, we’re going to call draw on them. So, we can understand why the compiler can’t determine at compile time which is the correct implementation to execute.

Because this d.draw, it could be a point, it could be a line. They are different code paths.

So, how does it determine which one to call? Well, the compiler adds another field to classes which is a pointer to the type information of that class and it’s stored in static memory. And so when we go and call draw, what the compiler actually generates on our behalf is a lookup through the type to something called the virtual method table on the type and static memory, which contains a pointer to the correct implementation to execute. And so if we change this d.draw to what the compiler is doing on our behalf, we see it’s actually looking up through the virtual method table to find the correct draw implementation to execute. And then it passes the actual instance as the implicit self-parameter.

So, what have we seen here?

Classes by default, dynamically dispatch their methods. This doesn’t make a big difference on its own, but when it comes to method chaining and other things, it can prevent optimizations like inlining and that can add up.

🚀 Not all classes, though, require dynamic dispatch. If you never intend for a class to be subclassed, you can mark it as final to convey to your fellow teammates and to your future self that that was your intention. The compiler will pick up on this and it’s going to statically dispatch those methods. Furthermore, if the compiler can reason and prove that you’re never going to be subclassing a class in your application, it’ll opportunistically turn those dynamic dispatches into static dispatches on your behalf.

Whenever you’re reading and writing Swift code, you should be looking at it and thinking,

Is this instance going to be allocated on the stack or the heap?
When I pass this instance around, how many references counting overhead I’m going to incur?
When I call a method on this instance, is it going to be statically or dynamically dispatched?
If we’re paying for dynamism we don’t need, it’s going to hurt our performance.

ref: https://developer.apple.com/videos/play/wwdc2016/416/