A few years back I joined the top 1 online classifieds company in the world. One of the first tasks I was given, was to reduce the app launch below the 2 seconds mark. The users were leaving comments on the App Store because the app was “slow”.
The app, as you can imagine, had a mix of Swift and legacy Objective-C code, 4000+ files distributed in 8 repositories (main app and 7 internal libraries). In addition to this, around 30+ third-party libraries, everything linked using Cocoapods.
pre-main vs. post-main
When talking about app launch optimizations, there are 2 areas to tackle, both have different challenges and possible solutions.
There are many articles (1, 2, 3) already talking about what to do with the pre-main, usually is related to convert the dynamic libraries into static (libs will be linked at compilation time instead of runtime), or at least reduce the amount of dynamic libs.
But when I did the profiling, although there was some room for improvement in the pre-main section, the slowest part was the post-main (starting when
didFinishLaunchingWithOptions is called)
I don’t need to mention that several third-party libraries were initialized, internal services were set up, network calls were being made… it wasn’t an easy task.
The following code was the responsible for the slow post-main:
Can you spot the issue?
Since I just joined the company and my knowledge of the codebase was limited, decided to use instruments to see if could find a hint of where to look.
Using the Time Profiler, the heaviest stack revealed that had something to do with storing stuff in Realm.
Asking my colleagues, I discovered we were using Realm to persist the complete category tree (5000+ rows, because car make and models were included). You can find comparison between Realm and CoreData here, here and here.
On the first install, and every time there was an update on the tree (more often than you would think), the full tree was downloaded from the server and stored using the code shared above. In case of failure to download, a JSON containing the latest tree embedded in the app was used as a fallback.
I didn’t have experience with Realm by the time I joined that company, so decided to dig in the documentation, and found out that
Realm.write() is synchronous and blocking, that’s fine because we want the delete operation to finish completely before starting with the insertions.
But also the documentation states:
Since write transactions incur non-negligible overhead, you should architect your code to minimize the number of write transactions.
By looking at the first code shared, it’s obvious now that there is a write transaction for every category. Just imagine the overhead of having 5000+ database transactions queued while the user is watching clueless the loading screen.
Refactoring the code resulted in:
This simple change of moving the second
realm.write outside the
forEach, improved the efficiency by 90%
But wait, there’s more!
There is still another potential error to solve. The second code shared is still violating the concept of atomicity in a transaction, by having 2 different transactions, one for delete, and one for the write.
Imagine the event in which the delete is success but the write fails.
In order to avoid inconsistency on the database, both delete and write operations have to be grouped within the same transaction, and let Realm handle the rollback mechanism in case of error.
The code was refactored one more time as follows:
Realm was used to store Categories only, therefore using the
deleteAll() command didn’t present any immediate risk. However would require further refactoring to ensure deleting only the desired objects and not flushing the entire database by mistake, like so:
One simple line out of place can have a tremendous negative impact for the user, affecting your app rating, losing potential new users. The first impression upon first install is key.
We can always rely on tools like Instruments to analyze the app from different angles, time profiling, memory allocations, network usage, etc.
However I believe logic errors like the one described above should be caught during local testing, or during code review, and it’s important to understand how it managed to get merged into the main branch, in order to avoid similar scenarios.