This is a story about Xcode, two developers, one MBP, one analytics tool, and the whole bunch of the statistics.
Once upon a time, there were two iOS Engineers — John Doe and Jane Roe. They were working on some legacy swift project. At some point they started being curious about the amount of time they were spending waiting for Xcode to build project. They found a tool, written by some unknown iOS Engineer. The good thing about that tool is that the tool worked silently, without the need to reinstall, without any interruptions and unexpected failures. The tool just worked. John and Jane were looking at the results the tool provided, and were making decisions whether they need to improve something or not.
A year passed and John and Jane decided to take a look at all the data, the tool collected…
Now the story begins
John and Joe looked at the results the tool collected, and they saw following image:
Not everything was clear on those charts, but after some investigation, brave iOS engineers were able to find what data was represented on charts
It seemed that all these charts were representing Xcode Activities such as: Build Failed, Build Succeeded, Clean Failed, Clean Succeeded, Tests Failed, Tests Succeeded.
Actually, engineers found that, the tool have also tracked Run Completed events. While this information was being collected, it was filtered out on all the charts. This is mostly because of that type of event doesn’t correctly represent ‘Run Times’.
The first chart was showing all events that happened during a day. By looking at this graph, developers realized that even in very productive day, they were able to perform about 200 events at max
The second chart looked similar to the first one, but it was tracking intentions instead of events. Each intention results in one or more events. So, for example if the original intention was to test a project, valid events produced by the intention would be
- [Build Failed]
- [Build Succeeded, Test Failed]
- [Build Succeded, Test Succeeded]
The third chart was showing the number of intentions, grouped by the related pods.
John and Joe have found that there was one main project and five pods they were working on last year. Each pod was represented by a nice, bright and unique color
The last chart was showing the information of running time of corresponding events. This is, actually, the chart which iOS developers were originally interested in. It was showing, how much time developers spent waiting for the build results.
John also found some old looking note from an unknown developer
I’m working on the tool that would allow users to update the code of the application without the recompilation(unsupported at the moment). Currently I’m not sure whether it is actually saving that much time. I need to start collecting some data.
Let’s now see, what our developers were able to get from those charts
This is how real vacation should look like. John didn’t take a laptop on vacation. He really wanted, but he didn’t. As the result - there were no events for more than 2 weeks.
Refactoring and rewriting
By looking closely at the chart, John and Jane discovered that there were several days, when build failure rate was just enormous — up to 90% of all events.
Those were days and weeks when they experienced mostly failing builds. There are few reasons behind it. First reason is, obviously, refactoring or restructuring. These are days when they were changing project structure (separating code to pods, etc). So these days they were just trying to make project running, and spending the whole day trying.
The second potential reason — is TDD. Jane and John were trying TDD for new small features. Test Driven Development increased failure builds count. As expected. By design. If you’re doing TDD, then before you get runnable code, you’ll have few rounds of build errors. (We’re not even counting failed tests here. Those would also increase failure events significantly. But here we’re talking about builds only).
The third reason is the local code complexity. In general it means, that our developers weren’t able to change the code in one try. They needed to change it in multiple locations, making sure that all types were correct. This caused a lot of build failures as well
Isn’t it obvious that Continuous Integration setup using external server, will save you a lot of time during the day? Of course, it totally depends on the project size, so try to guess at what point they have set up Jenkins to perform per-commit and release builds on it?
Up to 3 hours per day were spent just waiting source code to be compiled. Sick! Right? They knew that they were spending a lot of time simply on waiting, but they didn’t know how big the problem was.
Of course, it doesn’t mean that while the project was building they were staring at the screen and waiting until build was finished. But definitely they weren’t writing code at that moment. It’s almost impossible to do something like that while project is building. At least, not on their MBPs.
By simply moving release and testing builds on Jenkins, they decreased build times to under 1 hour per day.
The spike on the right was at the day when Jane was playing with compilation flags, and tried to decrease build times for the main project. So she had a lot of full-clean builds that day. You can see it by small light-green area on the top graph (which represents clean events)
Splitting to pods (Failure)
Right after vacation John was pretty sure that they need to split project to prevent full-project recompilation problem which was happening very often. Somehow the full-project recompilation was happening only if some files were edited. John and Jane spent a lot of time trying to understand what was causing the full-recompilation, but, unfortunately they failed. It was happening in some files while it worked fine in others. And they haven’t found any valid answers. But they tried a lot. Once per month they were returning to this question again and again, but right after moving to files that weren’t causing full-recompilation they hoped that this would never happen again.
If you think that this was easy, I would assure you — it wasn’t. The project was in that state when it’s really hard to separate something into a module. They failed at first try
They tried to move some part of the project to pod(“Awesome Feature”), and failed. They tried to move one part and then realized that they needed to move dependencies as well… Finally they realized that they would need to move everything to the “Awesome Feature”. While it potentially was a solution, that they could be living with, there weren’t able to test that the whole migration went fine, because there were too many lines of code dependent on resources in main bundle.
So they stopped and decided to go with another “Cool Feature” pod. And then they failed again.
Splitting to pods (Success)
Week later they changed approach and tried again. After almost a week of trying, they knew, what parts of the project could be separated, and what parts were really hard to separate(BFS (Big “Functional” Storyboard))
They created two pods
Models and started to move files one by one. If they weren’t able to move some files, they were adding //TODO: comments to come back to those files later. A lot of tests were added to cover moved files, just to freeze up functionality before next rounds of refactoring.
These pods weren’t that big.
Utils pod, was simply a container of helpers — functions and extensions they were using everywhere.
Models pod contained base domain models and code that performed transformation from JSON to Models and back. All logic that was in domain models was covered with tests in transition process.
What was the result? They were able to move only about 10% of code to the submodules, but that was enough to decrease compilation time(in worst case scenario) by almost 25%.
Also, moving domain objects and related parsing to
Models pod allowed them to perform pretty fast transition to the new API, almost without touching main project.
Splitting to pods (Success again)
Next time, when totally new “Gorgeous feature” came in, John and Jane already knew what to do. They immediately started new pod specifically for this feature. They created separate example project which allowed to jump right to this new feature without even running whatever was in the main project. By doing this, our iOS engineers were able to achieve blazingly fast development speed. And they were spending about ~10x less time waiting for builds.
The end of the story
This is where the story ends. What can we get from it?
- CI is good. If you can move repetitive tasks from your computer to CI — do it. The longer the project — the more time you’ll save for tasks that matters.
- Working with big monolythic project can be slow, and working on a new feature in a separate subproject can significantly increase productive time.
- It can be really hard to split monolythic project into subprojects
- Data is King and Charts are Cool
It took two days to write the Xcode time tracker tool. It took almost a year to gather data and five long evenings to write this post. Gathering data is fun, even if it is a data from one user which is working on specific project only. Looking on the data, you see the problems and solutions, rises and falls, good and bad decisions. Of course, Xcode events aren’t showing the whole picture of the project status — they are only one small piece of the puzzle. The main goal of the tool is to see whether the waiting for a build is a problem or not. How much time are we allowed as developers to spend daily waiting source code to be compiled? Two hours? Hour? 10 minutes? Everyone can decide for themselves. But in order to answer this question, we need to know, how much time are we actually spending on this type of “activity”
P.S. Even More Charts
In the process of writing this post, I spent about 10 hours playing with R language. I just wasn’t able to stop playing with data. Here are some awesome charts.
Compare build times of the main project and isolated features. Okay, okay, let’s zoom this chart a bit
Now you should be able to compare build times of the main project and isolated features.
What about event types?
And the last one just because it looks cool