Why I wrote my own UIKit

And the end of the platform wars

Most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves.
- Alan Kay

This is the story of the web, as it exists today. By many definitions, it is one of the wonders of the world — Wikipedia, Facebook, Google. And yet it began from humble beginnings — as a medium for sending around and connecting hypertext documents, but when slapped together with Javascript suddenly became a platform for creating applications.

Fast forward twenty years and you have multiple billion dollar companies (Facebook and LinkedIn to name a couple) pouring endless engineering resources into figuring out how to build state of the art applications in HTML. My company (bubbli) fell into this trap as well. “Use CSS transforms,” they said! “Optimize your code with Closure,” they said! But it was all a fools errand, we were trying build a bridges across abstractions so deep (web browsers) that it was like trying to build a house that spanned and sat atop ten skyscrapers.

We all eventually came to our senses and started building native applications for each individual platform because each platform provided “the best tool for the job,” in the sense that they had been designed to build apps from the ground up. Furthermore, the general mobile computing experience has greatly improved in the past couple of years: Apple and Google have continually pushed out a lot of great additions to their respective platforms. Along these lines, I remember sitting at a UI Dynamics session at WWDC 2013 and thinking to myself:

Wow, this is amazing.

Followed shortly by:

Wait a second, why do we have to be spoon-fed UI innovation from Apple? Did we all forget that computers are Turing complete and we can do whatever we want with them?

And thus began my descent down the rabbit hole, writing my own UI framework in OpenGL ES along the way.

The White Rabbit

Pro-tip: the text alignment is off because I use native text rendering libraries on each platform and I haven’t worked out the text metrics for OS X where I recorded this GIF. On iOS everything looks much better!

If you’ve never played around with our app, here’s the gist: spherical photos, called bubbles, that you take with your phone and always view as windows into other worlds.

We think of them as a medium unto themselves, deserving their own visual representation. Just like you never look at a film as a thousand pictures side-by-side, you never see a bubble stretched out into a flat image.

The challenge here is that we’re presenting a lot more data than a flat photo, in a non-linear fashion, at 60fps and not in a way that is remotely CPU friendly. I first prototyped this in CoreAnimation, but as the guys who built Facebook Paper discovered, loading a lot of resources concurrently in UIKit at once is extremely difficult. Very quickly I switched to drawing bubbles in OpenGL.

Down the Rabbit Hole

The biggest challenge at this point in time was that OpenGL was not very friendly to disappearing on and off the screen, resizing, and animating along with the rest of CoreAnimation. I had managed to piece together enough glue with UICollectionViews that things pretty much worked, but after we saw iOS 7 at WWDC we decided the bar had been raised, transitions should be contextual when possible and we quickly began conceptualizing new interactions in this spirit. One thing that we struggled with was an interface to change the size of the bubble views in a way that was continuous and didn’t snap you between grid sizes, which we felt was disorienting on a small screen.

What we settled on is what you see to the left. As you slide your finger along the bottom of the view, the grid zooms in and out like a rope that is being wrapped tighter and tighter around a pole.

Because the gif on the left was recorded on my computer, these bubbles are static, but on an iPhone, you can imagine each little window orienting along with the gyroscope on the device. It’s pretty amazing to watch (the app is free, so go download it to see).

I prototyped this in pure OpenGL, and to my surprise, found I could easily get hundreds of bubbles on screen at once whereas in a highly tuned UICollectionView, I could maybe get twenty.

There was no going back.

All-in

One of the biggest benefits of being the sole engineer at a startup is that I don’t really need to justify my gut intuitions to anyone. Conveniently it was also clear that, although we were about to launch, Apple wouldn’t promote an app that looked like an iOS 6 app so we would need to adapt our app to iOS 7 anyway. So I did what you should never do and rewrote virtually everything from scratch, using OpenGL ES.

To prepare for the future, I made sure that what I would write, would essentially run on any modern platform with minimal modifications. I switched off of CoreData to sqlite. I stopped writing everything I could in Objective-C and instead wrote everything in C++11. I wrote my own network stack, touch handling, scroll views, table views, physics, gaussian blur, resource loading, animations, etc.

Over the four month period between WWDC and our launch I rewrote all of our code and ended up with a total that was actually a few thousand lines less than our previous app while still including an entire clone of UIKit! We launched and, despite my fears, virtually no-one pointed out any weirdness about how our app “didn’t feel right.”


Beyond Parlor Tricks

While duplicating all of iOS 7's details was a fun intellectual exercise, the real payoff of writing your own UI framework is what happens next. To some degree, this approach fulfills the promise of HTML5—write once, run [almost] everywhere—with none of the compromises in performance or development time.

Other platforms? Piece of cake.

I have a fully functional Mac app that I threw together over Christmas weekend. In fact, the screencasts above were all recorded from it (not the iOS Simulator). I’ve already got the app bootstrapped on a ton of other platforms (well, except for Windows Phone) and even on things like emscripten in the browser! Will we have to drop some of the Apple style scrolling on other platforms? Yes, but it’s just a few if statements/ifdefs.

No more waiting on fixes in closed source libraries you depend on.

CoreData contexts firing off a million KVO notifications in the middle of merging contexts? Nope. When you’ve built the world from scratch, you can solve virtually any of the bugs you come across through just brute will and persistence. Compare this to weird behavior you might find in IE11 — have fun waiting a year for the next version!

Additionally, I would wager that for competent programmers, getting up to speed is significantly easier than on a closed platform because you can read the code and understand how everything works; if you get the abstractions right this is only a few thousand lines of code.

Insane performance by embracing modern hardware.

Virtually every CPU that connects to a screen ships with a GPU. Why are we still doing so much drawing on a CPU? Consumers expect increasingly immersive experiences and GPUs are incredibly efficient at doing so. For example, if you take a look at the camera button in our app, you’ll notice it “refracts” the rest of the screen as you scroll around. iOS 7 improves screenshot-ing performance considerably, but compared to a rendered-to-texture view, there’s no comparison.

A unified code-base.

As we move to the future, we won’t have to deal with the constant battle of keeping different platforms at feature parity (notwithstanding the Apple approval process). If I didn’t love Go(lang) so much, I would probably write our backend using the same DB abstractions/code we use in the app. This obviously has huge benefits from a personnel point of view in terms of specialization and just shear development team size.


Towards the end of the platform wars

Nothing in this post is anything new — game developers have been embracing this philosophy for years. The magic trick that platform vendors have managed to pull off is convincing developers that user interfaces are fundamentally tied to the operating systems they run on. However, as is the trend in recent years, you can run a surprising amount of code in user space without any special privileges, and you can get very, very far with just POSIX and Kronos standards and interfaces.

Furthermore, something I’m very excited about is the potential to run these sorts of unified platforms on emscripten. If emscripten continues to improve, JS will be less and less important and at some point browser venders will be able to swap out a backend that doesn’t need to pretend to compile to JS and we will no longer have to rely on so many pre-determined browser choices when engineering our applications for the web.

If I were Apple and/or Google, I would be very concerned that if this philosophy progresses, the developer silos between platforms will fall and marketshare will become much more fluid. If I weren’t either of these companies, I would be pouring as many resources as I could into pushing this philosophy.

Alan Kay has often remarked that our perceived complexity of modern software systems is something we accept to be true when it’s actually not, stating that if we rewrote all of our abstractions with the retrospective of our past learnings, we would end up with a drastically smaller corpus of core code. I’m inclined to believe him and am incredibly excited about the innovation that is possible if we do.


Until next time…

In the future, I hope to expand upon some of the key abstractions that made development so enjoyable and some of the key drawbacks and challenges in building a cross platform UI framework. Until then, feel free to reach me @newhouseb on twitter.

Show your support

Clapping shows how much you appreciated Ben Newhouse’s story.