The 10th Anniversary of the iPad — a Perspective From Office

Terry Crowley
The Startup
Published in
19 min readMar 6, 2020

There have been a number of interesting and thoughtful retrospectives on the 10th anniversary of the iPad’s announcement. Steven Sinofsky used a Twitter thread (gathered into this post) to talk about the Windows team response to the iPad announcement. John Gruber, Jean-Louis Gassée and Ben Thompson also had interesting takes. Those last posts express disappointment, especially with recent attempts to expose a (rather unintuitive) multi-window, multi-tasking user interface on the iPad.

I thought I’d reflect a bit on my experience from the Microsoft Office side. I had been running development in Office for about four years and was working on finishing up Office 2010 and starting planning for the next big release when Apple made the iPad announcement.

My real first response came not at the announcement but a few months later when I walked into the Bellevue Apple store and actually got my hands on a device. More than anything else, the thing that blew me away was the responsiveness of the device. Yes, weight, battery life, overall performance and the quality of the display were all impressive. But the thing that stood out for me was how smooth and responsive all the applications felt.

The emotion I felt was mostly frustration and anger. This will require a bit of an aside to explain where all this emotion was bubbling from.

I had been building complex graphical applications since the early eighties — almost 30 years by that point. The challenge in any non-trivial application is staying responsive — providing immediate feedback even as the user edits a large, complex artifact like a long document, complex web page, spreadsheet or CAD model. The approach I had taken in virtually all the applications I had written was one that is familiar to any developer trying to do something complicated in a browser application today. Design a data model that makes most user actions quick to execute. Break any large synchronous computation into small pieces so you do not block the user interface for long periods of time. Figure out how to provide intermediate feedback during that computation so the user knows progress is being made on whatever action they took that launched the computation. For any interaction over the network, use asynchronous designs, since the performance is virtually guaranteed to be highly variable with much higher error rates than is typical for purely local interactions. Blocking on a network request, which inherently has to have a long timeout, is a guaranteed hang.

Windows (really the Win32 application programming interface or API) had gone down a different path. Win32 generally used synchronous, blocking APIs, even for interfaces that interact with the network. There were thousands of such APIs, developed during the explosion of client-server computing over the previous 20 years. Synchronous APIs allow a developer to write simpler sequential code rather than requiring they use asynchronous techniques (with callbacks, state machines, etc.). The approved way to write responsive applications in this environment was to “wrap a thread” around any API that might block on a network interaction. (This actually might involve wrapping a thread around a larger sub-component that calls lots of such APIs rather than a separate thread per-API. And other things block besides just network interactions.) An operating system thread is separate from the special main thread used for user interactions and allows computations to proceed in the background (or in practice, do blocking waits in the background) while the main thread keeps the application responsive.

There were a lot of practical problems with this approach. I remember being stunned by the obliviousness of a Windows kernel engineer at an internal workshop when I talked about application hangs and he said “everybody knows you’re not supposed to call a blocking API on the main thread!”

The first and most obvious problem is that there is nothing in either the structure of an API or its documentation that tells you whether an API blocks! A Windows kernel engineer know the six APIs that he cares about that block. The Office developer is faced with thousands of such APIs.

As new features like network printers or network file systems were added, APIs that would not block in one release became blocking APIs in the next. This allowed new operating system features to be “slipped in” without requiring applications be re-written — but those applications would occasionally start hanging instead.

The Windows driver model, where a single API was layered on top of code provided by lots of third parties (for printers and other devices) meant that an API might block when talking to one kind of printer or other device and not block when talking to another.

These APIs were especially pernicious because they were often designed so they might display UI, therefore requiring that they be called on the user interface thread as well as sometimes block — these APIs were impossible to use without exposing potential (actually certain) hangs.

An API might cache the result of some network call locally to speed up repeated calls. A side-effect was that the performance of the API varied wildly — sometimes local and fast and sometimes interacting with a remote service with lots of variation in latency. This made it hard to reproduce and recognize these problems.

The multi-threaded programming necessary to use these techniques effectively has large aspects of “rocket science” to it and there were no general acceptable composable patterns — mostly it was roll your own. So while a single sequential algorithm might be easier to write using this blocking API design, orchestrating a large complex application was much much harder.

Early development of these APIs (and the overall approach) were on hard-wired machines, reliably connected to local departmental servers. As the operating environment became dominated by intermittently connected laptops with flaky wireless connecting to more and more remote services, the practical network environment changed to one where any application designed in a way that assumed network communication was fast and reliable was going to suffer.

On top of these structural challenges, laptops, and especially inexpensive netbooks, had become “unbalanced”. Memory had grown in capacity, processors were continuing to improve and the capacity of magnetic harddrives had expanded dramatically. But IOPS — Input/Output Operations per Second — had trailed significantly behind these other improvements. That meant there was space for all those large programs on disk and space for them in dynamic memory, but getting from disk to memory was painfully slow, especially on boot and resuming from sleep. Even programs that tried to protect themselves from hangs on network interactions would be slow and glitchy while competing with other applications for precious reads or writes to the disk. This on top of problems introduced by invasive anti-virus and anti-malware (as well as actual virus and malware) that plagued the platform and introduced pervasive performance anomalies.

I had been beating the drum about hangs and responsiveness inside Microsoft for around 5 years at that point, including writing a paper with another engineer for one of Bill Gate’s famed “Think Weeks”. Inside Office, we were getting data from the remote crash reporting system (Windows Error Reporting, known internally as “Watson”). Those reports were showing that hangs (where the user got so frustrated they actually killed the program) were a significantly larger source of abnormal program exits than crashes. This was on top of all the other little glitches and temporary hangs that could so impact user experience and satisfaction. We were starting to invest a lot more effort in addressing these hangs, which typically required significant redesign because of the deeper underlying issues I alluded to above. Crashes could often be fixed with a small code change; hangs might require redesign of a large component.

I was standing in that Apple store holding a device that seemed to have none of these problems. The iPad used a fast SSD drive for storage so did not suffer from the “unbalanced PC” problems. It was built on top of the iPhone’s iOS operating system. Many of the most interesting operating system innovations in iOS were focused on guaranteeing application responsiveness as well as the good battery life that made it possible to build such a light-weight device that could run for a day on a single charge. We had seen these innovations in applications on the iPhone and I was now seeing this for applications that more directly competed with our Office apps.

According to Apple press, the iWork applications (Keynote, Pages, Numbers) had been “rewritten from the ground up” for the iPad. I thought that dubious, but they certainly felt smoothly responsive in quick testing that I did on the store device. (In fact they had slash and cut these products, introducing file and experience incompatibilities with the Mac versions that they would struggle with over the next few years.)

Microsoft had been working on Tablet PC’s for over 10 years at that point. They were generally clunky devices, typically heavier than normal laptops with odd displays that could fold over to hide the keyboard and present a tablet-like form factor with a special pen for input. Most hardware, OS and application work had been on supporting the pen for both UI gestures and ink content creation. They were built on Intel processors like any other PC and had no special innovations for general responsiveness and battery life. Office had done work to support ink content (including targeting a new application, OneNote, for inking scenarios), but had done no extensive architectural work to target this type of device. There was no way to simply iterate to the device I was holding in my hands.

Was this the future of personal computing?

Let me frame the challenge facing Windows, since that drove much of the early Office response.

Windows had remained committed to Intel which had struggled to deliver power-efficient processors for the phone and nascent tablet market. Apple had just shown that the efficient ARM processors used in 100’s of millions of phones could scale up to the needs of a light-weight tablet. Less obviously, Apple was turning the tables on Microsoft in how power and influence drove other aspects of the device hardware business. The high-volume PC market had put Microsoft in a position of power for almost three decades when dealing with makers of screens, memory and other critical graphics and communication chips used in these devices. But Apple had just linked the phone and tablet market which was one to two orders of magnitude larger than the PC market and would give them key leverage in driving innovation in these adjacent supplier markets.

Apple’s strategy (which was controversial both before and shortly after the iPad release) was to move up from the phone rather than down from the Mac in designing a tablet. This was a decision about hardware choices (the iPhone ARM processor vs. the Mac Intel processor) but also the OS and applications. They immediately got access to millions of iPhone apps. This not only meant the immediate body of applications but also meant leveraging and reinforcing the growth and momentum of the iOS API. This was only two years after the opening of the App Store, but it was already clear that Apple was replaying the “API moat” strategy that Microsoft had used so effectively with the Windows Win32 API. A compelling device attracted users. Users attracted developers programming unique applications to the device API, attracting yet more users in a virtuous cycle. It was a strategy Microsoft was very familiar with and understood how powerful and long-lasting it could be. By betting on the iOS API for this new device category, Apple was reinforcing rather than diluting that advantage. Building on the phone’s processor power management advantages and the OS and application changes for performance and responsiveness allowed them to build that light-weight long-running device that Microsoft’s tablet efforts had failed to do.

Perhaps less clear immediately as so much attention was played to the special features of the iOS API was that the Unix OS and the lowest level OS APIs underlying iOS were the same low-level APIs used by MacOS. This meant that much low-level open-source code that was used by third-party developers could move over to the new device platform with few changes. This ease of transfer also applied to large developers (who often program to the lowest level of the OS API) moving their code bases as our Mac Office team would find when looking to build the iPad version of Office.

On the Windows side, Microsoft’s mobile efforts were based on Windows CE, an independent OS kernel from the Windows PC. Some APIs were shared but in an ad-hoc way. Microsoft was in the midst of rebooting the phone platform with Windows Phone 7 (which wouldn’t release until September 2010) and it had a different OS and API strategy. The cacophony in Microsoft’s OS and API strategy was also reflected in organizational strategy, with Windows Phone in a separate division from the Windows PC business and with a distinct strategy and mission to compete with iPhone.

The App Store (and its lack in Windows) was also proving to play a much larger role in the ecosystem than had been obvious when it was first announced. Initially, it looked to be simply a way to discover, install new applications and update existing ones. By itself, this was a significant improvement over the situation in Windows where there was no central way to discover new applications, applications had ad-hoc and custom installation processes and often would have special updating tools that would run in the background and each interact in unique ways with the user in order to update their applications.

More importantly, the App Store was proving to be a key point of leverage for Apple in driving and controlling the ecosystem. The App Store approval and update process allowed them to vet applications, especially around security, performance and responsiveness. On a mobile device, one poorly performing application can tank battery life, so this approval process was clearly in the users’ interest. It also greatly increased user trust in the safety of trying new applications (the typical PC application installation process at the time was just as likely to install some malware). Importantly, over time, it also enabled Apple to require that developers move to new APIs and abandon obsolete ones. Any platform has the challenge that as they build up that body of applications that serve as the critical platform moat, it becomes harder and harder to move the platform and the applications forward together in concert and effectively deliver new capabilities and innovations to the user base. The App Store would be a critical asset in that process.

The strengths of the Windows position were just as critical to its response. 2011 would prove to be a high-water mark in PC sales, with 365M devices sold. Windows 7 was unlocking a wave of corporate deployments that had been blocked by the challenges with Windows Vista and the slow response of device makers to deep changes in the Vista driver model. Internally, a dysfunctional Windows engineering organization had been transformed by Steven Sinofsky and was riding the success of delivering Windows 7.

And we had Office. A key part of Microsoft’s internal myth was around the synergistic roles Office and Windows had played in each other’s success. Ballmer’s preferred key metric of Office business success was typically measured as “PC attach” — what percentage of those hundreds of millions of PCs were sold with a copy of Office on it?

When Windows was 90–95% of the “personal computing device” market (1995 through 2008 or so), most of the discordance between Windows and Office was about whether to support older versions of Windows or have new versions of Office require the latest Windows. These arguments would generate all kinds of internal drama, but ultimately had little consequence. But as these powerful phones and tablets came to market, the tension between the Windows business model and a distinct Office business model became more explicit. It’s also worth a reminder that when your business model is working really well, it becomes like water to a fish — you’re living in it but it is oddly almost transparent. When you have to fight and scratch and focus on your business model, it becomes much better understood by the overall organization. We had been fish for over a decade.

Getting back to the tablet market, how much would it be additive or distinct from the PC market? How much would it be disruptive? How should Office take advantage of this new class of device? Could Office be a competitive differentiator for a Windows tablet? How should we think about the value to Windows of limiting Office to a Windows tablet vs. the cost to the Office business?

The Windows team made a number of hugely ambitious bets as they launched into Windows 8 development. They would deliver a new version of Windows for ARM tablets, targeting a device Microsoft would design and manufacture itself. They would also build an Intel-based tablet. These would be explicitly targeted as productivity devices, with integrated cover/keyboards. They would design a new API, WinRT (Windows Runtime) that would support building a new class of managed, secure, responsive applications. These would be delivered through a new Windows Store. They would bet on touch devices, radically redesigning the Windows shell and much Windows UI for touch.

Explicit in this was a decision to address the tablet market by moving down from the PC rather than up from the phone. In some sense, this was an easy decision because although Microsoft had been in the mobile phone business for a decade and had a significant share of the feature phone market, we were scrambling to deliver our first real smartphone (Windows Phone 7 would not ship till September 2010) and were seeing iOS and Android building a rapidly accelerating installed base in smartphones.

There were a number of direct consequences for Office. We would build an ARM version. We would start work on new WinRT versions of Office. Office would partner actively with Windows in ensuring the new API met our demanding needs. We would start early work on iOS versions, but we would not set a clear delivery target or make a final decision on when to ship them. We would run the experiment of what impact Office exclusivity could have on the tablet market.

There was some early waffling about how to deliver on ARM (“just port” an earlier version? use a version of our web apps in some fashion?) but ultimately we settled on building these as simply a separate build target off the main branch of our new Intel development. This was a pretty easy decision in reality. The other options were trying to reduce impact on Office, but all guaranteed throwaway or off-strategy work that inevitably would have grown over time. In practice, virtually all the interfaces we needed were available in Windows ARM version since this was a complete version of Windows. Almost all the actual code changes were in support of performance and battery work that were just as valuable on Intel devices as they were for ARM. The changes for touch also applied equally well for all versions. It made no sense to strive for a false economy with a separate branch of development.

Perhaps most importantly, we were seeing the burst of new devices and OS platforms arriving. I knew our only long-term strategy for managing this complexity had to be to simplify our engineering processes and unify where ever possible. I was trying to avoid any off-strategy work that wasn’t clearly contributing to a larger goal and which would introduce complexity and impossible trade-offs at the small team level.

As we got deeper into development, the discordance grew. The phone market was exploding, with Android and iOS both gaining large share. We were just dipping our toes into supporting these platforms, with ports of our Office Mobile applications. Office Mobile had originally been built for our Windows Mobile product (with small screens and physical keyboards) and were completely different code bases from the desktop apps with significant feature incompatibility. That might have been acceptable for feature phones where use was limited but was significantly off strategy in the context of platforms that were expanding in capability for productivity work. In addition, at least in the back of our minds (and in front of our eyes in various envisioning videos) was whether a single powerful device that you could carry around and wirelessly connect to larger screens and keyboards might be the future of productivity devices. In that scenario you would want to make sure you were positioned with the ability to deliver your most complete offering on these devices.

At the same time we were barely investing in these exploding platforms, we were spending a tremendous amount of effort on porting to the new WinRT APIs. Windows had completely bought into all the arguments about asynchrony (I’ll take responsibility there), but the implications were pervasive and were forcing significant internal design changes in the applications. The sandboxing and process lifetime changes (necessary to move to a model where it was safe to install and run a new application from the store) were also pervasive and had rippling consequences for all the code Office used from other parts of the company. I would reread Arthur C. Clarke’s short story, Superiority, at regular intervals during this period and wonder whether we were making a classic mistake.

Windows leadership was not tremendously helpful in thinking through the strategy here despite the fact that they were friends and close colleagues. They didn’t want us to “just port” the applications — “think different about what this new platform makes possible”! At the same time, I was struggling with the question — “is this really the new Windows API? Where are the devices where this API will be the only effective way to program to it?” For other devices and platforms there was a direct relationship between the device and the APIs you needed to use to run on that device (let’s leave browser-based apps out of the discussion for now). These new Windows devices would still run the old Win32 APIs as well as the new WinRT APIs. Windows really wanted applications to move over to these new APIs or Windows, as an ecosystem, would not see the benefits necessary to deliver the power, responsiveness and security improvements to compete in the tablet space. But they wanted to also bring the Win32 APIs along to benefit from that huge installed base of applications.

Over the past twenty-five years, Office had had a love-hate relationship with other parts of the company bearing new APIs. There was the Office-Windows origin story where our internal teams had bet early on Windows and rode a wave to success there as we beat out Lotus and WordPerfect. But the Windows API had been the only way to deliver graphical applications on those machines. We had seen that committing to where the OS was putting their investment was important in the long-term, but we had also seen a bunch of fits and starts and dead-ends where buying into Windows’ story of their direction would have been a colossal mistake. The whole managed code diversion was the best example of that but there were many others (my first manager at Microsoft, Chris Peters, had told me, probably partly in jest, that Office’s secret advantage wasn’t that it got early insight into what Windows was going to do but that our knowledge of the people involved allowed us to predict whether they were actually going to be successful or were overselling and we should avoid using what they were building).

Windows got stuck in the middle. They wanted the benefits of that market of 100’s of millions of PCs to help them push into this new world (attracting developers who could attract new users) but that meant they ended up building a device that worked much more like a lightweight laptop. In fact, the laptop market has proved to be a much more resilient use case for productivity work than was feared when the tablet first appeared. And that laptop-like device could be programmed using the old Win32 APIs. So why do all the work to learn and use the new APIs? And if you really were going to target an application for a new tablet use case, why not target the market-leading device?

One of the biggest benefits for Office ended up being that the radical changes in API motivated a much more disciplined and rigorous approach to cross-platform (the largest divergence in API design across the platforms we were seeing was between Win32 and WinRT). In fact our Mac Office team working on the iPad version found that the commonality in the lowest levels of the API allowed them to bring much code directly over.

As we were struggling with the goals here and a viable Windows strategy, our long-term opportunity in the productivity market provided a very different perspective. The movement to services looked like an historic inflection point in the market. Google was attempting to ride this inflection point to dethrone Office from its position as the leader in productivity tools. From our perspective, we had a unique opportunity to bring together our server assets (Exchange, SharePoint, Lync/Skype/Teams) with our rich client applications into a compelling service productivity offering with a long-term sustainable and defensible position. In this world, the rich clients project the service on to the devices customers own. Limiting ourselves to Windows devices created doubt in customers minds about whether we were the right productivity service to commit to and project on to all their devices.

Trying to defend Windows’ collapsing position was directly putting the more robust Office opportunity at risk. This perspective is what ultimately unblocked the release of a version of Office for iPad, but it was after the release of Windows 8 and the Surface tablets demonstrated that Office exclusivity would not prove decisive. We had taken such a rigorous approach with cross-platform in the WinRT work, that we were also able to release an Android version in less than a year once we made the decision to do so.

I want to return to those iPad retrospective posts by other writers that I linked to above. Except for Sinofsky’s post, which was mostly a retrospective on the immediate context and response to the iPad announcement, all the articles express disappointment in what the iPad has *not* become, especially ridiculing the complexity of Apple’s efforts at a multi-window, multi-tasking interface. Thompson focuses on the lack of support in Apple’s store policies for business models that could support the kind of on-going R&D that more complex productivity applications typically require and how this has prevented innovation in tablet-specific productivity applications.

I come from a different perspective. I lived through the period from 2000 on when Windows continued to scramble to somehow define a new use model for the PC by more and more complex APIs and application building blocks and language environments. In fact, all the major use cases for the PC were all supported by 2000. Every single thing about the PC could continue to be improved — weight, battery life, connectivity, processor speed, memory and storage capacity, screen quality (as well as security, reliability, performance, etc.) But the major use cases were defined by a large screen, a keyboard, a precision pointing device and a high-speed Internet connection. “Sufficiency” was anathema to the internal thinking, but such a view would have put focus much more on channeling continuing hardware improvements than thrashing on software strategies.

The iPad reached a level of sufficiency faster than any other computing device category. That first one was great and subsequent improvements in screen, processor, memory and camera have been important as well as obvious (not ncecessarilyteasy, but obvious). It seems the worst thing they could do (which you could argue the thrashing on multi-window interface is an example of) is to not accept that the device is great at what it does and accept that.

--

--

Terry Crowley
The Startup

Programmer, Ex-Microsoft Technical Fellow, Sometime Tech Blogger, Passionate Ultimate Frisbee Player