Apple’s HomePod: Threading a Needle

To understand HomePod is to understand Apple’s structural disadvantages

Anthony Bardaro
Jul 6, 2017 · 7 min read

Apple is charting a brilliant strategic path with its HomePod release: they’re positioning it as a premium product, with superior specs, that uses music as its defining value proposition. But, despite the marketing department’s best laid plans, analyzing HomePod’s fate requires an understanding of the uphill battle Apple has ahead.

First, it’s important to note that HomePod is competing more against Amazon Echo than Google Home. That’s not just because of the premium pricetags, but also due to the complements that are required as entrees to truly leverage each device: Apple ID and Amazon Prime are prerequisites, and both require hard-dollar target premium markets that have a significantly higher marginal propensity to consume than Android’s.

So, Siri and Alexa are competing to win the same demographic, the high-end, for whom they can layer-on supplemental services. That leaves Google to inherit the mass market below.

Of course, in a couple years, this battle could look a lot like the mobile market: Google wins the overwhelming majority of users, and Apple wins the overwhelming majority of first-order revenue. But, in the intermediate term, the slight edge from Google’s scale advantage should compound into an ever larger lead.

That’s because Google’s business and business model are uniquely suited for alchemy of the voice interface. In this case, as in most others, Google is the low cost provider of product and it gives away its services for free — all to achieve maximum scale. That means omnipresence to generate omniscient data, which in turn means superior personalization to entice more users, more usage, and again more scale.

That’s a virtuous cycle, and it’s more important in the voice era than ever before. In mobile, your phone was your portal to everything your heart desired, so you had it with you everywhere you went. Yet, voice is crucially screenless, hands-free.

Voice abstracts-away the physical device, and consequently, whichever device you choose to use depends on the situation. You might have a phone in your hand when you’re watching TV; you might have a smartwatch on your wrist when you’re running; headphones when you’re at work; a sound system in the car; or a home speaker in your kitchen. No single device can handle all of those use cases. It’s a matter of which gets the job done with the least friction, in each different context.

That adds up to a lot of separate devices for a lot of different circumstances. We don’t even interact directly with these devices — they’re just a back-end, delivering and receiving the audio that’s really our primary touch-point.

With the voice assistant as the common thread woven through our every interaction, it’s hard to imagine a premium priced provider being able to justify the ever larger wallet-share necessary to accommodate all of these increasingly imperceptible devices… especially when there’s a low cost provider who’s sprialling-up the aforementioned virtuous cycle.

Clearly, the battle for the voice interface is different than that for mobile, and that difference makes the winner-take-all incentive even stronger this time around. (At least, winner-take-all in each market segment.)

With respect to HomePod, et al, a killer home speaker is not enough. For example, in the voice interface competitors need…

  1. Ecosystem: Devices like a smartwatch or phone to provide a complementary screen;
  2. Integration: Interoperable apps, APIs, and IoT components to perform value-added tasks;
  3. Ubiquity: Omnipresence to provide pervasive utility;
  4. User Experience: Data to personalize an optimized service;
  5. Scale: A massive userbase to incentivize the virtuous cycle among all of these interdependent factors

Again, those five items are bound by a common thread: the virtual assistant. Unfortunately for Apple, it’s almost impossible for Siri to surpass Google Assistant. Apple’s hard-line privacy stance means they will never have the comprehensive data to train a top-notch assistant, and that could contribute to its failure in the next era.

Don’t forget, User Experience has a different connotation when we’re talking voice, as opposed to visual. Apple’s peerless industrial design was a critical competitive advantage that helped accelerate adoption of its graphical user interface, mobile UX, and even its physical devices themselves. Yet, that’s a non-factor when the visual experience is abstracted-away. Narrowly defined, that User Experience is now predicated by the quality of a virtual assistant.

Base case scenario, Apple currently has an Ecosystem lead — and has the chops to lead Integration and Ubiquity — but it structurally cannot win the next competition for User Experience and Scale.

That all said, what’s the best case for Apple’s HomePod?

Considering Siri’s inferiority and the other systemic disadvantages, it’s rather remarkable that Apple still has a legitimate opportunity to compete in this new era. That’s a real testament to its Ecosystem — a brand so strong that it can overwhelm even the most fundamental flaws. Its obsessive control over hardware and software has always assured that its Integrations are best in class. Plus, AirPods are a sneaky secret ingredient to deliver unassailable Ubiquity.

It’s not hopeless. If we assume Apple will maintain its prohibitive privacy policy, its only chance to win marketshare in the voice interface is dependent upon outclassing the competition across all three of those components (Ecosystem/Integrations/Ubiquity) by an order of magnitude akin to the iPhone vs the Nokia brick.

Apple Watch and AirPods are already category killers that substantially differentiate Apple from competitors. Those are a good start — a huge head-start. Apple must start leveraging them now to earn the developer buy-in necessary to close-the-loop on that virtuous cycle. To accomplish that, HomePod must coordinate seamlessly and contextually with iPhones, iPads, MacBooks, Watches, AirPods, and TVs. That’s the kind of value-added that gets people talking.

Remember, it’s hard for consumers to justify a $2,000 ante. That’s the minimum buy-in to accumulate enough Apple devices that you get to enjoy a pleasing voice experience — excluding a $120 annual Apple Music subscription that’s really going to bring their installed userbase to the table. But it goes without mentioning that Apple’s premium demographics will pay to be wow-ed.

This entire framework is incomplete were it not to consider ARKit, which is another successful attempt by Apple to “wow” its userbase. With ARKit, Apple has accelerated the flywheel again, as it is wont to do historically. Not only can augmented reality move the conversation away from voice toward the next next big thing, but it can also leverage Apple’s device advantage. It further warrants that $2,000 buy-in and tries to assure that screens still have a place in a screenless epoch. But, it doesn’t change the fact that audio is the next big thing. Like mp3’s beating video to the punch a generation ago, culminating with iTunes’ proliferation (more on that in a moment), we’re just not equipped for a visual-first experience yet today. Audio already has the pieces in place to enable a more frictionless experience; Google Glass, Snap Spectacles, and iPhones aren’t sufficiently simple vehicles for video delivery. (Never mind issues with the heaviness of video relative to text and audio — with respect to the intensity required of storage, compute, and power-consumption.)

In other words, Google Glass/Snapchat Spectacles aren’t sufficiently ergonomic hardware yet, so the value proposition of a discreet headphone as a transmission mechanism for media — delivered to your ear/delivering from your lips — currently beats the value proposition of any visual element. After all, the headphone leaves your hands and eyes free to multitask, but the screen doesn’t. It’s that very reason — the value proposition — driving consumer adoption of voice already. We already employ voice as our first resort when walking around town or driving in the car. It has a foothold in those relatively small usage-cases, from which it will proliferate.

Tech will eventually converge upon the richest of sensory experiences, visual, but it’s not there yet. In the meantime, we’re graduating from the text-based era into audio – an incrementally richer experience. Perhaps video/AR/VR thereafer. Perhaps engaging other senses (smell/taste) after that…?

Steve Jobs himself learned this lesson the hard way. He made a big bet on video — specifically iMovie’s leveraging FireWire — because it was the ultimate user experience, but the marketplace wasn’t equipped for it. His focus on video distracted him from the burgeoning audio revolution – CDs and walkmen – and Apple missed that entire epoch. (I’m sure that drove him headlong into the mp3 movement and iPods, NBD!)

That was the last generation, featuring media output, which began with text, then graduated to audio, and now video. The Web 2.0 started a new progression in input (“read-write” and UGC) as well, so history will likely repeat its text/audio/video cycle.

Again, using music as HomePod’s value proposition is the right wedge — a core competency for Apple. Nevertheless, Apple lacks the full stack required to win the voice interface race. Head-to-head-to-head Google should “win” the voice interface outright. And that’s why the window is closing for Apple to stake its claim by explicitly playing to its strengths.

Regarding Achilles Heels…

Of course the age of abundant information is prone to beg, borrow, and steal your attention. Reading blogs, news, and research has always been an inefficient user experience — finding needles in haystacks. But, Annotote is the antidote. Don’t waste time or attention; get straight to the point.

Adventures in Consumer Technology

No IT Dept: You're On Your Own