Amazon’s Alexa Presentation Language and the future of the smart home

Published in

WillowTree®

5 min readNov 29, 2018

Amazon just bought its first house. Its first house company, to be more specific. Plant Prefab, a California-based custom modular home company, received a significant investment from the tech behemoth last month. The idea behind this investment is that Plant Prefab will now allow buyers to customize their homes with Alexa-enabled devices.

This is merely the most obvious example of Amazon’s ongoing plan for smart home domination, but it’s hardly the most significant. At their hardware announcement event in September, they announced 15 new Alexa-enabled smart devices, ranging from smart plugs to microwaves.

In addition to this onslaught of new hardware, Amazon introduced a more subtle but no less exciting piece of their voice strategy. The Alexa Presentation Language (APL) will allow third-party developers to integrate visual content to voice apps for use on screen-based Alexa products like Echo Show, Fire TV, Fire Tablet, Alexa alarm clock, and Echo Spot, as well as third-party Alexa devices soon to come with the release of the Alexa Smart Screen and TV SDK.

So why is this so significant? It opens the door to voice experiences that are not only richer visually, but also contextually; brands will now be able to bring a wider utility and specificity to their interactions with their customers, based on where that person is, what they’re doing, and what device they’re using at that moment.

The Future of Voice is Multi-Modal

It’s always been our belief that the core advantage of voice technology is its ability to process requests faster and with more versatility across multiple devices and contexts. But even more powerful than voice alone is voice combined with visual display capabilities. That is to say, we think the best voice experiences are multi-modal. Amazon’s announcement stands out most significantly to us in that it shares this vision of the future of voice.

By not only diversifying their offering of Alexa-first product offerings but also expanding Alexa’s domain to a growing array of third-party products, Amazon is doing everything they can to extend their device leadership over competitors. The Alexa Presentation Language is the glue that will connect all of these devices and experiences.

Whereas the web let people view content across different sized screens, APL (and likely future competitors) aims to support interactions across an even wider range of contexts, including those where there is no screen at all. On a laptop? Great — we’ll show you movie times. In the car? We’ll read them to you — but just the few we think will be most relevant.

Getting Your Brand Ready for Multi-Modal Voice Proficiency

2018 so far has shown that a multi-device ecosystem utilizing screens and voice — with mobile as the hub is the future of personal and enterprise technology. The primary challenges this presents to brands is scalability. We see two key areas to prepare in:

Content Strategy. Does your content strategy take both voice and screen into consideration? Do you have a plan for how, and where, to respond to voice queries in a multi-modal environment? Is your brand voice clearly defined internally to ensure continuity when new devices and contexts come to market? API Services that let you move quickly. As connected devices continue to grow in adoption, the variety of front-end contexts calling on your existing services can become equally complex. You’ll need to talk to Siri on smartphones, and Alexa on microwaves. The key is having robust, flexible APIs that let you interact with your customers in different contexts. A comprehensive, dedicated API layer in your product infrastructure can take this burden from your backend, allowing for increased flexibility and efficiency, meaning you’ll be able to get to market faster and more reliably when introducing a new device into your brand’s ecosystem.

Mobile in the Middle

At the end of 2017, we claimed that 2018 would be the year voice went mainstream. What surprised us was just how quickly it happened. By March, 20% of US homes had access to a voice-enabled smart speaker of some kind. A new report from Adobe shows that the current number is now closer to 32%, with a target of 48% after the holiday season.

Companies who have chosen to jump into voice in 2018 with both feet have found great success so far. Perhaps the most publicized success story in recent months is Erica, Bank of America’s virtual assistant, which launched in June and amassed 3 million users in its first three months. (It’s worth noting that actual user reviews of Erica are mixed, though that seems due largely to how it was rolled out.)

On the hardware side, nobody has invested as deeply and broadly as Amazon, who is handily outpacing the competition for adoption of in-home voice-enabled devices:

The key piece Amazon is missing, of course, is an Alexa-native mobile device to control all of these peripherals. (The failed Amazon Fire was pulled over three years ago, with no replacement in sight.) It’s important to remember that the smartphone is still the most widely-owned, widely-used voice-enabled device there is, and Google and Apple are the world leaders in smart assistant penetration (because of all the Android and iOS devices out there which have Google Assistant or Siri installed).

Considering the continued primacy of mobile devices as the hubs of our digital lives, it’s difficult to imagine a strategy comprised exclusively of standalone voice assistants winning the coming “voice wars” over ecosystems with a phone at the center. Can Amazon overcome the major disadvantage they face in comparison to Google and Apple — not having a foothold in the smartphone OS market?

Time will tell. If 2018 was the year voice became a mainstream topic of conversations, 2019 will be the year when it becomes a foregone conclusion — when it will become clear which brands have taken the lead, and which got caught flat-footed.

If you want to talk further on how to bring your brand into the age of voice, don’t hesitate to reach out!

Originally published at willowtreeapps.com.

Amazon’s Alexa Presentation Language and the future of the smart home

The Future of Voice is Multi-Modal

Getting Your Brand Ready for Multi-Modal Voice Proficiency

Mobile in the Middle

Written by Tobias Dengel