Amazon Alexa. The next big…fad
Every 10 years or so, Silicon Valley comes out with what they call “The Next Big Thing.” Very few of those things ever become a “big thing,” most fade away into obscurity after technologists worldwide have sunk hours both during- and after work, paid and unpaid, into learning and creating to align with this “big thing.”
Amazon’s Alexa platform is one of these “big things.” Destined to create promises that never resolve, business deals that fall apart, lofty KPIs and goals that are never reached; overall, a minor tech bubble that will cost the industry more than it ever produces, and cause developers across the world to stretch to meet unrealistic goals under the guises of “innovation” and “growth.”
I will explain why this is the case by covering the two major players that are “hyping” Alexa.
First, I will explain why the technology behind Alexa is not innovative, but rather just a crippled, marketing-centric abstraction of technology that already exists, where Amazon’s business goals have marred any semblance of truly original or ground-breaking work.
Second, I will explain how the design choices that permeate the Alexa platform prevent it from being anything more than an over-hyped Bluetooth speaker with a locked down ecosystem meant to generate revenue for Amazon, and nothing more.
The Technology Behind Alexa
To begin, I’d like to outline from a conceptual view the way the Alexa platform works.
- A developer creates an Alexa “app” through Amazon’s convoluted user interface
- The developer designs questions and keywords that people would be asking through the voice service API
- The developer connects the Alexa app to an endpoint which receives the specific triggers and searches based on this match
- The endpoint sends back either more questions, or a response
Upon initial examination, this seems perfectly fine. What’s wrong? Someone asks Alexa a question, and the developer has the ability to answer that question.
There are a few immediate issues. First, the developer never has full-text access to what the user is asking. It always must be in the form of a properly formatted question, and the developer only has access to specific “intent schemas,” which are essentially the voice equivalent to HTML forms. This means that not only must we rely on the user to ask a very specific question, but never do we have access to the full text to do our own processing, or allow more free form input.
This means that at its core, the Alexa platform is basically a customizable HTML form with built in NLP for similar keywords. Alexa pre-populates some “intent schemas” for you, but in the end, any action your “voice skill” takes requires a user to follow a very specific pattern.
Technologically, the Alexa platform offers the following to developers:
- Vendor Lock-In
- Voice based user input that can be passed to your endpoint
- An endpoint where you can either pass a voice response, an audio file, or another question posed back at the user
What features would give developers more power to make more innovative Alexa skills?
- A notification system. Any time someone wants to use Alexa, they are required to ask it a question or to preform an action.
- An integration with Amazon Payments. Let users buy something using their built in Amazon payment methods.
- A conversation-based API. Every interaction with Alexa is very structured and based around an input and output. It would be extremely difficult as a developer to build a piece of software that continuously worked with and learned from its users, as we have no access to the full text of what a user is communicating, preventing any type of open technological exploration.
Alexa as a Business Tool
As we’ve seen, the Alexa platform is highly limited in what it promises to deliver. Yet brands fail to understand this, and because it is “trending” believe that jumping on the bandwagon with little to no actual discovery or planning will lead them to higher sales by being “ahead of the curve.
“How would a shoe company use Alexa?” These are the kind of ideas that advertising agencies across the country are trying to answer. I would argue that this question can be applied to any company that sells products, and the potential answers all fall into three categories:
- Notifications, e.g. let’s use Alexa to notify our consumer about something that our product does in a new way.
As we’ve seen, this doesn’t work, as the Alexa platform doesn’t support notifications. - Education, e.g. let’s teach our consumer about our product. For example how to use it, why it’s better than competing brands, etc.
Ah, yes. The next time I go to buy shoes, I am certainly likely to:
1. Install the (Insert Shoe Company Here) Alexa Skill
2. Ask it very specific questions like “What makes the (insert specific shoe model here) better than the (insert specific competing shoe model here).
3. Take notes on the answer for further analysis later.
…As you can see, the consumer gets no benefit using Alexa. It would be easier to pull their phone out and search Google. - Some skill meant to be haphazardly related to the product, but actually useful to the consumer, e.g. let’s teach our consumer the best nap length to take.
Again, this leads to a difficult consumer journey. First, I have to go onto my phone and install an Alexa skill that will answer this question. Maybe the skill title is “Best nap lengths, by (insert pillow company here).”
At this point, it could actually be useful. But what makes it successful as an advertising utility? Because I saw the brand name? If I ask how long a nap should take, and you finish it with ”…and don’t forget to check out our best selling blah blah blah,” I will uninstall your application. Sure, I may not be your target demographic, but is your target demographic going to have gone through all of those steps just to figure out how long they should nap.
We can even brainstorm more ideas that are unlikely to work. Imagine a product selector. Alexa asks you questions about your lifestyle, and you answer, eventually coming up with a product for you. First: this will be an agonizingly long process. Because your answers need to fill pre-built buckets, it would go something like:
“What activity would you need your shoes for? Your choices are hiking, climbing, backpacking, biking, everyday walking, and running.”
“uh, everyday walking”
“Are you a male or female?”
“Male”
“What color do you prefer? Your choices are red, orange, yellow, green, indigo, blue, violet, brown, black, or white. Please choose one.”
“Brown?”
“Would you like the shoe to be waterproof? Please answer yes, no, or it doesn’t matter”
“It doesn’t matter.”
“Do you prefer slip-ons or laces?”
“It doesn’t matter.”
“Great! I have 5 options available. Please tell me your cell phone number and I will text you a link so you can view them in our e-commerce system.”
The issues here are obvious. First, the range of choices is highly limited. Second the time it takes a user to get to the end is going to feel longer with each question, resulting in an enormous number of uncompleted journey’s. And third, at the end, the are just force to go onto the website anyway.
Wouldn’t this be better suited for a chatbot or an online experience? Definitely, and that’s the point I’m trying to make. As it stands right now, Alexa is highly limited. Just because something “can” be done on a voice platform doesn’t mean it should. Will the product improve? We can hope so, but in the meantime, we should be solving problems, not throwing time at fads because they are a trending twitter topic.