iOS 12 Shortcuts, and the future with voice

Yunwenyao (Johnny) Zhou
5 min readOct 24, 2018

--

“black portable speaker” by Przemyslaw Marczynski on Unsplash

Apple recently introduced iOS 12 with Shortcuts, providing “a quick way to get things done with your apps, with just a tap or by asking Siri. This caught my attention because I am a iPhone user, and I never liked Siri.

In the meantime…

Globally, the size of smart speakers market grew 187 % in 2018 Q2, totaled 16.8 million units, up from 9 million in Q1. This market is estimated to be worth $30 billion by 2024.

image from Venture Beat

There is a clear trend of voice becoming the new battleground among tech giants. But what are they fighting for?

Time.

Time is valuable and people are lazy.

We only have 24 hours a day. The competition for time is only getting tougher, as every extra minute becomes more expensive.

Humans are a vision dominant species. Almost all UI/UX design is centered around vision — it is rare to find an app you cannot use when your phone is on silent mode. Thus, with vision becoming more scarce, companies are exploring other sensations where they can get user attention.

Hearing became the next area of interest.

The explosive growth from smart speakers reflect the growing importance of voice. Powered with AI, these voice assistants are becoming better at understanding and helping us with various things.

Let’s take a look at Alexa Skills first.

Every skill is essentially an “intent + slot” pair. This type of schema is flexible because there can be an infinite amount of meaningful combinations.

screenshot from Youtube

Alexa now has over 30,000 skills available. Similar to the App Store, Amazon has built a platform that connects developers and users. However, as the platform grows, it becomes more challenging for users to choose the skill that best fits their needs on the fly.

Moreover, one of the biggest challenges in voice-only interaction is the limit of available information. When we use our phone (with vision), we can easily navigate our attention to the most relevant information. The UI also guides us with every step, allowing us to review and rewind actions. On the contrary, voice can only communicate a few seconds worth of information that requires the user’s full attention. Thus, voice is usually good at one-step actions, such as checking weather and playing music.

To make voice interaction better, Amazon decided to combine screen and voice, a.k.a Echo Show. By adding an extra display, it can build a deeper user engagement with more information at disposal.

Imagine when you are following a recipe on Alexa, it tells you precisely when to perform which step with how much ingredients. Things are going great and you are ready to plate your amazing salmon fillet dish. Alexa will probably tell you to put the salmon fillet on top of the asparagus in the middle of the plate. But, how exactly should you position the the salmon and the asparagus? With a screen, you can see an example of a well-plated dish, without struggling over the orientation of your salmon.

image from epicurious

However, if screen time is becoming more scarce and expensive, will this be a solution for the future?

With Shortcuts, Apple took a different approach.

image from Gizmodo

Shortcuts are a series of steps that apps take to help users save time and get things done. What makes Shortcuts different is that users create them, rather than developers.

Every Shortcut consists of a list of actions, the smallest unit. For example, you can ask Siri to run your Shortcut that takes a picture with the back camera (RIP selfie sticks). You can also ask Siri to tell your ETA to work. Unlike a lot of Alexa Skills, Shortcuts are meant to be fast. They are also highly configurable, which gives two key advantages:

  • Individuality. Every Shortcut can be different from one another, based on the user’s individual habits. For instance, some users might prefer to stop by their favorite coffee shop before going to work. Letting users define individual preferences creates a much more personalized experience.
  • Context. It is challenging to perform complex asks with just voice. Shortcuts solve that problem by making users explicitly configure each step that the Shortcut takes.

Sure, this is not as cool as an AI-powered voice assistant at all, and it requires users to learn how they work. But ultimately, users do not just want better and cooler technology; they want an accessible product that understands and fits them. The voice assistants today are good, but they definitely can, and need to be better. Well, it may take years.

Shortcuts, though less sexy, unlock Apple’s ability to understand natural user behaviors in the present, by giving users the freedom to create functionalities the way they want.

Shortcuts is for the future, too.

If voice was to become the next popular HCI (human-computer interaction) interface, there is not much to prevent iPhone users from adopting new behaviors, since they already spend hours daily with their mobile devices.

However, the space beyond mobile is even more exciting. Almost every tech company is trying to create the best assistant and smartest home possible. What will this mean to mobile if everything at your home is connected to the internet? Do we still need a mobile device as the control hub?

The future with voice is near, but it is yet unclear what it will look like and what problem voice is going to help people solve.

Thanks for reading and hope you found this interesting. Again, questions and comments are very welcomed.

--

--

Yunwenyao (Johnny) Zhou
Yunwenyao (Johnny) Zhou

Written by Yunwenyao (Johnny) Zhou

Building at Alloy, making financial products safe and seamless. Had fun in e-commerce, real estate and now in fintech.