Texting Siri in the Era of Conversational Interfaces

It’s the middle of the night. I’m upstairs and looking out a window trying to figure out if a deer or a skunk might be rampaging through the yard. I pull up my phone and ask Siri to turn on the porch lights. In the British accent I gave her to seem smarter, she yells at full speaker-phone level volume:

“The porch lights is turned on.”

Great! The lights are now on, but Siri just woke-up the kid with a grammatically incorrect confirmation.

Why can’t I just “text” Siri?

We’re seeing a growing trend of platforms being a centralized control for a number of independent applications and services — many using voice as the primary input mechanism (Siri, Alexa, Cortana). The goal is a true conversational user interface — a huge milestone in human-machine interaction. I can’t wait for a future where I can talk to my car like K.I.T.T. and my house (or super suit) like J.A.R.V.I.S.

J.A.R.V.I.S

Even when we get there, having a primarily voice interface without “silent accessibility options” (yes, I just made that a thing) is a failure. You don’t hear Iron Man saying, “Hey Jarvis? Target that guy, that guy, that guy, and remind me to pick-up ointment for this rash.” You didn’t hear Michael Knight build-up a jump with “Okay K.I.T.T. I’m going to need you to Turbo Boost……………..NOW!”

There needs to be well thought-out alternatives.

Turbo Boost

In the real world, there are just too many scenarios where the platform or yourself need to STFU. It’s easy to dismiss this and say, “Just use the app” — but that’s not the future and not the point. Sure, I could:

  1. Unlock my phone
  2. Find the folder where my Shazam app is buried
  3. Open the app
  4. Tap the button to listen to the song playing to tell me what sweet jam is rocking the coffee shop right now.

But why? I have a perfectly efficient workflow which would prevent embarrassing myself by announcing to the patrons of the coffee shop that I don’t know the full Taylor Swift catalog:

  1. Hold the home button
  2. Type discreetly to Siri: “What song is this?”

It just needed a slight modification from talking to typing.

On a much more accessible level, there are a number of types of users where a voice interface isn’t helpful: people with severe speech impairments, people who can’t speak, people with thick accents — people from Staten Island. They can still utilize a conversational interface via typing — if one is made available to them.

This should be a thing that’s already here, don’t you think?

If you can say it, you should be able to easily type it.

Chat bots: Everything Old is New Again

On the text-based side of things, we have an explosion in chat interfaces, largely toward the renewed interest in bot applications. 
Typing:

/giphy fail

…into Slack and getting a randomized (and often bad) animated GIF isn’t innovation, an animated GIF containing content which is contextually relevant to the conversation thread and is randomly inserted into the channel by the Giphy bot with great comedic timing is.

Giphy bot fail!

The real magic happens when you can wire these bots up to understand natual-speech interfaces like the voice-driven applications. Even if the interface is primarily text-based, a conversational interface will indeed make interacting with all manner of devices more human.

Tell me you wouldn’t rather type:

“Hey TV, remember to record that new series “Roadies” on Showtime.”

…like you’d message to your significant other, instead of dealing with your DVR app — or worse — the UI of the cable box?


It’ll be nice to be able to have a conversation with our future robot overloads.

Use your preferred interface to communicate your thoughts.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.