Digital assistant showdown

Published in

Mutual Mobile

11 min readAug 26, 2015

By: James Ayvaz

Apple revolutionized hands-free user interactions when they introduced Siri in 2011. Since then, we’ve seen the introduction of Google Now (2012), Hound from SoundHound, and most recently, a “leaked” copy of Microsoft’s Cortana for Android. Below is my biased and totally unscientific comparison of the current crop of personal assistant apps. Your results may vary.

Note: Hound is in private beta release and Microsoft Cortana is a “leaked” copy. Extra consideration should be given to them, considering they are not officially released products. Also, all of these products are backed by cloud services so they are continuously improving.

Contenders

Siri

Siri, the original personal assistant acquired by Apple in 2010, was the first voice-controlled user interface to utilize conversational speech and cloud intelligence. Continuing the Apple tradition of making technology approachable, Apple gave Siri a “personality”.

Distinguishing features:

Accessible via physical button on device, making hands-free operation easier.
Can learn your name and how to pronounce it correctly.
Accessible through the lock screen to help identify the owner of the device.
Only available on iPhone and iPad devices, although rumors of a OS X version have persisted for years.

Google Now

Personal assistant powered by Google’s industry leading search engine and natural language processing technology.

Distinguishing features:

Aggregates data using Google’s knowledge graph to predictively report when a variety of contextually relevant information on “cards”. Your orders are delivered, flight status, weather, traffic and limited support for 3rd party information.
First personal assistant to use always listening technology to trigger interactions using only the phrase “OK Google”
Available across platform as well as on the desktop, iOS, Android, OSX, Windows, Linux

Hound

Created by SoundHound and in development for over a decade, Hound touts mind boggling logic processing and speed. Hound also works amazingly well when using recommended queries, but becomes less useful when you depart from suggested questions. Hound is currently in private beta for Android only, but that will likely change as the digital personal assistant race heats up.

Distinguishing features:

Correctly interprets convoluted logic queries (e.g. “Show me all hotels in San Francisco for Friday night that are less than $200 dollars”).

Microsoft Cortana

Cortana originally debuted on Windows phones, and is now being ported to Android and, eventually, iOS. As I said earlier, I only tested the “leaked” version on Android, and will need to revisit once officially version has been released. Like Siri, Cortana attempts to have a personality, as well as some other novel features: geofencing, emoji story.

Distinguishing features:

Geofence aware reminders that send notifications based on location. So for example; “remind me to buy milk the next time I am at the store” and Cortana will remind you when you are geographically close to the store you select. I tried this and it works to an extent; I got reminders when i would drive past stores on the freeway.

Rules

Like all good competitions, I had to devise a point system to keep my decision as scientific as possible. I decided to award points as follows:

Speech recognition (10 points)

To test each assistant’s speech recognition, I ran it through a series of ten quick questions just to see if it even understood me. Each time the assistant responded, I gave it a point. If it didn’t, I gave it a goose egg.

Speed (4 points)

Speed is really subjective and dependent on a lot of factors outside of the app’s control (network speed, data signal, etc). Additionally some apps take longer because they are performing logical analysis of the query while others are just plugging data into a search engine. Since speed is so subjective, I graded on a curve and assigned a cumulative score based on each assistant’s overall speed performance.

Accuracy (5 points per query)

The true measure of a digital assistant is how well it answers your questions and requests. To test this, I asked each service the same ten questions and scored them accordingly. Below is a breakdown of how they were judged on a per-question basis:

0 = wrong answer or fail
1 = query was submitted to a search engine and the search results are displayed.
2 = answer is available but requires touch interaction
3 = answer or operation is correct and result is reported audibly
+1 = bonus point can be added for exceptional behavior
+1 = extra bonus if context is preserved on follow up questions. (e.g. Show me pictures of the Eiffel tower? How tall is it?)
-1 = points can be deducted for shady behavior (e.g. directing users to pay services)

Results

After a hard-fought software battle, here’s how each digital assistant netted out.

Event 1: Speech Recognition

Below are the 10 questions I asked each assistant:

“When was the pyramid of Cheops built?”
“What year did Arnold Schwarzenegger win Mr. Olympia?”
“How tall is the Matterhorn?”
“Give me directions to Hopdoddy?”
“Give me directions to Manchaca road?” ( Austin pronunciation )
“Give me directions to Manchaca road?” ( Spanish pronunciation )
“How do you pronounce Guadalupe?” ( Austin pronunciation )
“How do you pronounce Guadalupe?” ( Spanish pronunciation )
“What time is it in Tallahassee?” “Who is the mayor of Tehran?”

And here are the results:

Siri
1 0 1 1 0 0 0 0 1 1 = 5

Google Now
1 1 1 1 1 1 0 1 1 1 = 9

Hound
0 1 1 0 0 0 0 1 1 1 = 5

Cortana
1 1 1 1 1 1 0 1 1 0 = 8

Event 2: Speed

Siri = 2
Google Now = 3
Hound = 4
Cortana = 1

Event 3: Accuracy

Since this is the most important event in the Digital Assistant Olympics, I’ll break down the results for each question/command individually.

1) “Play, ‘You know you like it.’”

Siri
Siri played the correct song from my music library, along with full media controls: Next, Skip, Previous, Pause, Stop, Resume.

Score = 5
(3 points for carrying out the command, 1 point for full control of music library, 1 point for context aware questions — “Who sings this song?” will report the artist for the currently playing track)

Google Now
Google Now played the correct song from my music library, but lacked the full media controls of Siri. All other actions must be performed by touch. In situations where the song or artist can not be identified in the music library, Google Now will launch Google Play Radio.

Score = 3
(3 points for carrying out the command)

Hound
Hound performed a web search and displayed the results from SoundHound instead of my music library. I also had to touch the screen to play the song.

Score = 2
(2 points for displaying the correct search results but requiring touch interaction to play)

Cortana
Cortana performed a web search, resulting in series of YouTube videos I had to click to activate. However, it should be noted that Cortana documentation says this works appropriately on Windows phones.

Score = 1
(1 point for performing a web search that led me to a bunch of videos instead of the song itself)

SUMMARY:

Siri = 5
Google Now = 3
Hound = 2
Cortana = 1

2) “What time does Torchy’s close? Show me their menu.”

Siri
Siri successfully answered the question, but required additional effort to dig up the menu.

Score = 3
(3 points for answering the question properly.

Google Now
Google Now answered the question and brought up the menu when asked.

Score = 5
(3 points for answering the question, +1 for reporting the results without follow up questions, +1 for bringing up their menu upon voice command.)

Hound
Hound performed a web search, but didn’t even narrow it down according to my location.

Score = 1
(1 point for performing a web search)

Cortana
Cortana answered the question, as well as the audible follow-ups, but I still had to tap for menu.

Score = 4
(3 points for answering the question, +1 for reporting the results without follow up questions)

SUMMARY:

Siri = 3
Google Now = 5
Hound = 1
Cortana = 4

3) “Make dinner reservations for tomorrow night.”

Siri
Siri worked due to the OpenTable integration, however, I had to use touch screen to answer follow up questions.
Score = 4
(3 points answering command, +1 for seamless OpenTable integration)

Google Now
Google Now performed a google search
Score = 1
(1 point for performing a search)

Hound
Hound performed a search
Score = 1
(1 point for performing a search)

Cortana
Cortana displayed list of nearby restaurants according to my geographical location.
Score = 2
(1 point for performing a search, +1 point for using my location to tailor the search results)

SUMMARY:

Siri = 4
Google Now = 1
Hound = 1
Cortana = 2

4) “Wake me up at 10:00 AM.”

Siri
Siri created an alarm and allowed me to verbally cancel or adjust the time. Score = 4
(3 points for performing the task, +1 for being able to adjust times)

Google Now
Google Now successfully created the alarm, but nothing else.
Score = 3
(3 points for performing the task)

Hound
Hound successfully created the alarm, but nothing else.
Score = 3
(3 points for performing the task)

Cortana
Cortana successfully created the alarm, but nothing else.
Score = 3
(3 points for performing the task)

SUMMARY:

Siri = 4
Google Now = 3
Hound = 3
Cortana = 3

5) “Remind me to {do something} at {time}.”

Siri
Siri created the reminder, but it seemed to be stuck in a loop and required several attempts
Score = 2
(3 points for performing the task, -1 for being buggy)

Google Now
Google Now created the reminder.
Score = 3
(3 points for performing the task)

Hound
Hound created the reminder.
Score = 3
(3 points for performing the task)

Cortana
Cortana created the reminder and added a location to it.
Score = 4
(3 points for performing the task, +1 for geofencing)

SUMMARY

Siri = 2
Google Now = 3
Hound = 3
Cortana = 4

6) “Is it safe to take acetaminophen with alcohol? What about aspirin?”

Siri
Siri performed a web search.
Score = 1
(1 point for performing a search)

Google Now
Google Now audibly returned the correct result and answered the follow up question.
Score = 4
(3 points for performing request, +1 for maintaining context during follow-up)

Hound
Hound performed a web search.
Score = 1
(1 point for performing a search)

Cortana
Cortana performed a web search.
Score = 1
(1 point for performing a search)

SUMMARY:

Siri = 1
Google Now = 4
Hound = 1
Cortana = 1

7) “Who was president of the United States in 1993? How long was he president?”

Siri
Siri audibly answered Bill Clinton, but failed to acknowledge the end of George Bush’s term.
Score = 2
(3 points for performing request, -1 for leaving out a president)

Google Now
Google Now audibly mentioned both presidents.
Score = 3
(3 points for performing request)

Hound
Hound performed a web search.
Score = 1
(1 point for performing a search)

Cortana
Cortana performed a web search.
Score = 1
(1 point for performing a search)

SUMMARY:

Siri = 2
Google Now = 3
Hound = 1
Cortana = 1

8) “Who was president of the Germany in 1987

Siri
Siri audibly returned the correct results — Richard Von Weizsacker
Score = 3
(3 points for performing request)

Google Now
Google Now audibly reported a completely unrelated anecdote about President Reagan’s “Tear down this wall” speech. (I asked this question on a historically significant day, which might have confused Google.)
Score = 0
(+1 point for answering audibly with confidence, -1 for giving the wrong answer)

Hound
Hound performed a web search.
Score = 1
(1 point for performing a search)

Cortana
Cortana performed a web search.
Score = 1
(1 point for performing a search)

SUMMARY:

Siri = 3
Google Now = 0
Hound = 1
Cortana = 1

9) “What is a calico cat? Why are they always female?”

Siri
Siri performed a web search.
Score = 1
(1 point for performing a search)

Google Now
Google Now audibly returned the correct results sourced from Wikipedia.
Score = 4
(3 points for performing request, +1 for preserving context during follow-up)

Hound
Hound performed a web search.
Score = 1
(1 point for performing a search)

Cortana
Cortana returned the correct results sourced from Wikipedia, but not audibly
Score = 2
(3 points for performing request, -1 staying quiet)

SUMMARY:

Siri = 1
Google Now = 4
Hound = 1
Cortana = 2

10) “Find a hotel in San Francisco for Friday night that is less than $200, and has a free breakfast”

Siri
Siri performed a web search for San Francisco the city, saying nothing about the hotels.
Score = 0
(0 points for performing the wrong search)

Google Now
Google Now performed a web search for San Francisco hotels.
Score = 1
(1 point for performing a search)

Hound
Hound returned correct results with prices and links to book the hotel. This is one of the featured query types that Hound excels at.
Score = 5
(3 points for performing an accurate search, +1 for providing the right links, +1 for speed)

Cortana
Cortana performed a web search for San Francisco hotels.
Score = 1
(1 point for performing a search)

SUMMARY:

Siri = 0
Google Now = 1
Hound = 5
Cortana = 1

Winners

Once the checkered flag dropped, it was Google Now who put the pedal to the metal and crossed the finish line just in time to gain the gold medal. Between its unrivaled contextual awareness and search prowess, Google Now was just a little too much for its competitors. Siri wasn’t far behind, and could become a true contender in the near future. Coming in third was Cortana, which may perform even better once it is officially released. And last, and unfortunately least, was Hound. Hound put up a good fight, but it’s just too small to truly compete with superpowers like Google, Apple, and Microsoft. If you don’t feel like adding up the scores yourself, I’ve added the final tallies below. As the old adage goes, numbers don’t lie.

Siri = 32
Google Now = 39
Hound = 28
Cortana = 29

Max Score (including bonus points): 64

Originally published at www.mutualmobile.com on August 26, 2015.

Digital assistant showdown

Contenders

Siri

Google Now

Hound

Microsoft Cortana

Rules

Speech recognition (10 points)

Speed (4 points)

Accuracy (5 points per query)

Results

Event 1: Speech Recognition

Event 2: Speed

Event 3: Accuracy

Winners

Written by Mutual Mobile