The Rise of Intelligent Digital Assistants

How mobile phones, personal data, and the cloud are creating a new way to interact with computing devices.

Sasank Reddy
Mobile, Apps and Future

--

The idea of having an intelligent digital assistant isn’t new. Hollywood has already introduced us to what a digital assistant could look like in the future — see Hal from 2001: A Space Odyssey, Jarvis from Iron Man, and Samantha from Her. But we’re still a ways off from Hollywood’s representation.

Clippy helped you with common tasks that you would do in Microsoft Office — like writing a letter.

Our practical history with digital assistants starts off much more modestly with the venerable Clippy, an interactive animated character that helped users navigate Microsoft Office.

Our next encounter came from Microsoft Bob. The simplified operating system contained a series of “personal guides” that included a friendly dog, a french cat, an energetic monkey, a turtle, a rat, a gargoyle, and even William Shakespeare. These digital guides helped users navigate various operating system functions such as finding programs, opening files, or editing documents.

The Microsoft BOB operating system had a series of guides that you could interact with.

With the advent of mobile phones, we’re seeing a new era of digital assistants that are much more sophisticated. They combine machine learning technologies from the fields of speech, natural language processing, and document analysis to provide a new way to interface our personal computing devices. The three most popular of these digital assistants come from the mobile operating system leaders: Google (Now), Apple (Siri), and Microsoft (Cortana). Each take a different approach in the way they interface with the end user — relying on various forms of predictive intelligence and personalization.

All three assistants enable you to query for information using voice, such as performing web searches, finding points of interest nearby, or getting weather information. They also interface to the basic functions of a phone like making calls, setting reminders, or invoking apps. But the direction they are evolving are different.

Google Now

Google Now takes advantage of its knowledge graph and personal data streams to display relevant information through the use of “cards”. The system leverages activities you do on your phone and information gathered through the vast array of services it provides (search, email, calendar, maps) as inputs to its “anticipation” engine. The idea is that instead of you querying for information all the time, Google Now anticipates what information would be useful to you and delivers it “just in time”.

Google Now is designed to get you the “right information at just the right time”.

For instance, if you are near an airport, it will bring up your boarding pass automatically. Or if you have an appointment in your calendar, it will suggest a route to get to your event. This “anticipatory” nature of Google Now is what makes it great and also sometimes frustrating — once you swipe a card away, you can’t get it back and some cards seem redundant over time (like everyday transportation information).

Siri

Apple’s Siri takes a much more human approach to being your digital assistant. Instead of focusing on being a service that runs in the background and anticipates your needs, it waits for you to make a query and serves your needs when requested. This goes back to the idea of computers being a “bicycle of the mind”. Siri essentially serves as a more effective tool for people to accomplish the tasks they want to undertake.

Siri is built to be personable — you can ask the app for a joke even.

Also, Siri has a personality — you can have idle chit chat such as asking it for a joke or inquiring about it’s origins. The playful nature associated with interacting with Siri makes it feel okay when the app makes mistakes. Overall, this conversational approach to interacting with Siri is what makes it stand out from Google Now.

Cortana

Microsoft’s Cortana combines the anticipation capabilities of Google Now with some of the personality of Siri. The key difference between Cortana and Google Now is user control. Cortana has a virtual “notebook” that stores all the personal information it knows about you. You are able to remove information from this notebook if you don’t feel comfortable — like denying access to reading your email.

With Cortana, you can also share your interests, like the types of movies you like or your food preferences, which help it suggest content, places, and events you might be interested in. The language and voice engine seem a bit more advanced than Siri — this is a testament to the long line of research Microsoft has done in this space.

You can interact with Cortana in a number a ways — a few are illustrated here.

Where are we going from here?

We are just getting started with these intelligent personal assistants. The algorithms that power them will only get better. As we progress to wearable computing devices, these personal assistants will become more important as they provide an easier interface to access information on the go. But where will this technology go next? Here are some ideas that might be interesting for digital assistants to explore.

  • Interacting Through Conversations

Currently, you interact with most digital assistants through voice and the search box. Voice works great in certain situations — when you are in the car or at home — but feels less comfortable to use as an input mechanism when you are with other people. In that case, being able to have a text conversation with your assistant might be appropriate. We have already seen that mobile users are comfortable with quick text conversations as seen by the rise of messaging apps. Digital assistants that can react to text conversations will be useful for many situations.

  • Humans in the Loop

Digital assistants are great at taking care of simple tasks right now. You can imagine they will get better when they are able to link with other apps, context information, and even more data about yourself. But sometimes, you want a true human touch. Perhaps we will see these services have a way to interact with a real person when needed.

The Mayday service from Amazon — which connects you to a live person to help troubleshoot your phone.

Both Amazon’s Mayday platform and Airbnb’s local companion service show the value of having a human in the loop. It might be that dynamic situations with tight interaction require humans to be involved while more straight forward queries that can be broken down into logical steps can be handled by computers.

  • Domain Specific Knowledge

Most of the digital assistants revolve around social, navigation, and search type applications. But what about applications related to productivity, relationships, shopping, or finance? In the future, we can imagine more specific modules built into these assistants for these various domains.

For instance, in the case of enterprise, there might be a a “sales assistant” that is able to help you accomplish your sales goals by having access to all of your relationships and interactions. Or a “money management assistant” that is able to suggest investment opportunities you should be aware of and helps manage your expenses.

We’re already seeing initial incarnations of this with applications that rely on data science for better recommendations in apps like RelateIQ, Refresh, and Check. But I imagine that this functionality will soon be built into these assistants as part of their overall utility.

  • Pretargeting Based Advertisements

We all know about re-targeting. This is why you see Zappos ads follow you around the Internet when you search for shoes on Google. But as these digital assistants are able to get more data about you, a new type of advertising will occur: “pretargeting”.

Instead of reacting to your customers after they showed intent — you pretarget them based on what you think they will want to do / buy.

The idea is that the advertising will be based on anticipating your future actions.

  • Is your anniversary coming up? How about an ad that suggests places you might want to go nearby.
  • Leaving your favorite restaurant? Maybe, you’ll get an advertisement for a bar nearby that has your favorite beer that is on tap.

As we give up more data to these systems, one can imagine advertising only being more predictive.

We are just getting started with digital assistants. It is safe to say that they will be the future interface in the way we will interact with computers on a day to day basis. It will be exciting to see how their interfaces will evolve over the next few years.

Sasank Reddy is the co-founder of Kullect. His interests are mobile messaging, sensors, and context aware computing.

--

--