“Do we ask for a what and a where, or simply present them with a search box?” — it was one of the burning questions the team faced as we were getting Google Maps for mobile ready to launch in 2005. I was the product manager and this was my first product launch at Google, so I felt we had to get every detail right. Oh, right, remember, this is before every phone had a built-in GPS unit.
We ended up showing users two input fields: a what, where the user could put restaurants, Starbucks or whatever they were looking for, and a where, where the user could put the location of whatever they were looking for. We were wrong.
People didn’t quite understand the difference between the two fields, and often filled out the first one but not the second, or mixed them up and put “coffee” as the location. That was ok, we had a simple solution: make it a single box. Yet again, we were wrong.
What was missing? Turns out that most people wanted to find things nearby, and had assumed the mobile phone would simply know its context: the current location. It was the automatically assumed “zero-input” part of the search. Luckily, the feedback loop is pretty short, so most users quickly realized what was going on and recovered.
Even early users of Google Maps for mobile automatically assumed location as the “zero-input” part of the search.
Context is a powerful thing, especially for mobile applications, and this little anecdote shows how bad the user experience can be if it’s not seamless and automatic. When people talk about context, they often focus on one specific dimension or another, and more specifically, location is often thought of as the strongest contextual indicator. In reality, it’s like that story about the blind men and the elephant: focus too much on one aspect, and you’ll get the wrong picture.
Understanding context can sometimes feel like the story of the blind men and the elephant: rely too much on one part and you’ll get the wrong picture.
What exactly is context and how can mobile applications take advantage of it? What follows is a high level overview of the various signals that make up context: location, time, etc, which then are combined into a holistic picture — towards the idealistic “perfect context” — which in turn can be used to make products better at adapting to our needs.
Side-note: I am going to focus on context as it can be derived from a mobile phone and periphery technologies. For a much more in-depth look at context and its impact on technology today, definitely read Robert Scoble’s and Shel Israel’s book Age of Context. Reading that book has inspired a lot of my thinking in this area.
Context of time
Most of us wake up sometime in the morning, and go to bed sometime in the evening, and that can be correlated with the time of day. By paying attention to what time it is, a surprising number of fairly good predictions can be made, even in absence of other information. Aviate, a contextual home screen for Android, does a surprisingly good job of using time of day to morph the experience in the evening, as you are heading to bed. Unlock the phone, and you’ll see the alarm clock, Kindle (for your nighttime reading) and a host of other useful bed-time apps.
Time also has longer cycles: day of the week, weekdays vs weekends, seasons, and so on, all provide interesting contextual signals in terms of what might or might not be relevant. Retailers have been scrutinizing this data for decades, with the most pronounced seasonality showing up at the end of the year as people buy presents for each other.
Offering directions to work when you are vacationing in the Bahamas is pretty likely to irritate people.
Using time as an absolute indicator of context can lead to some interesting failure cases, such as when you’re out partying past your bedtime and need an uber to get home, you shouldn’t be presented with an alarm clock and a book to read. Similarly, offering directions to work when you are vacationing in the Bahamas is pretty likely to irritate people.
Measuring time is easy and yields precise results, and too many applications ignore the usefulness of time as a signal for improving the user experience, especially if you combine the various phases of time. For example 7 am on a Sunday morning probably means a different thing to most people than 7 am on a Wednesday morning.
Context of location
An amazing host of services have sprung up which depend on knowing the user’s location: Yelp, Foursquare, Uber and the list goes on and on. Other interesting uses of location are also starting to surface: the built-in Photos app on iOS 7 clusters photos into “Moments” based on where they were taken, making use of the location context. All of these are great examples of how useful location can be to build products and features we could only dream of before.
As exciting as location is, it’s also often hard to use correctly. If two people are close to each other, what do we know about them? Are they friends? Are they even in the same place? Location data is inherently inaccurate and because accuracy costs energy (read: your phone’s battery life) the shortcuts taken can lead to suboptimal results. A favorite personal example was when I used a mobile app to track my morning jogs, and I was quite happily running a few miles every morning according to the app. It wasn’t until I took a close look at the map that I realized that it frequently tracked me off the course and I found out that the distance had been quite shorter than I thought. No wonder I wasn’t getting into better shape.
Google Now popularized the idea of a passive, observant assistant that will help you when appropriate. In addition to location and time, Google Now makes heavy use of search history which is a reasonably good proxy into a user’s online activity. For example, if you search for “Giants game schedule”, you probably care about the San Francisco Giants to some degree, and Google Now can bring you relevant information based on that contextual knowledge.
Search history is a reasonable proxy into a user’s online activity, but can’t be assumed to give the full picture.
The search giant has built a nice business on this simple premise: knowing what you’re looking for online tells a lot about what your intent is right now, and with some clever algorithms, the right results (and ads) can be served.
The risk, as always, is that you might be holding the trunk thinking it is a snake. When I searched for “Giants game schedule” back in the early days of Google Now, I wasn’t signaling that I’m a fan of the Giants, but rather that I wanted to know their schedule so I could avoid traffic when I go to the city. I wasn’t too happy with the repeated notifications about game results that followed during the next days and weeks. But then again, how could Google have known that?
Putting things together: the “Perfect Context”
One of the most interesting and hardest to achieve part of context is trying to understand what the user is actually doing, such as knowing whether someone is driving, sitting at a desk, talking to another person, or simply in deep thought. There is no direct measurement for this, but we are finding many proxies by combining various other indicators.
Driving, walking, hopscotch-ing?
Two of the most annoying failures of early versions of Google Now were when it showed me the driving time home at 11 am or the schedule for CalTrain. Why are these annoying? Because I’ve never left the office at 11 am (I wish!) and only once taken the CalTrain, ever. But why was Google Now offering these suggestions? Presumably it was making these inferences because I work close to a CalTrain station, and my commute has a highly variable traffic pattern. In absence of other information, it sounds pretty reasonable to offer driving time when traffic is better than usual or give the train schedule to people close to train stations. That’s also why it’s important for technology to have a broader mind and correlate things.
By correlating location data with mapping information (is this person going 45 MPH on a road or a railroad track?), time information (does this look like a commute or something else?), and external schedules (does that look like she’s waiting for a train every day around the same time as the train arrives?), a reasonably good picture of a person’s movements can be derived.
Once a solid understanding of a user’s movement patterns has been figured out, things that were never possible before suddenly are:
- You take the train every Monday, Wednesday and Thursday at 7:45am and 5:13pm or 5:27pm — you should get notifications about any changes in schedule, but only on the appropriate days and only for those trains.
- I’m driving on the highway, and get a text — the sender should be notified that I’m driving, and shouldn’t expect an answer immediately.
- You are on your way home from work, and just right before you get to the intersection where your dry cleaner is, you get a reminder to pick up the things you left there.
Many of these things can even be automated: by detecting that you stopped by the dry cleaner three days earlier and haven’t been there since then, the reminder seems timely and appropriate.
Free, busy? No, I just want to be left alone
Every time I invite someone to a meeting, coffee or whatever, I have no idea whether they can make it or not. How is it, when pretty much every one of my friends uses Google Calendar, they still won’t let me see whether they are busy or not? Turns out that the notion of being free or busy is much more nuanced than what is captured in a calendar.
The calendar keeps track of meetings I have, but it doesn’t keep track of everything else: whether I’m working on a big chunk of code and need to be left alone, having a fun conversation with friends, sporadically on the phone, my mood… the list goes on. Add on top of that the social norms of wanting to keep a lot of this information private — “wow, the Jones’s are free every night this month, they must be boring!” — and it becomes pretty clear that not only is the calendar inadequate of providing this information, it will never show the full picture.
Trying to figure whether someone is available is a notoriously hard task:
- Understanding the strength and relevance of a social connection can be important: most people don’t want to be bothered by co-workers while at home, whereas getting a call from a friend might be more than welcomed.
- A lot of work is conducted on computers, so knowing that I have Sublime Text or Google Docs in the foreground while at the office is a pretty good indicator that I don’t want to be bothered unless it’s urgent.
- Speaking of urgency, if I have a problem that needs your attention, I make the judgment call pretty much in the absence of any information about whether you want to be interrupted or not, and that can be a problem. If there was a reliable way to automatically hang up a digital “Do not disturb” signal, and take it down at other times, it would be much easier to be considerate of others (ah, just imagine that!)
- Simple tricks can also be used: if I set my phone to silent in the office, and then disconnect my computer and leave the office, I probably want the phone back on ringer.
Just as we can get pretty darn close to drawing a perfect circle, with technology, we can get pretty close to the perfect context.
Understanding the context of any person relies on a lot of indirect measurements and signals. Fortunately, technology is getting better every day at interpreting these signals, and we’re getting more signals as well — and just as we can get pretty darn close to drawing a perfect circle, with technology, we can get pretty close to the perfect context.