AI’s tipping point…?
Last night I watched my wife have a conversation with a computer. Not a particularly sophisticated conversation, as conversations go, but it felt like a conversation to her and that’s perhaps more important. She has just received a new phone upgrade — an iPhone4S, and, while she was playing around with it on the sofa next to me, I casually asked her whether she realised the phone was intelligent?
“What do you mean?” she said.
“It has some new software in it called Siri that will listen to you and understand what you’re saying. You can use it to do things for you, you know — phone-related things like looking something up or adding things to your calendar and stuff.”
She looked at me a bit dubiously.
“Just hold it to your ear and talk to it as if you were on the phone to someone”, I said.
“What shall I ask it to do?”
“Um, why not tell it you’re going to meet me for lunch tomorrow?”
She lifted the handset to the side of her head, and a soft chime sounded to indicate that Siri was listening.“What can I help you with?” it said.
She paused for a second or two and then said in her best ‘talking to a machine’ voice: “Meet Chris for lunch at one o’clock tomorrow.”
The voice in her ear started saying something unintelligible to me, and then she shot me a cheeky smile and said “Hmm..” as if she was giving it serious thought, “I meant Chris Dymond”.
The voice then said a few more things and her eyes widened. “Er, no, um.. push the meeting to 2 o’clock instead”.
The voice carried for a few seconds, then she said “Yes please”, and took the phone away from her ear, rapidly unlocked it and flicked to her calendar.
There it was. Lunch with me at two o’clock. Just as she’d asked for.She looked at me in amazement, trying to fathom what had just happened, then said “It didn’t sound like Stephen Hawking… at one point it said ‘okaay..’ slowly, as if it was thinking about something…”
Artificial intelligence has been with us for a while without us really noticing, of course: figuring out what’s relevant to us in Google search results, LoveFilm recommendations, personalised radio stations and the like. It’s used in smart recognition systems like Shazam and Google Goggles, and in content surfacing systems like Hunch and, more recently, Trapit, which uses technology from the same DARPA-funded research as that used by Siri.
But I think Siri represents something new — a difference in how visible the intelligence is and in how people perceive it and interact with it.
At the beginning of this year, geeks around the world were amazed by IBM’s ‘Watson’ computer, as it played a televised game of Jeopardy against human champions and beat them handsomely. It was an eye-opening demonstration that machines can now do things cognitively that were formerly the preserve only of humans. I think Siri will open the eyes of non-geeks to the new reality in a similar way.
And so, as I wonder what impact this will have, three things stand out for me initially:
Firstly, people will very soon realise that low-end ‘knowledge work’ is under significant threat. IBM didn’t build Watson to win lots of money at Jeopardy, they built it to provide a cheaper alternative to hiring and training thousands of call-centre staff (amongst a zillion other potential uses, but that one seems the most obvious). Automated online assistants have been around for ages, of course, but just like Sir Lancelot storming Swamp Castle, you see it coming for ages, then suddenly it arrives…
Secondly, I think the importance of Application Programming Interfaces, or APIs will become more obvious to people. Siri works out what you’re trying to do by processing enormous amounts of data from hundreds of services across the Internet, which it accesses via such APIs, and if you want your information to be part of Siri’s answers you need to be able to plug your service in to it.
Thirdly, Siri’s user interface is highly refined. From detecting when you move the phone to your ear, to the tone of voice, talking speed, pauses and word choice. Soon it may well dynamically modify these things based on the action it’s performing, the urgency it detects in the voice of the user and the need to keep the user engaged. It may not be perfect now, but it’s already impressively good. As such voice interfaces become more ubiquitous, and more nuanced, this new field of interaction design will gain in importance, with new skills, considerations, constraints and practices.
The great American author F. Scott Fitzgerald once wrote: “The test of a first-rate intelligence is the ability to hold two opposed ideas in the mind at the same time, and still retain the ability to function.”
We may still be some way from having machines that cope well with high levels of ambiguity, but, when we look back in a few years time, we may well identify 2011 as the year in which artificial intelligence first started to really feel like intelligence to us.
Originally published at suspendedjudgement.net.