The Test of Artificial Intelligence

We traditionally gauge an artificially intelligent agent by its ability to perform benchmark tasks.

Paul Abrahams
Pensées
2 min readDec 6, 2022

--

It was Alan Turing who originally proposed the criterion for a thinking machine: if the machine could carry on a conversation with a human and convince the human that it was itself human, then that machine was intelligent. We now have programs, most notably GPT-3, that meet this criterion and can, for instance, write essays that appear to have been written by a person.

A second criterion has been the ability to win at various games, starting with checkers and culminating in the Oriental game of Go. A program, Alpha Go, not long ago won a tournament against the world’s best Go player.

Driving a car is another classic challenge. We do indeed have self-driving cars, but they sometimes get into accidents. However, even the best human drivers also get into accidents, so this is not really a fair test.

So what, then, would be a good measure of artificial intelligence? We really need to distinguish between two broad kinds of challenges: those that are purely symbolic (like playing Go) and those that actually require interaction with the physical world (like driving a car).

As a symbolic task, I propose the ability to write a legal brief that can actually win a court case. Court cases are often decided on the bsis of legal briefs, which exist, of course, as texts. And a court case is a competitive situation: which side can present the more compelling argument? In a way, this test of artificial intelligence is akin to the early test of whether a computer can win at checkers. No physical abilities are required.

My favorite example of a task that requires physical ability is making up a hotel room — essentially, a robotic housekeeper. A housekeeping machine would have to have a number of physical abilities, as well as being sufficiently compact and autonomous to move around the hotel room in the first place. A more difficult task of the same genre is to find and repair a plumbing leak in a house. No existing robot comes anywhere near being able to do that.

There’s a whole range of tasks that people do that are often poorly paid and so far defy automation. Nursing homes are overwhelmed by tasks like that. What a boon it would be to have robots able to perform them!

The era of thinking machines can be a boon to humankind, but it will demand a major restructuring of society. That will have to be the subject of a different essay.

--

--

Paul Abrahams
Pensées

Paul Abrahams is a retired computer scientist living in Deerfield, Massachusetts. President of ACM from 1986 to 1988, he now writes philosophical essays.