Winged pig for scale

Book Review

Artificial Intelligence

Nick Santos
Uncritical Criticism
3 min readDec 31, 2016

--

by Stuart Russell and Peter Norvig

One of my favorite books is “The Princess Bride,” mostly for its framing device. In the novel, William Goldman is a father trying to find a gift for his son. He remembers a fantasy adventure his dad read aloud to him long ago, titled “The Princess Bride.” He calls around at New York book stores trying to find a copy. Then he finds it, and discovers that the fantasy adventure is not the main focus of the book. It’s a dry tome on politics with an allegory about a princess to illustrate its points. His father skipped the political bits, leaving only “the good parts.”

That illustrates how I think about “Artificial Intelligence” by Russell and Norvig. Yes, it’s a 1,000 page math textbook. I keep it in my kitchen to press tofu. This summer, I set out a goal to read this tome cover-to-cover.

I learned a lot of math. The book begins with an overview of graph search algorithms (breadth-first search, depth-first search, A*, etc) and their pros/cons. It introduces the idea that all machine learning is fundamentally graph search on massive state spaces. We have to think creatively about how to decide which paths to explore in the state space. And we have to think creatively about how we define the state space: considering both explicit spaces (i.e., “which path should I take next?”) and meta-reasoning spaces (i.e., “which algorithm should I use to decide which path I should take next?”).

There’s a tour through propositional logic (X ∨ Y ⇒ Z) versus first-order logic (∀ y : ∃ x : x ∧ y), how we can chain logical statements together to infer new ones, and how we can apply graph search to find those inferences.

The most complex math involves a deep dive into probability. Machine learning use probability to judge how likely a path is to be fruitful, Bayes’ rule to update the probabilities with new information, and derivatives over probability distributions to maximize how much we learn from new information.

Then a few chapters have no math at all, but are entirely philosophical. What is intelligence? How does the human brain work? Does human intelligence need to be the same as artificial intelligence? Where do they share commonalities, and where should they diverge? I wondered if there should be an edition called “Artificial Intelligence: The Non-Math Parts.”

The book introduces the conundrum of “The Nixon Diamond.” Richard Nixon was a Quaker, and most Quakers are pacifists. He was also a Republican, and most Republicans are not pacifists. How should an intelligent agent resolve this? Does it make no assumptions at all? Does it assign different probabilities to “not-pacifist, given Republican” and “pacifist, given Quaker”? How should it incorporate new information based on Nixon’s actions?

Does 100% perfect knowledge exist? The authors point out that even the most simple first-order propositions like “all cars have 4 wheels” run into problems. If one of the wheels is stolen, does it cease being a car? What makes it a “car?” How much would you have to take away to make it a “Not Car?”

When is it OK to act on imperfect knowledge? If the AI waits for 100% confidence, then it’d be useless. So what is the right confidence threshold to act on a logical inference? What kinds of thresholds allow the system to make more effective decisions, but are morally treacherous?

The math and philosophy parts of the book converge towards a vision of AI where nothing is certain. Instead, the AI algorithms keep

  1. keeps huge probability tables about statements that might describe the world, and
  2. seeks out lots and lots of new data to refine those probability tables, attempting to validate or disprove those statements.

But when you put it that way, it doesn’t sound that much different than human intelligence.

--

--

Nick Santos
Uncritical Criticism

Software Engineer. Trying new things @tilt_dev. Formerly @Medium, @Google. Yay Brooklyn.