Farewell, Marvin Minsky (1927–2016)

Published in

Backchannel

13 min readJan 27, 2016

I think it was 1979 when I first met Marvin Minsky, while I was still a teenager working on physics at Caltech. It was a weekend, and I’d arranged to see Richard Feynman to discuss some physics. But Feynman had another visitor that day as well, who didn’t just want to talk about physics, but instead enthusiastically brought up one unexpected topic after another.

That afternoon we were driving through Pasadena, California — and with no apparent concern to the actual process of driving, Feynman’s visitor was energetically pointing out all sorts of things an AI would have to figure if it was to be able to do the driving. I was a bit relieved when we arrived at our destination, but soon the visitor was on to another topic, talking about how brains work, and then saying that as soon as he’d finished his next book he’d be happy to let someone open up his brain and put electrodes inside, if they had a good plan to figure out how it worked.

Feynman often had eccentric visitors, but I was really wondering who this one was. It took a couple more encounters, but then I got to know that eccentric visitor as Marvin Minsky, pioneer of computation and AI — and was pleased to count him as a friend for more than three decades.

Just a few days ago I was talking about visiting Marvin — and I was so sad when I heard he died. I started reminiscing about all the ways we interacted over the years, and all the interests we shared. Every major project of my life I discussed with Marvin, from SMP, my first big software system back in 1981, through Mathematica, A New Kind of Science, Wolfram|Alpha and most recently the Wolfram Language.

This picture is from one of the last times I saw Marvin. His health was failing, but he was keen to talk. Having watched more than 35 years of my life, he wanted to tell me his assessment: “You really did it, Steve.” Well, so did you, Marvin! (I’m always “Stephen”, but somehow Americans of a certain age have a habit of calling me “Steve”.)

The Marvin that I knew was a wonderful mixture of serious and quirky. About almost any subject he’d have something to say, most often quite unusual. Sometimes it’d be really interesting; sometimes it’d just be unusual. I’m reminded of a time in the early 1980s when I was visiting Boston and subletting an apartment from Marvin’s daughter Margaret (who was in Japan at the time). Margaret had a large and elaborate collection of plants, and one day I noticed that some of them had developed nasty-looking spots on their leaves.

Being no expert on such things (and without the web to look anything up!), I called Marvin to ask what to do. What ensued was a long discussion about the possibility of developing microrobots that could chase mealybugs away. Fascinating though it was, at the end of it I still had to ask, “But what should I actually do about Margaret’s plants?” Marvin replied, “Oh, I guess you’d better talk to my wife.”

For many decades, Marvin was perhaps the world’s greatest energy source for artificial intelligence research. He was a fount of ideas, which he fed to his long sequence of students at MIT. And though the details changed, he always kept true to his goal of figuring out how thinking works, and how to make machines do it.

Marvin the Computation Theorist

By the time I knew Marvin, he tended to talk mostly about theories where things could be figured out by what amounts to common sense, perhaps based on psychological or philosophical reasoning. But earlier in his life, Marvin had taken a different approach. His 1954 PhD thesis from Princeton was about artificial neural networks (“Theory of Neural-Analog Reinforcement Systems and Its Application to the Brain Model Problem”) and it was a mathematics thesis, full of technical math. And in 1956, for example, Marvin published a paper entitled “Some Universal Elements for Finite Automata”, in which he talked about how “complicated machinery can be constructed from a small number of basic elements”.

This particular paper considered only essentially finite machines, based directly on specific models of artificial neural networks. But soon Marvin was looking at more general computational systems, and trying to see what they could do. In a sense, Marvin was beginning just the kind of exploration of the computational universe that years later I would also do, and eventually write A New Kind of Science about. And in fact, as early as 1960, Marvin came extremely close to discovering the same core phenomenon I eventually did.

In 1960, as now, Turing machines were used as a standard basic model of computation. And in his quest to understand what computation — and potentially brains — could be built from, Marvin started looking at the very simplest Turing machines (with just 2 states and 2 colors) and using a computer to find out what all 4096 of them actually do. Most he discovered just have repetitive behavior, and a few have what we’d now call nested or fractal behavior. But none do anything more complicated, and indeed Marvin based the final exercise in his classic 1967 book Computation: Finite and Infinite Machines on this, noting that “D. G. Bobrow and the author did this for all (2,2) machines [1961, unpublished] by a tedious reduction to thirty-odd cases (unpublishable).”

Years later, Marvin told me that after all the effort he’d spent on the (2,2) Turing machines he wasn’t inclined to go further. But as I finally discovered in 1991, if one just looks at (2,3) Turing machines, then among the 3 million or so of them, there are a few that don’t just show simple behavior any more — and instead generate immense complexity even from their very simple rules.

Back in the early 1960s, even though he didn’t find complexity just by searching simple “naturally occuring” Turing machines, Marvin still wanted to construct the simplest one he could that would exhibit it. And through painstaking work, he came up in 1962 with a (7,4) Turing machine that he proved was universal (and so, in a sense, capable of arbitrarily complex behavior).

At the time, Marvin’s (7,4) Turing machine was the simplest known universal Turing machine. And it kept that record essentially unbroken for 40 years — until I finally published a (2,5) universal Turing machine in A New Kind of Science. I felt a little guilty taking the record away from Marvin’s machine after so long. But Marvin was very nice about it. And a few years later he enthusiastically agreed to be on the committee for a prize I put up to establish whether a (2,3) Turing machine that I had identified as the simplest possible candidate for universality was in fact universal.

It didn’t take long for a proof of universality to be submitted, and Marvin got quite involved in some of the technical details of validating it, noting that perhaps we should all have known something like this was possible, given the complexity that Emil Post had observed with the simple rules of what he called a tag system — back in 1921, before Marvin was even born.

Marvin and Neural Networks

When it came to science, it sometimes seemed as if there were two Marvins. One was the Marvin trained in mathematics who could give precise proofs of theorems. The other was the Marvin who talked about big and often quirky ideas far away from anything like mathematical formalization.

I think Marvin was ultimately disappointed with what could be achieved by mathematics and formalization. In his early years he had thought that with simple artificial neural networks — and maybe things like Turing machines — it would be easy to build systems that worked like brains. But it never seemed to happen. And in 1969, with his long-time mathematician collaborator Seymour Papert, Marvin wrote a book that proved that a certain simple class of neural networks known as perceptrons couldn’t (in Marvin’s words) “do anything interesting”.

To Marvin’s later chagrin, people took the book to show that no neural network of any kind could ever do anything interesting, and research on neural networks all but stopped. But a bit like with the (2,2) Turing machines, much richer behavior was actually lurking just out of sight. It started being noticed in the 1980s, but it’s only been in the last couple of years — with computers able to handle almost-brain-scale networks — that the richness of what neural networks can do has begun to become clear.

And although I don’t think anyone could have known it then, we now know that the neural networks Marvin was investigating as early as 1951 were actually on a path that would ultimately lead to just the kind of impressive AI capabilities he was hoping for. It’s a pity it took so long, and Marvin barely got to see it. (When we released our neural-network-based image identifier last year, I sent Marvin a pointer saying “I never thought neural networks would actually work… but…” Sadly, I never ended up talking to Marvin about it.)

Marvin and Symbolic AI

Marvin’s earliest approaches to AI were through things like neural networks. But perhaps through the influence of John McCarthy, the inventor of LISP, with whom Marvin started the MIT AI Lab, Marvin began to consider more “symbolic” approaches to AI as well. And in 1961 Marvin got a student of his to write a program in LISP to do symbolic integration. Marvin told me that he wanted the program to be as “human like” as possible — so every so often it would stop and say “Give me a cookie”, and the user would have to respond “A cookie”.

By the standards of Mathematica or Wolfram|Alpha, the 1961 integration program was very primitive. But I’m certainly glad Marvin had it built. Because it started a sequence of projects at MIT that led to the MACSYMA system that I ended up using in the 1970s — that in many ways launched my efforts on SMP and eventually Mathematica.

Marvin himself, though, didn’t go on thinking about using computers to do mathematics, but instead started working on how they might do the kind of tasks that all humans — including children — routinely do. Marvin’s collaborator Seymour Papert, who had worked with developmental psychologist Jean Piaget, was interested in how children learn, and Marvin got quite involved in Seymour’s project of developing a computer language for children. The result was Logo — a direct precursor of Scratch — and for a brief while in the 1970s Marvin and Seymour had a company that tried to market Logo and a hardware “turtle” to schools.

For me there was always a certain mystique around Marvin’s theories about AI. In some ways they seemed like psychology, and in some ways philosophy. But occasionally there’d actually be pieces of software — or hardware — that claimed to implement them, often in ways that I didn’t understand very well.

Probably the most spectacular example was the Connection Machine, developed by Marvin’s student Danny Hillis and his company Thinking Machines (for which Richard Feynman and I were both consultants). It was always in the air that the Connection Machine was built to implement one of Marvin’s theories about the brain, and might be seen one day as like the “transistor of artificial intelligence”. But I, for example, ended up using its massively parallel architecture to implement cellular automaton models of fluids, and not anything AI-ish at all.

Marvin was always having new ideas and theories. And even as the Connection Machine was being built, he was giving me drafts of his book The Society of Mind, which talked about new and different approaches to AI. Ever one to do the unusual, Marvin told me he thought about writing the book in verse. But instead the book is structured a bit like so many conversations I had with Marvin: with one idea on each page, often good, but sometimes not — yet always lively.

I think Marvin viewed The Society of Mind as his magnum opus, and I think he was disappointed that more people didn’t understand and appreciate it. It probably didn’t help that the book came out in the 1980s, when AI was at its lowest ebb. But somehow I think to really appreciate what’s in the book one would need Marvin there, presenting his ideas with his characteristic personal energy and responding to any objections one might have about them.

Marvin and Cellular Automata

Marvin was used to having theories about thinking that could be figured out just by thinking — a bit like the ancient philosophers had done. But Marvin was interested in everything, including physics. He wasn’t an expert on the formalism of physics, though he did make contributions to physics topics (notably patenting a confocal microscope). And through his long-time friend Ed Fredkin, he had already been introduced to cellular automata in the early 1960s. He really liked the philosophy of having physics based on them — and ended up for example writing a paper entitled “Nature Abhors an Empty Vacuum” that talked about how one might in effect engineer certain features of physics from cellular automata.

Marvin didn’t do terribly much with cellular automata, though in 1970 he and Fredkin used something like them in the Triadex Muse digital music synthesizer that they patented and marketed — an early precursor of cellular-automaton-based music composition.

Marvin was very supportive of my work on cellular automata and other simple programs, though I think he found my orientation towards natural science a bit alien. During the decade that I worked on A New Kind of Science I interacted with Marvin with some regularity. He was starting work on a book then too, about emotions, that he told me in 1992 he hoped “might reform how people think about themselves”. I talked to him occasionally about his book, trying I suppose to understand the epistemological character of it (I once asked if it was a bit like Freud in this respect, and he said yes). It took 15 years for Marvin to finish what became The Emotion Machine. I know he had other books planned too; in 2006, for example, he told me he was working on a book on theology that was “a couple of years away” — but which sadly never saw the light of day.

Marvin in Person

It was always a pleasure to see Marvin. Often it would be at his big house in Brookline, Massachusetts. As soon as one entered, Marvin would start saying something unusual. It could be, “What would we conclude if the sun didn’t set today?” Or, “You’ve got to come see the actual binary tree in my greenhouse.” Once someone told me that Marvin could give a talk about almost anything, but if one wanted it to be good, one should ask him an interesting question just before he started, and then that’d be what he would talk about. I realized this was how to handle conversations with Marvin too: bring up a topic and then he could be counted on to say something unusual and often interesting about it.

I remember a few years ago bringing up the topic of teaching programming, and how I was hoping the Wolfram Language would be relevant to it. Marvin immediately launched into talking about how programming languages are the only ones that people are expected to learn to write before they can read. He said he’d been trying to convince Seymour Papert that the best way to teach programming was to start by showing people good code. He gave the example of teaching music by giving people Eine kleine Nachtmusik, and asking them to transpose it to a different rhythm and see what bugs occur. (Marvin was a long-time enthusiast of classical music.) In just this vein, one way the Wolfram Programming Lab that we launched just last week lets people learn programming is by starting with good code, and then having them modify it.

There was always a certain warmth to Marvin. He liked and supported people; he connected with all sorts of interesting people; he enjoyed telling nice stories about people. His house always seemed to buzz with activity, even as, over the years, it piled up with stuff to the point where the only free space was a tiny part of a kitchen table.

Marvin also had a great love of ideas. Ones that seemed important. Ones that were strange and unusual. But I think in the end Marvin’s greatest pleasure was in connecting ideas with people. He was a hacker of ideas, but I think the ideas became meaningful to him when he used them as a way to connect with people.

I shall miss all those conversations about ideas — both ones I thought made sense and ones I thought didn’t. Of course, Marvin was always a great enthusiast of cryonics, so perhaps this isn’t the end of the story. But at least for now, farewell, Marvin, and thank you.