The Lisp approach to AI (Part 1)

Common Lisp code to create an n-inputs m-units one layer perceptron. Taken from the code of AIMA, a classic textbook in Artificial Intelligence. The whole code here.

If you are a programmer that reads about the history and random facts of this lovely craft, and practice it ad honorem — just for fun — , you have found yourself reading about a programming language called Lisp. Some praise it as a software miracle, as the best tool for programming. Some even dare to call Lisp one of the best programming languages ever invented (even if that doesn’t make sense at all). After all, before Python, Scala, Haskell, there was programming, and before Deep Learning there was Artificial Intelligence. Great hackers that love Lisp:

  • Paul Graham, co-founder of Y-Combinator is a big Lisp evangelist. He wrote his startup’s code in Lisp. Viaweb — the startup — was co-founded along with Robert Tappan Morris, a legendary hacker who allegedly released the first computer worm accidentally. After being a rehabilitated worm writer (written in C), Robert Tappan wrote the server code of the small company in Common Lisp. Viaweb was sold to Yahoo! in 1998 for $48 million dollars. Of course there’s not enough evidence yet to said that C lead you to jail while Lips makes you a millionaire.
  • Alan Kay, a pioneer in the practical aspects of OOP and the lead developer of the original Smalltalk (another software miracle?), has called Lisp the greatest single programming language ever designed. He has also compared Lisp with Maxwell’s equations.
  • Edsger Wybe Dijkstra, a pioneer in the field of formal verification and specification, concurrency theory, and operating systems design, whose most famous work is a greedy one, once said:
Lisp has jokingly been called the most intelligent way to misuse a computer. I think that description is a great compliment because it transmits the full flavor of liberation: it has assisted a number of our most gifted fellow humans in thinking previously impossible thoughts.
Although my own previous enthusiasm has been for syntactically rich languages, like the Algol family, I now see clearly and concretely the force of Minsky’s 1970 Turing Lecture, in which he argued that Lisp’s uniformity of structure and power of self reference gave the programmer capabilities whose content was well worth the sacrifice of visual form.
Some CS celebrities that have treated Lisp as a miracle (sometimes). A venture capitalist, a musician, Dijkstra’s algorithm inventor, and Robert Floyd (he was highly appreciated by Donald Knuth).

Three of our luminaries, along with Marvin Minsky (the guy referred by Floyd), and John McCarthy (the inventor of Lisp), were awarded with the Turing Award. So why do many CS celebrities talk so good about a simple programming language? Lisp is famous nowadays because of the things others have said about it, but in the early days of AI, Lisp was the de facto language to express ideas related to natural language processing, computer assisted geometry, text generation, AI planning, and automated theorem proving. Yes, there was AI before Machine Learning, indeed, there was an AI winter before the boom of neural networks and statistical approaches to AI, but that’s a topic that deserves an entire single post.

The Lisp approach to AI

John McCarthy, the inventor of the term “Artificial Intelligence”, the inventor of garbage collection, and the inventor of Lisp. Marvin Minsky, the founder of the AI lab at MIT.

The progress, development, and evolution of Lisp was tightly related to the early progress, development, and evolution of Artificial Intelligence. Two of the guys mentioned before were pioneers in AI. John McCarthy, the creator of Lisp, coined the term Artificial Intelligence, while Marvin Minsky shaped the content of the new field by founding the AI lab at MIT. Many of their students were the developers of the first digital milestones of artificial intelligence.

Programs for natural language understanding and generation, game playing (the link contains a paper from the man who introduced the term Machine Learning), theorem proving, early computer vision, symbolic mathematics (specially integration), problem-solving and knowledge representation, were produced at Stanford and MIT using different dialects of Lisp as a tool to express those ideas in. Was it just a coincidence, or is there something special with the idea (not just the language) of Lisp? This is a list of some classic AI programs that were expressed in Lisp.

A typical conversation between a human and ELIZA. The paper that introduced the program is called ELIZA — A Computer Program for the Study of Natural Language Communication between Man and Machine
A very simple session in MACSYMA.
  • SHRDLU, was the dissertation of Terry Winograd (the PhD advisor of Larry Page at Stanford University) at MIT. It was written in the AI lab created by Minsky to demonstrate a dialog with the machine that could lead to actions taken by the machine in a virtual environment both agents (the human, and the machine) were capable to understand. As MACSYMA, SHRDLU was written in MacLisp.
A sample session in SHRDLU. The program was supposed to understand and execute actions told by a human in natural language.

The progress of AI in its early days was not because of Lisp, I do think CS subjects should be agnostic of the language they express their ideas in. Lisp was used on the early days of AI because it was flexible enough to allow quick experimentation and prototyping (REPL), and it introduced fundamental ideas that were cool and fresh at the moment (IF-THEN-ELSE construct, recursion, and Garbage Collection). Those features proved themselves to be useful to express the kind of the ideas AI people needed to express. This innovation, and the rapid adoption of Lisp for AI (in labs and projects) helped the language grow and become a standard AI language.

Of course all these programs could have been written in other languages, but Lisp was an accepted and highly praised vehicle to explore and implement these kind of ideas at the moment.

Lisp in the real world

At this point, you may think that Lisp was just an academic invention to teach and implement symbolic AI programs. But the rapid adoption of Lisp in academia, implied a massive effort to embrace Lisp (or any of its descendants) in real-world production ready software. The following is a collection of some of those programs; most of the programs included in this list are still running on production environments, while the other part of it used to backup large pieces of software in well-known projects or companies.

You need to know that Lisp and its dialects have evolved a lot since McCarthy defined it for the first time, but most of the original idea of Lisp has been preserved in its descendants. Most of the semantics of Lisp has been an invariant in most Lisp’s implementations that were capable to power or support, in one way or another, the operation of the following projects.

Some of the projects/companies whose stack has included Lisp.
  • After the pain found by Bernie Greenberg and Richard Stallman while implementing a language to manipulate text in the TECO text-editor, Bernie decided to implement a whole new editor (written in MacLisp) and an interpreter that allowed users to manipulate the text being edited. Due to poor portability offered by MacLisp, Richard Stallman decided to implement it in C keeping the interpreted language to customize both the text and the editor, this language is called EmacsLisp.
  • Douglas Lenat, a persistent believer in the power of symbolic AI, has worked on three famous AI programs through his lifetime. The first one, Automated Mathematician (AM) made heavy use of the Lisp property to represent programs as data (you’ll understand this later) to define a bunch of mathematical concepts that could serve as a basis to solve math problems. Its sequel, Eurisko, written in RLL-1, a language written in Lisp, was looking to extend AM’s potential to other fields by working with heuristics (an abstract concept that’s hard to define using a programming language). The frustration of Lenat, acquired while working on Eurisko, lead Lenat to start his own company, Cycorp, Inc. In 1984 the company started a project to introduce common sense to machines, that was supposed to enable computers to perform human-like reasoning. This effort, often qualified as impractical, was launched in 2014.
  • Some of our favorite sources to get information from: Reddit and Hacker News were/are at websites powered by Lisp. Reddit was originally written in Common Lisp, the “standard” Lisp dialect, but it was rewritten in Python by 2005. Hacker News is itself powered by Arc, a programming language written by Paul Graham using the Racket programming language (another descendant of Lisp).
  • Planning and logistics are hard problems due to the size and the number of variables involved in. AI has been capable to deal with those problems by finding “clever” ways to optimize search in complex data structures. With this fact in mind, the U.S. Military choose to simulate the feasibility of strategies for supply or personnel transportation using the DART program written in Common Lisp; DART was used in the Gulf War, where it represented large budget savings.
  • Besides military usage, planning and scheduling with AI, have found space in industrial software. Routific is a Route Optimization as a Service startup whose routing engine — entirely written in Common Lisp — plans optimal routes for delivery companies optimizing the time and spent fuel. ITA Software, a company acquired by Google 5 years ago, offers to their customers a simple travel search engine to search for cheap air trips taking into account several variables. ITA Software makes use of sophisticated algorithms expressed in Common Lisp.

One of Lisp’s main virtues, is that it enables a programmer to create new linguistic abstractions with ease. So there should be not surprise in the fact that Lisp has influenced many popular programming languages; two of them — very close to the AI/Data Science/ML community (besides from Lisp itself) — , which are R and Julia.

  1. R : Past and Future History
  2. Back to the Future: Lisp as a Base for a Statistical Computing System
  3. R: A Language for Data Analysis and Graphics
  • Julia development was heavily inspired by the same Lisp dialect that inspired R. That influence was so big, that the language developers decided to write some parts of the language pipeline in it. The Julia parser is written entirely in Scheme and it’s evaluated using a Lisp dialect written by one of the language designers (femtolisp).
  • Another language that’s worth mentioning is Lush, a scientific object-oriented programming language designed to prototype numerical analysis, computer vision and machine learning programs. It was designed and implemented by Yann LeCun, the man behind the introduction of Convolutional Neural Networks to Computer Vision (along with Kunihiko Fukushima), and the current director of Facebook’s AI lab.

If Lisp if so great, Why TensorFlow’s main language isn’t Lisp?

Most of the programs mentioned earlier made heavy use of symbolic manipulation. As mentioned by Carlos E. Perez in his post The Many Tribes of Artificial Intelligence, before ML and the Neural Network boom, there were symbolic based approaches to AI that combined symbolic manipulation of some elements, following a collection of rules that were modeled with the purpose to encapsulate the behavior of an intelligent system. The problem those days was not the efficient computation of numerical problems, but the manipulation and synthesis of symbols.

Just as C, C++, and Fortran shine in numerical computation where performance matters the most, Lisp shines in symbolic manipulation. One of Lisp’s greatest strengths is being able to handle efficiently symbols and lists.

Lisp is not a perfect language, it has many flaws (lots of dialects, lack of well-known libraries, weird syntax that does not contribute to attract people in, dynamic typing, etc.), but it was a well-suited tool for the problems AI pioneers were trying to tackle at those days, just the same way C/C++, or Fortran are a perfect choice to implement the underpins of a Deep Learning system (TensorFlow is implemented both in C++ and Python). There’s not a single Swiss army knife programming language, we do need to pick a language that suits the most the particular task we’re approaching.

Exploring AI with Lisp

The whole idea of this series is to use Lisp, more specifically, its dialect Scheme to explore Artificial Intelligence (AI is much more than programming, and AI programming is much more than Lisp) related ideas. The goal is to learn together about classical AI concepts such as general problem solving, text generation, symbolic mathematics problems, knowledge representation, expert systems, search, NLP, logical and stochastic reasoning, game playing, and even “contemporary” stuff such as neural networks using the Scheme programming language to express those ideas.

Let’s begin our journey exploring Artificial Intelligence using Lisp. Your homework for the next post in the series is to install MIT-Scheme on your machine.