On Lisp in 15 minutes

Iwo Herka
Iwo Herka
May 24, 2019 · 16 min read
Image for post
Image for post

LISP is simple and difficult, elegant and ad hoc; it is a beautiful blend of foresight and fortuity.¹

Whether you agree with this statement or not, Lisp’s historical importance is indisputable. Even though a beautiful language in its own right, Lisp turned to be one of the most influential models of computer programming. It also gave birth to many widespread concepts in the industry today and, as mainstream is steadily shifting from von Neumann style to λ-calculus, we can suspect many to be adopted in the future. In effect, Lisp is of interest to both a historian and a futurist. Paul Graham suggested that what John McCarthy did for programming was something like what Euclid did for geometry². What is so important that McCarthy discovered? To answer this question we have to put aside decades’ worth of Internet mythos and take a look at Lisp’s humble beginnings — Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I³ by John McCarthy. In this article, I will follow his steps in order to explain how Lisp was constructed.

I’m writing this article to practice my understanding and appreciation of McCarthy’s work. The goal of this article is to explain how Lisp came to be and discuss properties that make it special. Hopefully, someone will find it useful. I highly recommend reading the original paper, and Paul Graham’s take on it, after this short introduction (see References).

1. Preliminaries

1.1. Partial functions

def one_nth(n):
return 1 / n

or, a bit more involved:

def query_purchases(client: BillingClient) -> List[Purchase]:
if client.is_initialized:
return client.query_purchases()
raise Exception('Billing client is not set up')

The behavior of these functions (in a mathematical sense) is undefined for certain inputs, as passing zero to one_nth or uninitialized client to query_purchases will result in an exception. Another possibility is that functions may not terminate, e.g. when caught in infinite recursion. It derives from the fact that the type systems are often too weak to constrain the type as much as the programmer would like or the penalty of assigning more precise type is too high. In mathematics, we don’t have this problem. In case of one_nth, we could simply say that the domain of the function is ℝ \ {0}.

One example of a more “expressive type” system is Idris’, which supports dependent types. Although dependent types are a story for another time, in short, it simply means that types themselves are dynamic and first-class, in the sense that they possess properties related to values they take. For example, a function taking argument N may return the type of list which is no longer than N.

In general, any function that can result in an error, or does not terminate for some of its inputs, is considered partial.

In some languages, such as Haskell, a computation which never completes successfully is referred to as “bottom” and denoted with “⊥”. Therefore, when the evaluation of a partial function results in a partial behavior (such as a crash), its value is ⊥.

In Racket, Lisp dialect, the programmer can explicitly mark a function as partial.

Sometimes, when it is desirable to suppress errors and exceptions, a special value might be returned acting as a bottom, such as IEEE floating point’s not-a-number (NaN).

Analogically, when a function is total, we can say that it produces a valid result for all possible inputs. For example, the function not is obviously defined for both True and False, which are the only possible values of the bool type. Total functions allow a programmer to reason about programs using the types alone; a function with the type A B implies that it is always possible to get a B when you have an A.

1.2. Predicates

1.3. Pure functions

1.4. Conditional Expressions

Image for post
Image for post

The function above is a well-known absolute value. Because its value depends on the condition (whether x is greater-or-equal to zero), we can say that it is conditional in its behavior. However, this notation is not suited for expressing symbolically the dependence of quantities on truth values — it has to rely on English words.

A better notation for representing conditional expressions was proposed by McCarthy (now referred to as McCarthy formalism):

(p₁ → e₁, … , pₙ → eₙ)

Here, the p’s are propositional expressions and the e’s are expressions of any kind. It may be read as:

If p₁ then e₁ otherwise if p₂ then e₂, … , otherwise, if pₙ then eₙ.

Notice how we are assuming here the evaluation order — we consider pᵢ from left to right. With this notation under our belt, we can express the absolute value function simply as:

(x ≥ 0 → x, x < 0 → -x)

More importantly, however, this notation is sufficient to express any if-then statement. For example:

if n % 3 == 0 and n % 5 == 0:
return 'FizzBuzz'
elif n % 3 == 0:
return 'Fizz'
elif n % 5 == 0:
return 'Buzz'
return n

…can be expressed as:

(3 ∣ n ∧ 5 ∣ n → 'Fizzbuzz',
3 ∣ n → 'Fizz',
5 ∣ n → 'Buzz',
t → n)

m | n means that n is divisible by m, or m divides n. is standard logical conjunction, or AND in Boolean algebra.

Notice how by exploiting the left-to-right evaluation order we can encode the else clause using the t value.

Conditionals were introduced by Lisp. In effect, if statements, as we know them today, are thanks to McCarthy. As of 1958, Fortran only had a conditional goto which was based on the branch instruction in the underlying hardware. McCarthy also got conditionals into Algol 60, whence they spread to most other languages⁸.

1.5. Recursive functions and label

n! = (n = 0 → 1, t → n · (n − 1)!)

Note, however, that we are making here one very important assumption — the body of the function can reference the function itself by its name (or symbol — ! in this case).

In most actual languages this is not a problem, but remember — we do not have any language yet!

In general, we can solve this problem with something called Y combinator, but that’s a topic for another time. As McCarthy most likely was not aware of Y combinator at the time of writing his paper, he introduced his notation instead.

Image for post
Image for post
Fig 1. McCarthy in his AI laboratory at Stanford

label(a, e) denotes the expression e, with the special property that all occurrences of a within e are to be interpreted as referring to the expression e itself. Therefore, by combining label with conditional expressions, we can define Euclid’s algorithm for calculating the greatest common divisor as follows:

label(gcd, (m > n → gcd(n, m),
m | n → m,
t → gcd(n % m, m)))

It is important to note that although recursion existed as a mathematical concept before Lisp, Lisp was the first programming language to support it.

2. S-expressions

  • an atom, which is a sequence of characters, e.g. spam, ham, eggs or f*&o$o. An atom cannot start with (, which is reserved for lists.
  • a list of zero or more s-expressions which are separated by whitespace and enclosed by parentheses. For example: (abc) , (), (a (bc)).

In simplified extended Backus-Naur form (commonly used for describing the syntax of a language), s-expressions can be represented as:

Image for post
Image for post
Fig. 2 Simplified EBNF for s-expressions

— which is the exact restatement of the definition given above (except that atoms are assumed to be alphanumeric).

The above definition is recursive, as we are referring to s-expressions while defining lists of s-expressions. Some examples of valid s-expressions include:

(foo bar)
((spam) (ham ()) eggs ())

— but also:

(+ 1 2 3 5 8 13)
(> 24 512 12)
(and true false true)

3. Abstract machine

Image for post
Image for post
Fig. 3 ST µA741 op-amp, or: imaginary Lisp machine

3.1. Elementary functions

To keep our notation simple, we represent operators using atoms and lists — if an s-expression is a list, the first element is an operator, and remaining ones are arguments. This means that functions on s-expressions are encoded using s-expressions themselves. For example, the car operator expects one argument, and so its invocation will take the form:

(car x)

Atom “car" on its own refers to the function itself. If car expected an argument which is a function (it doesn’t), we could write (car car).


> (atom f)
> (atom foo)
> (atom atom)


> (quote x)
> (quote (atom x))
(atom x)
> (quote (1 2 3 4 5))
(1 2 3 4 5)

It is crucial to understand how quote behaves and why is useful. The second example — (quote (atom x)) —will reduce to the list containing atom and x. (atom x) is not “evaluated” further (at this point we do not know what evaluation even is, but let us ignore that for a second). In essence, quote is used to stop the evaluation — normally in(1 2 3), 1 would be assumed to be an operator but we want to treat the list as a regular list. This is why we would use quote. It is usually abbreviated with ', which gives us:

> '(1 2 3)
(1 2 3)

Now that we have quote , we can go back to atom for one more example:

> (atom '(foo bar baz))

If the code would eventually be evaluated by some kind of program capable of doing that, quote would prevent evaluation of any s-expression given as its argument, which is very important when passing data around.


Some languages, such as Scheme, use eq?. Clojure uses =.


> (car '(1 2 3 4))

Certain languages, such as Clojure, rename car with something more intuitive, e.g. first. The “car” and “cdr” names are derived from the machine architecture that Lisp was designed for and nowadays languages use more familiar names.


> (cdr '(1 2 3 4))
(2 3 4)

In Clojure, cdr behavior is provided viarest.


> (cons 1 '(2 3))
(1 2 3)

Why would we prepend to lists instead of appending them is a topic on its own. In short, lists constructed as the head and rest are simpler to manipulate recursively, which is a huge benefit for a language such as Lisp.


(cond (p₁ e₁) ... (pₙ eₙ))

The value is evaluated as follows. We evaluate p expressions from left to right. If any predicate evaluates to t, the value of the cond expression is taken to be the value of the corresponding expression e. For example:

> (cond (t 1) (f 2))

Moreover, because we evaluate from left to right:

> (cond (t 1) (t 2))

3.2. Denoting non-elementary functions

(lambda (p₁ … pₙ) e)

where p₁ -pₙ are called parameters and e is an expression. If you ever tinkered with lambda calculus (or with lambdas/anonymous functions in pretty much any modern programming language), you will recognize this notation.

An expression whose first element is a lambda, i.e., is of the form:

Image for post
Image for post

— is called a function call and its value is computed as follows. Each expression aᵢ is evaluated. Then e is evaluated. During the evaluation of e, the value of any occurrence of one of the pᵢ is the value of the corresponding aᵢ in the most recent function call.

Do you remember label notation? We can now translate it to s-expression form and use it on our abstract machine to obtain a general method for defining functions. The label s-expression will have the following form:

(label name lambda)

— where name is an atom representing the name of the function and lambda is the function body, also called to form.

Using label and the Magnificent Seven our expressive power is limitless.

For example, we can define the Boolean operator NOT:

(label not (lambda (x)
(cond (x f) (t t))))
> (not t)
> (not f)

As you can see, NOT expects one argument — x. In the body of the function, we perform a conditional check of x. If the value is equal to t, we return f and otherwise t (t in the last positional of the cond acts as the else clause in modern languages, since the condition is always true).

Next, we can define a bit more involved Boolean AND:

(label and (lambda (x y)
(cond (x (cond (y t) (t f)))
(t f))))
> (and t f)
> (and t t)

This body can be summarized as: If x is true, check if y is true and if so, return t. If y is not true, or x is not true, return f (we have two else clauses).

We can write functions to manipulate lists, Boolean expressions, pair elements, rewrite expressions, and so on. For example:

  • (not x) returns t if its argument returns f and t otherwise.
  • (pair x y) takes two lists of the same length and returns a third one containing two-element lists of corresponding elements in argument lits. Nowadays we would more likely call this function zip.
  • (append x y) takes two lists and concatenates them.
  • (assoc x y) takes an atom x and a y of the form return by pair and returns a second-element from a pair in y whose first element is x. This function emulates a lookup into a dictionary (alternatively called a hash table or associative array).

It is important to note that in Lisp, functions are first class objects — they’re data, just like numbers and strings. Therefore, functions can be stored in variables, passed around as arguments, and so on. The idea of a first-class function was pioneered by Lisp.

3.3. Evaluation

That is a bit of mouthful. If you want a step by step description, which is beyond the scope of this article, I recommend reading Recursive Functions… by McCarthy³ and Paul Graham’s The Roots of Lisp², which take a thorough look at eval‘s inner-workings.

The important thing is this: eval is a Lisp interpreter, and thus, implements the language. Any valid Lisp code can be passed to it for evaluation and we don’t need any other infrastructure. With only seven functions and a bit of notation, we were able to implement the whole language and allow it to define new functions. That’s neat and very elegant, but so what? Now that we understand how Lisp operates, let’s step back and consider what it means.

If you’re curious how Lisp can be implemented in a few hundred lines of code, check out my toy Lisp implementation: Kite.

4. Chemical X: Homoiconicity

The big deal, as it turns out, that when the representation of the program (code) is the same as the representation for that data, you can manipulate code like you would manipulate data. This allows, for example, for syntactic extensibility, which is the ability to extend (or completely) change the syntax of the language. For example, in a normal language, if you wished that you had the for-each construct, you would have to wait until creators of the language add it in the next version. In Lisp, you can write a macro in a matter of minutes. In many Lisp languages, program transformation is an explicit part of the evaluation strategy of the language.

Moreover, in Lisp, programs composed of expressions. Lisp programs are trees of expressions, each of which returns a value. At the time, this was a revolutionary idea, in contrast to languages such as Fortran which distinguish between expressions and statements. This was largely dictated by the limitations of the hardware available in the late 1950s.

Finally, what McCarthy discovered is not just a programming language but a very elegant model of computation, just a notch above lambda calculus. We can say that Lisp was discovered, and not invented, because McCarthy built it from axioms so fundamental that it would be very hard to make an argument for the otherwise.

Core Lisp corresponds to an untyped, call-by-value lambda calculus extended with constants. As it turned out, lambda calculus* corresponds to natural deduction in logic. Specifically, by Curry-Howard isomorphism, types in a program are propositions, programs are proofs and evaluation of programs is the simplification of proofs. This discovery is phenomenal. Among others, it means that we can reason about programs using logic and that we get theorems in logic for free in programming (and vice versa).

The big difference between Lips and lambda calculus is that lambda calculus doesn’t contain any non-functional data. In lambda calculus, everything is a function, even a number (if you’re curious how this is achieved see Church encodings or Mogensen-Scott encodings). Lisp, on the other hand, operates on atoms and numbers (hence, constants). This is most likely what makes Lisp readable to mortal humans — a property that lambda calculus lacks.

The adherence to mathematically sound principles of computation allows Lisp to reap the benefits in the form of simple semantics, clarity, consistency, and other nice things such as referential transparency (original Lisp formulation is completely value-based). In contrast, almost all imperative languages are “just” tools (sometimes very good tools nonetheless) with undefined formal semantics, which makes reasoning about them rigorously very hard or almost impossible.

Thank you for reading. Please check out the references and maybe my other stuff. If you have any questions, please feel free to send me an e-mail at hi@iwoherka.eu or leave a comment.

*Specifically, typed lambda calculus. For more details, I highly recommend great talk by Philip Wadler, co-creator of Haskell — Propositions as Types.

Makimo Tech Blog

Makimo, a bespoke software development house @ Łódź, PL

Thanks to Mateusz Papiernik, Michał Moroz, and Marcin Struś

Iwo Herka

Written by

Iwo Herka

Senior software developer at makimo.pl. Functional programming enthusiast. You can find me on github.com/iwoherka.

Makimo Tech Blog

Makimo is a bespoke software development house located in Lodz, Poland with main focus on web development with Python & Clojure. Here we share what we know, learn, discover or struggle with when we craft our software.

Iwo Herka

Written by

Iwo Herka

Senior software developer at makimo.pl. Functional programming enthusiast. You can find me on github.com/iwoherka.

Makimo Tech Blog

Makimo is a bespoke software development house located in Lodz, Poland with main focus on web development with Python & Clojure. Here we share what we know, learn, discover or struggle with when we craft our software.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store