The spectra of programming languages

Published in

Ruminations on Programming

11 min readMar 14, 2019

People use many different criteria to classify programming languages. I want to talk about a few here. They are often considered different axes in the space where languages resides. However, it is important to realize that those axes are not orthogonal to each other. For example, a lazy language is probably also a functional language. And many languages do not occupy a particular dot in the space. I’d like to think of them as features of programming languages. Most modern languages have multiple (even opposing) features and are sometimes called multi-paradigm languages.

Imperative versus functional

Most people think of imperative and functional when talking about types of languages.

Imperative programs work by executing statements that change a program’s state. A program’s state is the memory locations that store its data. Imperative programming is based on the Turing machine model which most computer hardware are based on. Most procedural languages including C and Pascal and object-oriented languages such as C++ and Java are considered imperative. Classes and objects are just constructs used to group together data and the procedures that access or modify it.

The Turing machine concept was invented by Alan Turing as a model of computation. It is an abstract machine which reads and writes symbols on a strip of infinite tape according to a finite set of user-defined rules. Despite its simplicity, the Turing machine characterizes the power and limitation of computation. It was shown that any task which can be accomplished by any computing device can be simulated by a Turing machine. Therefore if any device can simulate a Turing machine, it is as powerful as any computer in terms of computability, though not necessarily in terms of efficiency.

Functional programming treats computation as evaluation of mathematical functions. Mathematical functions are idempotent, meaning a function’s result only depends on the arguments passed to it. Therefore, when you call a function with the same arguments, you should always get the same result. A function should also be free of any side effect, meaning it should not have any observable effect (such as modifying some non-local or static variable or doing IO) besides returning a value. The model of computation underlying those languages are called λ-calculus (lambda-calculus). Because of disparity with the underlying hardware and the need to interact with external systems, most languages have some non-functional elements. Languages that attempt to be purely functional use concepts like monad to model state and IO.

λ-calculus was invented by Alonzo Church. It expresses computation by constructing terms and performing reductions on them. Terms are constructed with variables, function definition, and function application, and reduction operations include renaming function parameters and replacing function parameters with arguments. λ-calculus can simulate any Turing machine, so it is also a universal model of computation.

My first introduction to functional programming was the book The Haskell School of Expression by professor Paul Hudak who was a main designer of the purely functional language Haskell. Whenever I had the opportunity to talk to him about another language, he would immediately say with a smile “it’s not functional” or “it’s not pure”, depending on how functional that language was.

Many imperative languages borrow tools from functional languages to make it easier to reason about manipulating complex data structures like lists, maps, and trees. You can also do functional programming in any language as long as you stick to the rules that functions should avoid side effects and results should only depend on arguments, even if the language does not enforce it.

In real-world projects, no matter what language is used, it is good practice to write most of the application logic in functional style and establish a clear boundary with the non-functional parts. This makes it easier to write tests and to track down bugs.

Strict versus lazy

Lazy evaluation is an evaluation strategy that delays the evaluation of expressions until their values are needed. Values that are not used later are never computed, and the same expression is not evaluated repeatedly. In contrast, strict evaluation means the values of function arguments are always computed before the function is applied, even if some arguments are never used by the function.

Lazy evaluation can lead to elegant and efficient definition of infinite data structures. For example, you’re probably familiar with the Fibonacci sequence:

fib[0] = 0
fib[1] = 1
fib[n] = fib[n-1] + fib[n-2]

If we write a function in Rust to compute the nth Fibonacci number according to the recursive definition above:

fn fib(n: i32) -> i32 {
    match n {
        0 => 0,
        1 => 1,
        n => fib(n - 1) + fib(n - 2)
    }
}

It’s straightforward and pretty close to the definition, but it’s very expensive. For n > 1, each call to fib causes 2 calls to itself, and the argument is only decremented by 1 and 2 respectively. I try not to get into too much math here, but it's easy to see the number of function calls grows much faster than n. Worse, most of the calls are repeated computation. You can insert println!("fib({})", n); to the first line of fib() to see how many times and with what arguments it's called. fib(5) causes 15 calls, but there are only 6 unique arguments. fib(10) causes 177 calls with 11 unique arguments. To get rid of unnecessary computation, one has to replace the recursion with a more clumsy loop.

In Clojure, the Fibonacci sequence can be defined as:

(def fib (lazy-cat [0 1] (map + fib (rest fib))))

It defines a sequence fib. lazy-cat is a function that returns a lazy sequence which is the concatenation of its arguments. Its first argument [0 1] is the first two Fibonacci numbers. map is a function that applies its first argument as a function repeatedly to the items in the other arguments which are sequences. rest returns the sequence starting from the second item of its argument which should be a sequence. So the expression (map + fib (rest fib)) evaluates to the infinite sequence fib[0] + fib[1], fib[1] + fib[2], fib[2] + fib[3], ... Notice that fib is a sequence instead of a function, and it refers to itself in its definition. Only in a language that supports lazy evaluation can infinite self-referential data structure be defined.

To get the 10th Fibonacci number, just use:

(nth fib 10)

Besides its elegance, the Clojure version is also efficient. Each item in fib is only computed once. This is true across multiple calls to (nth fib ...). To compute (nth fib 10), fib[0] through fib[10] are computed, then when we call (nth fib 15), only fib[11] through fib[15] are computed, because the earlier items are already available.

Every powerful language feature gives programmers bullets to shoot their feet, and lazy evaluation is no exception. Suppose we are doing some computation with the first 10 Fibonacci numbers:

(let [x (take 10 fib)]
    (compute-intensively x))

Here (take 10 fib) just returns the first 10 items from fib. Suppose we didn't get the expected result from compute-intensively. We suspect something's wrong with the argument, so we want to print out each item of x. We just learnt a neat map function that applies another function to a sequence, let's put it to good use. After all, real programmers use higher-order functions. For-loops are for newbies.

(let [x (take 10 fib)]
    (map println x)  ; <- added
    (compute-intensively x))

But when the program runs, we don’t see anything on the console. Why? Remember that map returns a lazy sequence. Because it’s not used elsewhere, it’s simply discarded. The items in the sequence never get evaluated, and println is never called.

You might think that it is enough to make sure everything you intend to happen gets evaluated, but that’s not all. Clojure has a range function that generates sequences of natural numbers. (range 100) generates the first 100 natural numbers. What does the following code do?

(take 5 (map println (range 100)))

This evaluates and returns the first 5 items of (map println (range 100)), so we will definitely see 0 to 4 printed, right? On my computer, evaluating the above expression prints 0 to 31. Because items close to each other are usually accessed in a short time span, the compiler may decide to compute a lazy sequence in chunks if it determines doing so would be more efficient than evaluating each item at different times. Whether/when chunking happens and the chunk size are usually not in the languages specification and are left to the compiler writers. It could depend on how the sequence is generated, the compiler version, the memory and cache size, or hardware architecture.

In general, the only assumption one should make about lazy evaluation is that expressions whose values are used are evaluated when or before they are needed. There is no guarantee about whether unused expressions are evaluated. Therefore it is best to only use pure functions with lazy evaluation. If you need to perform actions with side effects, use imperative iteration.

If a language use lazy evaluation everywhere by default, it is called a lazy language. Haskell is the most well-known lazy language. Some strict languages such as Clojure provide support for lazy data structures but use strict evaluation everywhere else.

With these examples, I want to highlight the point that the design and implementation of programming languages are like other complex engineering problems. It’s about making the right trade off for the target audience and problem domain. Each feature or tool may make some things easier but usually brings other complications into the picture. Some new languages and frameworks can make some types of programming easier, but they don’t change the fact that, in general, writing correct programs is hard.

Compiled versus interpreted

Many people consider languages like C, C++, Rust compiled languages, because programs written in those languages are compiled to machine code by a compiler. They consider languages like Python, Ruby, and Perl interpreted languages, because programs written in those languages can be executed with an interpreter without first going through a compilation phase.

The distinction is actually not well-defined. In theory, any language can be compiled or interpreted. What people usually mean by compiled languages or interpreted languages has more to do with how the languages happened to be implemented than the languages themselves. Even at the implementation level, there is no clear boundary. Interpreters for Ruby and Python actually do compile the program dynamically at runtime to improve performance. Many compilers emit lower-level code for platforms such as JVM, LLVM, or Web Assembly to preserve portability, and the programs are actually interpreted at runtime by a low-level interpreter.

Type systems: dynamic versus static, strong versus weak

Almost every practical programming language has a type system that specifies how to assign types to various constructs in the language and how constructs of those types interact with each other. Most programmers characterize type systems with two sets of properties. One has to do with when rules of the type system are enforced (aka type checking): dynamic or static; The other has to do with how much safety guarantee the type system provides: strong or week.

This is a confusing topic. I’ve run into articles on multiple otherwise reputable websites claiming static typing means variables need to be declared before use, and dynamic typing means otherwise. This is very misleading. Haskell is a statically-typed language, yet programmers don’t need to declare the type of each name. Types are inferred from how the names are used. Its optional type annotation is mostly for the benefit of human readers rather than the compiler. I’ve also seen articles that equate dynamic with weak and static with strong. This is also wrong. A dynamically-typed language can be more strongly-typed than a statically-typed language.

To understand the distinction, we need to separate names and values. A name is the identifier you use in a program to refer to entities like objects and functions. Values are the entities themselves. A name can refer to different values at different times, and a value can be referred to by different names.

In statically-typed languages, names have types, and the type of a name typically cannot change within its scope. A name of a certain type can only refer to values of that type. In a statement like a = sum(b, c), a, b, c, and sum all have types. The compiler checks to see if the types of a and b match with the parameter types of sum and if the result type of sum matches the type of a. If not, it emits a type error and refuses to compile the program.

In dynamically-typed languages, names themselves do not have types, but the values they refer to do. In the last example a = sum(b, c), when the program runs, the language runtime looks at the values b and c and makes sure they are of the type to which sum can be applied to. For example, sum(b, c) might be implemented as b + c, the language runtime checks if b and c refer to the types for which the operator + is defined. If not, it throws an exception.

Let’s turn to the strength of type safety by looking at some examples.

Both C and Rust are statically-typed, but they provide different levels of type safety. Consider the following C program:

#include <stdio.h>

int main() {
  int numbers[] = {0, 1, 2};
  printf("%d", numbers[6]);
  return 0;
}

It has an obvious problem, but can be compiled successfully. Depending on what compiler you use, you might see a warning, but it’s just a heuristic added by the compiler to assist programmers. The C language standard allows the program to be compiled. When I run this program on my Mac, it prints 32766% which is just whatever gibberish happened to be at that memory location. If this were a complex program, it would likely be a frustrating bug.

The following Rust program attempts to do the same thing:

fn main() {
    let numbers = [0, 1, 2];
    println!("{}", numbers[5]);
}

But when it is compiled, the following error is emitted. It won’t get a chance to run.

error: index out of bounds: the len is 3 but the index is 5
 --> array.rs:3:20
  |
3 |     println!("{}", numbers[5]);

This is because in Rust, the length is part of the type of an array literal, so [0, 1] and [0, 1, 2] are actually different types. In this case the compiler can detect illegal access to an array literal just by looking at the type. To verify this, add a line to the rust code:

fn main() {
    let numbers = [0, 1, 2];
    let foo: () = numbers;  // <- add
    println!("{}", numbers[5]);
}

You will see the following error from the compiler:

error[E0308]: mismatched types
 --> array.rs:3:19
  |
3 |     let foo: () = numbers;
  |                   ^^^^^^^ expected (), found array of 3 elements
  |
  = note: expected type `()`
             found type `[{integer}; 3]`

This is an intentional type mismatch error we add to show the type of numbers. The last line says it's [{integer}; 3], in which 3 is the length of the array. For variable-size vectors, Rust uses a Result type to force programmers to check for the possibility of out-of-bound errors.

The infamous type systems of Perl and PHP have tormented countless programming souls. The following is copied verbatim from the official PHP manual:

$foo = 1 + "10.5";                // $foo is float (11.5)
$foo = 1 + "-1.3e3";              // $foo is float (-1299)
$foo = 1 + "bob-1.3e3";           // $foo is integer (1)
$foo = 1 + "bob3";                // $foo is integer (1)
$foo = 1 + "10 Small Pigs";       // $foo is integer (11)
$foo = 4 + "10.2 Little Piggies"; // $foo is float (14.2)
$foo = "10.0 pigs " + 1;          // $foo is float (11)
$foo = "10.0 pigs " + 1.0;        // $foo is float (11)

PHP and Perl are extremely lenient to type mismatches and go to great length to massage arguments into whatever types required. For quick-and-dirty scripts, they may allow the programmer to get things done with as little code as possible, but for large projects they are good at burying bugs. Most other languages in wide use today require explicit conversion between unrelated types, whether they are dynamically typed or statically typed. For example, in Clojure, you’d need to call Integer/parseInt to parse a string to an integer.

Both static type checking and a strong type system help to uncover bugs as early as possible, usually at the expense of making programs more verbose, but they are different things.

The spectra of programming languages

Imperative versus functional

Strict versus lazy

Compiled versus interpreted

Type systems: dynamic versus static, strong versus weak

Written by Hong Jiang