APL in the eyes of a Python programmer

7 min readMar 11, 2018

I recently ran into the APL phenomena. Well, it wasn’t the first time I heard about it, but this time I didn’t dismiss it as some kind of legacy mistake from the 70:s when you paid for code by the character. If you haven’t seen APL before here’s a (slightly modified) example from Wikipedia that computes all prime numbers between 1 and 10:

(~R∊R∘.×R)/R←1↓ι10

Only 19 characters, but completely unreadable right? So why even bother? What happened is that I ran into a thread on hacker news where a proponent of such languages got a chance to tell his story. This person was Aaron W. Hs under the nickname arcfide and he was responding to a comment about his work. The work in question was a compiler written in the programming language Dyalog, a language similar to APL and the comment provided a short simple from the code base and just the word “No”.

I can highly recommend reading the thread that followed if you got the time. What I got out from it was the lesson that sometime fitting a lot of code in a small space has advantages that can, at least for some people, outweigh the drawbacks of having code that if far from obvious for the untrained eye.

Obviously terseness is far from always the best choice. There are preconditions that have to be fulfilled for it to be workable.

The developers on the project should have deep enough understanding about the problem domain that they can recognise patterns without human readable names explaining them. This is similar to how we would expect people to know what the + sign means and to recognise common constructs such as the for loop in C style languages. Languages like APL introduces more signs that you are expected to know. For instance, APL defines the operator ↓ as “drop the first element in the array”.

The pattern of using short variable names also requires that the reader can recognise them by context or convention. In math this is common, f means a function, x is the variable, y is the dependent variable, etc.

There are however benefits with terse code. You can now much more quickly see the full scope of the program. In a complex codebase with a lot of abstractions and short functions you have to spend a lot of time just browsing trough the code to understand what it does. The fact is that even perfectly named self documenting code will have details that won’t be obvious from the function name. If your code is terse enough you could skip defining a separate function and just include the code in place.

This also means that it can be easier to change the code, since you can fit more of the code base in a single screen you can also review the implications of a change more quickly.

This compares quite well to how we use notation in mathematics to compress the information to be dense enough that we can reason about the theoretical implications without being hindered by verbose language constructs.

We can do this in Python as well

So it started with me learning about the arguments for terse code and terse programming languages. But what about them makes them so terse? To find out I went deeper and translated the prime example to Python.

Step one was to do it the naive way. If you read the Wikipedia explanation you’d see that it’s based on the idea that we can find a prime by identifying all numbers that does not exist in the multiplication table for numbers 2 to R (10 in my example to make the example self contained).

The long version

So let’s do this in the simplest way possible and with somewhat self documenting code (9 lines, 336 characters):

potential_prime_range = range(2,11)
for potential_prime in potential_prime_range:
    prime = True
    for first_factor in potential_prime_range:
        for second_factor in potential_prime_range:
            if first_factor * second_factor == potential_prime:
                prime = False
    if prime:
        print potential_prime,

The inner loops are a bit opaque, so lets refactor it using a generator function that returns all the nonprime products. This adds slightly more code (14 lines, 492 characters), but readability is more important than terseness.

def PossibleProducts(highest_number): 
    factors = range(2,highest_number+1)
    for first_factor in factors:
        for second_factor in factors:
            yield first_factor * second_factorhighest_number = 10
potential_prime_range = range(2,highest_number+1)
for potential_prime in potential_prime_range:
    prime = True
    for product in PossibleProducts(highest_number):
        if potential_prime == product:
            prime = False
    if prime:
        print potential_prime,

This new version is more self documenting and the implementation of PossibleProducts has been abstracted away, this means that it can be reused and that we can improve on it’s performance in isolation.

However, it might not be obvious what the PossibleProducts should return, and you can’t be sure what it will return without reading it. If you are solving a bug in the code, you will probably need to find the definition and read it to ensure it’s correct. Any reuse of the code also risks creating dependencies on the exact behaviour which might actually make it harder to simplify.

This is where the APL formulation might have an advantage the terseness of the code makes it easier to keep everything in front of you.

The short version

I recommend going trough the explanation in the Wikipedia APL prime numbers example before reading further. It will serve as basis for comparison.

Assuming that we like the idea of APL, do we need an APL class language? It turns out that we can achieve almost the same level of terseness in Python. Using pure Python we could write this with list comprehensions and short variable names: (2 lines, 64 characters)

R=r(2,11)
[x for x in R if x not in [j*k for j in R for k in R]]

All the repetition and the existing verbosity of the language makes this quite hard to follow. If you read the Wikipedia example you can note that a lot of the power comes from operations specifically designed to operate on numerical arrays.

So lets introduce Numpy, the de facto standard for numerical computing in Python. With Numpy and some definitions to compensate for the inherent verbosity of the language and the lack of an element wise in operator, we can write an equivalent expression in Python using only 26 characters:

# Setup to match APL library
import numpy
def element_wise_in(elems, container):
 return numpy.array([x in container for x in elems])r=numpy.arange
a=numpy.array
o=numpy.outer
e=element_wise_in# Calculate primes
R=r(2,11);R[~e(R,o(R,R))]

Now, this is actually not so different to what lots of scientific engineers are already writing using Numpy daily. The big issue here is that we have to define the short names here, reducing the benefit somewhat, but of course we could just create a helper module making these redefinitions once. Another issue is that when we use single char names, we will soon run out of characters. APL use special characters such as ↓ and × to avoid this issue. This is something Python can’t match.

The Julia version

This is where I realised the usefulness of a peculiar language feature in the quite interesting new scientific computing language Julia. Julia allows using unicode characters in identifiers. Julia also have functionality equivalent to Numpy built in from the start. (There are more advantages to Julia, I recommend looking into it if you do a lot of scientific computing) Going straight to the prime example, using multiplication with the transposed array to do an outer product (R*R’) it could be done as this:

# Element wise in operator
function ∈(elems, container)
    [x in container for x in elems]
end
R=2:10;R[.~∈(R,R*R')]

This uses only 22 characters for the final line, compared to 18 in the APL example. If we don’t like the added helper we skip it and still be quite compact using map and an anonymous function: (32 characters)

R=2:10;R[.~map(x->x in R*R',R)]

Conclusion

This post was mostly an excuse for me to show the results of some fun code golfing I did, but I think there are lessons to be had. One, is the one that is repeated in the hacker new thread a few times. Maybe you shouldn’t be so quick to dismiss what looks odd, sometimes it’s just like the difference between speaking different languages. R°.×R might look incomprehensible to me, but the equivalent R*R’ is perfectly clear as I’m used to Matlab syntax.

Sometimes it might be worth optimising for different things. The long verbose version of the prime example might be easier to read for someone who don’t know about the specific syntax and outer products etc. However, the short version is perfectly readable for someone well versed in the expected function of a prime number generator and the concepts used. It is much quicker to read, if not just by being less verbose, at least by requiring less jumping trough the code.

The other lesson is that you don’t actually have to switch to APL, Dyalog, J, or any of the languages in that family. Modern programming languages such as Julia and Python with the right modules can provide similar compactness if you choose to use them that way, with a slight edge for Julia, due to allowing unicode identifiers.