Lambda Calculus in Pharo

Yes, the Y Combinator is useful in normal programs

It started with a 3 line shell script implementation of Quicksort and it ended with an implementation of the mythical Y Combinator fixed point operator of the Lambda Calculus. Block closures in Pharo are the real deal.


I came across this little shell script implementation of Quicksort.

#!/bin/sh

# qsort N1 [N2] [N3] [N4] ...
qsort()
{
local L=""; local G=""; [ $# -eq 1 ] && echo $1 && return;
P=$1; shift; for i in $@; do [ $i -lt $P ] && L="$L $i" || G="$G $i"; done
[ -z "$L" ] || L=`qsort $L`; [ -z "$G" ] || G=`qsort $G`; echo "$L $P $G"
}

qsort $@

OK, like all shell scripts this is not very readable but it is not very difficult to understand either. Anyway, the actual syntax is not that important.

The algorithm is pretty standard. The base case of the recursion is a one element input. Else, the first element is chosen as pivot and the input is divided in two groups: those less than the pivot and those greater than the pivot. Both of these are then sorted using a recursive call. Finally, the sorted partitions are concatenated with the pivot in between them.

The label, ‘in 3 lines’ is not very accurate either, but is is quite short, somewhat elegant even.

Pharo

My first reaction was: I wonder how this could be written in Pharo, my favorite programming environment.

What follows is standalone scripting code, this is not how you would normally design and structure real programs.
| quicksort |
quicksort := nil.
quicksort := [ :data |
data size <= 1
ifTrue: [ data ]
ifFalse: [ | pivot rest left right |
pivot := data first.
rest := data allButFirst.
left := rest select: [ :each | each <= pivot ].
right := rest select: [ :each | each > pivot ].
(quicksort value: left), { pivot }, (quicksort value: right) ] ]

You can clearly see the implementation elements of the algorithm described above. What gets assigned to the quicksort variable is a BlockClosure, often called a block. It is a proper lambda and closure.

Copy the code in a Workspace or Playground and you can run some tests.

quicksort value: #(7 12 3 20 5 8 2).  =>  #(2 3 5 7 8 12 20)
quicksort value: #(). => #()
quicksort value: #(1). => #(1)
quicksort value: #(1 2 3). => #(1 2 3)
quicksort value: #(3 2 1). => #(1 2 3)

So we are using 9 properly written lines of code, not bad.

But what is with the

quicksort := nil

line ? This is needed to silence a compiler warning. The recursive call is implemented as an evalation (invocation) of the quicksort block. But at that point, the compiler is not 100% sure that the quicksort variable is not uninitialised (read before written). Our block actually closes over the quicksort variable and uses it to reference itself.

How can a block refer to itself ?

Here is a little detour. It turns out that this is something that Pharo can do, through the special variable thisContext (one of the 6 reserved names, along with self, super, true, false and nil). The thisContext variable refers to the execution context of the code being executed. Normally this is a method, but inside a block it is a closure.

Here is a version of our algorithm that uses this trick to recursively call itself.

| quicksort |
quicksort := [ :data |
data size <= 1
ifTrue: [ data ]
ifFalse: [ | pivot rest left right |
pivot := data first.
rest := data allButFirst.
left := rest select: [ :each | each <= pivot ].
right := rest select: [ :each | each > pivot ].
(thisContext closure value: left), { pivot }, (thisContext closure value: right) ] ].

Neat, but maybe a bit too esoteric.

Are there any other ways a block can refer to itself ?

Yes, in Lambda Calculus there something called the Y Combinator fixed point operator, which is used for exactly that. The mythical Y Combinator is a function that, when given another recursive function, computes its fixed point. The function given to the Y Combinator accepts one argument, a reference to itself.

Is your head spinning already ? Hold on, it will get worse.

Let’s use the classic recursive factorial function as an example.

[ :recur |
[ :n |
n < 2
ifTrue: [ 1 ]
ifFalse: [ n * (recur value: n — 1) ] ] ]

This is what we pass to the Y Combinator. A function of one argument that will return a function that will compute the factorial of a number. The argument of the outer block, recur, it used for the self referencing. Let’s call this block or function F.

In pseudo code, usage of the Y Combinator looks as follows.

(Y F)(5)  =>  120

Y turns our partial recursive F into a proper function that implements factorial — without the need of F to refer to itself !

What would Y look like ? That is a long and complicated story. Many people have been fascinated by the Y Combinator and have tried to explain it as well as they could. Look it up. A nice one, in my opinion, is ‘Yet Another Y Combinator Tutorial’. There exist explanations in many programming languages.

Here is a standalone version as a block. Ready ?

ycombinator := [ :f | 
[ :g | g value: g ] value: [ :g |
f value: [ :x |
(g value: g) value: x ] ] ]

Absolutely beautiful and frightening at the same time. Magical !

You would then use it as follows.

(ycombinator value: F) value: 5  =>  120

We can make this more elegant by adding a method called fixedPoint to the class BlockClosure.

BlockClosure>>#fixedPoint
^ [ :g | g value: g ]
value: [ :g | self value: [ :x | (g value: g) value: x ] ]

Now, we can write our code in a very compact way.

[ :recur |
[ :n |
n < 2
ifTrue: [ 1 ]
ifFalse: [ n * (recur value: n — 1) ] ] ] fixedPoint value: 5

Note again how no closed over variables are used for the self referencing.

We are now ready to revisit our Quicksort example, this time using the Y Combinator.

[ :recur |
[ :data |
data size <= 1
ifTrue: [ data ]
ifFalse: [ | pivot rest left right |
pivot := data first.
rest := data allButFirst.
left := rest select: [ :each | each <= pivot ].
right := rest select: [ :each | each > pivot ].
(recur value: left), { pivot }, (recur value: right) ] ] ]
fixedPoint value: #(7 12 3 20 5 8 2)

Pretty neat, right ?

Efficiency

Our Quicksort implementations are not very efficient because they create lots of intermediate datastructures. Here is what an efficient, in-place version would look like. The original data array is never copied and only modified in-place — notice the elegant #swap:with: message.

| quicksort |
quicksort := nil.
quicksort := [ :data :from :to |
from < to
ifTrue: [ | pivot index |
pivot := data at: from.
data swap: from with: to.
index := from.
from to: to — 1 do: [ :each |
(data at: each) < pivot
ifTrue: [
data swap: each with: index.
index := index + 1 ] ].
data swap: to with: index.
quicksort value: data value: from value: index — 1.
quicksort value: data value: index + 1 value: to ].
data ].
quicksort value: #(7 12 3 20 5 8 2) copy value: 1 value: 7

Note: A limitation of the way we implemented the Y Combinator is that it is can only generate functions that accept just one argument. One way around this, as Werner Kassens suggested, is to use a collection holding multiple arguments.

Show your support

Clapping shows how much you appreciated Sven VC’s story.