Musings On Gleam

This story is brought to you by a random, but haunting and now lost, tweet by bodil in which she berated lovers of testing and dynamic languages by pointing out that types are a form of compile-time testing. You make declarations about your intent and the compiler asserts them at write time.

If compiling wasn’t a form of testing we wouldn’t have ignore warnings flags, right? There would and could be no ‘proceed with caution’.

Gleam is the new baby in the BEAM family — a nappy-wearing ML typed language. Let’s talk types-as-testing.

It was a truism that Phil Wadler failed to write a type system for Erlang and therefore the BEAM would be type-free and that was it, settled.

The reality is slightly subtler. Fred Hébert sums it up in Learn You Some Erlang For Great Good and quotes Joe Armstrong:

One day Phil phoned me up and announced that a) Erlang needed a type system, b) he had written a small prototype of a type system and c) he had a one year’s sabbatical and was going to write a type system for Erlang and “were we interested?” Answer — “Yes.”

Phil Wadler and Simon Marlow worked on a type system for over a year and the results were published in [20]. The results of the project were somewhat disappointing. To start with, only a subset of the language was type-checkable, the major omission being the lack of process types and of type checking inter-process messages.

The type system could only be made to cover code that runs in-process — but couldn’t handle interprocess comms — whether the process runs locally on the same Erlang VM or some remote VM.

This makes sense because you cannot assert at write time what you cannot possible know until run time. And with hot code loading, you don’t know what will be where at run time.

That works for microservices over URLs. When I write my invocation code on one how do I know what will be running on the other? Even if both ends are written in Haskell.

Story over, or is it?

The knight in chess gives its name to the Knight’s Move in psychiatry — a sudden jump in conversation. So let’s talk about distributed systems and the CAP theorem . (There is method in my seeming madness.)

The CAP theorem states that a distributed system can be available in the presence of partition or consistent in the presence of partition — but not both.

When you look at the development of distributed systems you see many approaches to try and mitigate this fundamental (and unprogrammable) issue.

AP systems use eventual consistency with CRDTs — the system is inconsistent during partition but returns to consistency afterwards.

CP systems use sharded consensus so that the system remains partially available in the presence of partition — a partition only affects a portion of the name space.

In this fuzzy world we start seeing things like harvest and yield. When I queried X how much of the data did I get back, what was my harvest? We have moved from hard guarantees of behaviour.

This would suggest that there might be a blended solution to the unprogrammable problem of distributed typechecking: a blended type system that enforces compile-time types on the pure functional parts of the code base, and generates run-time checks on inter-process side effects that codify, standardise and operationalise run time partition and other failure modes for all programmes written in the language.

Every chess player has two knights — Kubernetes, let’s go.

Like all Erlangists, I maintain two uneasily-conjoined attitudes towards K8s.

On the one hand it brings such goodness as process isolation, hot reloading and supervision to the masses and that gives me all the peeing-in-the-wetsuit.

But the ‘process isolation’ overhead and all the, all the, all the, all the confusing boilerplate and invocation guff and endless dockerfiles gives me the absolute dry boaks.

Where it outshines Erlang though is that it has at its heart an explicit Directed Acyclic Graph and not an implicit one.

Every Erlang process can send a message to every other one. But every non-trivial Erlang application has a designed and architected business-process message flow — always directed, (almost) always acyclic.

Every Erlang developer joining a new team and reading the code then has to learn, plot and write out the expected call flow — and there are a non-trivial amount of hard distributed systems bugs pushed to production because of poor understanding and/or poor implementation of this implicit architecture.

In K8s ‘processes’ (by which I mean docker containers) have one-way holes punched in their firewalls which (in theory) allows you to construct the DAG from declarative configuration. (If you can’t in practice, don’t @ me bro.)

It is worth reviewing the history of Erlang/OTP to see what else we can learn about how we could design such a blended type system.

What is interesting about the transition from Erlang to Erlang/OTP (and many other programming language version bumps) is some critical primitives are occluded.

At the heart of Erlang is the bang operator ! which is used to send a message between two processes in a network transparent manner.

And yet as an Erlang/OTP programmer I have used it maybe 3 times in 16 years. OTP takes a primitive and hoists it into the standard library — you use server call and cast.

One way of handling type checks would be to mark all modules in Gleam as either pure functional or dirty and then type check the former and ignore the latter.

This is daft tho. Because if we hoist side effects from primitives on the Beam into the Gleam language itself the compiler would ‘know’ which modules were pure functional. But because we have hoisted message passing we can then implement communication by contract and let the types flow across the contract.

(Pleasant side effect: statistics like unit test code coverage could split into cc for pure functional and cc for dirty).

We know that type checks cannot deterministically flow across the contract — but they can fuzzily using blended type checks — part compile time and part run time.

Let’s see how.

Consider the three Erlang functions:

add_one(X) -> X + 1.add_1(X) when is_number(X) -> X + 1.plus_1(X) when is_number(X) -> {ok, X + 1};
plus_1(_) -> {error, "not a number"}.

In the second one there is a trivial run-time assertion in the code that causes a crash.

In the third one the run-time assertion is turned into a well behaved error message.

By hoisting gen_server calls into Gleam and generating a hashed nonce from an mixture of module/functions/argument types it would be possible to have runtime interprocess calls that compile to Erlang that looks like this:

gen_server:call(Pid, {<<"abce1234">>, MyTerm}),

The nonce <<"abcd1234"> is the generated nonce hash. If is effectively a pointer to a type specification and by matching on it you are doing a run time type check.

The compiler would generate remote handling code like:

handle_call({<<"abcd1234">>, Term}, _From, S = #state{}) -> 
{reply, MyReply, NewState};
handle_call({Nonce, Term}, _From, S = #state{}) ->
{reply, {error, {type_error, {got, Nonce}, {expected, <<"abcd1234">>}}};
handle_call(Msg, _From, S n= #state{}) ->
{reply, {error, 'WTActualF'}}.

So the normal side-effect free type check asserts the functions are invoked with the correct types.

This new blended type check asserts that if the remote function has been invoked, it has been invoked with the correct types and also if something has gone wrong the calling module correctly handles the complete error set.

(Bear in mind that the error set the calling code has to handle is a superset of the error types in this example. It has to deal with I tried to call it, and it timed out and I don’t know if it crashed or wasn’t available or there was a partition or what. The compiler needs to generate code on the invocation side as well.)

Normally type signatures are symmetric.

The callee function declares if you call me with arguments of type A, then I will return a value of type B.

The caller function declares I will call you with arguments of type A and I expect you to return a value of type B.

This is no longer the case. The callee function type spec is the same, but the callers type spec is extended I will call you with arguments of type A and my expectation now at write time is that you will return a value of type B but you may also return one of this list of error types at run time.

We have moved from hard guarantees to fuzzy partial ones. You can think of this as happy-path typing.

Defining a language that naturalistically blends types and interprocess calling contracts is a non-trivial business, with some sharp aesthetic choices to make (and kinda out of scope, but interesting).

It is trivial to craft a message call in another BEAM language that invokes the function with incorrect types. This type system is no defence against Byzantine failure with an untrusted, distributed and malicious partner.

But there is a wider set of failure types — let us call them Brexit failures. One party is reliable and trustworthy and the other party is a chaotic, slapdash, arrogant and ignorant, bloated, self-important, pompous klown kar kavalkade of krap. Normal for software development errors then. Happy-path typing will prevent some of these and make diagnosis of the rest simpler and cleaner.

Moving from ‘I am in error and my message causes failure for you to ‘I am in error, and the type error is returned to me

The code fingerprints and error handling routines have been standardised and hoisted from being a normal write-time task of the developer (implement, re-implement, re-re-re-re-implement) into the write-time task of the compiler writer.

Joe Armstrong’s original conception of writing programmes in Erlang was that you divided the app into an Error Kernel which can’t go wrong, and bare worker code. OTP took that Error Kernel concept and build a standard library where all applications share the same one. My Error Kernel and yours have the same battle-tested goodness.

But OTP was developed before Networked Erlang.

This proposal would take the Error Kernel of distributed systems built in Erlang/OTP and incorporate them into a larger Distributed Error Kernel.

By hoisting key libraries (like a Paxos or Raft implementation) you could then expand the OTP call profiles from call and cast to include consensus.

Because your remote function calls are declarative you can build a call graph of inter-process communications — the explicit DAG — wowzer!

But this is just a sketch — there are other things the compiler-as-test-writer might want to do.

You could turn the plain nonce into a sequential nonce {<<"abcd1234">>, N} to enforce call sequencing on an interface. How you represent that in a type contract, gawd only knows.

(Basically two state machines running on different machines and communicating by a defined contract can be considered a vector of state machines where some transition events are hard-wired across and between the two concurrent states (provided you recognise that errors are also valid state transition events). The vector is not capped at a length of two, obvs, but Jeez Louise.)

You could address the operational processes for handling node-by-node cluster upgrade for instance — this implies programmes that contain multiple releases in the language. You can get a flavour of the scope of that in this old Basho design review checklist.

The compiler knows which exported functions in which modules are pure functional and can therefore emit a skeleton test suite — and possible append and update it.

The compiler knows how to emit generators for property-based testing with QuickCheck and generators for fuzz testing.

Fun things galore!

If you like this follow me on Twitter


Addendum 1

I always said if I ever wrote a new compiler for the BEAM I would use a homoiconic AST format which I dubbed Liffey.

Liffey’s defining characteristic is that it would be transpilable on a statement-by-statment basis to LFE (Lisp Flavoured Erlang).

Addendum 2

New BEAM languages really need to have source maps generated and expose a native language wrapper for tracing in their REPL.

Former SNP Parliamentary Candidate — Quondam Computer Boffin

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store