Rust and Haskell: The Languages of Concordium

Thomas Dinsdale-Young
Concordium
Published in
6 min readJun 16, 2020
Rust and Haskell code at the heart of the Concordium blockchain.

At Concordium, we are developing a next-generation blockchain. This is a complex piece of software, and it needs to be correct and reliable. We use two cutting-edge programming languages — Haskell and Rust — which allow us to be productive while writing high-quality code. In this article, I will discuss why and how we use these languages, and examine the benefits and costs of our choices.

A common feature of both Haskell and Rust is that they have powerful and expressive static type systems. A type system gives types to values: in Haskell, for instance, 2 has type Integer, while "Cat" has type String. The type system prevents us from performing operations that don't make sense, such as adding 2 and "Cat". A dynamic type system prevents this when a program is run; this is common in interpreted languages such as Python. A static type system prevents this when a program is compiled; a Haskell compiler would reject 2 + "Cat" as ill-typed, ruling out the possibility of a run-time failure.

Of course, Haskell and Rust’s type systems go far beyond preventing us from adding a string to an integer. For instance, Haskell’s type system allows us to be explicit about the side effects of code: it can prevent us from writing to a database in code that is only supposed to query it. Rust’s type system enforces memory discipline: it can prevent us from accessing memory outside its lifetime, or in other unsafe ways, which can lead to the program crashing.

If both Haskell and Rust have powerful type systems, then why do we use both? The answer is that they have different strengths in other areas.

Haskell

Haskell is a pure, lazy, functional language. A functional language is one where functions are first-class values: functions can be passed to other functions as arguments or return values. A function can even be partially applied to obtain a new function:(+) is the function that adds together two integers, while(1+) is the function that adds 1 to another integer. The higher-order function map applies a given function to each element of a list: map (1+) [1..4] computes to [2,3,4,5].

Laziness means that an expression is only evaluated when its result is actually needed. For example, consider the expression take 2 (map (1+) [1..4]), which takes the first two elements of the list produced by map (1+) [1..4], resulting in[2,3]. Laziness means that only 1+1 and 1+2 end up being computed, but not 1+3 and 1+4. This makes it possible to write Haskell code in an elegant, mathematical style that expresses the intention of the code, without concern for the process of executing the code.

Haskell’s laziness means that it is often difficult to predict when (and if) code will be evaluated, since it depends on the wider context. This is a problem for code that has side effects, such as writing to a file or communicating over a network, since we care about the order of these operations.

A pure function is one that has no side effects. In Haskell, functions are pure: since functions do not have side effects, the order of evaluation does not affect the order of side effects. Side effects are made possible in Haskell by using monads. A monad captures a notion of computation that can be sequenced. For instance, the IO monad encapsulates side effects that interact with the outside world, such as printing to the console. The>> operator sequences two actions in a monad: print 2 >> print 4 first prints out 2 and then prints 4. (I won't go into much more detail about monads here. The concept has a reputation for being difficult for novice Haskellers to understand, which is exacerbated by descriptions such as “a monad is like a burrito”, or “a monad is just a monoid in the category of endofunctors”. I recommend studying examples of monads and then seeing how they fit the generalization.)

Haskell, like many high-level languages, relies on a garbage collector for memory management. Indeed, memory management is largely hidden from the programmer: it is allocated as needed for structures (such as lists) that are created during evaluation. Structures are deallocated periodically by the garbage collector if they are no longer accessible.

While it is convenient not to be concerned with explicitly allocating and freeing memory (as in lower-level languages such as C and C++), Haskell programs are still prone to using more memory than necessary, causing a space leak. This is often exacerbated by Haskell’s laziness. For instance, suppose that we want to calculate the total of a list of numbers; laziness could mean that the program delays calculating the total, and the list (which could be quite big) cannot be garbage-collected. If we expect to need the total, but not the list, then it can be helpful to force evaluation earlier so that the list can be garbage collected. This is a particular concern for long-running programs (such as a blockchain node) where cumulative space leaks can cause the program to run out of memory. The Glasgow Haskell Compiler (GHC) has good support for profiling, which can be helpful in identifying and tracking down space leaks. However, often the solution is just to force evaluation for all long-lived data.

Rust

Rust prioritises performance over convenience. While features such as laziness and garbage collection offer convenience to Haskell programmers, they also have a run-time performance cost. Rust eschews such features, instead adopting ‘zero-cost abstractions’: conveniences that have little or no run-time cost.

Rust’s approach to memory management is a prime example of this. While Rust does not use a garbage collector, it still largely absolves the programmer from explicit memory management through the use of ownership and lifetimes. Rust’s type system determines where a data structure is last used and the compiler automatically generates the code to free it at that point. For instance, if a vector is passed as an argument to a function, that function takes ownership of the vector; if it does not pass on ownership of the vector, then the function frees the vector at the end of its lifetime. While the lifetime analysis is not free, the cost is entirely borne at compile-time.

The Concordium Combination

Rust and Haskell’s different philosophies give them different strengths: Rust is great for low-level performance-critical code, while Haskell is great for high-level correctness-critical code. At Concordium, we try to play to the strengths of each. We use Rust for our cryptographic primitives and identity layer, which are computationally expensive and performance critical, and for the peer-to-peer networking layer, which requires high performance and low resource usage. We use Haskell for the consensus, transaction scheduler and smart contract language implementation, where the convenience of a high-level language helps us to write correct code quickly.

While using two programming languages allows us to take advantage of the strengths of each, the downside is that the interface between Haskell and Rust is not seamless. A language’s foreign function interface (FFI) provides a means to interoperate with code written in other languages. The FFI is generally designed with C in mind as a kind of lowest common denominator, which limits how data can be passed from one language to another. (A tricky example of this is that Rust pointers are not always the same size as C pointers. We found this out the hard way with some code that would crash on Windows but not on Linux.) Using two languages also complicates the build process, since the output of multiple compilers needs to be linked together. This is further exacerbated when using multiple operating systems, since linking can have differences on each.

Ultimately, Rust and Haskell are both cutting-edge languages with strong and expressive type systems, powerful abstraction mechanisms (traits in Rust and type classes in Haskell), typed metaprogramming (via Rust macros and Template Haskell) and excellent compilers (rustc and ghc). They enable Concordium to be at the forefront of developing efficient, reliable and trustworthy blockchain software.

--

--