Blockchain Implementation From Scratch and in a Functional Programming Style

Lance Galletti
Coinmonks
11 min readOct 26, 2021

--

This article aims to take you step by step through the implementation of a blockchain and a smart contract language, from scratch and in a functional programming style. We will be using the programming language ATS.

The full code can be found here.

What are Smart Contracts?

Smart contracts allow for circumventing the need for third party entities whose role is to ensure and enforce that contracts are respected. For example consider user A wants to create a slot machine on the blockchain so that users can gamble with their cryptocurrency. Without smart contracts, taking part in this gamble is risky beyond the odds set by the machine. Consider that a slot machine is simply a random number generator (RNG). User B must trust that sending a coin over to A will result in A running a well-calibrated (i.e. calibrated as specified in the unspoken contract made by playing the game) RNG. And, should B win, that the appropriate reward (if any) will be sent over.

User A can write what is called a smart contract to specify the calibration and semantics of the slot machine for user B to review before deciding to partake in the game. To ensure that these contracts are upheld, they are written in a programming language that miners of the blockchain execute. The effect and result of this code persists in the blockchain (as would ordinary transactions) for all to view and verify. In this case, user B is no longer required to trust any third party and can simply read the smart contract (as any person would read any ordinary contract) to understand the terms and/or conditions.

Defining the Blockchain Type

Tuples in ATS are immutable and are statically defined by their size. By defining a block as a tuple of various types, if we wish to extend our blockchain datatype later (by adding some elements to the tuple), the typechecker will easily catch functions we forgot to update. This will make future enhancements of our code very productive!

A single block will then look something like this:

Interfacing with the Blockchain

To interface with the blockchain we will create a simple CLI. We want our CLI to be easily extendable to include new ways of interacting with the blockchain. For now, we would like our CLI to work as such:

  1. define miners/users
  2. make transactions (valid or otherwise)
  3. write/execute smart contract
  4. decide who will mine the next block
  5. mine the above (which executes the transactions and code)
  6. view the balance of users and results of transactions
  7. Repeat from 1/2

We thus implement our CLI as follows:

User input is treated as a linear stream and hence our CLI is very memory efficient and guarantees not to leak any user input once the CLI terminates. You can read more about linear streams here.

Walking through the code we see the bulk of the work is done in the read_loop function. If the input is “exit” then we call cli_stop() and free from memory the resource that is the stream of input. Otherwise, we call cli_do() followed by a recursive call to read_loop(). The cli_do() function is deliberately simple — we want adding functionalities to be as easy as possible. Each of these “do_” functions is expected to either write locally or send a message to a peer.

Mining

Mining involves validating transactions and finding the nonce that makes the hash of the header valid.

Notice that we filter out the invalid transactions using list0_filter before recursively mining the block. The is_valid_transact() function has the intended side effect of executing the transactions as it validates them. This is necessary for the following scenario:

User A has 2 dollars and makes two transactions:

  1. A sends 1 dollar to B
  2. A sends 2 dollars to C

Each transaction is valid but both transactions cannot be executed together — otherwise A’s account would have a negative balance. The side effect of is_valid_transact() allows for transaction 2 to be invalidated once transaction 1 has been executed. We can easily modify the order in which these transactions are executed to minimize or maximize a desired quantity. For example, a miner can decide to minimize the number of invalid transactions, or maximize the total amount transferred, or even maximize the sum of the fees of the transactions.

The tail-recursive implementation of mine() is inline with the memory efficiency of the CLI. Also, notice that the target hash that miners aim to meet is, here, a hash starting with 4 zeros but is easily adjustable.

Smart Contracts

Lambda Calculus

The programming language we will design to create smart contracts will be based on untyped lambda calculus. As such, our language will be turing complete but, as you will see, it will be extremely impractical since defining complex functions will be tedious.

To start, let us define a datatype called “term” to represent lambda calculus terms.

To facilitate the parsing of our language we decided to give it the following lisp-like syntax:

Integers
1 is represented as (TMint 1), 2 as (TMint 2) …etc.

Bool
True is (TMint 1) and False is (TMint 0)

String
“hello, world!” is defined as (TMstr hello, world!)

Variable
a variable x is declared as (TMvar x)

Tuples
a tuple (t0, t1, t2) for example is defined as (TMtup t0 t1 t2)

Operations
2 + 3 + 4 is defined as

Lambda Function
λx.x + x can be defined as

Function application
Applying the above lambda function to 1 can be done as such:

Fixed Point
the fibonacci function can be written as

Sequential operations

can be encoded as

As you can see, this language is not fit for encoding complex functions — although it is possible because of turing completeness. It is hence discouraged to write directly in this language and instead we recommend generating lambda code from ATS. This allows for extending the lambda language to allow for more succinct verbosity, in addition to which one can utilize the typechecker and compiler of ATS for static syntactic debugging. For example, we can overload the + symbol to make our syntax for adding two integers more practical:

In ATS, we can now write TMint(1) + TMint(1) instead of (TMopr + (TMint 1) (TMint 1)).

Parser

Our parser is also very memory efficiency since it treats the contents of the file to be parsed as a linear stream of the characters. A tokenize() function maps the stream of characters into a stream of strings which can be in turn mapped to a term (via our parse_tokens() function) that will be interpreted into a value through an interp function we will implement below.

You can view the full parsing implementations here.

Interpreter

Before we design an interpreter for our lambda language, we need to define a value datatype — that is the value of the interpreted term. Similarly to the way we defined the lambda term datatype we can define the value datatype as follows:

The above should be intuitive in that an integer term is expected to evaluate to an integer value which has the type int in ATS.

At a high level, our interpreter will need to evaluate a term to a value but also keep track of terms defined in the environment (so that you can define variables and functions). Again using the pattern matching feature of ATS, we can implement the interp function as follows:

Where interp_app will need to add to the environment. Here is an example of the interpreter for TMopr, interp_oper:

Our interpreter for the TMopr term can easily be extended in a way to include more operators (“*”, “%” etc.) and transactional operations on the blockchain for example or to facilitate writing code in this language.

The next step for this interp function is to interpret terms into an Option(value) where Option is a datatype with two constructors: None() and Some(value). In this way, we can implement error messages for our lambda language. That is, if the term provided cannot be interpreted into a value, without the Option datatype, the program will report a runtime type error (because some function that was expecting a value will be given a term). In this case, if we expect an Option type, we can use pattern matching to handle the case where the Option type is None() (which is when the interpreter has failed to convert a term into a value).

If you are thinking you can also use exception handling for this, you are correct. But in this case it is not recommended because the same error in ATS can be produced from different lambda errors. This means either the code will increase in complexity, or the quality of the error messages recieved from the lambda language will diminish.

Code Generation

For fun, we implemented, in this lambda language, various functions such as the fibonacci, the factorial, 8-queen… etc. Each of the .dats files here will generate a .txt file with code written in our lambda language that can be parsed and interpreted to produce a value. These files can be accessed in the CLI through the do_execute() function. Currently, only the return values of files executed by do_execute() are stored on the blockchain. In the future, we would like to store the compiled byte code onto the blockchain along with the result so that validation can be done.

Conclusion

A lot of very interesting research is currently being conducted on programming languages for the blockchain. In general, but especially in blockchain applications, we want to be absolutely sure that the code will do exactly what we expect it to do — and on the first time it is executed. This is due to

  1. the immutability of the blockchain and hence the infinite persistence of your code (and its potential bugs)
  2. the difficulty of performing runtime tests on a fully decentralized network

Going back to the slot machine example, if you miscalibrate the slot machine, there is nothing you can do but weep for your lost coins — the bug will forever persist on the blockchain for users to freely exploit. To solve this issue, a number of languages try to combine theorem proving with programming. ATS, for example, uses types (more precisely dependent types) to flush out potential bugs at compile time (i.e. before running the code). This is enhanced by an ability to construct mathematical proofs about your code via special dynamic functions that exist only as compile-time helpers. In this way, provable guarantees can be provided without the need for runtime testing, and without impacting runtime efficiency. You can find some introductory examples on dependent types in ATS here and on theorem proving in ATS here.

Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing

Also Read

--

--

Lance Galletti
Coinmonks

Lecturer @Boston University. Principal Software Engineer @Red Hat. Watch my DS videos https://youtube.com/@howithinkabout Always refining.