Archetype, a DSL for Tezos

Aug 1 · 7 min read

Archetype is a domain-specific language (DSL) to develop smart contracts on the Tezos blockchain, with a specific focus on the formal verification of the contract.

Archetype is funded by the Tezos Foundation and developed by edukera.

Introduction

Why a new smart contract language?

There are two categories of smart contract languages :

  • Generic languages, usually with familiar syntaxes (for example OCaml or Pascal in the case of LIGO, and Python in the case of smartpy) with system-specific builtins/libraries. They give the impression you can develop a smart contract as you would do for any program.
  • Domain-Specific Languages (DSL) which system-specific algorithms, carried by a tailored syntax, and limited generic algorithmic capacity

What is specific to smart contracts that would require a DSL?

A blockchain is not designed for computational intensive calculus (nor for data storage). As a consequence, any required computation should be as much as possible processed off-chain and only checked on-chain. For example, say a process requires the square root of 2; rather than having the smart contract implement the Babylonian algorithm, it should better be computed off-chain and provided to the smart contract; the smart contract just checks that the provided value multiplied by itself is close enough to 2...

A smart contract is a highly critical piece of software, especially when dealing with currencies. Bugs in smart contracts have lead to the loss of millions of dollars. Being provided with the full algorithmic capacity of a standard language increases the potential complexity of the program and the risk of bugs. Conversely, limiting the algorithmic capacity of the language provides a form of security by design. For example in Archetype, recursive calls are not authorized.

Why not just Michelson?

Michelson is the native smart contract language of Tezos. Any other language is converted somehow to Michelson.

Tezos’ contract execution machine is a stack machine, and the Michelson language provides instructions to change the stack (push, pop, …). It is way superior to ethereum’s stack machine thanks to its strongly typed approach and its standard memory management, which makes it an excellent backend language.

The main drawback is that it makes contracts hard to write and read because you have to follow and imagine the evolution of the stack in order to figure out what the contract is about. If “Code is Law”, we propose a more explicit smart contract language, easier to read, maintain and formally verify.

Why formal verification?

The main issue is that a bug in a smart contract breaks the entire business process. This questions the whole “Blockchain + smart Contracts” as a trustless system, since you need to trust the developer to use a contract.

Formal verification is the golden standard when it comes to a trustless guarantee of program correctness: say you have a program and a specific property that the program is supposed to have; say this property is written in formal logic. Program verification consists in figuring out a mathematical proof that the program has the property.

A formal mathematical proof is the perfect trustless guarantee because it can be verified automatically!

Are there limits to formal verification?‌ Yes:

First, the guarantee you get is limited to the properties you have been able to identify. A critical property may still be forgotten!

Second, you need to be skilled in formal methods to formalize the contract properties, and even more to prove them, especially if they are complex.

Archetype provides the possibility to write the properties the contract is supposed to have. The archetype compiler then generates the contract in whyml format for verification in Why3.

At last, regarding the second limitation point mentioned above, note that the use of a DSL, which limits the algorithmic complexity of the contract, facilitates the verification process (see “Contract verification” section below).

Some Archetype features

In this section, we cover some of the main features of the language with some basic examples. A more extensive presentation of the language may be found at https://docs.archetype-lang.org/

Contract as a state machine

It is possible to design the contract as a state machine by declaring the list of states and transitions between these states.

In the example below, the contract declares some roles identified by their addresses, a tez price value to exchange in an escrow process, 5 states, and transitions from one state to another.

The advantage of the state machine design is the ease to get a map of the process; it is for example straightforward to derive the graph of transitions from the above code, as illustrated below:

Collection of assets

Data is modeled as collections of assets, in a way very similar to the modeling in Hyperledger.

Say you want to set up a contract which manages miles for a marketing loyalty program. Each customer collects a set of miles which can be consumed before the expiration date (each mile has an expiration date).

This process is modeled with two assets, a “mile” asset and an “owner” of miles asset:

asset declaration

The “amount” field (line 3) is for storage optimization; it enables to aggregate miles with the same expiration date. For example, 10 miles with the same expiration date are merged into one mile with the amount value set to 10.

An owner possesses a collection of miles through the “miles” partition (line 9). A partition means that any mile is possessed by one and only one owner.

The collection of assets is referred to by the name of the asset. For example, the following code declares a contract entry called “add” to add a mile asset to an owner:

Line 4 of the above snippet demonstrates the “get” method on the collection of owners. The argument is the address of the owner. It returns the owner asset associated with the “ow” address.

Line 6 adds a new owner to the collection (it fails if an owner with “ow” id already exists).

The field “miles” being a partition, it prevents from straightforwardly adding a mile to the collection of miles. It has to be added (and removed) through an “owner” asset.

Asset invariant

We have used the “int” type for the “amount” field of the mile asset (line 3 above). The issue here is that it only makes sense for this field to be strictly positive.

A solution is to formulate the positivity property in the declaration of the asset in order to ensure the property throughout the life of the contract:

The property “i”, declared line 6 specifies that the “amount” value always has to be strictly positive. Such a property is called an asset invariant.

This property is now supposed to hold before calling any contract entry and after any contract entry.

It is the role of the verification framework to verify that asset invariants hold after any contract entry. In this example, the asset invariant would not be verified after the “add” action (presented above). Indeed the positivity of the new mile’s “amount” field is not ensured by any mean.

The solution is to add a “require” instruction on the new mile’s “amount” field:

There is no unsigned integer type in Archetype. Whenever you want a value to be positive, add an asset (or variable) invariant.

Action postcondition

We saw the “add” action above. Now consider the “consume” action which has to delete a certain number of miles. The process is a bit more complex than it sounds because of the storage optimization via the “amount” field.

Miles consumption should remove oldest miles first til the accumulated amount of removed miles equals the right amount. It may be necessary to reduce the amount value of the last mile to consider.

Here is a proposition of implementation:

How to decide whether this implementation is correct?

Typically we would like to make sure that the “consume” action actually consumes quantity miles.

A solution is to formulate this property as a postcondition, that is a formula about the effect the action is supposed to have over the assets:

This formula states that the sum of miles amounts after the execution of “consume” is equal to quantity subtracted from the sum of miles amount before the execution of the action.

One of the tasks of the verification framework would be to verify whether the implementation has this property.

It turns out the implementation above does not verify the postcondition! The verification framework is able to isolate the issue in case line 9, when the remainder is strictly less than the current mile amount: the remainder should be set to zero only after the current “m” mile amount is decreased by remainder …

A correct version of the “consume” action may be found here, in the archetype contract library.

Work in progress

Archetype is currently under development. We hope to release the first version in September. This version will provide:

  • compiler
  • transcoders to CamLIGO and Whyml (for verification)
  • VS Code integration

Do not hesitate to visit the github project and participate!

For more details on the archetype language, please visit the documentation site:

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade