Introducing SmartPy and SmartPy.io, an Intuitive and Effective Language and Development Platform for Tezos Smart Contracts

Overview

Smart contracts are scripts stored on a blockchain that enable users to define custom rules and protocols for various purposes. Some typical examples include escrow accounts, insurance-like contracts, decentralized applications — frequently called dApps, multi-signatures (multi-sig) contracts, security tokens, and virtual or physical tokenized assets.

While the Ethereum Virtual Machine was branded as a “World Computer”, Tezos’ virtual machine, Michelson, was not designed to comfortably execute any arbitrary computation but to provide a solid foundation for secure smart contracts on a public blockchain and it is doing so quite successfully. Despite its obvious qualities, Michelson’s adoption suffers from a high barrier of entry.

SmartPy is a high-level smart contracts library and comes with related tools in the form of SmartPy.io to greatly ease the accessibility, understandability and provability of smart contracts on Tezos. It combines several popular aspects of software engineering to create an attractive language for smart contract development on Tezos:

  • Use of the widely popular programming language Python, with its simple syntax and fast and elegant scripting possibilities.
  • Functional programming for clean and effective formal manipulations in OCaml as is already done in Michelson.
  • Static analysis and automatic proofs. The need for proofs is well understood in mission-critical industries and highly recognized in the Tezos community. A specific focus in the larger blockchain ecosystem is performed here with fully automatic proofs, especially through abstract interpretation.

We believe that the combination of these three features will lower the barrier of entry for smart contracts and empower a large crowd of potential developers of dApps and other smart contracts to contribute to the Tezos ecosystem even when they do not already have prior experience with functional programming. It is our primary goal that SmartPy creates the necessary bridge between a development environment which is as simple and friendly as possible without sacrificing any of the robustness and security which are at the core identity of Tezos. Moreover, we’re excited by the specific possibilities brought by analytics on smart contracts’ development and adoption.

Motivations

Smart contracts have a mixed history, simultaneously generating a huge interest from programmers and enthusiasts on one hand but also showing some limitations and shortcomings on the other: an inability to fully understand and analyze what will or may happen, unexpected bugs that yield to forks and disrupt network effects, obscure compilation phases to low level virtual machines, decompilation limitations, etc. These shortcomings limit the trust developers can have in smart contracts and ultimately the adoption of this technology by end-users.

Michelson partly addresses these issues by defining a reasonably simple and quite well designed virtual machine, but it is not a complete solution: direct programming in Michelson is difficult and there is a lack of perfect alternatives for developers who wish to build applications on Tezos.

Translating a custom built high-level language into Michelson partly solves this problem. However, some important considerations remain: is the new syntax popular and obvious for programmers? Are the constructions as natural as they should be? Is the language as powerful as it should be? What about meta-programming? How do we build confidence in the language?

SmartPy answers these questions by incorporating the following elements:

  • A set of high-level primitives, SmartML, written in OCaml for a new smart contracts virtual machine that can be translated into Michelson. These primitives also exist on their own and can be executed, simulated and analyzed.
  • A Python library, SmartPy, providing a very intuitive and effective syntax that programmers can fully understand and use to directly access SmartML definitions. Today, Python is one of, if not the most, popular languages to do meta-programming in numerical analysis and machine learning with the multitude of Python frameworks: NumPy, TensorFlow, PyTorch, Scikit-learn, etc. SmartPy functions in a similar fashion: a programmer has access to the full power of Python and describes computational models with a regular Python library. These models are then evaluated elsewhere: on Tezos with SmartPy, in the cloud, on other CPUs or GPUs, etc. for numerical analysis or machine learning.
  • A compiler to translate SmartML contracts to Michelson.
  • Analytics — Analytics are elements of the user interface that provide some automatic procedures in SmartPy.io to analyze and prove properties of smart contracts. They are not required to define smart contracts, but we believe that they are absolutely necessary for smart contracts to gain mainstream adoption by building trust among both developers and end-users.

Timeline

SmartPy.io is an in-browser editor based on SmartPy that brings simulation, debugging, tests, and analytics capabilities straight to developers. It is under active development and a first public version is expected to be released in summer 2019. Analytics are expected to be publicly available in a second version.

The development of SmartPy is driven by Smart Chain Arena LLC, a New York based company, and the project is made possible with the great help of the Tocqueville Group and the Tezos Foundation.

All software will be released open source.


SmartPy.io

SmartPy.io is an in-browser development platform with advanced capabilities to develop, test, and prove properties of smart contracts.

All development takes place in a browser — from Python programming to Michelson contracts generation, testing, and analysis. Python and OCaml code is executed in JavaScript thanks to Brython and js_of_ocaml.

Technical overview

The SmartPy definition library is a Python library that has a single purpose: offer a simple, intuitive, and powerful syntax to define Tezos smart contracts.

Being a Python library means that by defining smart contracts, users benefit from the full power of Python to iterate over data structures, call other libraries, pre-compute parameters, etc. while ultimately targeting a powerful yet limited language such as Michelson.

Once built, SmartPy contracts become SmartML contracts and are handled in an OCaml library called SmartEngine, which is seamlessly manipulated from a Python script.

Overview of SmartPy.io: from SmartPy to Michelson, Tests, and Analytics

SmartPy

An example

We can start with an example: a very simple contract StoreValue that stores some value that we call storedValue and enable its users to either replace it by calling replace, double it by calling double or divide it by calling divide.

import smartpy as sp
class StoreValue(sp.Contract):
def __init__(self, value):
self.init(storedValue = value)
    @sp.message
def replace(self, data, params):
sp.set(data.storedValue, params.value)
    @sp.message
def double(self, data, params):
sp.set(data.storedValue, data.storedValue * 2)
    @sp.message
def divide(self, data, params):
sp.check(params.divisor != 0)
sp.set(data.storedValue, data.storedValue / params.divisor)

A few SmartPy constructions

Here are some typical constructions in SmartPy.

  • Storage and parameters are readily available:
@sp.message
def messageName(self, data, params):
...
  • Some usual operators:
a = 3 * b + 2 # operations and affectations
a <= b # comparisons
  • Access to record fields, dictionary elements, arrays, etc.:
# Retrieving
a.b
a[element]
# Setting
sp.set(target, value)

Types

SmartML contracts are fully typed but initial SmartPy contracts need not be. The typing occurs in the OCaml world and type inference is used to determine the type of every element, parameters, and storage of a contract for every type including on records, constructors, arrays, etc.

This is quite useful when defining complex contracts. For example, here is a pretty-print of a type inferred for a multi-sig contract.

[Rec(groups = [Rec(contractWeight = int,
ok = bool,
participants = [Rec(hasVoted = bool,
id = address,
weight = int)],
thresholdVoters = int,
thresholdWeight = int,
voters = int,
weight = int)],
groupsOK = int,
name = string,
ok = bool,
thresholdGroupsOK = int,
thresholdWeight = int,
weight = int)]

In this pretty-print, the conventions are the following:

[...]                     # list/array
Rec(...) # record type
int, bool address, string # basic types

Sometimes, types inference is not able to determine an exact type for all expressions because the developer of the contract is not explicit enough. SmartML notifies the developer if this is the case and he/she can add a type annotation such as:

# for arrays
Array(t, [v1, v2 , v3])
# for any expression
sp.setType(group.weight, int)

Evaluation of SmartPy

The evaluation of these constructions is not completely standard: SmartPy creates a SmartML value which represents some computation that will execute in a simulation or on Tezos at a later stage.

At execution, commands are evaluated in their natural order. Something that must be noted is that evaluation is atomic in the sense that if something fails, e.g. a sp.check does not validate, then the whole computation is reversed. This is also the case with Michelson.

An elementary example

The following elementary example represents part of a contract from the game, Nim.

def remove(self, data, params):
cell = params.cell
k = params.k
sp.check(0 <= cell)
sp.check(cell < data.size)
sp.check(1 <= k)
sp.check(k <= data.deck[cell])
sp.set(data.deck[cell], data.deck[cell] - k)
sp.set(data.nextPlayer, 3 - data.nextPlayer)

This part of the contract gets the data and params and accesses two parameter fields: 1) cell representing a cell from an array; and 2) k, a quantity to remove from this cell. It then performs a few checks (cell points to an element in the deck, k is non-zero and less that the quantity at deck[cell]) before actually updating the deck.

A little further with meta-programming

We can add the following code to the previous example to see the two stages of programming at play:

        if self.bound is not None:
sp.check(k <= self.bound)

These lines add a check in the abstract syntax tree depending on the value of self.bound.

These kinds of meta-programming constructions are quite powerful and allow the definitions of general libraries of contracts without paying gas for unused features in particular cases.

A complete example

Here is a more complete Nim game example.

import smartpy as sp
class NimGame(sp.Contract):
def __init__(self, size, bound = None, winnerIsLast = False):
self.bound = bound
self.winnerIsLast = winnerIsLast
self.init(deck = sp.range(1, size + 1),
size = size,
nextPlayer = 1,
claimed = False,
winner = 0)
    @sp.message
def remove(self, data, params):
cell = params.cell
k = params.k
sp.check(0 <= cell)
sp.check(cell < data.size)
sp.check(1 <= k)
if self.bound is not None:
sp.check(k <= self.bound)
sp.check(k <= data.deck[cell])
sp.set(data.deck[cell], data.deck[cell] - k)
sp.set(data.nextPlayer, 3 - data.nextPlayer)
    @sp.message
def claim(self, data, params):
sp.check(sp.sum(data.deck) == 0)
sp.set(data.claimed, True)
if self.winnerIsLast:
sp.set(data.winner, 3 - data.nextPlayer)
else:
sp.set(data.winner, data.nextPlayer)

A regular Python class NimGame is defined which inherits a classSmartpy.
A NimGame has two points of entry, called messages: remove and claim.

Tests

To interact with and test this contract, we can write a test. Here we simply create an html document that is directly visible in SmartPy.io. This can be done in the same editor where we defined the contract. It is also regular Python code.

For the StoreValue contract, we can write.

@addTest(name = "Test Store Value")
def test():
c1 = StoreValue(12)
html = c1.fullHtml()
html += c1.execMessage('replace', value = 15).html()
html += c1.execMessage('replace', value = 25).html()
html += c1.execMessage('double').html()
html += c1.execMessage('divide', divisor = 2).html()
html += c1.execMessage('divide', divisor = 0).html()
html += c1.execMessage('divide', divisor = 3).html()
setOutput(html)

And for the Nim game, we write.

@addTest(name = "Test Nim Game")
def test():
# Define of a contract
c1 = NimGame(size = 5, bound = 2)
# Show its representation
html = h2("Contract")
html += c1.fullHtml()
# Interact with the contract
html += h2("Message execution")
html += h3("A first move")
result = c1.execMessage("remove", cell = 2, k = 1)
# and update the html document we're creating
html += result.html()
html += h3("A second move")
result = c1.execMessage("remove", cell = 2, k = 2).html()
html += h3("An illegal move")
result = c1.execMessage("remove", cell = 2, k = 1).html()
html += h3("Another illegal move")
result = c1.execMessage("claim").html()
html += h3("A third move")
result = c1.execMessage("remove", cell = 1, k = 2).html()
html += h3("More moves")
html += c1.execMessage("remove", cell = 0, k = 1).html()
html += c1.execMessage("remove", cell = 3, k = 1).html()
html += c1.execMessage("remove", cell = 3, k = 1).html()
html += c1.execMessage("remove", cell = 3, k = 2).html()
html += c1.execMessage("remove", cell = 4, k = 1).html()
html += c1.execMessage("remove", cell = 4, k = 2).html()
html += h3("A failed attempt to claim")
html += c1.execMessage("claim").html()
html += h3("A last removal")
html += c1.execMessage("remove", cell = 4, k = 2).html()
html += h3("And a final claim")
html += c1.execMessage("claim").html()
setOutput(html)

At each step, when computing a step such as:

c1.execMessage("remove", cell = 4, k = 1).html()

We get a nicely formatted message such as

Executing: remove by [] at time [0].
Params:
cell = 0
k = 1
Balance:0
Result: OK
Effects:
Data:
Claimed Deck            NextPlayer Size Winner
False [0, 0, 0, 4, 5] 1 5 0

Being able to go through a complete list of interactions is paramount for understandability and establishes confidence in developers to build secure smart contracts on Tezos.

Taking this example further

The NimGame example can be taken further in several directions:

  • Adding security verification to verify the identity of each player. This is typically done with some constructions like the following:
sp.check(data.id == sp.sender.identity)
  • Defining a contract to play several Nim games in parallel where users can freely add new decks. Doing so helps reduce the cost of gas and allows for a more uniform landscape of smart contracts and dApps.
    This can be done by using a big_map for games.

Here are possible remove and claim methods. This code is very similar to the initial example with only one game per contract.

    @sp.message
def remove(self, data, params):
gameId = params.gameId
game = data.games[gameId]
cell = params.cell
k = params.k
sp.check(0 <= cell)
sp.check(cell < game.size)
sp.check(1 <= k)
sp.check(k <= game.bound)
sp.check(k <= game.deck[cell])
sp.set(game.deck[cell], game.deck[cell] - k)
sp.set(game.nextPlayer, 3 - game.nextPlayer)
    @sp.message
def claim(self, data, params):
gameId = params.gameId
game = data.games[gameId]
sp.check(sp.sum(game.deck) == 0)
sp.set(game.claimed, True)
sp.if game.lastWins:
sp.set(game.winner, 3 - game.nextPlayer)
sp.else:
sp.set(game.winner, game.nextPlayer)

On can notice the use of sp.if instead of if. This is necessary because the condition is evaluated on the blockchain. There is a native Python implementation:

with sp.ifBlock(...):
...

and sp.if serves as syntactic sugar for sp.ifBlock in SmartPy.io.

As seen for if, control operators for (to loop over some list-like data-structure) and while (to loop over some condition) have their own native versions in SmartPy, using forBlock and whileBlock or the syntactic sugar sp.for and sp.while.

Translating SmartPy to SmartML

The Python interpreter evaluates Python code containing SmartPy definitions. It creates a very simple expression, called a S-expression, that is parsed and typed while some properties are verified in the SmartEngine backend.

The S-expression sent between SmartPy and SmartML has the following form, which is not supposed to be human-readable.

(storage (record (claimed (bool False)) (deck (array "int" (int 1) (int 2) (int 3) (int 4) (int 5))) (nextPlayer (int 1)) (size (int 5)) (winner (int 0))) messages ((claim ((check (eq (sum (attrData "deck")) (int 0))) (set (attrData "claimed") (bool True)) (set (attrData "winner") (attrData "nextPlayer")))) (remove ((define "cell" (unknown 1)) (define "k" (unknown 2)) (check (ge (getParam cell (unknown 1)) (int 0))) (check (lt (getParam cell (unknown 1)) (attrData "size"))) (check (ge (getParam k (unknown 2)) (int 1))) (check (le (getParam k (unknown 2)) (int 2))) (check (le (getParam k (unknown 2)) (getItem (attrData "deck") (getParam cell (unknown 1))))) (set (getItem (attrData "deck") (getParam cell (unknown 1))) (sub (getItem (attrData "deck") (getParam cell (unknown 1))) (getParam k (unknown 2)))) (set (attrData "nextPlayer") (sub (int 3) (attrData "nextPlayer")))))))

The rationale behind this design is to be able to apply the same methodology of building this kind of simple S-expression for other languages and have something like SmartJS instead of SmartPy, etc.

Something that must be noted is that SmartPy source code contains both regular Python code to do any “meta” computation (constant parameters, checks, etc.) and Python expressions that represent SmartPy types, values, expressions or commands. This interleaving is quite powerful but must be understood. This is similar to what happens in NumPy, TensorFlow, PyTorch, Scikit-learn, etc., to represent computations.

Viewing SmartML

Smart contracts in SmartML are abstract syntax trees as typically done in OCaml with values, expressions, commands, etc. Users of SmartPy.io can view such contracts with a simple trick: they are pretty-printed in SmartPy. This is quite efficient and gives great confidence in the meta-programming ability of SmartPy: you see what you get in a simple and understandable form.

Here is an example for the Nim game of such a pretty-printing.

@sp.message
def claim(self, data, params):
sp.check(sum(data.deck) == 0)
sp.set(data.claimed, True)
sp.set(data.winner, data.nextPlayer)
@sp.message
def remove(self, data, params):
cell = params.cell
k = params.k
sp.check(cell >= 0)
sp.check(cell < data.size)
sp.check(k >= 1)
sp.check(k <= 2)
sp.check(k <= data.deck[cell])
sp.set(data.deck[cell], data.deck[cell] - k)
sp.set(data.nextPlayer, 3 - data.nextPlayer)

This looks like the same code with some arguments inlined.


Analytics

Several analytics are necessary to help understand and prove Tezos smart contracts. We can view them as shedding light on several orthogonal dimensions of smart contracts.

  • Value domain analysis: which studies possible values taken by variables. This is the most important analysis and other analyses depend on it.
    Integers in Michelson use arbitrary precision so there is no overflow issues but this analysis still offers bound checking which is useful for gas and storage cost estimation and control.
  • Effects: determination of possible effects such as Payments.
  • Ownership: determination of who can change data in storage. A typical output of this analysis will be to determine a signature or a set of signatures that can change some storage. It would prevent bugs like the so-called MultiSig Parity Bug by construction and will be available for all smart contracts and not only the most important ones where an ad-hoc proof can be developed.
  • Workflow: graph of possible executions. This analysis is concerned with the sequencing of messages: ordering of messages, multiple calls, etc. It works by building a finite automaton approximation of the contract. Having the intended workflow for a contract helps prevent reentrancy or similar attacks.
  • Contract simplification: reduction of gas costs and complexity of contracts and storage.
  • Token conservation: verification of token attribution in a contract so that users know that their tokens cannot be misappropriated by others in contracts. The alternative could be to compute this property at runtime but it would cost gas and it would require a proof as well.
  • Time analysis. There are several notions of time in a blockchain. There is the real time used by every human activity with dates, hours, minutes, seconds, time zones, etc. And then there is “blockchain time” with the block time resulting from the sequence of blocks. Whatever time is used, it is important to check that a message to a contract can be made ‘on time’ and that a future event doesn’t happen either too early or too late in the future. This analysis studies contract execution timing.

Discussion about analytics vs ad-hoc proofs

Automatic analyses as presented in analytics and ad-hoc proofs of a given contract, in e.g. Coq, serve the common purpose to understand what a contract does with the ultimate goal to avoid costly mistakes or bugs.

We must first remark that we don’t prove smart contracts, we prove properties about some smart contracts: that they do something, serve some purpose, conserve some invariants, etc.

Automatic and ad-hoc proofs are not mutually exclusive, and it is important to compare them and understand what they can and cannot do.

  • Automatic analyses can be sound in the sense that they don’t claim something false but they cannot be complete in the sense that they cannot in general always answer the interesting questions they’re trying to answer.
  • Building an ad-hoc proof for a given contract in Coq is probably doable for every natural question we try to answer on every reasonable contract, but it’s probably quite costly and cannot be reasonably expected outside of a select small list of very important contracts.
  • Some analytics can present a result and may even be able to produce the ad-hoc proof in, e.g., Coq or part of it. We strongly think that this is a very nice research project.
  • These two approaches are non-exclusive in the sense that we can pragmatically start with an automatic analysis and see what it can or cannot do and continue from there to build an ad-hoc proof if deemed necessary.
  • The way analytics are designed in SmartPy.io is really to focus on different dimensions (values, effects, ownership, workflow, etc.). Depending on the contracts, these questions will be useful or not. If a question appears repeatedly and is not answered by one of these analytics, it will probably be reasonable to add a new analysis to study it.

Next steps and tasks

Here are a few items on the roadmap.

Next steps: usable version of SmartPy.io

  • Code base improvements
  • Editor improvements
  • Full determination of the list of constructions in SmartPy and implementation of all constructions of Michelson
  • Implementation and tests of the compiler to Michelson
  • Documentation and implementation of several example contracts
  • First public version of SmartPy.io
  • Implementation of the analytics framework
  • Open source all software

Unscheduled tasks

  • Adaptation of the editor to possibly use a Javascript SmartJS library instead of the current Python SmartPy, the rest staying the same
  • Miscellaneous improvements of SmartPy.io
  • Study and proof or verification of the compiler to Michelson
  • Optimization of compiler to Michelson possibly through Michelson to Michelson optimizations

Tasks for a later step

  • Implementation of all the analytics
  • Use of SmartPy / SmartML to research potential improvements to Michelson