Motoko, a Programming Language Designed for the Internet Computer, Is Now Open Source

Andreas Rossberg
The Internet Computer Review
10 min readJun 10, 2021

Motoko seamlessly supports the building of applications on the Internet Computer while leveraging the blockchain’s unique advantages.

The Motoko team at the DFINITY Foundation is thrilled to announce the open-sourcing of the Motoko compiler, interpreter, test suite, and documentation. After three years of development and a year and a half since its public introduction, Motoko’s full sources are now available (along with its previously open-sourced base library) under the Apache 2.0 License.

Motoko is a programming language that is designed to seamlessly support the programming model of the Internet Computer, making it easier to efficiently build applications and take advantage of some of the more unique features of the platform. Motoko is strongly typed, actor-based, and has built-in support for orthogonal persistence and asynchronous message passing. Productivity and safety features include automatic memory management, generics, type inference, pattern matching, and both arbitrary- and fixed-precision arithmetic. Messaging transparently employs the Internet Computer’s Candid interface definition language and wire format for typed, high-level, and cross-language interoperability.

We hope that this code release fosters collaboration with and contributions from the wider community, whether they involve improving documentation, polishing error messages, or producing entirely new tools such as additional IDE integration, debugger support, and code-formatting tools.

Our intention is to provide the same development experience to both internal and external contributors. Currently, our test infrastructure still partly relies on internal service, but we are working to replace them with publicly accessible services. External and internal developers will be on equal footing. No doubt we will encounter some growing pains as we move into the open. Please bear with us.

For those not yet familiar with Motoko, here is a recap of what it is, why we developed it, and how it works…

WebAssembly

To understand Motoko, we first have to talk briefly about WebAssembly — aka Wasm (yes, correctly spelled without all caps). As you may be aware, Wasm is a newish low-level code format that aims to be portable, safe, and efficient. Its initial use case has been the web, but the name actually is a misnomer: when we designed Wasm in the W3C Working Group, we carefully did it as an open standard and a universal platform. That is, it is not aimed at any specific programming language, paradigm, computing environment, or platform, and we made sure that it is not at all tied to the web. So it is absolutely no accident that Wasm is seeing adoption in many other environments, such as cloud computing, edge computing, mobile, embedded systems, IoT, and blockchains.

There were many, many design considerations that went into Wasm, some obvious and some rather subtle. Too many to go into here. A fairly comprehensive discussion of Wasm’s technical goals, design choices, formal semantics, and implementation techniques can be found in a scientific article that we published in Communications of the ACM (an older and more technical version of this article is freely accessible here).

Wasm’s main difference compared to other virtual machines is that it is not optimized for any specific programming language but merely abstracts the underlying hardware, with a bytecode directly corresponding to the instructions and memory model of modern CPUs. On top of that, Wasm supports sandboxing through strong modularity and a rigorous mathematical specification that ensures that execution is safe, free of undefined behavior, and (almost) entirely deterministic. Moreover, these properties actually have a machine-verified mathematical proof!

Altogether, these properties were intended to make Wasm an attractive choice for a wide range of environments and use cases that have high expectations for portability, safety, generality, and performance — such as the Internet Computer.

Wasm’s properties made it an obvious choice for representing programs running on the Internet Computer. But in practice, porting an existing programming language to Wasm is not entirely trivial. Obviously, it requires implementing a new compiler backend. That’s fun, but the effort doesn’t end there: it also requires porting the language’s runtime system and library primitives. And there are still a few features, especially ones relevant to more high-level languages, that cannot currently be implemented in Wasm easily — for example: threads, coroutines, exceptions, and tail calls. While various proposals to enrich Wasm with respective functionalities are on the horizon, they have not yet been finalized for standardization.

Although there are many experimental language implementations targeting Wasm already, most are not yet ready for prime time. The ones that are primarily include low-level systems languages like C/C++ and Rust. These are certainly great for their use cases, but they are less-than-ideal tools for developing high-level applications for the Internet Computer, where accessibility, productivity, and high assurance tend to be more desirable than manual meddling with memory management.

At the same time, a language for the Internet Computer needs to provide access to the platform’s main concepts: a distributed programming model with asynchronous message passing, notions of resources like cycles (a.k.a. gas), and a few other idiosyncrasies. Sure, they could all be made available as libraries, but a language that natively includes appropriate constructs can deliver a much more seamless programming experience.

So if we have to do work anyway to get off the ground, why not apply it to creating something that can deliver an optimal user experience and convey our vision for how to program the Internet Computer?

Motoko

That is why — despite all the risks of creating yet another language — we decided to create Motoko. We wanted a language that is safe, easy to use, and seamlessly exposes the concepts of the platform, as well as one that looks sufficiently friendly and accessible to most programmers. Currently, that latter goal makes it practically inevitable that it’s firmly in the semicolons-and-curly-braces camp of languages. And no suitable language existed in this camp.

But Motoko’s rather conventional skin is only superficial: its interior is that of a modern language. For example, every construct is an expression, it has closures, it has variant types and statically checked pattern matching, it has garbage collection, and of course it has a flexible type system that is actually sound, i.e., it really guarantees the absence of certain errors like crashes, undefined behaviour, misinterpreting data, or simply missing a case in a switch. No holes!

At the same time, we intentionally tried not to be fancy or reinvent the wheel, but rather built on a wealth of history, both practical and theoretical, and acknowledged the lessons that have been learned over decades in this field. Besides putting together a coherent mix of well-understood features, Motoko’s design incorporates many small decisions to minimize foot guns and err on the side of safety, e.g., numbers cannot overflow by default, locals are immutable by default, concurrent execution is atomic by default, null cannot occur by default, fields are private by default, and so on. Oh, and there is no inheritance, only subtyping.

Implementing these parts of Motoko and compiling them to Wasm is conventional compiler craft. The Motoko compiler, written in OCaml, uses a typed intermediate representation, a few transformation passes, and spits out Wasm byte code. The generated Wasm module includes a small runtime system, written in C and Rust, that mainly implements a simple garbage collector using the Wasm memory as its heap. That wasn’t hard, but surely there is much potential for improvement here, and we are working on that.

Actors

The central feature of Motoko, however, is its direct support for actors, in both syntax and type system. The actor model is a well-known concept that is 40+ years old, but sadly, it has barely made it into mainstream languages. An actor is like an object (and in Motoko, even looks like one), in that it encapsulates private state along with a set of methods to process messages that can be sent to it. But all message sends are asynchronous. Consequently, unlike conventional methods in OO, actor methods do not have results. Moreover, all messages are received sequentially by an actor — that is, it has an implicit message queue and methods execute atomically, even when messages are sent concurrently.

Actors are a great model for concurrent programming because they automatically prevent race conditions (thanks to atomicity and encapsulated state) and deadlocks (because execution never blocks), and hence rule out many concurrency bugs. All that without requiring programmers to ever define a lock. Actors are also a great model for distributed programming, because asynchrony naturally deals with the latency involved with sending a message to a potentially remote receiver. And finally, actors are a great fit for Dfinity’s Internet Computer, where applications are deployed in the form of so-called canisters — essentially, actors represented as Wasm modules that can communicate across subnetworks. So a Motoko actor compiles to a Wasm module, where the methods become exported Wasm functions with special conventions defined by the platform.

In short, an application in Motoko is an actor (or several), which in turn is a big asynchronous object compiled into a Wasm module. With Wasm’s notion of memory, such an actor can immediately manage up to 4 GiB of internal state, although this can be enlarged further by linking multiple Wasm modules that each have their own memory.

Futures

To make asynchronous programming more convenient and allow expressing it in sequential “direct style,” Motoko adopts another 40+-year-old idea from the annals of programming language research, though one that fortunately became a bit more popular recently: futures (also called promises in some communities). In Motoko, they materialize in the form of “async values,” values of type async<T> that are produced by expressions prefixed with the async keyword. In particular, a function body can be an async expression, thereby naturally replacing and generalizing the more monolithic concept of an “async function” that exists in some other languages.

With that, actor methods are allowed to have results after all — as long as those are futures. Futures can be awaited to get their value, but only inside another async expression, akin to async/await monads as known from other modern languages.

The Motoko compiler implements this via a traditional CPS (continuation passing style) transformation, turning each await point into a separate Wasm function (plus some closure environment) representing the continuation of the await. In fact, it’s double-barreled CPS, because every message can also have a failure reply with a respective failure continuation. By convention, a method with an async result is one that sends a reply message carrying the result values as arguments. This message is received by the created continuation function, which can then resume the execution it has captured. Waiting for a reply doesn’t block an actor — it can freely receive other messages in the meantime.

Persistence

Another important consideration for Motoko was allowing developers to utilize blockchain technology without having to learn an entirely new type of computing. So we took out most of the special knowledge that you might need on the current breed of blockchain programming languages. For example, there is no observable notion of block or block height, no explicit constructs for updating state on the blockchain, nor is there other API for writing data to persistent storage, like files or databases (although that could be emulated as a library). Instead, the Internet Computer implements orthogonal persistence — yet another old idea where a program has the illusion of running “forever” and its memory simply staying alive (at least until it is explicitly taken down). In Motoko, this means that developers do not have to worry about explicitly saving their data between messages or bother with files or an external database: whatever values or data structures are stored in program variables will still be there when the next message arrives, even if that is months later.

The platform takes care of transparently saving and restoring the private state of a canister between method invocations. That was relatively easy to retrofit onto a Wasm engine, because the state of a Wasm module is clearly isolated in a module’s memory, globals, and tables. For the most part, it is sufficient to watch Wasm memories with the use of virtual memory techniques exposed by operating systems. This way, the platform knows when pages in such a memory have been modified and can take whatever measures are necessary to persist the dirty pages, as well as hashing them for the distributed consensus protocol.

Beyond Motoko: Interface definitions with Candid

Because the Internet Computer runs Wasm, Motoko is just one option for creating an application — and intentionally so. We also provide Rust, and we are looking forward to making other language choices available. Even then, because each language will uniformly compile to canisters represented in Wasm, these canisters can freely communicate with each other through message sends regardless of their source language.

To make such interoperability well-defined, we have introduced a generic interface definition language (IDL) named Candid. It is the lingua franca of communication on the Internet Computer, and entirely independent from Motoko. It describes the set of messages understood by a canister and what type of data is sent along. Data is described in Candid by a combination of canonical data types (numbers, text, arrays, records, variants, functions, references to other canisters) that are independent from the Motoko type system or that of any other programming language.

Phew, yet another type system? Well, programmers will probably be pleased that the Motoko compiler can automatically consume and produce such interface descriptions for actor exports and imports and map them from and to corresponding Motoko types. It also automatically generates the right Wasm code to serialize and deserialize the argument data for each message, transparently interconverting Motoko’s internal representation with the binary format that Candid specifies.

This way, Motoko programs can communicate with external canisters in a typeful manner and express remote invocations as if they were local objects in the program. And that is regardless of whether the remote canisters are written in Motoko or, say, Rust; the interface description of a canister is enough as type information. Besides mere convenience, interfaces also provide a strong form of modularity, where programs can be type-checked against other actors/canisters without having access to their implementation.

Conclusion

Our goal is for all languages to have equal rights on the Internet Computer, all compiling to canisters in Wasm and all communicating seamlessly through Candid. This is important to making the Internet Computer open. Motoko is just one choice among many, but is intended to be a particularly good one for a wide range of applications to be developed on the Internet Computer.
____

Start building at smartcontracts.org and join our developer community at forum.dfinity.org.

--

--