Library Driven Development in Solidity

A comprehensive review on how to develop more modular, reusable and elegant smart contract systems on top of the Ethereum Virtual Machine by using libraries

Solidity is a limited language

Coming to Ethereum mainly from the lands of Swift and Javascript, developing in Solidity is definitely a step back in terms of what the language allows the programmer to do and the expressiveness of the language.

Solidity, and in general languages that compile to bytecode intended to be executed in the EVM, are limited because:

  • When executed, your code will run on every node of the network. Once a node receives a new block, it will verify its integrity. In Ethereum this also means verifying that all the computations that happened on that block were performed correctly and the new state of contracts is correct.
  • This causes that, even though the EVM is Turing-complete, heavy computations are expensive (or directly not allowed by the current gas limit) because every node will need to perform it, therefore slowing the network.
  • A standard library hasn’t really been developed yet. Arrays and strings are specially painful, I have personally had to implement my own ASCII encoding and decoding and an algorithm to lowercase strings by hand, which are tasks I never had to even think about in other languages/platforms.
  • You cannot get data from the outside world (out of the EVM) unless it gets in via a transaction (Oracle) and once a contract is deployed it is not upgradable (you can plan for migrations or pure storage contracts, though).

Some of this limitations are needed for the existence of the Ethereum computing platform (you will never be able to store a backup of your Google Photos and perform image recognition purely on-chain, and that is just fine). Other limitations are here just because it is a really young technology (though evolving blazingly fast) and it will keep improving over time.

That being said, it is very possible to build interesting projects on top of Ethereum today. I have personally recently discovered the use of libraries as a way to keep code clean and organized.

What is a library

In Solidity, a library is a different type of contract, that doesn’t have any storage and cannot hold ether. Sometimes it is helpful to think of a library as a singleton in the EVM, a piece of code that can be called from any contract without the need to deploy it again.

This has the obvious benefit of saving substantial amounts of gas (and therefore not contaminating the blockchain with repetitive code), because the same code doesn’t have to be deployed over and over, and different contracts can just rely on the same already deployed library.

The fact that multiple contracts depend on the exact piece of code, can make for a more secure environment. Imagine not only having well audited code for common endeavors (like the tremendous job the guys at Zeppelin are doing), but relying on the same deployed library code that other contracts are already using. It would certainly have helped in this case, where all balances of an ERC20 token (nothing too fancy), that was intended to raise a maximum of $50M, were whipped out.

Disclaimer: Everything below was written for Solidity v0.4.8, given the current rate at which it is evolving, it might be outdated soon.

Enough buzz words, what is a library

A library is a type of contract that doesn’t allow payable functions and cannot have a fallback function (this limitations are enforced at compile time, therefore making it impossible for a library to hold funds). A library is defined with the keyword library (library C {}) in the same way a contract is defined (contract A {}).

Calling a function of a library will use a special instruction (DELEGATECALL), that will cause the calling context to be passed to the library, as if it was code running in the contract itself. I really like this angle from the Solidity documentation, “Libraries can be seen as implicit base contracts of the contracts that use them”.

In this snippet, when function a() of contract A is called, the address of the contract will be returned and not the library’s. This appears to be the same for all msg properties msg.sender, msg.value, msg.sig, and msg.gas. (Solidity documentation related to this indicates otherwise, but after doing some testing it looks like msg context is maintained)

How libraries are linked

Different from explicit base contract inheritance (contract A is B {}) a contract that depends on a library is not that clear how it gets linked with it. In the above case, contract A uses library C in its function a(), but there is no mention of what address of the library to use, and C won’t get compiled inside A’s bytecode.

Library linking happens at the bytecode level. When contract A is compiled, it leaves a placeholder for the library address in this way 0073__C_____________________________________630dbe671f(0dbe671f is the function signature for a()). If we were to deploy contract A untouched, the deployment would fail as the bytecode is invalid.

Library linking is as simple as replacing all occurrences of the library placeholder in the contract bytecode with the address of the deployed library in the blockchain. Once the contract is linked to the library, it can be deployed.

How libraries are upgraded

Original (Feb 2017): They are not, in the same way contracts aren’t either. As stated in the previous section, the reference to the library is made at the bytecode level rather than at the storage level. Changing the bytecode of a contract is not allowed once deployed, therefore the reference to the library will live as long as the contract does.

UPDATE (March 2017): We have been working on the library upgradeability problem for the last weeks since the publication of this article on a way to upgrade libraries. We have been working with our friends at Zeppelin and have published an article about it:

With this new method instead on linking a contract against a library address, it can be linked with the dispatcher and that will allow for updating the underlying library later own, and upgrade the business logic of the contract.

‘Using’ structs and methods

Even though libraries do not have storage, they can modify their linked contract’s storage. When passed astorage reference as an argument to a library call, any modifications the library does, will be saved in the contract’s own storage. It is helpful to think of it as to passing a C pointer to a function, only that in this case the library may have been deployed by someone else and lives on the blockchain.

Also, one piece of syntax sugar that makes for easily understandable code is using using. Using this keyword, a function in the library can be called as a method of its first parameter, making it look like it is a proper method.

The using keyword allows for calling functions in CounterLib for all functions that take a Counter as a first argument, as if they were a method of the struct.

This construct is pretty similar of how you can execute methods on Go structs, without them being fully-fledged objects.

Events and libraries

In the same way that libraries don’t have storage, they don’t have an event log. But they can dispatch events, let me explain that:

As stated above, a library can be thought of as an implicit base contract, and in the same way that if a explicit base contract dispatches an event it will appear in the main contract event log, same thing happens with libraries, they will be saved in the event log of the contract that calls the event emitting function in the library.

Only problem is, as of right now, the contract ABI does not reflect the events that the libraries it uses may emit. This confuses clients such as web3, that won’t be able to decode what event was called or figure out how to decode its arguments.

There is a quick hack for this, defining the event both in the contract and the library will trick clients into thinking that it was actually the main contract who sent the event and not the library.

Here is a small example illustrating this, even though the Emit event is emitted by the library, by listening on EventEmitterContract.Emit we will be able to get the events. In contrast, listening on EventEmitterLib.Emit will never get any events.

Implementing ERC20Lib

As a real world example on developing with libraries, I’m going to refactor Zeppelin’s ERC20 StandardToken to be built using libraries.

The first step will be to rewrite SafeMath to be a library, as because of its current design to be used as a base contract won’t work, because libraries aren’t allowed to inherit. Also this refactor will make using SafeMath more clear: (safeMul(2, 3) vs 2.times(3) )

Even though libraries cannot directly inherit, they can be linked with other libraries and use them in the same way a contract would, but with the natural limitations of libraries.

Now for the real work, ERC20Lib is be the library that contains all the business logic related to managing a ERC20 token. It defines the TokenStorage struct which holds all the storage a token needs, and all its functions.

Now that all the required logic is encapsulated in the library, implementing StandardToken is trivial and will only contain code specific to that token and accessor functions that directly call methods on the library using TokenStorage (and the event declaration as we explained above).

The interesting part of this approach is that both ERC20Lib and SafeMathLib only need to be deployed once and all the contracts that link ERC20Lib will be using the same, secure, audited code.

The full refactor is in Aragon’s Zeppelin fork, and all the tests related to StandardToken are still passing even thought its internal architecture is fundamentally changed.

Wrapping up

As we the first lines of the article said, Solidity has still a long way to go in terms of programmer productivity and language expressiveness. In my opinion, libraries are a very good way to achieve code reusability.

For us at Aragon, developing with libraries is very important as we plan to deploy many times the same code with slight modifications or none at all. Using this architecture will allow our clients to save in transaction fees and also have a proof that the software their company is running is the same one that powers other successful organizations.

Aragon is a platform for building blockchain companies (DAOs) on top of Ethereum.

We will be in EDCON this week, so please come say hi. If this article seems like interesting work, join our Slack community or come work with us!