On Babylon2.0.1

Published in

METASTATE

5 min readSep 20, 2019

Babylon2.0 is the proposal undergoing the testing phase of the Tezos on-chain governance process. In parallel, Nomadic and Cryptium Labs have not only been working on upcoming potential features, but also continuously inspecting and testing the ones contained in the current proposal. About a week ago, Nomadic Labs’ additional testing led to the discovery of an efficiency loss in the Multiple Big Maps (multimaps) feature for Michelson and the lack of logs when calling the trace_code RPC.

This article contains the description of the found issues and their fixes; presents possible paths; and shares with the broader community some crucial lessons learnt by both core development teams.

Multimaps, Issue Description, and Fix

multimaps is one of the Michelson improvements included in Babylon2.0 that removes the restrictions on contracts for creating, storing, and transmitting only one big_map. big_maps are an alternative to regular maps, as they provide more gas-efficient access to large datasets. Allowing contracts to contain more than one big_map is a desirable feature for current and future smart contract developers.

It was found during Nomadic’s additional testing that the state machine returns a regular map instead of a big_map, regardless of the fact that the latter was deployed. The efficiency loss becomes palpable when accesing storage: in the case of maps, access to storage will trigger all keys to be loaded into memory, resulting in high gas prices as the entire map will be deserialised; in the case of big_maps, whenever storage is accessed, only the keys of interest are loaded in memory.

This issue is caused by a line in script_ir_translator.ml, more precisely the function has_big_map, which currently (in Babylon 2.0) always returns false even when a big_map had been inputed. This issue can be solved by changing the aforementioned line in the protocol source file:

Tezos Protocol fix to the efficiency loss to Multiple Big Maps feature — Diff of script_ir_translator.ml showing the fix (highlighted in green)

Trace Code RPC Call, Issue Description, and Fix

The trace_code RPC call is used to produce execution traces when running scripts from the client. It was found that it currently does not produce logs. This issue is fixed by adding ?log to script_interpreter.ml (source file):

Tezos Protocol Fix of Code Trace RPC Call to produce logs — Diff of script_interpreter.ml showing the fix (highlighted in green)

Possible Paths

Considering that the fixes need to be implemented two of the protocol source files of Babylon2.0, we identify two possible paths that cause the least amount of friction to the Tezos on-chain governance process. The execution of either paths is entirely dependent on stakeholders’ votes, which will be carried out in a few cycles, as soon as Babylon2.0 enters the promotion phase (participation requirement of ~74.7%).

A) Injecting Babylon2.0.1 in the Next Proposal Period

In other words, Tezos stakeholders could choose to vote no on Babylon2.0 in the upcoming promotion phase. If the proposal does not succeed in the promotion phase, the current proto_005 will not be activated and a new proposal phase will start. In this scenario, we intend to inject Babylon2.0.1 as a new proposal, which will be subject to a new iteration of the governance process.

This option is the simplest and most natural path, as the purpose of the testing and promotion phases is to not only provide a cooldown period for the protocol, but also the possibility of halting the upgrade, in the event a critical issue is found. The downside is that the features of Babylon will inevitably be delayed by ~3 months, which might result in delays in adoption.

B) Babylon2.0.1 as a Conditional Hotfixes

Simply put, this approach enables the Tezos shell to deploy Babylon2.0 with the fixes if, and only if, Babylon2.0 succeeds in the promotion phase, meaning that quorum (~74.7%) is reached and the majority have voted yay.

This path allows the integration of the fixes without the need of reintroducing the proposal to a new iteration of the Tezos on-chain governance process. While it does require coordination through a shell upgrade, it is fully conditional to the governance vote and thus causes little added overhead.

On the other hand, the conditional hotfixes require a modification of the Tezos shell (source code) and that bakers and full nodes pull the newest release before Babylon2.0 is activated, should the latter succeed in the last voting phase. Additionally, even if the difference between Babylon2.0 and 2.0.1 is in a few lines of code, this will change the protocol hash from PsBABY5HQ to e.g. PsBabyM1, which might cause confusion in the community.

Lessons Learnt and Final Remarks

First, the discovery of this efficiency and RPC issue is a demonstration of the importance of the testing phase or cooldown period in the Tezos on-chain governance process, where core developers and the community have supplementary time for deeper code inspections. Moreover, the subsequent promotion phase is designed to be a final checkpoint, granting Tezos stakeholders to halt the protocol upgrade in the event of a critical issue being found throughout the governance process.

Nevertheless, considering that this is an efficiency issue and a minor RPC call issue, and that the Tezos protocol is still in its infancy, ensuring a steady evolution of the protocol is more desirable. Hence our motivation behind designing the conditional hotfixes and communicating the possibility (option B), as an alternative to waiting for 3 months (option A).

Second, Babylon is feature-wise many orders of magnitude larger than Athens. While developing Babylon, we unexpectedly received more time, which motivated us to include more features than previously planned. Regardless of the motivation of core developers to push as many meaningful changes every ~3–4 months as possible, this reminds us to prioritise enough testing time over further features when crafting a protocol proposal.

To Sum Up

About a week ago, Nomadic Labs discovered two issues: one that limits the efficiency in the multimap Michelson feature included in Babylon2.0; and another that prevents the code_trace RPC call from producing logs. The fixes to the latter require a change in two of the protocol source files. After careful consideration, we conclude that there are two possible paths for integrating them, both dependent on the results of the upcoming vote during the promotion phase for Babylon2.0:

A) The regular route, wait for the next proposal phase and inject Babylon2.0.1, which will be subject to a new iteration of the governance process.

B) Enable the Tezos shell to deploy Babylon2.0.1 (Babylon2.0 with the hotfixes) with the condition that Babylon2.0 is accepted in the promotion phase.

Considering that the both issues are minor and that evolving the Tezos protocol at a steady piece is of higher priority, we introduce option B) as an alternative to A) as it would prevent delaying the features of Babylon by another 3–4 months.

Lastly, the key takeaways for the core development teams are:

A reminder of the importance of testing and promotion phases, which are designed to grant an exit to the protocol in question, should a critical issue be found throughout the governance process.
To prioritise lengthier testing periods over adding new features before injecting a proposal, regardless of unexpected circumstances that grant the core developers with more time.

Follow us on Medium and Twitter to Stay Tuned! 🐫