On Babylon2.0.1
Babylon2.0 is the proposal undergoing the testing phase of the Tezos on-chain governance process. In parallel, Nomadic and Cryptium Labs have not only been working on upcoming potential features, but also continuously inspecting and testing the ones contained in the current proposal. About a week ago, Nomadic Labs’ additional testing led to the discovery of an efficiency loss in the Multiple Big Maps (multimaps
) feature for Michelson and the lack of logs when calling the trace_code
RPC.
This article contains the description of the found issues and their fixes; presents possible paths; and shares with the broader community some crucial lessons learnt by both core development teams.
Multimaps, Issue Description, and Fix
multimaps
is one of the Michelson improvements included in Babylon2.0 that removes the restrictions on contracts for creating, storing, and transmitting only one big_map
. big_map
s are an alternative to regular map
s, as they provide more gas-efficient access to large datasets. Allowing contracts to contain more than one big_map
is a desirable feature for current and future smart contract developers.
It was found during Nomadic’s additional testing that the state machine returns a regular map
instead of a big_map
, regardless of the fact that the latter was deployed. The efficiency loss becomes palpable when accesing storage: in the case of map
s, access to storage will trigger all keys to be loaded into memory, resulting in high gas prices as the entire map
will be deserialised; in the case of big_map
s, whenever storage is accessed, only the keys of interest are loaded in memory.
This issue is caused by a line in script_ir_translator.ml
, more precisely the function has_big_map
, which currently (in Babylon 2.0) always returns false
even when a big_map
had been inputed. This issue can be solved by changing the aforementioned line in the protocol source file:
Trace Code RPC Call, Issue Description, and Fix
The trace_code
RPC call is used to produce execution traces when running scripts from the client. It was found that it currently does not produce logs. This issue is fixed by adding ?log
to script_interpreter.ml
(source file):
Possible Paths
Considering that the fixes need to be implemented two of the protocol source files of Babylon2.0, we identify two possible paths that cause the least amount of friction to the Tezos on-chain governance process. The execution of either paths is entirely dependent on stakeholders’ votes, which will be carried out in a few cycles, as soon as Babylon2.0 enters the promotion phase (participation requirement of ~74.7%).
A) Injecting Babylon2.0.1 in the Next Proposal Period
In other words, Tezos stakeholders could choose to vote no on Babylon2.0 in the upcoming promotion phase. If the proposal does not succeed in the promotion phase, the current proto_005
will not be activated and a new proposal phase will start. In this scenario, we intend to inject Babylon2.0.1 as a new proposal, which will be subject to a new iteration of the governance process.
This option is the simplest and most natural path, as the purpose of the testing and promotion phases is to not only provide a cooldown period for the protocol, but also the possibility of halting the upgrade, in the event a critical issue is found. The downside is that the features of Babylon will inevitably be delayed by ~3 months, which might result in delays in adoption.
B) Babylon2.0.1 as a Conditional Hotfixes
Simply put, this approach enables the Tezos shell to deploy Babylon2.0 with the fixes if, and only if, Babylon2.0 succeeds in the promotion phase, meaning that quorum (~74.7%) is reached and the majority have voted yay.
This path allows the integration of the fixes without the need of reintroducing the proposal to a new iteration of the Tezos on-chain governance process. While it does require coordination through a shell upgrade, it is fully conditional to the governance vote and thus causes little added overhead.
On the other hand, the conditional hotfixes require a modification of the Tezos shell (source code) and that bakers and full nodes pull the newest release before Babylon2.0 is activated, should the latter succeed in the last voting phase. Additionally, even if the difference between Babylon2.0 and 2.0.1 is in a few lines of code, this will change the protocol hash from PsBABY5HQ
to e.g. PsBabyM1
, which might cause confusion in the community.
Lessons Learnt and Final Remarks
First, the discovery of this efficiency and RPC issue is a demonstration of the importance of the testing phase or cooldown period in the Tezos on-chain governance process, where core developers and the community have supplementary time for deeper code inspections. Moreover, the subsequent promotion phase is designed to be a final checkpoint, granting Tezos stakeholders to halt the protocol upgrade in the event of a critical issue being found throughout the governance process.
Nevertheless, considering that this is an efficiency issue and a minor RPC call issue, and that the Tezos protocol is still in its infancy, ensuring a steady evolution of the protocol is more desirable. Hence our motivation behind designing the conditional hotfixes and communicating the possibility (option B), as an alternative to waiting for 3 months (option A).
Second, Babylon is feature-wise many orders of magnitude larger than Athens. While developing Babylon, we unexpectedly received more time, which motivated us to include more features than previously planned. Regardless of the motivation of core developers to push as many meaningful changes every ~3–4 months as possible, this reminds us to prioritise enough testing time over further features when crafting a protocol proposal.
To Sum Up
About a week ago, Nomadic Labs discovered two issues: one that limits the efficiency in the multimap
Michelson feature included in Babylon2.0; and another that prevents the code_trace
RPC call from producing logs. The fixes to the latter require a change in two of the protocol source files. After careful consideration, we conclude that there are two possible paths for integrating them, both dependent on the results of the upcoming vote during the promotion phase for Babylon2.0:
A) The regular route, wait for the next proposal phase and inject Babylon2.0.1, which will be subject to a new iteration of the governance process.
B) Enable the Tezos shell to deploy Babylon2.0.1 (Babylon2.0 with the hotfixes) with the condition that Babylon2.0 is accepted in the promotion phase.
Considering that the both issues are minor and that evolving the Tezos protocol at a steady piece is of higher priority, we introduce option B) as an alternative to A) as it would prevent delaying the features of Babylon by another 3–4 months.
Lastly, the key takeaways for the core development teams are:
- A reminder of the importance of testing and promotion phases, which are designed to grant an exit to the protocol in question, should a critical issue be found throughout the governance process.
- To prioritise lengthier testing periods over adding new features before injecting a proposal, regardless of unexpected circumstances that grant the core developers with more time.