CosmWasm 1.4.1 fixes regression

Simon Warta
CosmWasm
Published in
2 min readOct 11, 2023
Photo by Samer Khodeir on Unsplash

On September 28th, we were made aware by multiple community members that the Chihuahua chain halted after upgrading to wasmvm 1.4.0. This was resolved by rolling back to 1.3. A few days later, we recognized similar issues in wasmd upgrade tests. The common error:

Wasmer runtime error: RuntimeError: out of bounds memory access

This error reverts the transactions with an error code and does not harm the node process. However, it is something that a contract is unlikely to trigger by accident.

After analyzing the cases in which the error is occurring vs. not occurring, we could detect the code path inside of cosmwasm-vm, which is used. It turned out that we were misusing the new Wasmer API after upgrading Wasmer from version 2.3 to 4.1 in some cases. Once the spotlight was on the right codepath we could reproduce the bug in unit tests and fix it.

1.4.1 released

cosmwasm-vm and wasmvm 1.4.1 have been released on October 9th. The update is included in the latest wasmd 0.43.0.

1.4.0 retracted

Version 1.4.0 of cosmwasm-vm (embedded in wasmvm 1.4.0) contains the bug and will cause failing transactions in almost any setup. It has been yanked/retracted. The same goes for wasmd 0.42.0. Systems using that version should upgrade to 1.4.1 quickly.

Improved test coverage

The bug revealed a blind spot in our test coverage. In all caching scenarios, we tested getting an instance and checked caching statistics. However, we did not execute the resulting instances. This has changed now, such that the codebase is more resilient due to this issue. Also, we’re evaluating the options of automated node upgrade testing for more high-level coverage.

Big shout out to the team

While part of the team was at Cosmoverse, Alex, Mauro, and Chris from Confio debugged the issue and gained valuable insights around its behavior to narrow down the problem. The bug could only be identified so quickly due to this work. Once the code path in question was found, it was relatively easy to reproduce and fix the bug.

--

--