What I think about what KEVM people think about EVM

Yoichi Hirai
2 min readAug 2, 2017

--

I stumbled upon an interesting document by the KEVM project. The KEVM project is an attempt at defining the Ethereum Virtual Machine, whose goal overlaps with eth-isabelle project. The nature of these projects involves typing in some of the contents of the yellow paper. I’ve done that to some extent. It’s nice finding somebody down the same road.

In section 9.4.2, exceptions are described as if they are all catchable before an opcode is executed

In one of the coredev calls, I heard about the history. The idea was to separate gas calculation and actual execution. Moreover, the gas calculation should be computationally lightweight. Otherwise, there could be some spamming attack where the attacker sends “hard-to-judge if out-of-gas” transactions. In hindsight, I agree that exceptions could be detected during instruction executions. It’s interesting that the costs of these design choices are made visible with the K framework.

if the memory is overflown, then the existing semantics doesn’t do anything.

That’s true. The memory addresses should just overflow to a small number.

Some operators which access data of other accounts don’t specify explicitly what to do if the other account doesn’t exist. EXTCODESIZE and EXTCODECOPY examples

I agree. These are specification errors. I filed an issue.

What about contracts that have “junk bytes” in them?

Regarding jump destination detection (definition of $D_J$), junk bytes are treated in the same way as one-byte instructions. Regarding execution, junk bytes cause exceptional halt when the control flow reaches them. The definition of exception detector $Z$ says that (δ_w=∅).

Precompiled contracts: Why are there 4 precompiled contracts?

I don’t know much. I heard about plans replacing the precompiled contracts with usual contracts once we have a more performant virtual machine.

The byte-aligned local memory makes reasoning about EVM programs much more difficult

I totally, wholeheartedly agree. The symbolic states become painfully longer for this choice.

*CODECOPY opcodes allow regarding program pieces as data, meaning that a translation back must always be maintained.

That’s true. I didn’t think much about it when I was implementing instruction-to-byte functions, but it’s very true that this prevents us from abstracting away the code representation. I sense an experienced semantics engineer behind this remark.

The rest of the document (nondeterminism and language independence) becomes very interesting, but I haven’t gathered my own thoughts about these.

--

--