UTXO Chains. Part 2

State and determinism

Evaldas
15 min readDec 17, 2022

(previous: Introduction)

This chapter is an abstract philosophical detour on the topic of state, the ubiquitous term in DLT, what does it mean and what properties it has. We are aiming to develop useful analogies, metaphors and intuition rather than precise definitions. The reason is that, when we are leaving the blockchain realm for the DAG/UTXO, the common intuition, based on one-blockchain assumption, often fails, especially when talking about what is shared state and what is global state.

By state or atomic mutable state we mean a state of the system. The concept is commonly used in DLT reasoning. With ‘mutable’ we want to say, that the state is changing over time.

We will use state of the physical system as analogy. The example may be a particle. (Note, however, that such real world analogies are just convenient metaphors, which help us to develop intuition in the reasoning about DLT, nothing more. The ledger is not a physical world)

The state of the system in physics is a function state: time->PhaseSpace, which gives a point state(T) in the phase space for each moment of time T. For example, the state of the physical particle is its speed, velocity and other values, which are changing over time.

By running the time T along the range of system’s lifetime, {state(T)} gives us the trajectory of the system, a curve in the phase space. This is a mathematical model of the evolution of the physical system over time. In the discreet time of DLT it is a sequence of states: state(1), state(2), … state(n), …

In the DLT context, the state in each moment of time is some atomic data set, usually a collection of key/value pairs. We call it atomic, because it changes (mutates) as a single indivisible unit each time tick.

In the DLT context, state and ledger state usually means the same thing, synonyms. We will also call it a data state.

Each participant in the system (think node) perceives the state(T) (or a particular version of it) the same way, i.e. as exactly the same data set. In other words, the (atomic) state cannot be updated somehow partially, not fully, it is always a whole, either its is state(T) or state(T+1), nothing in between, not even temporarily.

The state is also called objective and deterministic state. Objective it is because all participants (nodes) perceives it the same way. Deterministic because if somebody analyzes part of the data set of the state(T), the remaining part will be equal on every node.

Two or more different states, say state1(T) and state2(T), will be parallel, because their trajectories do not intersect in the phase space.

If a data is perceived differently by nodes, is is not objective and not deterministic, so it is not an atomic state strictly speaking, by our definition. For convenience we will sometimes call it a non-deterministic state. For example, the ledger state of the pure UTXO ledger is non-deterministic due to parallelism of state updates. We will discuss this aspect below.

The state of the physical particle gives us a good intuition about the data state in the distributed ledger. It explains concepts of atomicity, objectivity, determinism and parallelism.

Two physical particles are parallel states in the sense, that their trajectories are independently described by their current state and the external forces.

The state in the DLT is an atomic data set, which is mutating according to the flow of asynchronous incoming requests/calls from the user to change the state. The requests will be analogs of external forces to the particle. Each request/call makes the data state mutate.

The analogy of the trajectory leads us to another important observation: the objective data state can only be updated sequentially. If two external requests (‘forces’) compete to update state(T) within the same time window, they must be sequenced: somebody in the system must impose the order on them to be equally perceived by all nodes. Otherwise the state(T) will become non-deterministic, i.e. will be different in different nodes.

For example, if state(T) means balance asset 10 tokens, request R1 asks to withdraw 5 tokens, and R2 asks to withdraw 6 tokens, the final outcome will depend on the order in which the R1 and R2 will be processed. This order must be the same for all nodes, otherwise the state will be perceived differently, i.e. will not be objective.

Meanwhile different, therefore parallel, states can be updated independently. The two requests to different states won’t compete: state1(T) -> state1(T+1) and state2(T)->state2(T+1): there’s no need of sequencing them, because the order is irrelevant for the final result. This is the case if withdrawal assets from two independent accounts.

Ledger state. Accounts. Smart contracts

If we think a ledger as a data structure, we naturally want it to be objective and deterministic. That is what we call a ledger state. The ledger is a shared model of economic activity, so to be shared and objective is the very purpose of its existence.

We naturally talk about sub-states of the ledger state. The account is a sub-state inside the ledger: a collection of assets which is controlled by some entity (e.g. address). It changes over time when assets are coming/leaving the account. Another example of sub-state is a smart contract state: a sub-state controlled by a smart contract program. It changes over time upon requests/calls from users.

Sub-state of a deterministic and objective state is also deterministic and objective. In simple words, all nodes in the system see the balance of the account the same way. Same applies to smart contracts: when we execute a smart contract on a given state, the inputs to the program will be exactly the same on each node. Hence, determinism. This way we can talk about sequence (“trajectory”) of each sub-state just like any deterministic state.

The opposite is not true: the non-deterministic state may consist of many deterministic sub-states. That is how the pure UTXO ledger works.

The non-determinism of an account is normally ok for the user’s account, a wallet, for the user it is normal that account balance changes over time. The account on the pure UTXO ledger is non-deterministic, because it consists on many parallel assets and it is fine.

However non-determinism in the smart contract state is unacceptable. That would mean, two nodes may run the same program on different input data. This will result into different succeeding states and nodes won’t be able to come to consensus on the next state: all properties of the atomic state will be broken. This is the race condition. Race conditions appear when several parallel agents try to access same data set in parallel. Race conditions and non-determinism always goes together, that is what we have to prevent it in the smart contract system.

The above means that the account on the pure UTXO ledger cannot be the state of the smart contract.

Let’s imagine two smart contracts SC1 and SC2. Option (a): both SCs (their states) are parts of the same global state, i.e. they both perceive the global state the same way. Another option (b), the two are on different parallel, independent states.

If we request SC1 and SC2 with requests R1 and R2 in the option (a), the R1 and R2 must be ordered (sequenced) in order to keep the state deterministic. Meanwhile, in option (b) R1 and R2 can be run in any order, because the states are independent.

In the option (a), if SC1 will query the state of SC2, it will always receive the same result on every node. The two SCs on the same state are deterministic with respect to each other. For SC1 and SC2 it is the shared state or global state. This makes SC1 and SC2 synchronously (or atomically) composable by calling each other, because the two SCs “live” on the same blockchain.

Meanwhile, two SCs on two parallel states cannot deterministically query the state of each other. Hence, they cannot be composed by calling each other, they “live” in parallel worlds, for example two blockchains, they do not share the state.

Objectivity and consensus

Objectivity is achieved by agreement of nodes on the data, hence the consensus.

In order to have a equal perception of state(T), the nodes must have agreement on all the preceding ‘trajectory’ too: …, state(T-2), state(T-1)

Let’s look at the stream of incoming requests to mutate the state (‘forces’). Each node may receive requests in a different order: for one it is (A,B,C), for another it is (C,A,B), so it is non-deterministic. It cannot be an input for a transition of the state(T) because it will ruin its determinism. Only after many nodes come to agreement what the order is, say (C,B,A), the sequence of requests and the resulting state become same for all nodes i.e. deterministic. Process of ordering incoming requests is known as sequencing.

An important observation: in order to make a non-deterministic data an objective and deterministic state we need the (global) consensus on the order. This observation is valid for any non-deterministic data sets. In order to avoid race conditions, you must reach an agreement on who is the first.

(Until there’s a consensus on the state of the Schrödinger’s cat among group of people, is it dead or alive, it would be unwise to base your judgment on any assumption)

State of the blockchain

The parallel atomic states are common in models of physics. However, parallel states are not that common in DLTs. For a reason.

On a blockchain the ledger state is always updated by a sequence of blocks, a deterministic data structure. The chain of blocks determines the “trajectory” of the ledger state. Each block contains deterministically placed requests (transactions) on it, so each node in the system perceives the state represented by the specific block exactly the same way. It is impossible to update the part of the block or of the whole blockchain state in parallel. It is either one block which mutates the state in its entirety or another (two blocks with the same set of transactions but differently ordered are two different blocks).

So, the entire blockchain represents one single atomic state, it cannot be split into parallel independent states, it must be updated as an atomic unit, at once. This is also called global state, because perceived the same way among all nodes. This also implies that the ledger, represented by the blockchain and it sub-states (accounts, smart contracts) is strictly deterministic. It is a shared state among all smart contacts on the blockchain. Concept of ‘smart contract’ and ‘account’ become almost synonyms on the blockchain due to its determinism.

Note, that we didn’t even mention here what exactly kind of ledger is put into the state of the blockchain. It can be an account based ledger, like Ethereum, or it can be an UTXO ledger transactions in the block, like in Bitcoin, Cardano or Fuel. UTXOs being parallel does not make the blockchain account parallel, the whole UTXO ledger on the blockchain become one atomic global state anyway.

What’s good and bad in the global shared state?

The global shared state is awesome! The whole crypto industry is based on this concept. It makes many things simple. It is responsible for the synchronous (atomic) composability of smart contracts on Ethereum and other blockchains, so it enables, for example, Ethereum DeFi ecosystem, with ERC-xx token ledgers, DeX-es and other dApps. It also makes PoS and staking delegation simple, because every node then knows exactly how much skin-in-the-game this or that entity has. Global state makes distributed Oracles (e.g. Chainlink) a simple on-chain callable service for smart contracts. It also enables such things as collaterals, proof of reserves, liquidity pools and many others.

Global and objective state is a synchronous paradise shared by many players on the same blockchain. Nobody wants to be expelled.

However, there’s one problem: it is not scalable. The sequential updates of a global state of the blockchain is a blessing but also a curse of it. To search for a solution, one must leave the comfort zone to parallel ledger state architectures, the sharded ones or UTXO. As Albert Einstein once convinced us, there’s no such thing as an objective shared state in the physical world. So, if we want to build a model of real world phenomena in the form of the shared distributed ledger, the comforting assumption of the global objective state is not realistic.

Parallelism of the UTXO ledger

Things quickly become much less comfortable when we move from global state assumption to the world of many parallel states, for scalability or other reasons. For example, states of two blockchains, say Ethereum and Bitcoin are parallel, i.e. they are non-deterministic and non-objective with respect to each other. The Ethereum smart contract cannot directly affect state on Bitcoin, nor even query it (this is also known as the Oracle problem). When we want to transfer a value between the two, it is difficult to do that without a bridge, a system in between, which requires additional trust in it.

The UTXO ledger model, however, offers nice parallel properties combined with it’s determinism which may be exploited. Let’s dig deeper.

The ledger state of the UTXO ledger consists of outputs or UTXOs (Unspent TransaXtion Output), each with unique global identity on the ledger. The UTXO transaction Tx consumes specific outputs IN1, … INn and produces new outputs OUT1, … OUTm.

Each output represents an asset on the ledger. The output usually contains balances of fungible tokens on it (a ‘liquidity’) or any other properties which are subject to the ledger constraints (aka ledger invariant). The valid state transition is only possible when transaction satisfies the ledger constraints, such as total number of tokens or immutability of the NFT. More details on UTXO transaction model can be found here, here, here and here.

The UTXO transaction is an atomic unit: it either is added to the ledger in its entirety, or it is rejected. When Tx is added to the ledger, consumed outputs IN1, … INn are deleted from the ledger state, the produced outputs OUT1, … OUTm are added to the ledger state: a ledger state transition occurs.

The UTXO transaction is a deterministic unit: each transaction is specific about identities of outputs it is consuming and it is producing. It means, the UTXO transaction will be interpreted exactly the same way by any node in the system: all side effects of adding the transaction to the ledger will be exactly the same on each node, independently on any random factors. In this aspect, the UTXO transaction is different from the non-deterministic request, which may be interpreted in the context of the unknown state.

To avoid confusion: ‘requests’ are called ‘transactions’ in Ethereum and in most other smart contract platforms, however, strictly speaking, they become real transactions (state transitions) only after sequencing them on the blockchain.

The requests in the Fuel blockchain are also called UTXO transactions, because they are specific about dependencies on smart contracts they are about to consume. However, they are not specific about particular state of the SCs they are consuming, therefore they are non-deterministic in general. For this reason requests (transactions) with overlapping dependencies can’t be executed in parallel and must be sequenced. In out opinion, the Fuel transactions should be called quasi-UTXO or similar, to avoid confusion. Meanwhile, the Bitcoin and Cardano uses strictly deterministic UTXO transactions, the real ones.

As it was mentioned before, here we assume UTXO ledger to be a pure UTXO ledger state, without any enwrapping data structures around, such as blocks. Also, let’s forget for now it being distributed. The useful metaphor would be a database for UTXOs (in EasyUTXO it is called utxodb). Users can write into that ‘database’ in parallel as long as mutated data does not overlap.

The UTXO transactions in the ledger make a directed acyclic graph (DAG) of transactions, the UTXO DAG. The UTXO ledger state consist only of unspent UTXOs, it does not keep history.

Each UTXO on the ledger state can be consumed by an independent transaction, deterministically. The transaction contains all the information needed to produce a valid state transition. It means UTXO ledger state can be updated without knowing the whole global state of the ledger, just a local fragment of it, which is fully defined in the inputs IN1, … INn of the transaction.

Moreover, the transaction can be validated completely independently from the rest of the state and from other transactions. This is property of UTXO is used for example by Fuel to execute/validate independent transactions in parallel, on separate processors/cores.

Being able to execute transaction in parallel is a powerful optimization, however it is not the most interesting property of UTXO model for our purposes, because it does not make ledger scalable by itself.

The most exciting fact is that the UTXO transaction consumes just a deterministic fragment of the UTXO ledger, a local state. It means, many parties can write to the UTXO ledger in parallel without interfering with each other: as long as they do not collide by consuming same input(s). The party which updates its fragment of the ledger state is completely independent from another party, which updates another fragment of the ledger state.

Another important observation about UTXO ledger and transaction model is that each UTXO (output) represents a parallel and independent state: because each UTXO can be consumed by transaction in parallel.

There are beautiful analogies with particle physics. Let’s imagine each UTXO is a particle (a parallel state). Many particles (UTXO inputs) “collide” in the transaction to produce new set of particles (also UTXOs).

The analogy becomes even more inspiring when we remember that particle collision always preserve physical invariants, such as energy, momentum or spin, that are all laws of nature.

Meanwhile, the transaction preserves ledger invariants, expressed as ledger constraints: both global (‘the laws’) and the ones brought by inputs (constraints). These are all laws of the ledger. See also the Constraint-based UTXO ledger.

Note, that UTXO transaction (the ‘collision’) destroys parallel states of consumed inputs and produces a set of new states in produced outputs. Previously existing states of inputs cease to exists. In the particle physics it probably is a bit more complicated, but in the UTXO ledger is as simple as that: each UTXO transaction is producing a set of parallel states by destroying (consuming) a set of input states, while preserving ledger invariants. Each “collision” (UTXO transaction), like an act of God (the user), makes independent states to meet (“collide”) in the specific point of the time and space to produce new state of the world, consisting of many new UTXOs/parallel states (‘particles’).

Non-determinism of the UTXO ledger

The fact, that the pure UTXO ledger consists of many parallel states, each for one UTXO, makes the UTXO ledger state as a whole non-deterministic and non-objective (not an atomic state).

That brings about problems. The above implies that any account on UTXO is non-deterministic. It also means that the state of the smart contract cannot be modelled as a collection of UTXOs. That looks weird for any blockchainer, how could we live without having a proper ledger state?

Yes, that is the cost we pay for introducing parallelism into the ledger, as intended. In order to preserve parallelism, strategically we have to embrace the following:

  • we have to put the whole smart contract state into the single output (UTXO). The deterministic state cannot span several outputs
  • if we want to keep the identity of the state contained in the output from one transaction to another, we must introduce a concept of chain of UTXOs, a kind of ‘chain identity preservation law’ on our ledger
  • the act of production of the deterministic transaction by consuming parallel outputs naturally becomes an act of the sequencing: making deterministic from non-deterministic.
    Whenever a wallet wants to produce a transaction, it takes UTXOs from the account and sorts them. It is trivial because wallet is centralized (controlled by one user).
    However, if account is controlled by a distributed entity, a committee, it means the entities in the committee first have to come to consensus on the order of UTXOs in the account, only then produce the transaction. The latter would be the act of the distributed sequencing.

Those sequencing problems due to non-determinism of the account are unknown to blockchains, because blockchain enforces global order anyway. Each blockchain is a sequencer, it orders even UTXO transactions, which otherwise looks parallel.

However, the pure UTXO ledger does not have a sequencer in it, by the very assumption.

(a funny physical analogy of sequencing a pure UTXO state on the blockchain. For the innocent UTXO transaction being placed into the blockchain is kind of the hand of the global God, about existence of which the UTXO outputs are unaware. That makes two outputs dependent (ordered deterministically in every node) without them knowing that. Someone may see it similar to the quantum entanglement between the two particles with the same origin: it looks like miracle because the real mechanism is hidden from the eyes beneath the surface of the perceivable reality)

--

--