How to make IOTA actually work
What is IOTA
IOTA is a cryptocurrency protocol that promises some beautiful properties:
- There’s no mining. The ledger is maintained by transactions themselves. No mining means no transaction fees;
- Throughput primarily bounded by the network;
- Simplicity — the protocol can be explained on a whiteboard in few minutes, and its basic properties (such as the fact that the doublespending is hard if the majority of actors follow the protocol) are somewhat obvious from the protocol explanation itself.
For the rest of this article understanding IOTA protocol is necessary, so let’s take advantage of (3) and quickly get familiarised with it.
In IOTA the ledger is a DAG, not a linked list like in blockchains. The DAG consists of all the transactions ever attempted, including the conflicting ones, as vertexes, and each transaction can directly approve between 1 and 2 other transactions. A transaction that directly approves some other transaction indirectly also approves all transactions that are reachable from them. There’s a genesis transaction that creates all the tokens and approves nothing, and all the transactions indirectly approve the genesis transaction. No transaction can indirectly approve two conflicting transactions.
When one wants to send a transaction, they need to perform two random walks starting at the genesis block, and on each step go to one of the transactions that directly approves the one they are currently at. Let’s say the algorithm is presently at transaction x
, and x
is approved by transactions a
and b
. Let va
be the total number of transactions that approve a
(directly or indirectly), and vb
be the same number for b
. The probability of going to a
will be proportional to the exponent of alpha * va
, and the probability of going to b
will be proportional to the exponent of alpha * vb
. In the reference implementation alpha
is equal to 0.001. The random walk ends at so called Tips, transactions that do not have any other transactions approving them yet. The new transaction shall then approve the two Tips that were found during such random walks.
There are some optimisations and tweaks (such as during the random walks there’s a probability to backtrack, and the random walks do not start at the genesis for the efficiency purposes) which are not important for the sake of this article. The general idea is that some Tips are more likely than others to be found, and if one approves such Tips, their transaction becomes a Tip that is also likely to be chosen in the future by others. If one approves unlikely Tips, they are likely not to be chosen in the future, and over time the probability of being chosen decreases further.
From here it is somewhat intuitive that a transaction x
that was posted by following the protocol properly is likely to be quickly approved by other transactions, and once that happens, most of the random walks will go through x
. From perspective of a person who received money in x
once many transactions approve x
, the double spending is unlikely, since any transaction that conflicts with x
needs to be in a different sub-DAG, and since most of the random walks are likely to go through x
, they are unlikely to go through that conflicting transaction.
When one posts a transaction, they need to post some proof of work, so attempting to build a competing subgraph with more transactions is expensive.
For as long as all participants (or rather more than 50% of them in terms of compute power they control) behave as prescribed, everything should be fine. And with that we have few problems.
What’s wrong with IOTA
1. Proof of Work at transaction time is not feasible long term
There’s a major problem with Proof of Work at transaction time and at transaction time only. Let’s say at the present moment the proof of work takes ~5 minutes per transaction (I get this value from here). Let’s also assume that IOTA reaches 1000 transactions per second. Right now they are at ~10 tps, so this would require increasing their adoption by two orders of magnitude. This means that in one second all the nodes that posted transactions on that second will perform 1000 * 5 * 60
CPU seconds worth of work. So if an attacker wants to have more compute power, they just need a cluster with 300,000 CPUs. That is by no means unreachable. Note that 5 minutes above was referring to slower CPUs, so the attacker actually needs even less than that. And note that the attacker also doesn’t need to perform random walks, that are also not free.
And 5 minutes proof of work per transaction is a complete show stopper, because that is effectively time that is necessary to submit a transaction. It completely nullifies all the benefits of fast transaction confirmations. But lowering that number would mean reducing further the cost of attacking the system, and increasing it would make posting transactions less practical.
So even long term their proof of work will not be sufficient to withstand a not that expensive attack. Short term, while they still have only ~10tps, their proof of work is so weak that they actually have to maintain a centralised node called a Coordinator that submits transactions that everyone is encouraged to trust (and if you think that this makes the system centralised, IOTA CTO has some strong words for you, see his response to point B).
2. Users are actually strongly incentivised not to perform random walks
IOTA developers published a whitepaper in which they try to prove that performing random walks achieves a (“almost symmetric” as they call it) Nash Equilibrium for all the participants. That would mean that deviating from the prescribed behavior would be disadvantageous for any participant for as long as all other participants continue behaving as expected.
The whitepaper is rather complex, however it a) operates under lots of simplifying assumptions, b) doesn’t provide a full formal proof due to complexity of it, and uses simulation instead and c) actually proves that it might be beneficial for other nodes to deviate slightly.
Coincidentally, proving that something reaches a Nash Equilibrium is way harder that proving that something doesn’t. To prove the latter it is sufficient to just show a strategy that provides huge benefits for the participant, even if other participants continue behaving properly. Let’s try to do just that.
Recall that a transaction is likely to get approved if it approves two other transactions that had a high chance of being approved. That means that any transaction that enters circulation is actually very likely to be approved for as long as most of the nodes in the system are honest. That means that to find two likely Tips one can just choose two random recently published transactions. Considering that IOTA targets IoT devices a strategy of maintaining a list of 100 or 1000 (or 2) recent transactions, and always approving them is superior to maintaining an expensive graph in memory, and performing any random walks on it.
With present IOTA implementation such behavior is completely unnoticeable, since the honest nodes are also very likely to choose some recently published transactions, and the system continues to operate as expected even if the number of nodes following the lazy strategy increases to scary percentages. Even if the majority of nodes follow the lazy strategy, the system continues to operate properly (since a transaction posted in a system with majority of nodes approving random recent transactions is likely to be quickly approved). However once the majority of nodes switch to this lazy strategy, the system becomes vulnerable to a double spend attack: an attacker creates a transaction, waits for it to be approved by a large number of participants, then gets the goods from the merchant, creates a double spend transaction, and uses just enough compute power to create 51–501 approvals for it. The nodes that approve a random transaction from among the last 100 or 1000 will be more likely to pick up such transaction than the transaction from the bigger graph, and the double spend will be executed successfully.
3. There’s no intuitive way to say if a transaction is finalised
This point is less critical than the previous two, but it is still a problem in my view. In blockchain the transaction is considered finalised if it is in one of the block in the longest chain, and sufficient number of blocks are built on top. Most of the time the “longest chain” is well defined (and when it is not, most of the block among the competing longest chains are shared), and checking if a certain transaction belongs to them is easy. In IOTA there are many Tips, some of which have been abandoned for a long time. To confirm that a transaction is actually approved, one needs to make sure that most of the likely Tips approve it, but what a likely Tip is is not well defined. So it is not clear how one shall define for themselves if their transaction is actually finalised.
Note that at the present time there’s a way to say if a transaction is finalised with certainty, since there’s an Coordinator, and everything that is approved by the Coordinator is considered to be finalised. But it is not a decentralised way of approving things (it’s like having a DPOS implementation in which there’s only one validator node, and it never changes).
How to make IOTA actually work
I think I have a solution that addresses some of the problems described above.
First let me present the idea in its easiest form, and then talk about how it can be improved.
Imagine a new kind of transactions, call it “Finalisation Transaction”, that requires significantly more work that regular transactions. Such transaction is similar to a regular IOTA transaction in that it directly approves two other transactions, but it has few differences:
- It requires to solve a way more complex puzzle, ideally of adjusting complexity, such that a new finalisation transaction appears on average every two minutes
- Such transaction actually emits few new IOTA tokens.
- Such transaction must approve all the previous finalisation transactions it is aware of (if it is not aware of some recently published, it creates a fork, that is later resolved by the next finalisation transaction either indirectly approving both if it doesn’t create conflict, or approving one randomly).
- Such transaction must traverse the DAG all the way from the genesis block, and must include the seed that was used for the traversal so that everyone could verify the traversals were done properly.
We assume all transactions approved by a sufficient number of such transactions to be finalised. This effectively makes it similar to a blockchain, with the difference that one doesn’t need to wait for a block to consider their transaction finalised (though one should for extra security, and ideally for more than one).
With some benefits of blockchain it also brings some disadvantages (e.g. forks are possible, work is wasted, new tokens are emitted).
It also has the following weaknesses for which I do not have a complete solution yet, but all seem to be very solvable:
- If the node that mines a finalisation transaction decides on the seed, it can approve some transaction that is orphaned, by trying many different seeds until it finds one that suits it best. Ideally the seed shall be generated by some decentralised random number generator.
- If the random walk starts at genesis, it might not include the previous finalisation transaction. It could be solved by either restarting random walks until at least one walk includes the previous finalisation transaction, or starting the walks from the last finalisation transaction itself.
- It still takes few such finalisation transactions before one can be reasonably certain their transaction is approved. A better approach (that also replaces proof of work with proof of stake, and as such saves some trees) would be to adopt Casper or a variation of it to make IOTA somehow elect witnesses who reach a byzantine consensus on the finalisation transactions. Note that Casper only works if the consensus needs to be reached relatively infrequently, and most of the nodes agree on what the consensus shall be (since once a node casts a vote at a certain height, it can’t revote, and if many nodes have casted votes on conflicting outcomes, the consensus can’t be reached at the given height, and the height is skipped). IOTA at 1000 transactions per second will be a terrible fit for vanilla Casper, so some modifications would be necessary.
Unfortunately, ideas above bring all the complexity of decentralised random numbers generation, validators election and Byzantine consensuses. All the complexity that IOTA attempts to solve. Introducing such complexity into IOTA would eliminate its biggest advantage — its beautiful simplicity.
At this point all the above has more of a theoretical interest, since, if the history is any indication, IOTA developers would never accept that their system has any issues to begin with, and others are unlikely to attempt to build a fork of IOTA.