Split Scale: Scaling Bitcoin by Partitioning the UTXO Space

James Ovenden
Primalbase
8 min readApr 5, 2019

--

This is a transcript of Kazım Rıfat Özyılmaz’s presentation at Master Workshop: Layer 1 solutions, which took place at Primalbase AMS.

On-chain scaling is easy, right? If we make Bitcoin blocks one gigabyte then we have no problem - you almost catch the transaction throughput of Visa. But we all know that that’s not the case. We have to keep the network decentralised. Furthermore, the bandwidth requirements of running a full Bitcoin node is actually a bit high.

Bitcoin has 400 kilobits per second upload. You need to have at least 700 kb per second in total bandwidth to achieve a maximum of seven transactions per second. To achieve 2000 transactions per second, we have to have 200 megabits per second connections. This is very high. So will the future of full node be in the data centre?

It is the same story when it comes to storage. Bitcoin is at 190 gigabyte right now and growing over one gigabyte per week. If we consider full cyclic blocks, it’s going to increase four gigabytes bytes per week. It’s also going to require a lot of storage in the future.

So is there hope?

Split-Scale

We want to create multiple blocks in the same block interval for Bitcoin by creating multiple sub chains. We are going to introduce another chain called Eigenchain, which keeps track of the sub-chain headers and we are also going to introduce a new type of Bitcoin node, called half nodes. These enable a user to track a sub chain and a Eigenchain, instead of getting all the blockchain block messages.

In order to do that we have to look a bit deep into UTXOs.

Bitcoin transactions consist of inputs and outputs. In order to make a transaction you have to reference a previously sent money, previously sent transaction to you. Currently, there are 57 million UTXOs and a good amount of them are dust right now.

Basically this (above) is the implementation. There are three classes and the most important class in my opinion is the CtxOut, which is the transaction output and the most important part there is script pop key which defines where we are sending our money.

These are the most widely used two types of script pop key. These are the scripts and pop key hash you see is basically your address, and the script hash is also a hash of your script. These two types of script pop keys make up 90% of the total Bitcoin transactions. So after that pre-knowledge is there hope?

To solve this, we divide the UTXO set by calculating the hash of the script pop key. If you have one address and you got a lot of transactions sent to that specific address, all the hash values of that of these transaction outputs will be the same. So we calculate hash values for all the unspent transactions and then we are going to divide them.

Dividing Chains

There are a couple of strategies to divide these UTXO hashes. You can start by accumulating the amount of Bitcoin and if you reach the half of the Bitcoin in circulation then you may say that this is the threshold. Before that will be the Bitcoin number 1 and after that will be the Bitcoin number 2.

We are effectively dividing the spendable money into two separate groups and by doing that we are actually dividing the chain state DB which is the actual database that Bitcoin implementation uses to keep track of UTXOs. We are also dividing mempool, which is the pool of transactions that nodes receive. So every sub-chain will accept only a subset of UTXOs, on a subset of transactions and we effectively have chain state 1 DB, chain state 2 DB and mempool 1 and mempool 2. Throughput will be increased because in one block interval now you are able to create multiple blocks and they will not clash with each other, they don’t have to query anything to each other because UTXOs are already separated. So you have an account subchains effectively also. No cross subchain transaction possible — we are going to revisit that. Let’s say that it is now.

So what is the mining going to look like? Well, we are not addressing mining pools - the mining will be relatively the same, the miners should create blocks for all the subchain plus one block for eigenchain, which should contain the hashes of these sub subchain blocks. So mining will be a monolithic big thing still, they have to they have to calculate everything and the eigenchain will actually encapsulate all the information, all the hash values of these specific subchain blocks. The actual difficulty level will be determined by the eigenchain, so the there will be proof of work for subchain blocks but it will, relatively, be very low.

What Is the Network Going to Look Like?

Now we are able to actually partition the network, so we have full nodes. Miners are also full nodes because they have to listen to all mental messages because they have to create sub chain blocks for all the sub chains. So full blocks are sending complete block messages among each other, so they are sending the the complete thing, but their sub chain networks only receive the required sub chain block, plus eigenchain so it’s possible to actually conserve bandwidth, because full nodes are not going to send the whole block message there. All the sub chain networks are able to independently verify because they have a subset of UTXOs. The sub chain nodes will relay messages to full nodes also.

One of the problems that is going to arise at this stage is if you don’t have enough money in a specific sub chain to send to another peer but your total balance is very sufficient for it, you have to make two transactions, you have to make multiple transactions on sub chains. In order to achieve that in a secure and successful manner you have to use hash time-lock contracts.

You have to create a data, have the hash of it, use that hash to send your transactions, and in the end you are going to reveal the actual data to your merchant or to your peer so they can cash out. If all the transactions don’t happen in a timely manner on all the sub chains, then of course this money will be claimed by you again, so you are not going to lose anything.This is one of the one of the mechanisms that should be taken care of, but all these things can be taken care of by wallet implementations so it’s not, relatively speaking, transparent for users.

We talked about Bitcoin ng a bit. There is this failed SegWit2x, which is only block size increase, and there is Bitcoin ng. There are key blocks that miners introduced and then there is a 10-second microblock interval. This leader, the party that introduced the key block, is able to create microblocks, which as a result increases the throughput very highly but it doesn’t help you conserve any bandwidth or storage. With split scale, in terms of miners you have to do all the mining - you are not allowed mine a specific sub chain for security reasons.

Basically, full nodes will receive the big blocks so they are also unable to conserve anything. But the half nodes will only receive the specific sub chain and the eigenchain, so their storage requirements will be relatively low. Of course full noses will store everything like in regular Bitcoin. The bandwidth requirements will also decrease for half nodes because they they are only getting block dash and the specific sub chain block messages. Therefore they are able to conserve bandwidth, they’re able to save the check transactions, and they are able to make transactions in a secure manner.

To sum up, what we are trying to achieve is we want to increase the throughput by giving the option to create multiple blocks in the same block interval by separating the UTXO set and this new type of Bitcoin node, half-nodes, makes it possible to run in a bandwidth constraint environment. So thank you very much.

Q&A

I was wondering how to decide or to restrict miners to mine on all chains and not on specific chains?

The only option is broadcasting the block message. If they don’t broadcast the block message, they don’t get anything.

But as an adversary, I can just mine on one of the sub chains to do that?

And you have to also create an eigenchain block, which is the actual difficulty. So you have to mind the eigenchain block also, not the only sub chain.

Maybe some honest node mines for me and includes a reference to my chain? Like I’m not getting how this works — how do I pick the chain, mine or what I can do as an adversary to selectively mine only particular chains?

If you selectively mind a particular chain, you have a block header for a chain, you don’t have the other ones, then you just fill it with garbage data. Then the collective hash of these headers should be also lower than a difficulty value which is the actual mining operation actually. That’s what you didn’t do, so you are unable to broadcast the valid block message to the network, so you are unable to mine on a specific subject actually. The part is the eigenchain — eigenchain stores the hash values of the sub chain block headers okay, and is the difficulty thing, this is the actual proof of work. So you have to create eigenchain block.

How bad does the hash time-lock transactions. How do they affect the throughput, in terms of you now need to have some kind of synchrony?

The reason that we had to use hash time lock contract is we are unable not 100% sure whether these multiple transactions are mined but HTLCs gives us the capability that if they are not mined for example in a ten-block interval, I can claim back the money, I can get the money back as a sender. It’s not wasted so it’s not lost.

--

--

James Ovenden
Primalbase

Editor-in-chief @ Luno, blockchain enthusiast, crypto dweeb, eats mustard with a spoon