Bitcoin: transactions, malleability, SegWit and scaling

Published in

LightningTo.Me

7 min readAug 24, 2017

As of today, bitcoin community has finally achieved its long awaited goal: SegWit was activated. The panic is over (at least for the moment), the price has climbed up… It’s time to figure out: what is this SegWit, why everyone is talking about, and why do we need it in the first place?

Disclaimer. I consciously omit/simplify some of the technical details. And also I am not an expert, and can be simply wrong.

Segregated Witness (SegWit) is a cumulative name for a bunch of changes to bitcoin protocol. Many of these changes deserve exclusive attention, but I will be focusing on only the first (and the most important) one: a change to how transaction identifier is computed.

Despite the fact that recently all the talks about SegWit are related to a scaling debate, the initial proposal was solving a very different problem: transaction malleability.

But let’s move one step at a time: what is a bitcoin transaction?

Transaction

For simplicity, let’s consider a transaction from an address A to an address B with a single input and a single output. An example of such a transaction:

Input:
Previous tx: f5d8ee39a430901c91a5917b9f2dc19d6d1a0e9cea205b009ca73dd04470b9a6
Index: 0
scriptSig: 304502206e21798a42fae0e854281abd38bacd1aeed3ee3738d9e1446618c4571d1090db022100e2ac980643b0b82c0e88ffdfec6b64e3e6ba35e7ba5fdd7d5d6cc8d25c6b241501Output:
Value: 5000000000
scriptPubKey: OP_DUP OP_HASH160 404371705fa9bd789a2fcd52d2c580b65d35549d OP_EQUALVERIFY OP_CHECKSIG

And here is what it means:

Previous tx — an identifier of the previous transaction to the address A;
Index — input number (here we have only one input number 0);
scriptSig — first part of the validation script (more about it below);
Value — the number of bitcoins to send in satoshi (one bitcoin = 100 millions satoshi) — 50 bitcoins in the example;
scriptPubKey — second part of the validation script, which also contains the receiver address B.

Transactions are linked in chains (well, in trees in fact, but this does not matter for now). To spend bitcoins from address A, one needs to spend the output of some transactions to address A (except of coinbase transactions rewarding miners for generating a block, these transactions have zeros inputs).

To verify that one can spend the output of the previous transaction, one needs to:

Combine scriptSig of a new transaction with the scriptPubKey of the previous transaction,
Check that the resulting script (written in a special language called, you guessed it, Script) is valid and returns True.

In the simplest case (like in this example), scriptSig contains a transaction signature (signed with a private key from this address), and scriptPubKey simply checks the equality of the public key and the validity of the signature (the main operation here is OP_CHECKSIG).

Of course, one does not need a special programming language just to handle these simple cases. But Script enables bitcoin to do much more. For example, using this mechanism, one can create a MultiSignature address — an address, from which one can spend funds only in case the transaction is signed by multiple keys from different parties.

Transaction malleability

To explain what transaction malleability is, we need to clarify one point: what exactly is transaction identifier or simply txid?

txid is a sha256d hash of all the fields of the transaction data.

What is relevant in this definition, is that txid depends on scriptSig.

Every full node in the bitcoin network (not only miners) helps to collect and distribute information about transactions. In the process, the node can mutate scriptSig in such a way, that the signature will stay valid, the transaction will have the same effect, but txid will change.

For example, one can add a OP_NOP operation (that does nothing). Or for some sophistication, one can add two operations: OP_DUP OP_DROP (the first one is duplicating the signature on the stack, and the second one removes it again). The signature is still valid, but txid changes.

And now we can see a problem. Two problems, in fact.

1. txid is a bad identifier

This mutability do not permit the usage of txid as it was intended: as an identifier. An attacker can intercept a normal transaction and distribute the modified version through the network. With some probability the miners will include this modified transaction instead of the original one.

You may ask: why would an attacker do that? In ideal world there is indeed not much of a point to do that. But in real world people like using identifiers for unique identification. And this is what happened to MtGox exchange in 2013 (this is only one of the versions, they could have simply stolen the funds themselves).

An attacker was withdrawing the funds from the exchange, intercepting the transaction, and changing its txid. The transaction was still valid and the attacker got the bitcoins. But the exchange saw that txid was never included in the block and did not decrease the attackers balance.

2. Transaction chains in one block

The second problem is rather not a problem, but a missed opportunity.

Theoretically bitcoin allows to spend the outputs of transactions even before they were included in the block. One can transfer funds from A to B. And then, without waiting for any confirmations, transfer bitcoins from B to C. And both transactions could have been validated and included in the next block, if not for transaction malleability.

If txid of the first transaction can change, the Previous tx field of the second transaction should also change. And it means that the second transaction can only be created when the first transaction is included in a block (and therefore its txid is fixed).

And this problem is bigger than one might think, but we will come to it later.

Segregated Witness

And now it becomes easy to explain what SegWit proposal is all about.

SegWit suggests: let’s separate all the malleable information from the transaction into a separate “witness data”. txid will be computed without it, the identifier will never be able to change, the problems will be fixed.

Transaction schema before and after SegWit. scriptSig is now empty, the data moves to a separate “witness data”. Everything else stays the same: *txid and transaction size* are still computed from the same fields as before.

Easy, right? Yes, but it changes the transaction definition (that was not changed much since the beginning), changes the validation mechanisms, requires huge code changes. And on top of it, would be nice to make SegWit compatible with the previous protocol versions.

Because of all that (and also because of some political games, which I am not going to talk about), SegWit activation took so long.

What does it have to do with scaling?

Apparently, this fix also have a huge impact on the scaling of bitcoin. Because of two reasons.

1. More transactions in a block

Look once again at the transaction example above. More than half of the data is scriptSig. Separating this data, we essentially allow transactions to be smaller. And effectively enlarge the block size.

Huge lego blocks in front of The Lego Group headquarters

Of course this is cheating, because we still need to store all these signatures. But now this data does not count towards the block size limit. Why not? Without going into details: because this was the only way to implement SegWit as a soft fork, and not a hard fork. And nobody likes hard forks.

Theoretically, new SegWit blocks can include up to four times more transactions. Practically — about two times more. Quite a tangible scaling.

2. Off-chain transaction and lightning network

Currently, every bitcoin transaction ever made is stored in bitcoin blockchain. Thousands of computers around the world are storing information about me buying coffee for 0.001 bitcoins.

This is expensive for the nodes of the network. This is expensive and annoying for me: I have to pay large fees and wait for ten minutes for the confirmation of my transaction.

One of many possible solutions to this problem is to have some small transactions executed off the main chain, and then sometimes sync the balance. The common name for these approaches is “second layer network”.

One close-to-working implementation of this idea is Lightning network. It works approximately like this:

two persons open a micropayment channel between each other, both have some amount of funds locked;
they execute large number of small transactions between each other;
honest behaviour is guaranteed by some very clever crypto-magic, conflicts are automatically resolved from the locked funds;
as soon as the channel is closed — the locked funds are unlocked, the balances are synced to the main blockchain.

The real power of the payment channels is achieved when there are many of them. Channels form edges of the graph between people. If two people are connected, even indirectly — they can transact between each other.

What does SegWit has to do with that? It is indeed not necessary, but it makes the implementation much easier.

Without going into too many details… Micropayment channels rely on double-signed transactions to lock the initial deposits. In the beginning, the funds from both parties are sent to one double-signed address. To prevent cheating, the transaction should be double-signed before any funds are actually sent there.

But to do that, one needs to collect the outputs of transactions that were not yet synced to the main blockchain. And this is exactly the scenario not permitted because of transaction malleability (described above in “2. Transaction chains in one block”). And this is where SegWit comes to the rescue.

Conclusion

TL;DR. SegWit is activated. It solves the problem of transaction malleability, but it is also useful for scaling. Short-term is will permit more transactions in a single block. Long-term it will open new opportunities for off-chain transactions. Looking forward to the future!

Please do comment, criticize, ask questions, subscribe. More about myself: http://laptev.ch.

This article in Russian (статья на русском).