SegWit the solution to the problem

This is the eighth in a series of posts where we discuss the core concepts behind the Blockchain, Bitcoin and Ethereum. At Verify, we’re building a reputation protocol on the Ethereum blockchain and are sharing these posts in an effort to share our knowledge with the wider crypto community.


We have seen in previous posts that transactions exist in blocks and the miners add them to the block. As bitcoin became popular over the years, more and more transactions are being generated and queued to be added to a block resulting in a larger and larger backlog of transactions. Recall that blocks have a limited number of transactions; the block size is currently set at 1MB; it used to be unlimited at some point of time however it would have allowed for attacks like clogging the system with void transactions.

The more transactions are generated the slower it becomes to add them at the regular transaction fee. Consider the example of Alice sending 2 BTC to Bob at the regular transaction fee. While it is in the memory pool waiting to be added by some miner, Alice now decided to send Clair 3 BTC with a higher transaction fee than the first one (assuming she has enough balance); the second transaction would be executed before the first. The transaction fee became the incentive for miners to prioritize which transactions to add to the block. Now assume instead of sending 3 BTC to Clair, Alice decided to replace the transaction to Bob; she can do that (and we will see how this could happen later in the post).

You can check block size increase visually at blockchain.info.

One way you could go about solving such a problem is by increasing the block size. However some researchers, specifically Pieter Wuille [link here] proposed optimizing the current block first. As he examined the components of the block he realized that the transaction signature takes quite a space in a given transaction. Since the signature (named script) is only used for transaction validity, he proposed to remove that signature data from the transaction to another structure in the block called the witness field.

Doing this meant there would more room for transactions because you are basically freeing a large portion of the used space. This also means that transactions hashes would no longer contain the signature data. The signatures by nature are proofs or in other words “witnesses” for the transactions. That is how the name SegWit came about. It means “Segregating Witnesses”.

Recall earlier I mentioned that it is possible to just increase the block size; but the reason why it was not the best way to go is because you will need to do a hard fork; new blocks which are bigger in size would no longer be supported by older nodes; they will be rendered invalid.

To fully understand what is going on; let us study the transaction data.

The transaction data has the following fields:

- version of 4bytes
- In counter, positive integer of 1–9 bytes
- List of Inputs
- Out counter of 1–9 bytes
- List of outputs
- Lock time of 4 bytes

List of inputs:

- Prev transaction hash of 32 bytes (remember previous hashes are outputs of other transaction)
- Prev Txout Index of 4 bytes
- TxIn script len of 1–9 bytes
- ScriptSig VarInt
- Sequence No of 4 bytes

List of Outputs:

- Value of 8 bytes
- Txout Script of len 1–9 bytes
- ScriptPubKey VarInt

With that in mind, let us take a look at a random block, say block #300000

There is a transactions section. Let us choose the second transaction:

7301b595279ece985f0c415e420e425451fcf7f684fcce087ba14d10ffec1121

We need its raw format in hexadecimal to be able to fully understand it. To get the hex format of the transaction you just add ?format=hex to the url like this:

https://blockchain.info/tx/7301b595279ece985f0c415e420e425451fcf7f684fcce087ba14d10ffec1121?format=hex

you will get something like this:

01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

Now we can breakdown the transaction to its various components:

Starting with the version number (think of it as consensus version) it is a 4 byte field (remember each hex character is 4 bits or half a byte)

01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

Followed by the input count (In counter). It is usually 1 byte, unless the byte was a prefix of fd, fe or ff; if the prefix was fd then this basically says that the next byte is the input count. if it was ff then it is saying the next 8 bytes are the input counts.

01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

In our example 01 is basically saying we are working with one input. If there wasn’t 01 instead was fd2221 then this would mean that fd says the next 2 bytes represent the input count. So 2221 is the input count. Now we will get into the input list of the transaction.

Recall in our post “Bitcoin Transactions: How they work” inputs are basically the result of previous transactions. In this example we know there is only one input. The next field should be the previous transaction hash of 32 bytes (Prev Tx).

01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

Followed by 4 bytes that will represent the output index of the referenced transaction. Remember that transactions have outputs; if the previous transaction hash was a the third output in the previous transaction then this would be reflected accordingly. In our example the index is 0. This basically says the first output of the previous transaction.

01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

The next field to follow is the script length (TxIn script len); In this example it is 8c; which is saying the next 140 bytes (hex to decimal conversion) represent the signature (scriptSig) (keep in mind that this varies and it is not fixed).

01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

The 4 bytes (ffffffff) that follow are called the sequence number. This is intended for replacing a transaction that has been sent to memory pool. In this case we see all that all bytes are fs so this is saying it did not signal for replaceability. It has to be lower than (0xffffffff-1) to indicate replaceability.

Now we are left with the output

020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

The output count indicates that we have 2 outputs. Recall the output should contain the following fields:

- Value of 8 bytes
- Txout Script len of 1–9 bytes
- ScriptPubKey VarInt

Value is the number to transfer. To obtain the value itself we need to perform what is called swap byte ordering on the value, then convert it to decimal; the value we get is the number of satoshis; it later will be converted to bitcoins to get the bitcoin value.

So let us see that conversion in action. So we have 0049d97102000000 in hex; we do the swap byte ordering using this handy python script (hex_val is the hex value to swap)

import codecs
hex_val = ‘0049d97102000000’
swaped_hex = codecs.encode(codecs.decode(hex_val, ‘hex’)[::-1], ‘hex’).decode()
print("the endian swap for the hex value is {}".format(d))

We will get 0000000271d94900

Let us convert that to Decimal and we get:

10500000000

That value in satoshis == 105 BTC

Now we are left with

1976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

The 19 is the scriptPubKey size in our case it is 25 bytes so we take the next 25 bytes to be the scriptPubKey.

76a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac

We will be left with:

30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000

We take the next 8 bytes to be the value of the second output (remember the value 2 for output count) so we will have 30601c0c06000000 and we do the endian swapping to get 000000060c1c6030. We convert that to decimal to get 25972990000 satotishs which equals 259.7299 BTCs.

Similarly 19 to follow is the scriptPubKey size and the scriptPubKey itself

1976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac

We will be left with

00000000

That is called the locktime. It is mostly used to postpone adding the transaction to a given block height. So if this is specified, that transaction will be added to that block height or a specified time. The default is no lock time set.

Clearly the transaction hash depends on various information. For every input and output there is a script. The script locks the transaction. With SegWit the idea is to move all of the scripts out of the transaction; this means that transaction hash would not depend on the scripts.

What do you get by doing this?

Changing the structure of the block means the measurement of the block could also change. So Instead of measuring the block with a byte size and accordingly limiting it that way; a new unit was presented called block weight.

The block weight maximum is 4MB (don’t let the 4MB confuse you). Each byte in a transaction weighs 4 bytes and a byte in the witness field weighs 1 byte. This is basically saying that the block size of 1 MB is to be multiplied by 4 to get the maximum weight. For every byte in the transaction you multiply it by 4 however for the witness data you will be multiplying by 1 so you are basically saying you will be freeing ¾ the original size of the transactions aka 75%. The reason for this increase in transaction size is to maintain backwards compatibility (resulting in a soft fork vs. what would have otherwise required a hardfork). When a SegWit node generates a block it can understand the witness field however if an older node that does not support SegWit can still accept the block but without the witness data.

That is not all that SegWit did. It fixed a “bug” that existed, it was called Malleability bug; Which was that OpenSSL did not enforce a specific kind of encoding called DER-encoded ASN.1 octet representation. OpenSSL allows for small changes in encoding mechanism which basically means the node or miner can still render a transaction but under a different signature. Since it is under a different signature it means a different transaction hash, the original sender will keep waiting forever for a confirmation of the first unchanged signature while the transaction has been added under a different hash.

What should be noted here is that the transactions output would not be changed; the transaction will still be executed but under a different hash. No changes can happen to the sender, amount and receiver; the amount sent will be delivered to receiver and the only change would be in the transaction hash. This means your coins are spent to the said address however the confirmation is under a different transaction id. Now with SegWit the scripts (signatures) are separated from the transaction data so the transaction hash would not be affected by the signatures encoding or any other changes that could happen to signatures. More details could be found here: Bitcoin Improvement Proposal (BIP 141).