The scalability of the bitcoin is one of its main problems and a focus of active efforts. One of proposed solutions is, for instance, the Lightning network technology, but its implementation does not yet seem possible because of several vulnerabilities. Another solution, the Segregated Witness, is also aimed at scalability increase but it also solves a number of other problems including the aforementioned vulnerability that interferes with the implementation of the Lightning network. In this article, we will have a look at the advantages of the Segregated Witness and describe how it works.
The Segregated witness or SegWit is a soft fork described in the series of BIPs (141, 142, 143, 144 и 145), whose main aim is to optimise the structure of transactions and blocks by moving signatures (called ‘scriptSig’, ‘witness’ or’unlocking script’) from the transaction to a separate structure. Not only it allows to decrease the size of transactions, providing for more spacious blocks, but it also solves the issue of the transaction malleability (the vulnerability we spoke about in the beginning of the text), which is crucial for such technologies as payment channels or Lightning, based on the bitcoin transaction structure.
How it works
Before we begin
To start with, we will briefly remind you what the bitcoin payment system is like. It does not have anything like the a list of balances as it could be implemented in a bank. Instead of this, the balance of every address is represented by a set of transactions sent to this address, where a transaction is a structure whose principal parts are inputs and outputs. The inputs are transactions that we refer to when spending funds (to be more precise, these are not full transactions but their concrete outputs because we may transfer funds to several addresses) while the outputs are the addresses towards which we want to transfer the funds. This is how the structure of a bitcoin transaction looks:
The PubKey Script field (hereinafter referred to as scriptPubKey) in the outputs is what they call locking script. It is necessary to make sure that only the owner of the recipient address can use this output. The Signature Script field (hereinafter referred to as scriptSig) is also called unlocking script because it “unlocks” the locking script, providing the proof of address ownership.
More details about the transactions and about the functions of the locking script and of the unlocking script are available here.
In fact, the Segregated Witness changes not only the transaction structure, but its outputs as well. However, it does not mean that both traditional UTXO (unspent transaction outputs) and SegWit UTXO cannot be spent in the same transaction: in such a case, traditional UTXO will wait for proof inside the input (scriptSig field), and SegWit UTXO will wait outside.
As Segregated Witness is, after all, a soft fork, its updates can be ignored, and, therefore, older systems should be somehow capable of processing SegWit outputs. In fact, old nodes or wallets will see such outputs as accessible to all, meaning they can be spent with empty signature, which is still valid. Updated nodes and wallets will evidently look for signatures outside the inputs in the special field ‘witness’.
Let us now take a look at the examples of transactions and on how the Segregated Witness will change them. We will start with a standard Pay-to-Public-Key-Hash (P2PKH) transaction.
We are interested by outputs, especially by their “scriptPubKey” fields. Let us consider a typical locking script:
OP_DUP OP_HASH160 <PubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
That’s how it will look with the Segregated Witness:
As you can see, the SegWit output is much simpler than the traditional one: it consists of two values that will be put on the script execution stack. As we have mentioned above, old versions of bitcoin client will see this output as available to all because it does not require a signature. However, a new client will interpret the first number as the number of version, and the second as a counterpart of a locking script (witness program). In reality, the hash of the compressed public key should be used in this situation, we will tell about it a little later.
Now let us consider a transaction in which this output will be spent. That is how it would have looked in a traditional case:
"Vin" : [
"scriptSig": "<our scriptSig>"
However, to spend a SegWit output, the transaction should have an empty scriptSig field and contain all the signatures in a separate place:
"Vin" : [
"witness": "<Witness data>"
While the traditional clients can process SegWit transactions (we remind you that they see their outputs as accessible to all), they cannot spend their outputs because they simply do not know how to do this: an old-type wallet will try to spend a SegWit output with an empty signature but such a transaction will not be valid in reality (nodes that have Segregated Witness installed, would not allow such a transaction). It means that the sender has to know if the recipient wallet supports Segregated Witness or not, to be able to create outputs of necessary type.
Starting from BIP 143, outputs should be created with the hash of the compressed public key. If you created an output using a traditional address or a non-compressed public key, this output will be unusable.
Another crucial type of transaction is P2SH. It allows to send transactions to a script hash instead of public key hash (the ordinary bitcoin address). To spend the output of the P2SH transactions, the recipient should provide a script (called redeem script), whose hash matches the script hash to which the funds have been sent, complete with signatures/passwords/something else depending on the script. The use of such approach makes it possible to send bitcoins to an address protected by a method totally unknown to the sender, and saves a lot of space: for instance, in the case of a multisignature wallet the locking script would have been really very long, had we been obliged to keep all the “locks” in their entirety instead of keeping the hash.
So, let us consider an example: a multisignature wallet that requires two signatures out of five. If we use a traditional way, the locking script of the P2SH transaction output looks like this:
HASH160 54c557e07dde5bb6cb791c7a540e0a4796f5e97e EQUAL
To spend it, the recipient should provide a redeem script, defining the condition (multisignature, two out of five) and two signatures, and all this has to be situated in the input of the transaction:
"Vin" : [
"scriptSig": "<SigA> <SigB> <2 PubA PubB PubC PubD PubE 5 CHECKMULTISIG>",
Now let us see how it all would look if both the sender and the recipient used the Segregated Witness. The locking script of the output:
Again, as in the case of the P2PKH transaction, the output script is much simpler. The first value is the number of the version, and the second is the 32-byte SHA256 hash of the redeem script (witness program). This hash function was chosen to somehow distinguish between the P2WPKH witness program and the P2WSH witness program by the hash length (32 bytes of SHA256 vs. 20 bytes of RIPEMD160(SHA256(script))).
A transaction using such an output:
"Vin" : [
"witness": "<SigA> <SigB> <2 PubA PubB PubC PubD PubE 5 CHECKMULTISIG>"
Embedding Segregated Witness inside P2SH
We have seen that the use of the Segregated Witness has it advantages. However, in the aforementioned cases both the sender and the recipient must have updated versions of bitcoin, which is not always possible. Let us consider, for instance, such a situation:
Alice wants to send bitcoins to Bob but she does not have a SegWit wallet while Bob does. They certainly can use a standard transaction but Bob wants to use SegWit to reduce the commission fee.
In such a case, Bob may create a P2SH address containing a SegWit script. Alice will see it as an ordinary P2SH address and will be able to transfer funds to this address without any problem, while Bob will be able to spend this output using a SegWit transaction and obtaining a commission fee reduction (below we will tell about the new valuation of the commission fee for SegWit transactions).
That is how both types of SegWit transactions, P2WSH and P2WPKH, can be implemented within the P2SH.
To use a P2WPKH transaction within a P2SH, Bob should create a witness program out of his public key. Then the result should be hashed and transformed into an address:
Creating a witness program:
As always, the first value is the number of version and the second is the 20-byte public key hash. The obtained script is then hashed by SHA256 and then by RIPEMD160, creating a new 20-byte hash.
HASH160 from the P2WPKH witness program:
Transformed into the address:
The locking script of the output sent to this address will look as a script for an ordinary P2SH address:
HASH160 3e0547268b3b19288b3adef9719ec8659f4b2b0b EQUAL
Let us now see how Bob can spend this output:
"Vin" : [
"scriptSig": "0 ab68025513c3dbd2f7b92a94e0581f5d50f654e7"
"witness": "<Witness data>"
At first, the redeem script we produced (in our case, a witness program) will be hashed and, if it matches the hash indicated in the locking script, it will be executed and signatures in the witness field will be verified.
In the same way, any P2WSH script can be implemented within a P2SH. Let us consider the multisig script two out of five, analyzed above. All the steps will be practically identical to the case of P2SH(P2WPKH):
First of all, once again, we create a witness program:
The first value is the number of version, the second is the 32-byte SHA256 hash of our multisignature script. Then the steps are repeated: we take the HASH160 of the witness program and transform it into an ordinary P2SH address. To use the output sent to this address, we should once more record a witness program in the scriptSig, and the whole multisig script in the witness field.
Segregated Witness benefits
Now, having sorted out the technical part, let us examine the principal advantages of the Segregated Witness.
One of the crucial problems solved by the SegWit is the malleability of transactions or, to put it more accurately, of their IDs which are hashes. Let us see it more detail.
In the traditional case, the signatures, situated in the inputs inside the transaction, can be changed by a third party without being invalidated. It allows to change the ID of the transaction, which is its hash, without changing any ‘fundamental’ fields such as inputs/outputs/amount of funds. Therefore, the transaction is still valid but has another ID, which enables different kind of attacks, for instance, denial-of-service.
SegWit solves this problem because all the signatures stay outside the transaction and therefore are not hashed and their change will not influence the transaction ID in any way. A separate identificator is introduced as well, named wtxid: it hashes not only the transaction but the whole witness part, so if the transaction is transferred without any witness data, txid is equal to wtxid.
The solution of this problem allows to create the chains of unconfirmed transactions without any risk, which is very important in the case of such protocols as the Lightning Network.
Network and Storage Scaling
The witness data often account for the largest part of the transaction. In such scripts as multisig, they may take up to 75% of the space used by a transaction. Thanks to the SegWit, the transfer of signatures becomes optional: the node requests them only if it is going to conduct the transaction validation. In this case, SPV (simple payment verification) clients or nodes that do not support SegWit are not obliged to download additional data, saving the disk space.
Block size increase and lower transaction fees
SegWit transactions are cheaper than traditional ones because of the reduction for the storage of witness data. To be more accurate, the very notion of ‘size’ was changed for SegWit transactions. Instead of ordinary size, the notion of ‘virtual size’ has been introduced: all data preserved in the witness is taken with the 0.25 coefficient, which allows for more transactions in the block. Let us consider an example.
Suppose we have a traditional transaction of the size of 200 bytes. There is a room for 5,000 such transactions in a 1 MB block. Let us now take it SegWit equivalent, with witness data taking about 120 bytes. It means its vsize = 80 + 0.25 · 120 = 110 so one block may contain 9,090 such transactions. The commission fee amounting to 40 satoshis for 1 byte, the total will amount to 8,000 satoshis for the first transaction and 4,400 for the second, practically twice cheaper.
As you might already have noticed, every locking script has one byte responsible for the script version. The use of different versions enables additions and changes (syntax changes, new operators etc.) as soft forks.
Signature Verification Optimization
Segregated Witness also optimises the work of algorithms with signatures (CHECKSIG, CHECKMULTISIG etc.). Before the SegWit, the number of hash calculations increased in proportion to the square of the number of signatures while the update decreases the difficulty of the algorithm to O (n).
So what is the problem?
If everything is so great, what is the problem? The update has many opponents in the bitcoin network because despite all evident advantages it has its weak points. Let us consider several arguments to the contrary.
- As the SegWit is a soft fork, many clients will not be updated, therefore two types of UTXO will coexist in the network, and such important changes as the removal of vulnerability of transaction IDs and hashing in linearly increasing hash time will not be applied to non-SegWit outputs, therefore the network will still be vulnerable to attacks based on the malleability of transaction IDs and to the problem of the quadratically increasing hashing time.
- SegWit may decrease the security of the network. The number of nodes that conduct the full validation will significantly decrease because only those who adopted the SegWit will be able to verify the witness part of the transactions.
- SegWit cannot be repealed. If it is repealed and all changes are taken back, all the SegWit outputs will become available to everyone.
- SegWit tries to solve all the problems at the same time and, as a result, an enormous portion of code is changed. It complicates the future work and increases the probability of apparition of bugs that would be harder to get rid of.
While it is quite probable that some of the issues solved by the SW may have more elegant solutions, we still believe that at the present moment it is an excellent way to increase the scalability of the network and to enable the implementation of such technologies as the Lightning Network, the detailed analysis of which we will present in the following articles.