Bitcoin transaction internals explained

l2xl
Coinmonks
Published in
10 min readApr 6, 2022

--

Picture by my daughter, Polina ©

Introduction

From the engineering point of view, all transactions initially had the same single structure. All bitcoin transactions have inputs and outputs. The inputs refer to the outputs of other transactions. Outputs manage funds distribution. That’s it. Funds are assigned to one or more outputs of every transaction, and inputs of the next transaction spend the funds from previous transaction. If an output has no corresponding input in another transaction, which spends the output’s funds, then this is Unspent Transaction Output (UTXO). UTXO is the cornerstone of the so-called UTXO-based electronic currency blockchain protocol. When the next transaction is created, it always has inputs corresponding to some unspent outputs of previous transactions. One input corresponds to exactly one output of a previous transaction. The sum of amounts referred by the transaction’s inputs must be slightly greater than the sum of the transaction’s outputs. The difference in the sums forms the transaction fee for the miner who wins the block. If Bitcoin accepts the new transaction, then corresponding outputs of the previous transaction become spent, and funds now correspond to the new UTXO. Therefore any assets accountable by consensus protocol always correspond to the UTXO set, which Bitcoin nodes track and enumerate.

In order to accept a transaction and assign funds to new outputs, Bitcoin needs to execute a script. This script splits between the previous transaction output and the input of the next transaction. Usually, these are named scriptPubKey and scriptSig (or witness). That is the trick: an output of a previous transaction contains the first script part, which contains some values to authenticate the transaction; and an input of a next transaction contains the second script part. A transaction may be accepted by Bitcoin only if the second script part matches the first script part, and the script executes successfully. In simple words, scriptPubKey contains part of the script with public keys (and maybe some logic), and scriptSig contains signature(s), which is verified by the public key(s). By the fact, scriptPubKey and scriptSig can either contain no keys or signatures at all or more often have some more complex data like public key hashes, secret hashes, and its corresponding preimages.

Now let’s look at the most popular transaction types, which can be constructed in the described way.

Legacy formats

P2PKH — Pay to Public Key Hash

Pay to Public Key Hash is the most popular type of transaction in Bitcoin, where scriptPubKey contains instructions on verifying a hash of a public key. Such output can be spent providing a public key corresponding to the hash and a signature that matches the public key:

scriptPubKey:
OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
scriptSig:
<sig> <pubKey>

P2PKH transaction supersedes “Pay to Public Key” transaction where there was no public key blinding by hashing in scriptPubKey:

scriptPubKey:
<pubKey> OP_CHECKSIG
scriptSig:
<sig>

Bitcoin wallets support the Pay to Public Key Hash transactions using a standard bitcoin address. The address is 160-bit public key hash encoded by Base58Check with the prefix “1”. This kind of transaction was a common way to move funds from one hand to another. Such transactions and addresses are currently deprecated in favor of SegWit transactions and addresses.

Other useful transaction types

Using Bitcoin Script op-codes like at P2PKH it is possible to construct more complex logic to control Bitcoin transfers.

For example, it is possible to create MultiSig output, which needs two or more signatures to spend funds. Even it is possible to utilize threshold multi-signature logic like 2-of-3 signatures:

scriptPubKey:
OP_3 <pubKey1> <pubKey2> <pubKey3> OP_2 OP_CHECKMULTISIG
scriptSig:
0 <sig1> <sig2>

Another widely used example is non-spendable output using OP_RETURN op-code. A transaction with such scriptPubKey creates a provably unspendable output which any bitcoin node or wallet skips adding to UTXO set. It has the special meaning of storing other arbitrary data or burning bitcoin. Such output can never have a corresponding scriptSig since it can never be spent.

scriptPubKey:
OP_RETURN <data>

P2SH — Pay to Script Hash

Despite the fact that Bitcoin allows for creating complex smart contracts in exactly such a way, it was deprecated in favor of Pay to Script Hash (P2SH) transactions, which provide more privacy until spent. Bitcoin node recognizes this type of transaction as a distinct form in contrast to a common scriptPubKey/ScriptSig and executes it in two phases:

  1. Executes common scriptPubKey with scriptSig, which really checks the script hash provided in scriptPubKey against the redeem script provided as the last scriptSig element;
  2. If these match, then redeem script is executed
scriptPubKey:
OP_HASH160 <scriptHash> OP_EQUAL
scriptSig:
<scriptParam>… <redeemScript>

For example, same as previous MultiSig will look like this:

scriptPubKey:
OP_HASH160 <multisig_scriptHash> OP_EQUAL
scriptSig:
0 <sig1> <sig2>
<OP_3 <pubKey1> <pubKey2> <pubKey3> OP_2 OP_CHECKMULTISIG>

BItcoin wallets support this class of transactions using address like P2PKH with prefix “3”, means the address is 160-bit hash of the script encoded by Base58Check with prefix “3”.

SegWit v0 — Segregated Witness

SegWit is Bitcoin soft fork activated in 2017. It is intended in particular to overcome the block size problem and other problems like transaction malleability and to decrease transaction fees. The segregated witness idea is to separate witness data or simply Witness (formerly named redeemScript) from common Bitcoin block data.

Since SegWit is a soft fork, legacy pre-SegWit transactions and SegWit transactions may co-exist in Bitcoin simultaneously. There are two ways to tell Bitcoin node to activate SegWit logic during exact transaction validation:

SegWit version 0 defines two types of “native” transaction types:

  • Pay-to-Witness-Public-Key-Hash (P2WPKH)
  • Pay-to-Witness-Script-Hash (P2WSH)

These transaction formats replace pre-SegWit P2PKH and P2SH. Additionally, there are two types of transactions defined for compatibility with Bitcoin wallets which does not support SegWit so far:

  • Pay-to-Witness-Public-Key-Hash nested in P2SH (P2WPKH-P2SH)
  • Pay-to-Witness-Script-Hash nested in P2SH (P2WSH-P2SH)

With the new specification, native SegWit outputs within scriptPubKey contain two elements:

  • SegWit version: 0;
  • the witness program: a 20-bytes public key hash or a 32-bytes script hash.

It is essential that the witness program size is the only way to differentiate the kind of Bitcoin transfer: to a public key hash or a script hash. The same principle works in SegWit Bech32 encoded addresses used by wallets.

P2WPKH — Pay to Witness Public Key Hash

The P2WPKH witness program has a size of 20 bytes and contains a public key hash.

The corresponding scriptSig is empty, and the witness contains exactly two elements: a signature and a public key.

The witness elements are placed on the execution stack. Then the same script as used for the P2PKH transaction is evaluated on the stack leading first to verification of the public key against the public key hash from the witness program and then verification of the signature against the Public Key and the transaction.

scriptPubKey:
0 <pubKeyHash>
scriptSig:
-empty-
Witness:
<sig> <pubKey>

Bitcoin wallets support P2WPKH transactions using the segwit address, the witness program encoded by Bech32, and prefixed with “bc” HRP prefix separated by “1” from the encoded part. For P2WPKH it is

bc1||<Bech32(<0>||<PubKeyHash>)>

P2WPKH-P2SH — Pay to Witness Public Key Hash nested in P2SH

Same as P2WPKH but

  • scriptPubKey contains the standard P2SH script
  • scriptSig contains 0 and 20-byte Public Key Hash as a single element
  • corresponding witness elements the same as common P2WPKH

The single scriptSig element hashed by 160-bit hash and should be equal to the scriptHash from the scriptPubKey.

scriptPubKey:
OP_HASH160 <scriptHash> OP_EQUAL
scriptSig:
<0 <pubKeyHash>>
Witness:
<sig> <pubKey>

This kind of witness transaction (along with P2WSH-P2SH) is used to send a SegWit transaction from a non-SegWit compatible wallet.

P2WSH — Pay to Witness Script Hash

Witness Program has a size of 32 bytes and contains a script hash. The hash algorithm is double Sha256 instead of the usual Bitcoin 160-bit hash.

The corresponding scriptSig is empty, and the witness contains parameters for a script and the script itself, like scriptSig for P2SH.

Similarly to P2SH transaction processing Bitcoin node first loads a common P2SH script in order to verify if the script hash matches the script from the last witness element. Then the script itself executes with witness elements (except the last element) placed on the execution stack.

scriptPubKey:
0 <scriptHash>
scriptSig:
-empty-
Witness:
<scriptParam>… <redeemScript>

Lets look at MultiSig transaction as an example:

scriptPubKey:
0 <multisig_scriptHash>
scriptSig:
-empty-
Witness:
0 <sig1> <sig2>
<OP_3 <pubKey1> <pubKey2> <pubKey3> OP_2 OP_CHECKMULTISIG>

As P2WPKH Bitcoin wallets support P2WSH transactions using SegWit address:

bc1||<Bech32(<0>||<scriptHash>)>.

P2WPKH and P2WSH addresses are distinguished by the length of a hash encoded inside the address.

P2WSH -P2SH — Pay to Witness Script Hash nested in P2SH

Same as P2WSH but

  • scriptPubKey contains the standard P2SH script and
  • scriptSig contains 0 and 32-byte Script Hash as a single element
  • corresponding witness elements the same as common P2WSH

The single scriptSig element is hashed by 160-bit hash and should be equal to the scriptHash from the scriptPubKey.

scriptPubKey:
OP_HASH160 <scriptHash> OP_EQUAL
scriptSig:
<0 <redeemScriptHash>>
Witness:
<scriptParam>… <redeemScript>

Lets again look at MultiSig transaction as an example:

scriptPubKey:
OP_HASH160 <scriptHash> OP_EQUAL
scriptSig:
<0 <multisig_scriptHash>>
Witness:
0 <sig1> <sig2>
<OP_3 <pubKey1> <pubKey2> <pubKey3> OP_2 OP_CHECKMULTISIG>

SegWit v1 — TapRoot

TapRoot is the latest update for Bitcoin rolled out in 2021. It contains some significant improvements for bitcoin transactions (but not limited by just transactions) like

It is much information about the Schnorr signature scheme available public. In the context of this article, it is essential that:

  • The Schorr signatures allow aggregation of a number of public keys and signatures, built on the same message into a single public key and signature. The resulting public key and signature are indistinguishable from a common public key and signature. In other words, a third-party observer cannot distinguish a signature created by a single signer with a single key pair from another one, aggregated and created by multiple signers with their separate keys. Indeed, anyone can verify the aggregated signature using the aggregated public key, which in its order is indistinguishable from a common public key. MuSig signature scheme used by TapRoot utilizes this property of Schnorr signatures.
  • An adaptor or tweak easily injectable into a Schnorr signature allows encoding additional information into the signature. TapTweak, as one of the core mechanisms of TapRoot, utilizes this to screen behind the TapRoot public key existence of advanced transaction logics encoded by scripts organized in a Merkle tree.

Contrary to previous Bitcoin transaction formats, Taproot output does not distinguish between a public key spend path and a script spend path. It is always like a public key path: scriptPubKey contains a witness program consisting of two elements:

  • SegWit version: 1;
  • 32-byte witness program, TapRoot public key encoded according to BIP340 rules.

Thus any TapRoot transaction is Pay-to-TapRoot (P2TR). Bitcoin wallets support P2TR transactions using an updated SegWit address which is independent of the spending path:

bc1||<Bech32m(<1>||<PubKey>)

The update of the address format is about updated encoding Bech32m.

There is a number of legitimate ways to spend TapRoot output described below.

Public Key Path Spending

If there is just a single party who controls a TapRoot output, then a public key spend path may be used. This is the simplest way to spend the TapRoot output.

Another case for the Public Key Spend Path is a mutual collaborative agreement of all controlling parties. In this case, all parties collaboratively create transaction aggregated signature using their independent private keys, which previously were used to create the aggregated public key, defined in the output’s witness program.

The witness should contain a single element, which is a transaction signature valid for the public key from the witness program

scriptPubKey:
1 <pubKey>
scriptSig:
-empty-
Witness:
<sig>

Script Path Spending

Regardless of the possibility of using the public key path to spend P2TR output, it may have another way to spend: a script path. It may be used as a fallback case when collaborative spending is impossible. A single script or many scripts organized in the Merkle tree may be used to cover a lot of different scenarios.

With the same UTXO’s scriptPubKey the witness looks like below:

scriptPubKey:
1 <pubKey>
scriptSig:
-empty-
Witness:
<script> <controlBlock>

The controlBlock has the next structure:

  • the first byte encodes a leaf version (which is always even) and tap tweak parity flag (see has_even_y(P) function definition at BIP-340):
leafVersion = controlBlock[0] & 0xfe
tweakParity = controlBlock[0] & 0x1

Current bitcoin implementation allows the only unique leaf version: 0xc0;

  • the next 32 bytes encode internal public key P;
  • all the remaining data is a path to the script in a Merkle tree

In order to create or interpret a control block TapRoot specification operates the so-called tagged hash function which is

hash_{tag}(x) = Sha256(Sha256(tag)||Sha256(tag)||x)

The picture below shows a schematic example of a script Merkle tree:

Taproot transaction script Merkle tree example

Let’s go through the process of verification of the Script B from the picture.

  1. Calculate a TapLeaf hash for the Script B:
b = hash_{TapLeaf}(0xc0||compactSize(Script B)||Script B)

2. Go through the Merkle tree branch from the leaf b to the root abcd: Every next 32 bytes from the Merkle tree path are a neighbor node hash for a previously calculated node hash.

  • Here first 32 bytes are the value of the hash a, which is the neighbor for the hash of the Script B or the hash b in the present example. Then
ab = hash_{TapBranch}(a||b)

3. Note that the order of concatenation under every hash_{TapBranch}(…) depends on a lexicographical comparison of the hash a and the hash b. Thus they have to be concatenated in lexicographical sorting order.

  • The following 32 bytes are ab’s neighbor, which is hash cd. Then hash
abcd = hash_{TapBranch}(ab||cd)

which is the root of the Merkle branch. Do not forget about lexicographical order.

4. Now it is time to calculate a TapTweak used to verify the whole construct against the previous output’s public key Q.

t = hash_{TapTweak}(P||abcd)

5. The last step is applying the TapTweak to internal public key P. The result should be the public key Q = P + t·G. If it is not, something goes in a wrong way, and the TapRoot transaction is invalid, and the Script B does never execute.

Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing

Also, Read

--

--