Bitcoin raw transactions : the hard way

I will give in this post a complete description of how to form a non-segwit Bitcoin transaction by hand. Python will be used for demonstration but the examples can be adapted to any language.

What has to be done ?

Let’s overview the process. A Bitcoin transaction spends one or more ouput(s) from other transactions (which become input(s) of this one) and create one or more other(s). In order to keep hard things simple, we will use a transaction spending one input, and creating one output. P2PKH will be used : the standard locking script (scriptpubkey) for a P2PKH is :

OP_DUP OP_HASH160 f256f3f62388e17b66e881f80b17a69dfc55b7e4 OP_EQUALVERIFY OP_CHECKSIG

with f256f3f6... being a random public key hash. In order to spend an ouput of that kind, we have to satisfy this locking script so let’s explain it to understand what it does and how to fulfill the conditions. OP_DUP duplicates the top item of the stack, OP_HASH160 hashes the top item of the stack with the ripemd160(sha256(item)) function. OP_EQUALVERIFY evaluates the 2 top items of the stack, stopping the execution of the script if they are not equal. OP_CHECKSIG takes the 2 top items from the stack (which should be, ordered, a signature and a public key) and verifies if the signature is valid for the given public key. So, how to make the script evaluate to True , meaning we can spend the ouput locked by this script ? We have to provide an unlocking script (scriptsig) for which, if we append the locking script to it fulfill the conditions : concretely a script containing a top item which would match the hash specified in the locking and a bottom item being valid if passed with the top item in OP_CHECKSIG. The unlocking script should be :

<a DER-encoded signature><a public key>

but not any signature or public key, the signature should match the public key (meaning we got the corresponding private key) and the hash of this public key should match the one specified in the locking script : this hash is actually the address to which the bitcoins have been sent to : that’s why there is no concept of balance in the blockchain, the balance of an address is the sum of the output referencing it. If you want to know how to get an address from a private key, you can check out this post.

The transaction structure

Let’s see how a transaction is serialized in order to be sent thanks to a network message.

Taken from a website explaining the block structure, can’t remember the name

Since we build a transaction with one input and one output, and given the default values, it results in :

version : 01000000 (little endian)
input_count : 01
prev_hash : 32 bytes (little endian)
index : 4 bytes (little endian)
script_length : 1 byte big endian (our script is <255 bytes long)
scriptsig : script_length bytes (big endian)
sequence : ffffffff
output_count : 01
value : 8 bytes (little endian)
script_length : 1 byte big endian (our script is <255 bytes long)
scriptpubkey : script_length bytes (big endian)
locktime : 00000000

Let’s code a function which, given the necessary variables as bytes, creates a valid serialized transaction :

def serialize(prev_hash, index, scriptsig, value, scriptpubkey):
"""
:param prev_hash: The id of the transaction from which the output is spent.
:param index: The place of the output in the list of outputs of this transaction.
:param scriptsig: The unlocking script.
:param value: The value IN SATOSHIS to spend from the output
:param scriptpubkey: The script setting the condition to spend the output we create with this transaction.
"""
tx = b'\x01\x00\x00\x00' # version
tx += b'\x01' # input count
tx += prev_hash[::-1]
tx += index
script_length = len(scriptsig)
tx += script_length.to_bytes(sizeof(script_length), 'big')
tx += scriptsig
tx += b'\xff\xff\xff\xff' # sequence
tx += b'\x01' # output count
tx += value
script_length = len(scriptpubkey)
tx += script_length.to_bytes(sizeof(script_length), 'big')
tx += scriptpubkey
tx += b'\x00\x00\x00\x00' # timelock
return binascii.hexlify(tx)
Ok great ! Now we just have to specify the variables !

Actually, this is were it gets thougher. Let’s say the tx we want to spend has the following structure :

ic getrawtransaction d06f1560e668908a0e9c8710d85a148412ea20b531a75fe4e4240146fd1d6e4b 1
{
"hex" : "0100000001a105781c91e77375fb9cd94c23211abe1576c9c331908f30babb32308e26b71c000000006b483045022100f1e67c42ffe7b7a317bfae89c4e32ceef27b248d8355663d95fbc0f5e27959e1022059215830921f9c27e4d6adf96e3a54975ba1f73b0e3da55b8d57c97655a2606d012102bbaee114cfc6e00934cca94eae156f8a005bfd727dffb7b770d0d9d26761feffffffffff01805cd705000000001976a914a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e088ac00000000",
"txid" : "d06f1560e668908a0e9c8710d85a148412ea20b531a75fe4e4240146fd1d6e4b",
"version" : 1,
"locktime" : 0,
"vin" : [
{
"txid" : "1cb7268e3032bbba308f9031c3c97615be1a21234cd99cfb7573e7911c7805a1",
"vout" : 0,
"scriptSig" : {
"asm" : "3045022100f1e67c42ffe7b7a317bfae89c4e32ceef27b248d8355663d95fbc0f5e27959e1022059215830921f9c27e4d6adf96e3a54975ba1f73b0e3da55b8d57c97655a2606d01 02bbaee114cfc6e00934cca94eae156f8a005bfd727dffb7b770d0d9d26761feff",
"hex" : "483045022100f1e67c42ffe7b7a317bfae89c4e32ceef27b248d8355663d95fbc0f5e27959e1022059215830921f9c27e4d6adf96e3a54975ba1f73b0e3da55b8d57c97655a2606d012102bbaee114cfc6e00934cca94eae156f8a005bfd727dffb7b770d0d9d26761feff"
},
"sequence" : 4294967295
}
],
"vout" : [
{
"value" : 0.98000000,
"n" : 0,
"scriptPubKey" : {
"asm" : "OP_DUP OP_HASH160 a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e0 OP_EQUALVERIFY OP_CHECKSIG",
"hex" : "76a914a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e088ac",
"reqSigs" : 1,
"type" : "pubkeyhash",
"addresses" : [
"iJq4io6SKdS9ueBwsGr9HpTNCv4niGHdCY"
]
}
}
],
"blockhash" : "e5eb7260adb96099f54522c2cfa855ab2f07173cfc2b7bf7ff1f699ec34a494e",
"confirmations" : 9,
"time" : 1544400075,
"blocktime" : 1544400075
}

(You won’t be able to see this transaction in the Bitcoin block chain, because I took it from the Insacoin chain, which is a fork of Bitcoin v0.10 : you can check it out here)

Given the preceding output list, we can see there is just one output thus our index will be 0 .

Let’s get a keypair, we’ll send the coins to this address (I will use functions defined here but you can generate it the way you want) :

>>> binascii.hexlify(gen_privkey())
b'3fc60583d1b58237cf4dd60270b0c0c9275297bd037a3a71849f96e4295799a901'

You can remark that the private key ends with 01 , which means it will derive a compressed public key. When we’ll sign the message we’ll have to drop this suffix.

>>>binascii.hexlify(get_pubkey(binascii.unhexlify(b'3fc60583d1b58237cf4dd60270b0c0c9275297bd037a3a71849f96e4295799a901')))
b'02934015f373002667d7f9d3846905559751fc3e33375cece9c060484257e70ae0'

So our address is :

>>>hash160(binascii.unhexlify(b'02934015f373002667d7f9d3846905559751fc3e33375cece9c060484257e70ae0'))
'9b1aba939d4f4b958cede48aa42e38668337afa7'

This is the address not encoded in base58_check, which is an encoding used for the end user.

Sending bitcoins to that address” results concretely in forming a locking script such as the one who wants to spend the output we create will have to provide the public key which corresponds to that address and a valid signature for this public key, which means it has the corresponding private key. Thus we build the locking script this way :

OP_HASH160 9b1aba939d4f4b958cede48aa42e38668337afa7 OP_EQUALVERIFY

In order to verify the public key possession, and

OP_DUP OP_HASH160 9b1aba939d4f4b958cede48aa42e38668337afa7 OP_EQUALVERIFY OP_CHECKSIG

In order to check the private key possession (here we use the OP_DUP opcode because the public key is “consumed” two times : once by OP_HASH160 and once by OP_CHECKSIG ).

If we want to spend the whole ouput minus a fee of 0.01 coin (which is not a good fee in practice), we’ll form an output with a value of 0.97 coins. At the stage we have :

prev_hash : d06f1560e668908a0e9c8710d85a148412ea20b531a75fe4e4240146fd1d6e4b
index : 0
value : 0.97
scriptpubkey : "OP_DUP OP_HASH160 9b1aba939d4f4b958cede48aa42e38668337afa7 OP_EQUALVERIFY OP_CHECKSIG"

We should parse the script, as this post will be long enough i’ll just give the result but you can check out how I achieved it using the function availables here and the opcodes available here (taken from pycoind).

scriptpubkey : 76a9149b1aba939d4f4b958cede48aa42e38668337afa788ac

There is just the scriptsig left ..

Scriptsig, the dark side

Too many numbers, an image was necessary before diving into scriptsig.

Now, we will form the scriptsig, composed of a signature and public key. If we take the previous transaction, the public key corresponding to the address

iJq4io6SKdS9ueBwsGr9HpTNCv4niGHdCY

is

02bbaee114cfc6e00934cca94eae156f8a005bfd727dffb7b770d0d9d26761feff
Here, the address starts with an i because it is an Insacoin address being base58 check encoded with a 102 prefix. But once again the encoding is something for the end user and the address is in fact just ripemd160(sha256(pubkey)) , which is valid on almost every network. For example the same address is 1GMYHih3uEAnVRy7NUsdq2h81TnaBRR8sH on Bitcoin and LaaVYvzsytQqkEfGYcrw73ktDg9rJXPcSn on Litecoin.

The private key from which this public key was derived is (I can give it to you, insacoins do not worth anything) :

ced12060f684b088abd332190b100d7220f63768162f66b59bd0011ed8a53ef4

Now that we have the private key “owning” the coins, we can create the signature. A big question I had when I first took a closer look at Bitcoin transactions was the message that we should sign with our private key. The answer is the serialized transaction we are creating, itself but with the field scriptsig filled with the scriptpubkey from the previous transaction. A discussion about the reason behind this choice can be found here.

The pubkey of the previous output can be found in the vout['scriptpubkey']['hex'] entry from the getrawtransaction command output from above : it’s

76a914a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e088ac

Now we got the message to sign :

>>> prev_hash = binascii.unhexlify('d06f1560e668908a0e9c8710d85a148412ea20b531a75fe4e4240146fd1d6e4b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'binascii' is not defined
>>> import binascii
>>> prev_hash = binascii.unhexlify('d06f1560e668908a0e9c8710d85a148412ea20b531a75fe4e4240146fd1d6e4b')
>>> index = b'\x00\x00\x00\x00'
>>> scriptsig = binascii.unhexlify('76a914a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e088ac')
>>> value = int(97000000).to_bytes(8, 'little')
>>> scriptpubkey = binascii.unhexlify('76a9149b1aba939d4f4b958cede48aa42e38668337afa788ac')
>>> serialize(prev_hash, index, scriptsig, value, scriptpubkey)
b'01000000014b6e1dfd460124e4e45fa731b520ea1284145ad810879c0e8a9068e660156fd0000000001976a914a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e088acffffffff01401ac805000000001976a9149b1aba939d4f4b958cede48aa42e38668337afa788ac00000000'

Now we append 4 bytes called hashcode :

message = b'01000000014b6e1dfd460124e4e45fa731b520ea1284145ad810879c0e8a9068e660156fd0000000001976a914a86c52f90b0e2ae853d8e9ea4403a4b68de7a7e088acffffffff01401ac805000000001976a9149b1aba939d4f4b958cede48aa42e38668337afa788ac00000000' + b'01000000' 

let’s sign this message using the ecdsa module :

>>> privkey = binascii.unhexlify(b'ced12060f684b088abd332190b100d7220f63768162f66b59bd0011ed8a53ef4')
>>> secexp = int.from_bytes(privkey, 'big')
>>> msg_hash = double_sha256(binascii.unhexlify(message), bin=True)
>>> sk = ecdsa.SigningKey.from_secret_exponent(secexp, curve=ecdsa.SECP256k1)
>>> sig = sk.sign_digest(msg_hash, sigencode=ecdsa.util.sigencode_der_canonize)

To this signature we append a one byte hashcode (signifying we will use SIGHASH_ALL) :

>>> binascii.hexlify(sig + b'\x01')
b'304502210092af6f1f2db5ec3504575ca7e25340e164d2281678d72342f96b0eecb73cfa65022009bd182dbc969499919718c31db278f34e5c1310f5923e68eb90b3ddff14d5bd01'

We now have our signature :)

Final steps

Let’s construct our scriptsig and serialize the final transaction. The format for the serialized scriptsig is

signature length + signature + pubkey length + pubkey

So :

>>> sig_len = len(sig)
>>> pubkey_len = len(pubkey)
>>> scriptsig = sig_len.to_bytes(1, 'big') + sig + pubkey_len.to_bytes(1, 'big') + pubkey
>>> binascii.hexlify(scriptsig)
b'47304502210092af6f1f2db5ec3504575ca7e25340e164d2281678d72342f96b0eecb73cfa65022009bd182dbc969499919718c31db278f34e5c1310f5923e68eb90b3ddff14d5bd2102bbaee114cfc6e00934cca94eae156f8a005bfd727dffb7b770d0d9d26761feff'

And we just have to serialize the transaction with the new scriptsig :

>>> serialize(prev_hash, index, scriptsig, value, scriptpubkey)
b'01000000014b6e1dfd460124e4e45fa731b520ea1284145ad810879c0e8a9068e660156fd0000000006a47304502210092af6f1f2db5ec3504575ca7e25340e164d2281678d72342f96b0eecb73cfa65022009bd182dbc969499919718c31db278f34e5c1310f5923e68eb90b3ddff14d5bd2102bbaee114cfc6e00934cca94eae156f8a005bfd727dffb7b770d0d9d26761feffffffffff01401ac805000000001976a9149b1aba939d4f4b958cede48aa42e38668337afa788ac00000000'

Here we go ! You can now send it to your local node with the sendrawtransaction method.

Footnotes

I could not detail every function I used in this post, but you can check out this repo where I made a basic implementation of keys and raw transaction. I documented all the code and detailed most of the things done in order to make it understandable.

Any remark or criticism is welcome.

Here are some helpful links :