Building a Blockchain From Scratch (Part 2)

Scott Hickmann
9 min readMay 20, 2022

--

This tutorial is a continuation of the previous part, which you can find here:

In this tutorial, we will see how to ensure our blockchain is secure by introducing blocks and chains. After reading this article, you should have all the tools needed to implement your own blockchain.

Double Send Attacks

The Problem

To ensure security, all honest nodes on our network must store their own ledger (the history of all transactions). Ledgers must agree on past transactions. Otherwise, there may be disagreements on what money has been spent or not. How can Alice prove to Bob that she paid for coffee if not all honest nodes say so?

Using the model we have developed in the past tutorial, the consensus among ledgers is not guaranteed. The reason for this is that information gossiped across the network is not transmitted instantly. There will be a network delay corresponding to the maximum time it takes for information to be received by all nodes on the network.

As a result of the network delay, an adversary could create two transactions tx1 and tx2 spending the same transaction output and paying two different people. Then, it could send each to two different nodes, and these nodes would not agree as to what the valid ledger of transactions is. Every node on the network which receives tx1 first will add tx1 to its ledger and reject tx2, while every node which receives tx2 first will add tx2 to its ledger and reject tx1. There will be disagreement.

No Simple Solution

Let’s see why simple solutions to this double-spend problem don’t work.

If we just cancel double spends, since no honest node would ever double spend, we would be canceling both tx1 and tx2. However, consider the scenario where an adversary first buys an item in tx1, gets the item, then sends a double spend with tx2. In that case, both tx1 and tx2 would be canceled, meaning that the adversary would get its money back and keep the item it purchased, at the loss of the seller.

If instead, we set up a time window, where if both tx1 and tx2 are received within the window, both are canceled. Otherwise, whichever transaction arrives last gets canceled. However, consider the scenario where an adversary sends tx1, then sends a double spend with tx2 just at the edge of the end of the time window, so that some nodes reject tx2 because it’s outside the window due to network delay and others reject both tx1 and tx2 because they’re both inside the window. In that case, nodes’ ledgers will again disagree.

Virtues of Ledgers

Before we can come up with a good solution to the double-spend problem, we need to introduce two virtues of ledgers that should be satisfied to solve the double-spend problem:

  • Safety: All honest parties should agree on spent transactions.
  • Liveness: Transactions created by honest parties should be confirmed by the rest of the network quickly.

As we will see, getting both of these conditions is difficult and will require tradeoffs.

Blocks and Chains

Blocks

One way to satisfy both safety and liveness is to group multiple transactions together into what we’ll call blocks. Using blocks, double-spends can only occur if tx1 and tx2 are in two different blocks, as otherwise the block would be rejected. Luckily, if the two consecutive blocks are spaced out by at least the network delay, then everyone will receive the blocks in the same order, and everyone will agree that the transaction in the first block will be valid and not the one in the second. But how do we make sure that the creation of blocks is rare?

Proof of Work (Mining)

To make sure blocks are created rarely, we can design a challenging task that can only be solved through computing power and time. More specifically, in addition to containing transactions, each block will contain a number, called a nonce. This number is arbitrary but has to be chosen so that when the entire block is hashed, the resulting hash is smaller than a network chosen target. Hashing is the process of taking any arbitrarily long string and passing it through a function to get a fixed-length number. Hash functions are easy to compute given the input, but hard to reverse. Hence the only way to find the nonce that makes the hash of the block below the target is to randomly choose a nonce and keep incrementing it until the hash is below the target. This process is called mining. The miner will keep track of a list of all valid transactions that are valid but not yet in the longest chain. This list is called a mempool. The transactions included by the miner in the block correspond to the mempool.

Chains

The miner will also include the hash of the previous block inside the block it is trying to mine. This forces the block to be created after the previous block, as it would be impossible to mine it without being aware of the previous block. This prevents people from mining lots of blocks in the past and releasing them all at once in the future, which would reintroduce the issue of double-spends. Since all blocks are now linked together, we call that a blockchain.

The Genesis Block

If all blocks need to contain the hash of a previous block, what happens to the very first block mined on the network? That block, called genesis, is allowed to not contain a previous block hash, but instead has a note containing information about a real-world event, to anchor it through time. For example, Bitcoin’s genesis block quotes The Times headline from January 3, 2009. This prevents an adversary or platform designer from pre-mining blocks before the blockchain becomes public.

The Randomness of Proof of Work

The process of proof of work is essentially random and expected to take one network delay. Sometimes, only one block might be created within more than two network delays. Other times, many miners might get lucky and multiple blocks will be mined within a network delay. We call the prior a convergence opportunity, while the latter an interval of conflicting blocks. When conflicting blocks occur, we might have double-spends again. How do we avoid that?

The Longest Chain Rule

To avoid conflicting blocks and eventually make everyone agree on which block is valid, honest parties will follow the longest chain rule. All honest parties will adopt whichever chain appears to them as the longest. Assuming there are enough convergence opportunities, honest nodes will eventually converge on the same chain, resolving any conflict that might occur. As a result, a transaction cannot be deemed fully confirmed as soon as it enters a block. We will need to wait for a few more blocks to be convinced that this block is now truly in the longest chain. On Bitcoin, we are usually required to wait 4 blocks before trusting that a transaction reached the longest chain.

Chain Reorganization

When a miner receives a new longest chain, it must update its mempool to reflect which transactions still need to be included in new blocks. In the next section, we will see how to update the UTXO and mempool every time a node receives a new block.

Updating the UTXO

For each received transaction, if the transaction is valid, apply the transaction to the previous UTXO to get the new UTXO.

For each received block, apply each transaction of the block to the previous UTXO. If at any point a transaction is invalid, reject the entire block and revert to the previous UTXO.

For all transactions not yet in a block, add them to a mempool with a temporary UTXO which starts as the same UTXO as the current longest chain tip. These transactions are added in the order in which they are received. If the transaction is invalid with respect to the current UTXO, reject it. When mining a block, use all of the transactions in the mempool to create this block.

Upon receiving a block at the tip of the current longest chain, validate the block and update the UTXO. Also, update the mempool and temporary UTXO by applying transactions in the block or removing any transaction that is now either invalid or already in the newly received block.

Upon receiving a block that changes the longest chain to another chain, set the UTXO to the fork between the current longest chain and this new chain (by undoing all transactions after the fork in the current chain). Then, go through each block until the tip in the new longest chain and update the UTXO. If at any point, a block is invalid, reject the new chain and revert to the previous chain and previous UTXO. If all blocks after the fork in the new longest chain are valid, update the mempool by starting with the UTXO at the tip of the new longest chain, applying every valid transaction in the previous longest chain that is not present in the new longest chain, then applying all the still-valid transactions that were previously in the mempool.

Updating the UTXO upon chain reorganization

Coinbase Transactions

In our system, we are still missing a way to create money. One way to do so is to introduce the notion of coinbase transactions, which are generated by miners at the top of the block they are trying to mine. If the miner is successful in mining the block and that block reaches the longest chain, the miner will gain that money. The amount of money that a coinbase transaction is allowed to create is equal to a fixed amount determined by the network added to the sum of all transaction that has not been fully spent inside the blocks (transaction fees). For example, if Alice creates a transaction that consumes a transaction output of value 10 and pays Bob 8, the miner is allowed to add 2 to their coinbase transaction.

Validating Objects

As we’ve introduced a lot of new objects, let’s summarize how to validate them.

Validating a Block

To validate a block:

  1. Verify the proof of work. (This is done first to avoid spam attacks.)
  2. Validate the parent blocks recursively (and if it does not exist locally, request it from the network) until a known valid block is reached (or genesis).
  3. Validate the transactions inside the block, including the coinbase transaction (and if they are not present locally, request them from the network).

Validating a Chain

To validate a chain:

  1. Validate every block in the chain.
  2. Ensure that the chain starts with the genesis block.
  3. If the chain is longer than the current longest chain, adopt it.

Validating a Regular Transaction

To learn how to validate a regular transaction, check out the previous tutorial:

Validating a Coinbase Transaction

To validate a coinbase transaction:

  1. Make sure the coinbase is unique in the block.
  2. Verify that it appears first.
  3. Verify that it has exactly no input and one output
  4. Verify that its value is at most the sum of the fixed block reward and the difference between the input and output amounts of all other transactions in the block (transaction fees).
  5. Verify that it is not spent in the same block.

Conclusion

You now know everything necessary to build a secure blockchain! There are many improvements that can be done such as using Merkel trees to have light clients or using proof of stake to reduce energy use. However, we’ve covered everything necessary for a working blockchain in these two articles. Please reach out if you have any questions.

References

Credits to EE 374, a class offered at Stanford which helped me understand blockchains and develop a theoretical background to prove their security.

--

--