Ethereum tutorial #01 — What is a blockchain and its purpose?

Image downloaded from Google Image Search

Hi there! I am planning to write a series of Ethereum tutorial in coming months to share what I have learnt. Today, I would like to start the series with the most fundamental stuffs — an Introduction to blockchain!

This tutorial post is for people who has little background and knowledge about blockchain, or people who want some knowledge refresh. Coming, I will be sharing the contents listed below. So, without further ado, let’s begin!

  1. What is a blockchain and a blockchain network?
  2. The characteristics of blockchain and its network
  3. How a blockchain network achieves pseudonymity?
  4. How a blockchain achieves immutability?
  5. The ultimate goal of blockchain and its network

Blockchain = Data structure

To understand blockchain, we can divide the word into two separate words: block and chain. A block is a data structure, which consists of a block header and a list of transactions. A chain, on the other hand, is a series of something being connected. By aggregating two definitions, a blockchain refers to a series of blocks, which consists of a block header and a list of transactions, in chronological order (see Figure 1).

Figure 1

Transactions are records of changes, which are like server logs. These changes might mean different things on different types of blockchain. For example, in the Bitcoin blockchain, a transaction refers to the transfer of digital assets (called Bitcoin) from A to B; In the Ethereum blockchain, a transaction could mean the transfer of digital assets (called Ether) from A to B, or the execution record of a small program (called smart contract) in the blockchain.

Since transactions are record of changes, which are stored chronologically into a series of blocks, we can deduce the latest status of the blockchain by looking at all transactions from the earliest to the latest one.

Blockchain network = Distributed and decentralized network

However, when people are talking about blockchain, they usually refer it as a blockchain network, not the data structure. A blockchain network is a distributed and decentralized network where every network node synchronizes among others to stores the same blockchain data structure and executes the same transactions in the same order.

To make things unambiguous, when we mention about “blockchain network”, we refer it to a network of nodes, like figure 2 below; When we mention about “blockchain”, we refer it as a data structure which organizes a series of blocks into a chain, like the one illustrated in the figure 1 above.

Blockchain network is peer-to-peer, meaning that every network node is connected to each other directly, without a middleman, a proxy, or a centralized server. This design makes the network distributed and decentralized. Unlike the modern web application architecture, where applications are centralized and controlled by one party. On a blockchain network, there are no central points of control. Every network node on the network has the right and opportunity to make changes in the blockchain, thus we called it “permissionless”.

Everyone owns the same data

From time to time, blocks are created by different network nodes and linked to an existing blockchain. Then, new blocks are being copied and synchronized among all peers on the network. As a result, every network node will eventually have the exact same copy of blocks. Or, to say it in another way, the whole blockchain (including all transactions and block data inside) is replicated among all peers in a distributed manner.

For example, in Figure 2, node A creates a new block (#03), it propagates the new block to other connected nodes, and those nodes synchronize with node A by saving the block. Then, imagine there is a new joiner, node D, it synchronizes itself with the network by downloading all existing blocks from some of the nodes on the network.

Figure 2

Highly available

Since every network node has a copy of the blockchain data, and the nodes are distributed and decentralized, these makes the blockchain network with no downtime and free from the problem of single point of failure. As long as one or more network nodes are online, the blockchain network will function continuously and users will be able to retrieve data from the blockchain. Furthermore, the decentralization design makes the blockchain network highly resist to DDoS (i.e. distributed denial of service) attacks.

Open to everyone and fully transparent

As you can see from Figure 2, the size of the blockchain network can grow or shrink, depending on the number of network nodes. In general, everyone can join a blockchain network or quit from it at any time. This characteristic makes a blockchain network open. Since everyone can join, transactions and blocks can be created by these new comers without acquiring permissions.

Because everyone can join the blockchain network and get a copy of the blockchain, therefore, every transaction and block stored in a blockchain is fully-transparent. That is, everyone on the planet can view the details of the blockchain.

Pseudonymity: no one discloses their true name

Although the network is open for users to come and leave, it does not require them to make registrations to start using it. Users on the blockchain network are said to enjoy the benefits of pseudonymity. That is, they do not have to disclose their real name or credentials on the network to communicate or make transactions with others.

To use the blockchain network, every user should own at least one public and private key pair (which is powered by cryptographic technologies, see figure 3), which are generated randomly on the user’s local machine. The public key (also known as the address) acts as an identity (like a username) and the private key acts as a proof of identity ownership (like a password).

Figure 3

When users try to make a transaction on the blockchain network, they will use the public key of the person involved for references. For example, imagine we are on the ABC blockchain network, Alice wants to send 10 digital assets to Bob. Instead of making a transaction that states “Transfer 10 digital assets from Alice to Bob”, she will make a transaction that states “Transfer 10 digital assets from address e38b0c45f01455bc4e9f344d66…. to 7a2cb67eaebd651ffcff8d0a7ce….”. Unless Bob knows the address “e38b0c45f01455bc4e9f344d66….” is owned by Alice, otherwise, he cannot figure out who pays him the 10 digital assets (see Figure 4).

Figure 4

Transactions are secured by digital signatures

To ensure no one can forge transactions and publish them into the blockchain, every transaction needs to be digitally signed by the originator to prove its authenticity.

For example, imagine there is a malicious person, Marco, who wants to forge a transaction that claims, “Transfer X amount of digital assets from e38b0c45f01455bc4e9f344d66…. (Alice’s address) to 5544427a02284717c9eaefb039…… (Marco’s address)”. He will not succeed because such transaction is not valid without the digital signature from Alice, which is generated from Alice’s private key, and such key is kept by Alice as a secret.

When Marco tries to include this illegal transaction into a block and publish it to other network nodes, these nodes will verify the correctness and authenticity of the block using Alice’s public key. Once the illegal transaction is being spotted, the nodes will immediately discard the block and thus the fraud will fail (see Figure 5).

Figure 5

Immutability: no one can change or delete once published

One of the breakthroughs of blockchain technologies is the immutability of data. That is, once a transaction is being executed or stored, practically, no one can alter or delete the data, or undo the operation. This makes the blockchain works like an append-only log, like a ledger in financial accounting. As such, blockchain network is also known as a type of distributed ledger system.

To understand how blockchain network ensures its data immutability, let’s review Figure 1. It illustrates that blocks are chained together by arrows, or pointers. However, that is just a over-simplified explanation.

In fact, blocks are connected by the cryptographic hash of its previous block (i.e. a fingerprint, or DNA of the block. It is a very long, randomized and fixed length unique token generated from the block data), as illustrated in Figure 6.

Figure 6

As you can see, blocks are referencing its previous block by using cryptographic hashes. A hash value of a block is generated from its transaction data and the content of the block header, which contains its previous block hash value.

Mechanism #1: Linked previous hash values

The cryptographic hash of a block N is used to compute the cryptographic hash of its next block N+1, and the cryptographic hash of block N+1 is used to compute the cryptographic hash of its next block N+2, and so on. Changing one data (even just a single character) in block N will affect its block hashes as well as all block hashes that are ahead of block N. Therefore, if someone, like a hacker, tries to modify the block data in block N, he/she needs to modify the “previous hash value” in block N+1, N+2, N+3, and all the way to the latest block in order to make the blockchain valid.

Mechanism #2: Nonce values

Moreover, the malicious person needs to re-calculate the nonce value (i.e. a random value, see Figure 6) to meet certain criteria for each affected block through a process called mining, which is exactly like a lottery game: you have to try many times in order to be successful. Such process is extremely time-consuming. It takes minutes or hours in a small-to-medium-sized blockchain network, or even days or months in a global blockchain network like Bitcoin, just for computing a SINGLE block’s nonce value! (Remember, hacker needs to re-calculate all affected blocks, not just one…)

Mechanism #3: The longest chain rule

Well… That is not the end of the story! After such a long process for re-computing the hashes of all blocks, the hacker still need to find a way to make the modified chain accepted by all network nodes before we can claim the attack is successful. How to do that? The only way is to make a longer chain than the genuine one. This is so-called the “longest chain rule”, which is enforced between network nodes. The “longest chain rule” states that: “If you have two or more valid chains, always select the longest one”. Unless the hacker controls over half of the power on the network, he/she will never create a longer chain. It is because the size of the genuine chain will keep growing as time goes by, and the hacker will never win the race (see Figure 7).

Figure 7

These designs make hacking or faking a blockchain extremely difficult. Therefore, practically, data cannot be changed once published into the blockchain, and that is the immutability.

Ultimate goal

  • Middlemen-free: The decentralization of blockchain network removes the need of a middleman, like a bank or an agent. Transactions of digital asset transfer can be done directly in a peer-to-peer manner. Furthermore, users can interact with each other through a small programmable application called smart contract, which contains custom rules or logic that acts as a virtual middleman.
  • Censorship-free: The blockchain network is not controlled by a single party but with every network node’s participation. If someone tries to shut down the network, he/she will need to hack into every network nodes to succeed, which is nearly impossible and extremely expensive. Furthermore, if someone tries to suppress some information on the blockchain by modification, he/she will fail due to the immutability of blockchain. And, since the blockchain is replicated across all network nodes, a censored material can be easily identified by making a comparison with the genuine chain.
  • Trustless: It means the blockchain network does not require people to explicitly know or trust each other for the system to function. Since the network is open, people can join or leave freely without acquiring trust from someone; The pseudonymity empowers users to hide their true and real-life identify, but at the same time transactions are backed by modern cryptographic technologies that ensure no one can forge transactions. More importantly, the immutability of blockchain allows users to trace all the transactions back to day one. As a result, people can rely all the technologies provided by the blockchain network to transact freely and securely.

Thanks for reading! I hope this article explains what I intended and able to help you. Please let me know your comments and opinions (either good or bad) below for discussion, and definitely it will be great if you could give claps 👏👏👏👏👏!

In the coming tutorials, I will be sharing more technical stuffs, like, how to use Ethereum console, write Solidity smart contracts and walk through some examples. Stay tuned, and have a nice day :)