An Introductory Guide to Blockchain Technology

Shubham
Analytics Vidhya
Published in
17 min readOct 27, 2020

“The practical consequence […is…] for the first time, a way for one Internet user to transfer a unique piece of digital property to another Internet user, such that the transfer is guaranteed to be safe and secure, everyone knows that the transfer has taken place, and nobody can challenge the legitimacy of the transfer. The consequences of this breakthrough are hard to overstate.” — Mark Andreesen

THE TECHNICAL SUMMARY
________________________________________
People generally use the term ‘Blockchain Technology’ to signify different things, and it can be confusing. Sometimes they’re babbling about ‘The Bitcoin Blockchain’, the other times it’s ‘The Ethereum Blockchain’, other times it’s different virtual currencies or digital tokens, sometimes it’s just smart contracts. Most of the time though, they’re talking about distributed ledgers, i.e. a list of transactions that’s replicated across variety of computers, instead of being stored on a central server.
The common themes seem to be a knowledge store which:
• usually has financial transactions
• is replicated across many systems in almost real-time
• usually sustains over a peer-to-peer network
• uses cryptography and digital signatures to prove identity, authenticity and enforcing read/write access rights
• can be written by certain any participants
• can be read by certain any participants, maybe a wider audience, and
• has mechanisms to make it difficult to change historical records, or at least make it easier to sense when someone is trying to do so.

I see “Blockchain Technology (BT)” as a set of technologies, a touch sort of a bag of Lego. From the bag, you’ll remove different bricks and put them together in several ways to make different results.

What’s the difference between a blockchain and a normal database?

In layman language, a blockchain system is a package which contains a normal database plus some software that adds new rows, validates that new rows conform to pre-defined and accepted rules, and listens and broadcasts new rows to its peers across a network, ensuring that each peer has an equivalent data in their databases.

INTRODUCING BITCOIN’S BLOCKCHAIN

The Bitcoin Blockchain ecosystem :-
The Bitcoin Blockchain ecosystem is really quite a cumbersome and complex system thanks to its dual aims: firstly, that anyone should be ready to write to The Bitcoin Blockchain; secondly, that there shouldn’t be any centralized power or control.

Minimize these, and you don’t need many of the convoluted usual mechanisms of Bitcoin.

Replicated databases : The Bitcoin Blockchain ecosystem acts as a sort of a network of replicated databases, each containing an equivalent list of past bitcoin transactions. Important members of the network are called validators or nodes which go around transaction data (payments) and block data (additions to the ledger). Each validator independently checks and analyses the payment and block data being passed around, thereby validating it. There are rules in situ to form the network operate as intended.

How Bitcoin Blockchain Operates?

Bitcoin’s Complexity : The aim of bitcoin was to be having no central authority (to be decentralized), i.e. not have certain extent of control, and to be relatively anonymous or unknown. This has influenced how bitcoin has developed. Not all blockchain ecosystems get to have an equivalent mechanism, especially if participants are often identified and trusted to behave.
Here’s how bitcoin approaches some of the decisions:

(I) Public vs Private blockchains :-
— — — — —

There is an enormous difference in what technologies you would like , counting on whether you permit anyone to write down to your blockchain, or known, vetted participants. Bitcoin in theory allows anyone to write down to its ledger (but in practice, only about 20 people/groups actually do).

Public blockchains → Ledgers can be ‘public’ in two senses:
1. Anyone, without permission usually granted by another authority, can write data
2. Anyone, without permission again usually granted by another authority, can read data
Usually, when people mention public blockchains, they mean anyone-can-write.
Because bitcoin is understood as a ‘anyone-can-write’ type of blockchain, where participants aren’t vetted and may increase the ledger without having any approval, it needs ways of arbitrating discrepancies (there is not any ‘boss’ to decide), and defence mechanisms against attacks (anyone can misbehave with relative impunity, if there is a financial incentive to do so). These generally create cost and complexity of running this blockchain.

Private blockchains →A ‘private’ blockchain network is where the participants/individuals are known and trusted: for instance , an industry group, or a cluster of companies owned by an umbrella company. Many of the mechanisms aren’t needed — or rather they’re replaced with legal contracts — “You’ll behave because you’ve signed this piece of paper.”. This changes the technical decisions on which bricks are wont to build the answer .
Another way of describing public/private could be permissionless vs permissioned or pseudonymous vs identified participants.
See the pros and cons of internal blockchains or the difference between a distributed ledger and a blockchain for more on this subject .

DIVING DEEPER
________________________________________
Warning: this section isn’t so gentle, because it goes into detail into each of the topics discussed above. I recommend getting a cup of tea.

DATA STORAGE → What is a blockchain?
A blockchain is just a file. A blockchain by itself is simply a knowledge structure, i.e., how data is logically put together and stored. Other data structures are databases (rows, columns, tables), text files, comma separated values (csv), images, dictionaries, lists, and so on. You can consider a blockchain competing most closely with a database.

Blocks during a chain = pages during a book
For analogy, a book may be a chain of pages. Each page in a book contains:
• the text: for example the story
• information / description about itself: at the top of the page there is usually the title of the book and sometimes the chapter number or title; at the bottom is usually the indexing which tells you where you’re within the book. This ‘data about data’ is called meta-data.

Similarly in a blockchain block, each block has:
• the contents of the block, for example in bitcoin is it the bitcoin transactions, or the miner incentive reward (currently 25 BTC).
• a ‘header’ which contains the data describing the block.

In bitcoin, the header contains some critical technical information about the block, as regard to the previous block, and a fingerprint (hash) of the info contained during this block, among other things. This hash is important for ordering.

Blocks during a chain ask previous blocks, like page numbers during a book.

Block ordering during a blockchain :-
Page by page → With books, predictable page numbers make it easy to understand the order of the pages. If you ripped out all the pages and shuffled them, it might be easy to place them back to the right order where the story is sensible .

Block by block → In blockchain, each block references the previous block, by the block’s fingerprint and not by ‘block number’,, which is cleverer than a index because the fingerprint itself is decided by the contents of the block.

Book Ordering Like Block Ordering

The regard to previous blocks creates a sequence of blocks — a blockchain!
Internal consistency → By employing a fingerprint rather than a timestamp or a numerical sequence, you furthermore may get a pleasant way of validating the info . In any blockchain, you’ll generate the block fingerprints yourself by using some algorithms. If the fingerprints are according to the info , and therefore the fingerprints meet up during a chain, then you’ll make certain that the blockchain is internally consistent. If anyone wants to poke into any of the info , they need to regenerate all the fingerprints from that time forwards and therefore the blockchain will look different.

A look inside a blockchain block shows : the fingerprints are unique to the block’s contents.
This means that if it’s difficult or slow to make this fingerprint, then it also can be difficult or slow to re-write a blockchain.
The logic in bitcoin is:
• Make it hard to get a fingerprint that satisfies the principles of The Bitcoin Blockchain
• Therefore, if someone wants to re-write parts of The Bitcoin Blockchain, it’ll take them an extended time, and that they need to catchup with and overtake the remainder of the honest network
This is why it is said that The Bitcoin Blockchain is immutable (i.e., it cannot be changed)

DATA DISTRIBUTION → How is new data communicated?
Peer-to-peer is a method of distributing data during a network. Another method is the client-server. you’ll have heard of peer-to-peer file sharing on the BitTorrent network where files are shared between users, without a central server controlling the info . this is often why BitTorrent has remained resilient as a network: there’s no central server to pack up .

CLIENT-SERVER
In the office environment, often data is persisted servers, and wherever you log in, you’ll access the info . The server holds 100% of the info , and therefore the clients trust that the info is definitive. Most of the web is client-server where the web site is persisted the server, and you’re the client once you access it. this is often very efficient, and a standard model in computing.

PEER-TO-PEER
In peer-to-peer models, it’s more sort of a gossip network where each peer has 100% of the info (or as on the brink of it as possible), and updates are shared around. Peer-to-peer is in some ways less efficient than client-server, as data is replicated many times; once per machine, and every change or addition to the info creates tons of noisy gossip. However each peer is more independent, and may continue operating to some extent if it loses connectivity to the remainder of the network. Also peer-to-peer networks are more robust, as there’s no central server which will be controlled, so closing down peer-to-peer networks is harder.

Client-Server vs Peer-To-Peer Model

The problems with peer-to-peer →
With peer-to-peer models, albeit all peers are ‘trusted’, there are often a drag of agreement or consensus — if each peer is updating at different speeds and have slightly different states, how does one determine the “real” or “true” state of the data?

CONSENSUS: How does one resolve conflicts?
A common conflict is when multiple miners create blocks at roughly an equivalent time. It is seen that sometimes blocks take time to be shared across the network, which one should be counted as the legit block?

For eg.: Let’s say all the nodes on the network have synchronised their blockchains, and that they are all on block number 80.
If three miners across the planet create ‘Block 81’ at roughly an equivalent time, which ‘Block 81’ should be considered valid? Remember that every ‘Block 81’ will look slightly different: they’re going to certainly contain a special payment address for the 25 BTC block reward; and that they may contain a special set transactions. Let’s call them 81a, 81b, 81c.

Which block should count because the legit one?

Which is the Legitimate Block? How does one resolve this?

First Block use see is the Legitimate

Longest chain rule → In bitcoin, the conflict is resolved through the rule called the “longest chain rule”.
In the example above, you’d assume that the primary ‘Block 81’ you see is valid. Let’s say you see 81a first. you’ll start building subsequent block thereon , trying to make 82a:

Treat the primary block you see as legitimate → However during a few seconds you’ll see 81b. If you see this, you retain an eye fixed thereon . If later you see 82b, the “longest chain rule” says that you simply should regard the longer ‘b’ chain because the valid one (…80, 81b, 82b) and ignore the shorter chain (…80, 81a). So you stop trying to form 82a and instead start trying to form 83b.

Longest chain rule states that : → If you see multiple blocks, treat the longest chain as legitimate.

The “longest chain rule” is the rule that the bitcoin blockchain ecosystem uses to resolve these conflicts which are common in distributed networks.
However, in a more centralised or trusted blockchain network, you’ll make decisions by employing a trusted, or senior validator to arbitrate in these cases.

UPGRADES: How does one change the rules?
As a network as in its entirety, you want to agree up front what quite data is valid to be passed around, and what’s not. With bitcoin, there are technical rules for transactions (Have you filled altogether the specified data fields? Is it within the right format? etc., and there are some business rules like (Are you trying to spend more bitcoins than you have? Are you trying to spend equivalent bitcoins twice?).

Rules change:- As these rules evolve over time, how will the network participants come to terms to the changes? Will there be a situation where half the network thinks one transaction is valid, and the other half doesn’t think so due to differences in logic?
In a private, controlled network where someone has control over upgrades, this is often a simple problem to solve: “Everyone must upgrade to the new logic by 31 July”.
However during a public, uncontrolled network, it’s a tougher problem.

With bitcoin, there are two parts to upgrades : →
1. Suggest the change (BIPs): First, there’s the proposal stage where improvements are proposed, discussed, and written up. A proposal is mentioned as a “BIP” — a “Bitcoin Improvement Proposal”.
If it gets written into the Bitcoin core software on Github, it can then form a part of an upgrade — subsequent version of “Bitcoin core” which is that the commonest “reference implementation” of the protocol.

2. Adopt the change (miners): The upgrade are often downloaded by nodes and block makers (miners) and run, but as long as they need to (you could imagine a change which lessens the mining reward from 25 BTC per block to 0 BTC per block. We’ll see just what percentage miners prefer to run that!).
If the bulk of the network (in bitcoin, the bulk is decided by computational power) prefer to run a replacement version of the software, then new-style blocks are going to be created faster than the minority, and therefore the minority are going to be forced to modify or become irrelevant during a “blockchain fork”. So miners with many computational power have an honest deal of “say” on what gets implemented.

WRITE ACCESS: How does one control who can write data?
In the bitcoin network, theoretically anyone can download or write some software and begin validating transactions and creating blocks. Simply go to https://bitcoin.org/en/download and run the “Bitcoin core” software.
Your system or the personal computer will act as a full node which means:
• Connecting to the bitcoin network
• Downloading the blockchain
• Storing the blockchain
• Listening for transactions
• Validating transactions
• Passing on valid transactions
• Listening for blocks
• Validating blocks
• Passing on valid blocks
• Creating blocks
• ‘Mining’ the blocks



Permissionless : →
Note that you simply don’t got to check in , log in, or apply to hitch the network. You’ll just plow ahead and take part . Compare this with the SWIFT network, where you can’t just download some software and begin taking note of SWIFT messages. during this way, some call bitcoin ‘permissionless’ vs SWIFT which might be ‘permissioned’.

Permissionless isn’t the sole way though. You may want to use blockchain technology during a trusted, private network. you’ll not want to publish all the principles of what a legitimate transaction or block seems like . you’ll want to regulate how the network rules are changed. it’s easier to regulate a trusted private network than an untrusted, public free-for-all like bitcoin.

DEFENCE: How does one make it hard for baddies?
A problem with a permissionless, or open networks is that they will be attacked by anyone. So there must be how of creating the network-as-a-whole trustworthy, albeit specific actors aren’t.

What can and can’t miscreants do?
A dishonest miner can:
1. Refuse to transgress valid transactions to other nodes
2. plan to create blocks that include or exclude specific transactions of his choosing
3. plan to create a ‘longer chain’ of blocks that make previously accepted blocks become ‘orphans’ and not a part of the most chain

He can’t:
1. Create bitcoins out of thin air*
2. Steal bitcoins from your account
3. Make payments on your behalf or pretend to be you
That’s a relief.
*Well, the individual can, but only his version of the ledger will have this transaction. Other nodes will reject this, which is why it’s important to verify a transaction across variety of nodes.

With transactions, the effect a dishonest miner can have is extremely limited. If the remainder of the network is honest, they’re going to reject any invalid transactions coming from him, and that they will hear about valid transactions from other honest nodes, albeit he’s refusing to pass them on.
With blocks, if the miscreant has sufficient block creation power (and this is often what it all hinges on), he can delay your transaction by refusing to incorporate it in his blocks. However, your transaction will still be known by other honest nodes as an ‘unconfirmed transaction’, and that they will include it in their blocks.
Worse though, is that if the miscreant can create a extended chain of blocks than the remainder of the network, and invoking the “longest chain rule” to kick out the shorter chains. This lets him unwind a transaction.

Here’s how you’ll do it:
1. Create two payments with equivalent bitcoins: one to a web retailer, the other to yourself (another address you control)
2. Only broadcast the payment that is paid to the retailer
3. When the amount you paid gets added into an honest block, the retailer sends you goods
4. Secretly create an extended chain of blocks which excludes the payment to the retailer, and includes the payment to yourself
5. Publish the longer chain. If the opposite nodes are playing by the “longest chain rule” rule, then they’re going to ignore the honest block with the retailer payment, and still repose on your longer chain. The honest block is claimed to be ‘orphaned’ and doesn’t exist to all or any intents and purposes.
6. the first payment to the retailer is going to be deemed invalid by the honest nodes because those bitcoins have actually already been spent (in your longer chain)

THE “DOUBLE SPEND” ATTACK : →
It is called a “double spend” because an equivalent bitcoins were spent twice — but the other was the one that became a part of the eventual blockchain, and therefore the first one eventually gets rejected.

How does one make it hard for dishonest miners to make blocks?
Remember, this is often only a drag for ledgers where block-makers aren’t trusted. Essentially you would like to form it hard, or expensive for baddies to feature blocks. In bitcoin, this is often done by making it computationally expensive to feature blocks. Computationally expensive means “takes tons of computer processing power” and translates to financially expensive (as computers got to be bought then run and maintained).

The computation itself may be a game where block-makers got to guess variety , which when crunched with the remainder of the block data contents, leads to a hash / fingerprint that’s smaller than a particular number. That number is said to the ‘difficulty’ of mining which is said to the entire network processing power. The more computers joining in to process blocks, the harder it gets, during a self-regulating cycle.

Every 2,016 blocks (roughly every 2 weeks), the bitcoin network adjusts the problem of the game supported the speed that the blocks are created.
This game is named “Proof of work”. By publishing the block with the fingerprint that’s smaller than the target number, you’re proving that you simply did enough guess work to satisfy the network at that time in time.

INCENTIVES: How does one pay validators?
Transaction and block validation is reasonable and fast, unless you select to form some time and expensive (a la bitcoin).
If you control the validators in your own network, or they’re trusted, then
• you don’t got to make it expensive to feature blocks, and
• therefore you’ll reduce the necessity to incentivise them

You can use other methods like “We’ll pay people to run validators” or “People sign a contract to run validators and behave”.
Because of bitcoin’s ‘public’ structure, it needs a defence against miscreants then uses “proof of work” to form it computationally difficult to feature a block (see Defence section). This has created a price (equipment and running costs) of mining and thus a requirement for incentivisation.

Just as the worth of gold determines what proportion equipment you’ll spend on a gold mine, bitcoin’s price determines what proportion mining power is employed to secure the network. the upper the worth , the more mining there’s , and therefore the more a miscreant has got to spend to bully the network. So, miners do many mining, increasing the problem and raising the walls against network attacks. they’re rewarded in bitcoin consistent with a schedule, and in time, because the block rewards reduce, transaction fees become the motivation that miners collect.

The idealised situation in Bitcoin where block rewards are replaced by transaction fees. This is all alright in theory, but the more you check out this, the more interesting it gets, and with the bitcoin solution, the incentives might not quite have worked needless to say.

UTXO

What are UTXOs?The term UTXO points to the amount of digital money someone has left remaining after executing a bitcoin like transaction. UTXO means “Unspent Transaction Output”. Each bitcoin transaction begins with coins used to balance the ledger. UTXOs are processed continuously and are responsible for starting and concluding each transaction. Though confirmation of transaction results in the removal of spent coins from the UTXO database, a record of the spent coins still exists on the ledger.

Thus:

  • A UTXO is the amount of digital money remaining after a cryptocurrency transaction is executed.
  • UTXOs are processed continuously and are responsible for starting and ending each transaction.
  • When a transaction is completed, any unspent outputs are deposited back into a database as inputs which can be used at a later date for a new transaction.

How does a UTXO Work

UTXO transactions seem complicated, but they really are fairly simple. UTXO or unspent transaction outputs are used in cryptocurrency transactions. These are the transactions that are left unspent after someone completes a transaction, similar to the change someone receives after conducting a cash transaction at the store.

The UTXO database or ledger is initially set to empty or zero. As transactions multiply, the database becomes populated with change records from various transactions. When a transaction is completed and there are outputs that aren't spent, they are deposited back into the database as inputs which can be used at a later date for a new transaction. But, spending does not take place using a single data byte. Instead, multiple fractions of bitcoin are retrieved by the algorithm to fulfill a spending request.

For eg., a purchase worth 10 bitcoin may retrieve 6 BTC from one byte and 4 BTC from another. Change from each of these fractions is then sent to the UTXO database to be spent at a later date.

CONCLUSION

It is useful to know blockchains within the context of bitcoin, but you ought to not assume that each one blockchain ecosystems need bitcoin mechanisms like tokens, proof of labor mining, longest chain rule, etc.

Bitcoin is the first attempt at maintaining a decentralised, public ledger with no formal control or governance. Ethereum is the next iteration of a blockchain with smart contracts. There are significant challenges involved.

On the opposite hand, private or internal distributed ledgers and blockchains are often deployed to unravel other sets of problems. As ever, there are tradeoffs and pros and cons to every solution, and you would like to think about these individually for every individual use case.

If you’ve got a selected business problem which you think that could also be solvable with a blockchain, I might like to hear about this: please contact me.
__________________________________________________________________

--

--

Shubham
Analytics Vidhya

Enthusiast...Inquisitive minded … Ever-Learner...Dreamer…Innovator…Persevering…Programmer…Creator