This is the second article in a series aimed at demystifying blockchain for those who’ve heard the term and the industry optimism around it one too many times and are seeking to understand just what all the fuss is about. If you haven’t read part 1 yet, now is the perfect time to do so!
The previous article delved into the world of blockchain — what blockchain is and isn’t, how it functions and what its enabled in terms of decentralized, self-regulating digital economic systems. None of that however may have explained why blockchain has suddenly become a major interest point for many industries. So what explains this surge in industry interest, then?
1. Where Businesses Come In
The answer lies in separating the underlying blockchain network infrastructure from the applications that are deployed and running on it: since a cryptocurrency is ultimately a software program deployed on a blockchain’s network infrastructure, one may think of the possibility of deploying other programs on blockchain networks as well. These could take the form of code to carry out business functions such as handling records or tracking asset ownership, for example. This opens up a world of possibilities for the enterprise space as any custom developed business logic could theoretically be deployed on a blockchain network: you should simply have to pair up with the relevant parties, set up and configure the blockchain network and then develop and deploy your business logic code atop it.
The good news is such blockchain networks are not just theoretical and they do indeed exist: the cryptocurrency system Ethereum was the first to open its underlying blockchain network to independent developers allowing them to deploy their own custom applications atop the Ethereum blockchain. Such applications are referred to as ‘decentralized applications’, or dApps for short, a term popularized by Ethereum. Such a decentralized application at its core consists of a “smart contract”, i.e. code that executes transactions and handles asset ownership amongst participants. In this regard, Ethereum and other cryptocurrencies themselves can be viewed as currency-centric smart contracts deployed on blockchain networks.
Deploying and running a dApp is obviously not free as nodes on the Ethereum network must maintain them. This cost takes the form of a volatile ‘gas price’ that changes as per network congestion and other parameters. This gas fee is borne by transacting parties for each transaction whereas the developer of the dApp incurs a one-time deployment fee. While dApps can implement their own currency for internal transactions, the gas fee must be payed in Ether. Thus, all transactions on the Ethereum blockchain necessitate the use of Ether.
Ethereum may have proudly ushered in the ‘Cryptocurrency 2.0’ Era, wherein cryptocurrencies are no longer just economic systems but rather decentralized platforms on which applications may be deployed, but its setup is evidently not the best fit for all enterprise usecases: industry participants wouldn’t want to be forced into using a cryptocoin and would probably have a host of other constraints making public networks such as Ethereum less than ideal candidates for their business-critical operational functions.
2. Bringing Blockchain to the Enterprise
A viable blockchain system for the enterprise space then should fulfill the following criteria:
1. Stay Permissioned
Unlike permissionless networks such as those of Ethereum and Bitcoin, an enterprise network must permit only authorized parties to join and participate in the network and must provide a mechanism for authorizing and verifying membership
2. No mandatory coins or assets
An enterprise-grade blockchain system must ideally not be based on digital coins or other assets that need to be obtained, stored and managed by business participants and should provide a means for transactions and consensus regardless
3. Infrastructure limitations
An enterprise system should not require business participants to run and manage large amounts of infrastructure to maintain the network, which would add significant overhead in terms of networking and infrastructure management expertise
4. Facilitate Custom Business Logic
A blockchain network system targeting business usecases must facilitate the deployment of any custom business logic code on the network along with providing simple mechanisms to upgrade and manage said code
3. The Hyperledger Fabric Blockchain Framework
Hyperledger Fabric is such an open source, enterprise-grade, permissioned blockchain framework that does not rely on any inherent currencies or assets. It’s also available in cloud-flavors via the Oracle Blockchain Platform and the IBM Blockchain Platform, and users opting for these are obviously spared of any infrastructure related concerns.
Elaborating on the name: Fabric is just one framework within the Hyperledger project. The Hyperledger project itself is an effort to build enterprise focused blockchain frameworks and is maintained by the Linux Foundation. Fabric happens to be the first and thus the oldest framework under this umbrella. Initially developed by IBM, Fabric was donated to the Hyperledger initiative for future development and upkeep.
Fabric has since gained significant popularity thanks in large part to the cloud offerings mentioned earlier that are based entirely on it. This warrants an in-depth look at the Hyperledger Fabric framework, so let’s dive in:
3.1 Deployment Architecture Details
1. Participants on a Hyperledger Fabric based blockchain network work together to validate transactions and to make changes to the blockchain ledger
2. Organizations on a blockchain network may wish to keep certain transactions private within a subset of members on the network, and can commission channels to do so:
- Each channel maintains its own blockchain ledger private to only the channel’s participants
- Nodes part of multiple channels/networks maintain multiple exclusive ledgers
- Participants on a channel may further wish to keep specific transactions (rather than entire ledgers) private within just a subset of members on the channel, and can commission a Private Data Collection (PDC) to do so:
i. Members in a PDC can independently verify and commit new transactions, while sharing only a hash of the transaction data with the remaining channel members
ii. The shared hash enables easy verification of a transaction’s time and data by all members of the channel while allowing only the PDC members to actually see the data
iii. The actual data is shared between PDC members via a gossip protocol and stored by them in a private database, dubbed a SideDB
3. Participants identities & their memberships to networks, channels etc. are established by verifying a digital, cryptographic x509 certificate:
- This certificate may be issued by a well-known Root Certificate Authority (CA) or Intermediate CA (such a Symantec, GoDaddy, etc.) or via the built-in Fabric-CA
- A Membership Service Provider (MSP) identifies which CAs are authorized to issue certificates and further identifies the specific roles of participants (such as admins, members, etc.) and their access privileges (readers, writers, etc.)
4. Each channel has its own independent smart contract code deployed on it, referred to as chaincode in HLF (the equivalent of smart contracts in Ethereum) written in GoLang
5. In addition to a blockchain ledger, each channel maintains a corresponding world state database to keep track of the latest values of assets tracked by the ledger
All these functions are carried out by multiple nodes, each fulfilling a distinct function and working together to keep the network operational. These nodes are described below:
3.2 Hyperledger Fabric Nodes
Each member on a blockchain network consists of several nodes with each fulfilling a different role:
1. Peer nodes
- Endorsing peer nodes
2. Ordering Service nodes
3. Fabric-CA node
These nodes are detailed below:
3.2.1 Peer Nodes
- Peer nodes carry out three major functions:
i. Store a copy of the blockchain ledger
ii. Install and activate (instantiate) chaincode
iii. Participate in transaction verification
2. An Endorsing peer node is a special type of peer node:
- These are the first to receive proposals for new transactions
- Endorsing peers simulate the results of these proposed transactions on their local ledgers via the installed chaincode
- If all goes well, they generate an endorsed transaction proposal which is then sent to the ordering service
3. Once the endorsed transactions have been ordered into blocks by the ordering service, peer nodes validate each transaction and subsequently append the new blocks to their ledgers
4. Transactions are validated for the appropriate endorsements & to confirm that any proposed changes haven’t already been invalidated by more recent transactions
5. If invalidated transactions are found, they’re marked as such before the block is committed to the ledger and they do not affect the world state
3.2.2 Certificate Authority Node
- This node issues cryptographic identity certificates to participants on the network
- These certificates may then be used by participants to identify themselves during transactions
- The use of this node is optional and an external third-party root CA or Intermediate CA may be used instead
- Certificates are a vital part of the Membership Service Provider (MSP) which uses them to ensure that participants are duly authorized
3.2.3 Ordering Service Nodes
- The ordering service accepts endorsed transactions and orders them into blocks which are then broadcast to all committing peer nodes for addition to their blockchain ledgers
- The ordering service makes no judgement as to the validity of transactions and simply orders them into blocks which are then verified by peer nodes
- HLF offers a choice of three ordering services:
- Solo Ordering Service:
As the name implies, this ordering service consists of a single node and thus can never be fault tolerant. Fault tolerance is a crucial requirement of any blockchain network (read all about it in Part 1) let alone one targeted at enterprise usecases. Why does this option even exist then? For simplicity: use a solo ordering service for quick Proof-of-Concept (PoC) development activities. However, if this PoC will eventually move to production, it would be better to make use of one the following ordering services, which can be made up of a single node initially and scaled up when it’s time to go prod. NEVER use a single node based ordering service in production though, you have been warned!
2. Raft Ordering Service:
Raft is the native, go-to ordering service of HLF and works via a leader and follower model: one node in the cluster is dynamically elected the leader and carries out the actual ordering, while follower nodes copy the leader’s results. Follower nodes listen out for periodic “heartbeat” messages from the leader to ascertain that it’s online and will wait a predefined threshold of time between each heartbeat before beginning the process of electing a new leader. Raft is “Crash Fault Tolerant” and can withstand the loss of a minority of nodes while staying functional (i.e. if you have 5 nodes a minimum of 3 are required online). Lastly, Raft is also touted as HLFs first step towards Byzantine Fault Tolerance (refer to part 1).
3. Apache Kafka based Ordering Service:
Apache Kafka clusters may be configured and used to provide an ordering service along with a ZooKeeper ensemble for the administration of this cluster. Kafka follows a similar leader & follower model to Raft, but with significant additional administrative overhead to its deployment. Not a native part of HLF, users opting for this are presumed to have prior expertise with the deployment and administration of Kafka clusters. Designed for CFT within tight groups, Kafka & ZooKeeper are not designed to run across large networks either. Kafka is notorious for deployment headaches and Raft should be the preferred choice.
3.3 Data Stores of Hyperledger Fabric
A Hyperledger Fabric network consists of two primary and one optional data store:
1. The World State Database
2. The Blockchain itself
3. Private Data Collections, aka PDCs (optional)
The world state database and the blockchain together make up the complete ledger and though they are related, they are very distinct:
3.3.1 The World State Database
The world state database holds current values of assets on the blockchain. For example, in a blockchain network used to track car ownership, this database would hold the name of the current owner for a given vehicle. The world state database holds this data in key-value pairs. The HLF default LevelDB doesn’t support SQL rich queries and HLF may be configured to use CouchDB instead for this functionality.
The world state database is extremely useful because programs will often only require the current value of assets and the ledger state, and these requests are all fulfilled quickly via the world state database instead of traversing the ledger.
3.3.2 The Blockchain
The blockchain itself is stored on every peer node that’s part of the network and consists of blocks which in-turn each contain a certain number of transaction records. Blocks have a block header which holds the block number, the root hash value of all transactions on the block and the hash of the previous block thus linking all blocks into a chain. In addition to the transactions (which make up the body) and the header, the block also holds metadata that specifies when the block was written along with the identifying information of the block endorsers and validators.
The blockchain itself is immutable, and the world state database traverses the ledger for the latest state values to fulfill queries.
3.3.3 Private Data Collections (PDCs)
A Private Data Collection refers to a subgroup of participants on a channel that need to keep certain transactions private amongst them. While channels are the primary route for ensuring privacy and separation of concerns, a Private Data Collection is the preferred route when only some transactions need to be kept private between a subset of members rather than entire ledgers. This has two advantages:
1. Avoiding the additional administrative overhead involved in setting up new channels for every communication that must be kept private between a group of participants
2. All participants on a channel are made aware of a transaction while the actual data itself is kept private within a subgroup, thus aiding future verification during conflicts
For this reason, a PDC is a collection of two elements:
1. The actual data which will be kept private within the subgroup
- This data will not go through an ordering service, but will be communicated via a gossip protocol between the peers in the subgroup and stored in a SideDB
2. A hash of the data which will be shared with all members on the channel
- All members on the channel will receive a hash of the data which serves as a proof of the transaction and can be used for audit purposes
- This hash will go through an ordering service and be visible to all members
If a dispute occurs later the collection members can choose to share their data with a third party, which can then compute the hash and compare it with the hash stored in the main channel state, thus proving that the transaction involving the data did indeed occur between the collection members at that point in time.
3.4 The Life-cycle of a Transaction
Updates to the ledger occur in three phases: endorsement, ordering and finally commitment to the blockchain ledger on each peer node. These phases are detailed below:
3.4.1 Proposal & Endorsement
- A front facing client application begins the process by sending a transaction proposal to several endorsing peers via the Fabric-SDK (Node & Java are native SDKs, currently)
- Endorsing nodes receive the transaction proposal and simulate the proposed transaction on their local ledgers, and all going well generate a ledger update proposal
- Do note that the endorsing peers do NOT actually update their ledgers at this stage
- The now endorsed transaction proposal is returned to the client application
- In this phase, the client applications submit the endorsed transaction proposals to an ordering service, whose role is to appropriately order these transactions into blocks
- The sequence of transactions in a block may differ from the order in which they were received & the number of transactions in a block can be changed by the orderer admin
- However, once a block is generated by the orderer, the sequence remains immutable and all committing peers must append the block as-is to their local ledgers
- This finality in the ordering of transactions in a block prevents the formation of forked chains that must eventually be resolved (refer to the PoW process detailed in part 1)
- Nodes that form a part of the ordering service do NOT execute smart contracts and in fact make no judgement whatsoever as to the actual content or actions of the proposed transactions, instead leaving that role to the committing peer
- Generated blocks are now sent to peers for validation and subsequent addition to their blockchain ledgers
3.4.3 Transaction Commitment
- This is the final stage wherein peers receive and validate newly generated blocks of appropriately ordered, endorsed transactions and finally commit them to their ledgers
- Each peer validates each transaction for the appropriate endorsement & to confirm that any proposed changes haven’t already been invalidated by more recent transactions
- If invalidated transactions are found, they’re marked as such before the block is committed to the ledger and they do not affect the world state
- Though validation is carried out by each peer individually, the process is still distributed in nature as each validating peer checks for the same consistencies in endorsement & transaction validity
- For any peers offline during this process, they can receive the blocks they’ve missed out on by connecting to an ordering service node upon returning online, or by gossiping with other peers via the appropriately named Gossip protocol
- With the ledger successfully updated the process is now complete!
4. Concluding Our Discussion: What We’ve Seen So Far
The first part of this series was aimed at shedding light on the innards of blockchain and its consensus mechanisms. To do so, we recruited the help of cryptocurrencies: after all, what better way to learn of a technology than through its most successful application? We thus explored the underlying nature of blockchain, its accompanying consensus mechanisms and the fault tolerant operations they enable.
Now at the end of part 2, we’ve explored what makes blockchain appealing to enterprise users and what specific features a blockchain network would need to provide to further fan this appeal. In doing so, we’ve explored an enterprise grade blockchain framework, the Hyperledger Fabric framework, in considerable depth. We’ve explored its architectural details, the nodes & datastores that form its operational backbone and have looked at the typical lifecycle of a transaction. We’re thus at an ideal point to conclude our discussion for now:
- Enterprise interest towards blockchain technology stems from the possibility of deploying any decentralized application (dApp) atop it
- These decentralized applications could, for instance, handle asset transfers or track transactions and services, such as goods through a supply chain
- The core of a dApp is code in the form of a smart contract
- Ethereum was the first cryptocurrency to open up its underlying blockchain for application developers, and in doing so ushered in the Cryptocurrency 2.0 movement
- Transactions on the Ethereum blockchain incur a gas fee that must be payed to the underlying network in the native token Ether
- That fact in addition to the public nature of the Ethereum blockchain network could make enterprise users hesitant to deploy their business-critical applications atop it
- Ideally an enterprise-focused blockchain network must not necessitate the use of currencies, and in addition should allow for easy yet powerful identity and code management and as a bonus provide means to work around infrastructure hurdles
- Hyperledger Fabric is an enterprise-grade, permissioned blockchain framework that allows for easy deployment of custom-logic in the form of chaincode (smart contracts) atop it and does not involve any native cryptocurrency-based operations
- HLF is further available in cloud-based offerings from Oracle & IBM, mitigating infrastructure concerns for opting users
- HLF networks are comprised of nodes that work together to endorse, order and verify transaction requests (in that order)
- HLF provides a choice of three ordering services: solo, Raft, and Apache Kafka
- Solo should only be used for quick PoC work and even then, a single-node based Raft or Kafka orderer is preferable if future production deployment is on the cards
- Raft is the native orderer and is crash fault tolerant: a Raft-based ordering service may continue to function as long as a majority of its nodes are online
- Raft is also the first step towards Byzantine fault tolerance for HLF
- Kafka is notorious for deployment complexities and involves significant administrative overhead: the user is presumed to have prior expertise with Kafka if deploying it
- Both Kafka and Raft follow a leader-follower model
- A membership service provider relies on identity certificates to ensures that all participants are duly authorized for the actions they’re attempting to perform
- These x509 cryptographic identity-certificates may be obtained via external Certificate Authorities or via the default Fabric-CA
- Channels are the primary means of privacy and separation of concern on an HLF-based blockchain network
- All transactions within a channel are private to the members of that channel alone
- Participants on a channel may further wish to keep certain transactions (as opposed to entire ledgers) private between them and can commission Private Data Collections (PDCs) to do so
- Data within a PDC is stored in a separate SideDB and is exchanged directly between peers via a gossip protocol, i.e., it does not go through the channel’s ordering service
- A hash of the data is shared with the rest of the channel, which can then be used to verify the time and contents of a transaction if a dispute arises in the future
- Three datastores exist within HLF: a world state database, the blockchain ledger and an optional SideDB for PDCs
- The blockchain ledger is immutable and stores a record of all prior transactions
- The world state database traverses the ledger to keep track of the latest values of assets tracked by the ledger, queries are quickly fulfilled via this world state database
- Data in the world state database is stored in key-value pairs with LevelDB serving as the default world state database
- LevelDB however does not support a JSON-rich query syntax and if this support is desired CouchDB can be used instead
- A front-end application needs to use the Fabric SDK to interact with the network
- Official Node and Java SDKs are available along with unofficial Python, Go and REST SDKs
- Chaincode in HLF is currently written in GoLang