Hyperledger Fabric Private Data Collections
Hyperledger Fabric took a big step forward by adding Private Data Collections to the 1.2 version of Hyperledger Fabric. The concept makes a lot of sense, as it becomes an administrative nightmare to create a new channel anytime that a few organizations want to exchange private data within a larger channel.
Private Data Collections
Private data collections make it possible for portions of the data to be private, thereby allowing a subset of organizations to view the private portion of the data on the same channel.
Update 10/26/18
David Enyeart (Maintainer for Hyperledger Fabric project) at IBM reached out me to me with clarification regarding the issue I found in the first version of the article:
Unfortunately, the initial community sample did not demonstrate use of transient field for keeping data private — Dave Enyeart
Dave said the latest 1.3 documentation does a better job at specifying the transient field as shown below:
A special field in the chaincode proposal called the
transient
field can be used to pass private data from the client (or data that chaincode will use to generate private data), to chaincode invocation on the peer.
This article is now about what happens when you don’t use the transient
field. Thanks Dave!
How I Found the Issue
I have been building a database adapter to store Hyperledger private data in any standard database e.g. MySQL, MongoDB, Oracle or Postgres.
Part of insuring the security of my database adapter is to inspect how Hyperledger is storing and purging the private data in the first place.
BlockToLive
Hyperledger has a blockToLive property in collections_config.json that specifies how many blocks until the data is purged.
The Purge
The purging does work in the sense that the data is no longer available to query after X more blocks are added, 3 blocks in this case.
Here is a an example of a private data query done pre-purge.
Here is the same private data query done post-purge.
The Problem
The problem is that the private data is purged, BUT if you don’t call the invocation with the transient field, when we initialized the private data (a marble in this case, with a price of $0.99) was actually written to ALL of the blockfiles, even on organization peers that were specifically NOT supposed to have ANY access to the private data. The collections_config.json that specifies the blockToLive property also specifies the policy that indicates which organizations can access the private data.
The private data was created by the initMarble function in the Golang file marbles_chaincode_private.go
Here is the marblePrivateDetails struct from the marbles_chaincode_private.go where it is clear that the Price integer is private.
Here is the post purge blockfile for a peer on Org2 where you can see the private data that should not have EVER been available to Org2, i.e pre and post purge.
Conclusion
I believe Hyperledger will be the dominate blockchain solution for enterprises and this issue does not turn me off to Hyperledger at all, but the technology is new and companies will need to be vigilant with development and iterative testing to insure private data remains private using the transient field.
How’d you like this article? If you liked it or learned something, please leave a clap! DarkBlock.io is an enterprise blockchain development company and we’re always taking on new clients. Reach out to me at sheffield@darkblock.io or visit our website at DarkBlock.io!