Hyperledger Fabric Private Data Collections

Hyperledger Fabric took a big step forward by adding Private Data Collections to the 1.2 version of Hyperledger Fabric. The concept makes a lot of sense, as it becomes an administrative nightmare to create a new channel anytime that a few organizations want to exchange private data within a larger channel.

Private Data Collections

Private data collections make it possible for portions of the data to be private, thereby allowing a subset of organizations to view the private portion of the data on the same channel.

Update 10/26/18

David Enyeart (Maintainer for Hyperledger Fabric project) at IBM reached out me to me with clarification regarding the issue I found in the first version of the article:

Unfortunately, the initial community sample did not demonstrate use of transient field for keeping data private — Dave Enyeart

Dave said the latest 1.3 documentation does a better job at specifying the transient field as shown below:

A special field in the chaincode proposal called the transient field can be used to pass private data from the client (or data that chaincode will use to generate private data), to chaincode invocation on the peer.

This article is now about what happens when you don’t use the transient field. Thanks Dave!

How I Found the Issue

I have been building a database adapter to store Hyperledger private data in any standard database e.g. MySQL, MongoDB, Oracle or Postgres.

Part of insuring the security of my database adapter is to inspect how Hyperledger is storing and purging the private data in the first place.

BlockToLive

Hyperledger has a blockToLive property in collections_config.json that specifies how many blocks until the data is purged.

The Purge

The purging does work in the sense that the data is no longer available to query after X more blocks are added, 3 blocks in this case.

Here is a an example of a private data query done pre-purge.

Here is the same private data query done post-purge.

The Problem

The problem is that the private data is purged, BUT if you don’t call the invocation with the transient field, when we initialized the private data (a marble in this case, with a price of $0.99) was actually written to ALL of the blockfiles, even on organization peers that were specifically NOT supposed to have ANY access to the private data. The collections_config.json that specifies the blockToLive property also specifies the policy that indicates which organizations can access the private data.

The private data was created by the initMarble function in the Golang file marbles_chaincode_private.go

Here is the marblePrivateDetails struct from the marbles_chaincode_private.go where it is clear that the Price integer is private.

Here is the post purge blockfile for a peer on Org2 where you can see the private data that should not have EVER been available to Org2, i.e pre and post purge.

Conclusion

I believe Hyperledger will be the dominate blockchain solution for enterprises and this issue does not turn me off to Hyperledger at all, but the technology is new and companies will need to be vigilant with development and iterative testing to insure private data remains private using the transient field.

How’d you like this article? If you liked it or learned something, please leave a clap! DarkBlock.io is an enterprise blockchain development company and we’re always taking on new clients. Reach out to me at sheffield@darkblock.io or visit our website at DarkBlock.io!