Improving transparency of cryptocurrency storage with publicly auditable proofs.

7 min readJul 18, 2017

This is part one of a three part series (part two is available here) exploring what publicly auditable proofs are and how the improved transparency they provide is beneficial for users who rely on third party hosts such as exchanges to store their cryptocurrency. Code snippets are written in Haskell although other language implementations can be found at olalonde’s github, who along with Wilcox et al , did most of the early work in this area.

On February 2014 the largest bitcoin exchange at the time, Mt Gox, halted all bitcoin withdrawals after months of uncertainty over its bitcoin reserves and numerous security issues. At its peak, Mt Gox accounted for over 70% of global bitcoin transactions and was receiving between $5 million to $20 million in daily transfers. By the end of the month, Mt Gox went offline and began filing for bankruptcy protection having lost around 850,000 bitcoins worth nearly $500 million. Most worrying was the rising speculation that, unbeknownst to their users, Mt Gox had been insolvent for many months leading up the bankruptcy filing.

Despite the threat of another “goxxing” and various exchange hacks, many people continue to store coins in exchanges because of the convenience and user-friendliness. Exchanges have tried to mitigate the risk and be more transparent with their users, with some now undergoing regular financial audits to reassure users their coins remain safe and in reserve. However, many of these audits occur privately with the results released periodically at the discretion of exchanges — reducing their effectiveness.

The goal here is a public proof of solvency (one that can be examined by any user and at any time) and it consists of two main sub proofs: A proof of liabilities and a proof of assets. This post will focus on constructing the proof of liabilities.

Note: There is a way to construct these proofs in a zero-knowledge way, but that’s for a future post.

The main concept underpinning the proof of liabilities is the widely-used merkle proof — an efficient proof of set membership. A merkle structure is a tamper-evident data structure that uses hashes to ensure items in the data structure aren’t modified post-construction. They can be layered over common data structures such as binary trees (Bitcoin’s merkle tree) or radix trees (Ethereum’s merkle-patricia tree). A common characteristic of these data structures is that they can be condensed or reduced to a single hash (commonly referred to as the merkle root) which represents a cryptographic fingerprint of the entire data structure.

To prove that Tx 2 is a member of the set, we provide the nodes in bold which form the merkle proof

In the proof of liabilities, we construct a merkle binary tree where the leaf nodes are user accounts and the merkle proof is then the path from a leaf node to the root, with the sibling node hashes provided to allow for a user to recreate the merkle root.

Let’s begin with defining our user account:

{-# LANGUAGE DeriveGeneric   #-}
{-# LANGUAGE TemplateHaskell #-}data User = User {
    _email   :: EmailAddress,
    _balance :: Double,
    _nonce   :: ByteString
} deriving (Generic)instance Binary EmailAddress
instance Binary UsermakeLenses ''User

We will use the email addresses as the unique customer identifiers and include a nonce to ensure the same user hashes differently with each release of a proof. Binary instances will be useful in encoding our users before passing them into our hashing function. Our tree definition is a very simple binary tree with values only at the leaf nodes. We will also define a foldable instance as well for the tree which will be helpful later.

Basic binary tree with user information at leaf nodes

data Tree a = Leaf a | Node (Tree a) (Tree a)instance Foldable Tree where
    foldr f acc (Leaf a) = f a acc
    foldr f acc (Node l r) = foldr f (foldr f acc r) l

For any tree we will thread a function, f, down the left and right subtrees until we arrive at a leaf node, accumulating the result of f on each leaf node. We’ll also add helpful functions like singleton and fromList as well as a simple insert function to make creating a tree easy. For added privacy, it is advisable to use an insert that randomly structures the tree so that no additional information about neighbouring nodes can be ascertained. For example, an always balanced tree will reveal the total number of users ( 2^n where n is the number of nodes in the merkle proof ).

singleton :: User -> Tree User
singleton = LeaffromList :: [User] -> Tree User
fromList (x:xs) = foldr insert (singleton x) xs-- insert adds to the left sub tree
insert :: User -> Tree User -> Tree User
insert user r@(Leaf a) = Node (Leaf user) r
insert user (Node l r) = Node (insert user l) r

With our standard binary tree completed we can now work on the proof itself. As described by Wilcox et al, the proof of liabilities is a regular merkle proof with an additional field- the sum of balances of each subtree. So now let’s describe our new internal node as well as the two functions to build it.

-- Defining hash algorithm here makes it easy to change in future
type Hash = Digest SHA256data LiabilityNode = LiabilityNode {
    sumBalance :: Double,
    nodeHash   :: Hash
}-- recursively add left and right subtrees
sumSubTree :: Tree User -> Double
sumSubTree (Leaf x)   = x ^. balance
sumSubTree (Node l r) = sumSubTree l + sumSubTree r-- use foldr described earlier to recursively hash subtrees
hashSubTree :: Tree User -> Hash
hashSubTree t = hashFinalize $ foldr updateHash hashInit t
    where
        updateHash = flip hashUpdate . toStrict . encode-- Helper function to make a proof of liability node
mkLiabilityNode :: Tree User -> LiabilityNode
mkLiabilityNode t = LiabilityNode balance nodeHash
    where
        balance = sumSubTree t
        nodeHash = hashSubTree t

Now we have a way to create an internal node as described by Wilcox, however, nowhere in our tree have we described a way to store these internal nodes (data is only stored in the leaf nodes). Turns out we don’t have to with the magic of zippers . In fact keeping the tree in this “more polymorphic” state affords us increased flexibility.

Liability proof for user 2 with the path to root highlighted in red and relevant internal nodes displayed (Privacy is maintained by only revealing sibling hashes). User 2 is able to reconstruct their own merkle root (which contains the overall liabilities) and compare it against the published root.

By using zippers we can create our desired proof “ad-hoc” as we traverse down our data structure. Using the differentiation technique, we find that the appropriate zipper for our data structure has the form.

data Dir = Left | Right deriving (Show)
type Dirs = [Dir]
type LiabilityZipper = [(Dir,LiabilityNode)]

We add a directional data type so that when we return our proof the user knows how to recreate the internal nodes at each join (since hashing is non-commutative). We can now write out the code to construct our proof.

liabilityProof :: EmailAddress -> Tree User -> LiabilityZipper -> Maybe LiabilityZipperliabilityProof usrEmail (Leaf x) lzs
    | usrEmail == (x ^. email) = Just lzs
    | otherwise = NothingliabilityProof usrEmail (Node l r) lzs = lookLeft <|> lookRight
  where
    lookLeft = 
      liabilityProof usrEmail l $ (Left, mkLiabilityNode r):lzs    lookRight = 
       liabilityProof usrEmail r $ (Right, mkLiabilityNode l):lzs

Given an email address, we run a breadth-first search down the tree adding the direction we went down and sibling liability node until we find a leaf node whose email address matches the one provided. Here we can use the handy alternative class to reduce a lot of the boilerplate code for the search. The <|> operator is also referred to as the choice operator and in this case if lookLeft fails (i.e return Nothing) it will try lookRight. Provided our email addresses are unique and the email address requested exists, our liability proof will be a list of directions specifying the path from node to root, and the relevant internal nodes.

This use of zippers is a powerful concept because it allows us to abstract away the construction of the data structure with the act of traversing and manipulating its contents. For example, if we also wanted to have vanilla merkle proofs we could easily define a new zipper and function.

type MerkleZipper = [(Dir,Hash)]merkleProof :: EmailAddress -> Tree User -> MerkleZipper -> Maybe MerkleZippermerkleProof usrEmail (Leaf x) mzs
    | usrEmail == (x ^. email) = Just mzs
    | otherwise = NothingmerkleProof usrEmail n@(Node l r) mzs = lookLeft <|> lookRight
  where
    lookLeft = merkleProof usrEmail l $ (Left,hashSubTree r):mzs
    lookRight = merkleProof usrEmail r $ (Right,hashSubTree l):mzs

Our tree definition does not have to change, only the zipper we use to navigate it. This makes layering differently functionality over an existing or simple data structure easier.

This proof system works best given a large number of people check their balances regularly. Otherwise, it is easy for exchanges or coin stores to hide balances of users that are inactive for long periods of time. This proof also does not protect against exit scams, since a fraudulent third party can just run away with your coins. However, if users begin noticing an increase in liabilities without a corresponding increase in reserves it might be a sign that something could be wrong.

Part two will cover proof of reserves as we get a step closer to a public proof of solvency.

Improving transparency of cryptocurrency storage with publicly auditable proofs.

Written by Ilyas Ridhuan