Bitcoin, Blockchain, and Building Blocks

We’ve all heard plenty of hype about Bitcoin — ​and now blockchain — ​and how they’re going to revolutionize finance and commerce. I’d like to dig a little deeper into how they actually work; pop the hood and poke around a bit. A bit of a spoiler: they’re not magical, perfect, and flawless. They have problems — ​some of which are technical, others at the tricky and unavoidable intersection of people and technology.

First, so we have our terms straight: “Bitcoin” is used to refer to both the digital currency — ​the unit of value — ​and the software that allows you to exchange this currency. I’ll use “BTC” when referring to the currency. Mostly, we’re going to be talking about the software.

Blockchain is part of Bitcoin, or that’s how it started out. It’s a distributed database specifically designed for storing transactions, like an account book. Bitcoin wraps software around it to control how transactions get added. Bitcoin was designed to be open and anonymous, for people who don’t know or trust each other. For transactions within a trusted group, like an organization or a consortium of banks, you can still use blockchain, but you may want different controls around it.

Bitcoin builds on several well-understood cryptographic technologies. While it combines and uses them in new ways, this is tech you use every time you go to a secure web site.

At the bottom of that stack of tech are two fundamental tools: public key cryptography and hash functions. Everything else builds on them. The math behind them is really complicated, way over my head, but you don’t have to understand the math to understand how it functions — ​in particular, its limitations, how it could fail, and what happens when it does.

We’ll need a bit of background before we get to public key cryptography.

Symmetric Cryptography

In the oldest, simplest forms of cryptography, you take a message and replace each letter in it with a different letter according to some formula. In short:

Here, the math is the secret part. If you know the math, you can decrypt the message. An example is Rot-13 encryption, where you “rotate” each letter of the alphabet by 13 places, so “a” becomes “n”, “b” becomes “o”, and so on. To decrypt the message, you just rotate each letter back. It’s a pretty dumb cipher. Modern encryption is way more sophisticated, but fundamentally it’s still:

One important innovation is the concept of a “key”. It’s a small piece of information used by the encryption algorithm, sort of like a password. Encrypting the same data with different keys would produce different encrypted output.

Here the math part is not the secret. In fact, the more public scrutiny it gets, the more trustworthy it is. What’s secret is the key.

You use the same key for encrypting and decrypting, like with real-world keys. A key that locks a door can unlock it. This is called “symmetric key cryptography”, to distinguish it from what we’ll see in a minute.

There’s a problem, though: The reason you’re encrypting the message is that you think someone might intercept it. So how do you send someone the key in a way that won’t also be intercepted? Maybe we need a different kind of encryption…​

Public Key Cryptography

In public key cryptography, there are two keys. One of them is private, one is public. The private key, you have to keep safe and secret. Really, really safe. The public key, you give out to anyone. Publish it. Register it with your email address in a public registry.

You can use either key to encrypt, but anything you encrypt with one can only be decrypted with the other:

You can send someone a secret message by encrypting it with their public key. As long as only they have their private key, only they can decrypt it. If they lose their private key, they can’t decrypt it; and anyone who steals their private key can.

The other thing you can do is encrypt a message with your private key. Anyone can decrypt it with your public key, so it’s not a secret, but they know that you encrypted it.

We can do both of these in combination. If I encrypt something with my private key and your public key, only you can read it, and you know it came from me.

Ok, that’s our first big building block. On to the second…​

Cryptographic Hash Functions

A hash function is math which takes in data and spits out a single number. The data can be anything from a single byte to a high-def movie file.

There are a few important aspects of hash functions:

  • The same data always generates the same number
  • It’s not reversible — ​you can’t reconstruct the input data from the output number
  • Different data can generate the same number; this is called a “collision”

Cryptographic hash functions use sophisticated math to guarantee a few more things:

  • Collisions are highly unlikely (1/N where N is the number of atoms in the universe)
  • Modifying the data slightly generates a completely different output
  • There’s no way to predictably modify the data to generate a specific output (to cause a collision)

Hashing is particularly useful for dealing with two kinds of data: secret things and big things.

For example, an authentication system could store the hash of a password rather than the password itself. When someone enters a password, its hash is calculated and compared to the stored value. If someone steals the stored value, all it lets them do is guess at the password and know if they guessed right. (They can try a lot of guesses really fast, which is why you need long passwords.)

Hashing also gives you a shortcut for comparing data files: rather than going through both files byte by byte, you can calculate the hash of each of them, and compare the hashes. This also means that if you want to track whether you’ve seen a particular document before, you only have to store its hash.

For developers, a familiar use of hashes is in Git. It keeps hashes of files so it knows if they’ve changed, and each commit is identified by a hash of everything in it.

As another example, I wrote a tiny Ruby script to go through all my MP3 files to look for duplicates. It reads each file, calculates a hash for it, and keeps a look-up table of hashes to file paths. If the hash is already in the dictionary, it prints out a message with the old and new file paths.

digests = {}
Find.find( dir ) do |f|
if File.file?( f ) and File.size?( f ) then
d = MD5.file( f ).hexdigest # hash of file
if digests[d] then
puts "Duplicates: #{digests[d]} and #{f}"
else
digests[d] = f
end
end
end

Digital Signatures

As mentioned, I could encrypt a document with my private key, and anyone with my public key could decrypt it and verify that it came from me. But once you decrypt it, it’s hard to re-verify it. You’d have to keep around the encrypted copy, re-decrypt it, and compare the decrypted versions.

A much better option is to run the unencrypted document through a hash function, then encrypt just the hash value with your private key. That’s a digital signature. To verify the signature, decrypt it with the public key, re-calculate the hash for the document, and compare the two.

Signing

Verification

Transactions

Ok, now that we’ve assembled our building blocks, we can get into the actual Bitcoin and Blockchain part of this.

As I said, blockchain is like a big account book. It’s a registry of all of the BTC in existence, and who owns them. What it actually records are transfers of BTC, not balances. It doesn’t record “Alice has 5 BTC”; it records “Bob gave 2 BTC to Alice” and “Chris gave 3 BTC to Alice”. Alice has to add up all those transfers to figure out how much she has.

Actually, Alice isn’t Alice, she’s 43b46ef2e61a3d6a725fe70fe2b3adaadbca7348 or something. This is what makes blockchain anonymous: everyone is only identified by their public key. It’s a shared database; anyone who uses it can see the whole history of every transaction. Alice can also use multiple keys to make it harder to figure out that her transactions belong to the same person.

A transaction, as it’s recorded in blockchain, defines a set of inputs and outputs. Its inputs are the previous transactions which it takes BTC from. Its outputs have the amount being transferred and the hash of the public key they’re going to. Every input is the output from a previous transaction.

So when you make a payment, you don’t have a pool of money to pay it out of; you have a bunch of individual transactions. You have to say something like “take that 5 BTC from transaction 13a16…​ and give it to key 72fc3…​.” (I’m abbreviating the ids here.)

A transaction output is either spent or not spent: when you spend one, the value in it all has to go somewhere. If all of the transactions you’ve received are bigger than the payment you want to make, what you can do is split one up and pay some of it back to yourself.

You can also do many-to-one or many-to-many transactions. You can take a bunch of little payments you’ve received, combine them into one bigger payment to someone else, and pay the difference back to yourself.

Or just collect them all into a single transaction.

Validation

So how is ownership of transactions enforced? What stops you from pretending to be someone else? This is the crux of blockchain. Any database can store transactions. How can we prove, mathematically, that transactions are authorized and the record hasn’t been tampered with?

Each transaction output also includes a little executable script which is used to verify any claims made to it. To claim the output as an input to a new transaction, you have to provide credentials: a public key and a signature of the input transaction. The script takes these as parameters. It checks that the public key is the one expected, uses that public key to decrypt the signature provided, and compares the result to the hash of its own transaction. If it matches, that proves that the claimant has the private key matching the required public key.

That whole validation process looks something like this:

In pseudocode, that’s:

function validate(signature, publicKey) {
return
hash(publicKey) == “43b46ef2e61a3d6a725fe70fe2b3adaadbca7348”
&&
decrypt(signature, publicKey) == hash(inputTransactionBytes())
}

Here’s a full example transaction from the Bitcoin wiki.

Input:
Previous tx:
f5d8ee39a430901c91a5917b9f2dc19d6d1a0e9cea205b009ca73dd04470b9a6
Index: 0
scriptSig:
304502206e21798a42fae0e854281abd38bacd1aeed3ee3738d9e1446618c4571d10
90db022100e2ac980643b0b82c0e88ffdfec6b64e3e6ba35e7ba5fdd7d5d6cc8d25c6b241501
Output:
Value: 5000000000
scriptPubKey: OP_DUP OP_HASH160 404371705fa9bd789a2fcd52d2c580b65d35549d OP_EQUALVERIFY OP_CHECKSIG

You see it has the id of the input transaction (Previous tx). Index says which output we’re claiming (since there can be multiple). scriptSig is the signature and public key — ​the credentials — ​which will be fed into the script from that input. Note that the input section doesn’t have an amount — ​that comes from the previous transaction. At the bottom, scriptPubKey is the script. It uses its own weird little programming language: OP_DUP and such are commands.

Since each transaction is linked to previous transactions, and each of those is linked to others, a sequence of transactions could look something like:

Ok, so I can create a new transaction which takes BTC from previous transactions and transfers it to someone else. And we’ve got a mechanism which lets someone verify that I’m allowed to do that. But part of that verification requires them to look up the input transactions and check that they’re valid. Which means that all of their inputs have to have been validated. And so on and so on.

Is there some way to shortcut that process, so we don’t have to re-validate the whole chain of previous transactions every time?

Blockchain

That brings us to the actual blockchain. Rather than validating and storing each transaction individually, they’re grouped into blocks. Each block also has some header data, which includes hashes of this block’s transactions and of the previous block’s header. Modifying an earlier block would change its hash, making any tampering evident. However, this only tells you that there was a change, not what it was.

As new transactions are added to the chain, it also updates an index (like a database index) of transaction outputs, adding new ones and removing spent ones. Removing spent ones is important. If I have a transaction output which gives me 2 BTC, I can create two new transactions which transfer it to different people. Each of those looks valid on its own, but we can’t allow both — ​that’s double-spending.

Proof of Work

So far, there’s no reason blockchain couldn’t be a centralized database. All transactions get sent to it, it validates them and adds them to the chain. Simple. And for some of the uses people talk about, they could totally do that.

But the entire point of Bitcoin is to avoid having any central authority. All Bitcoin users can create transactions, add them to blocks, validate them, and add new blocks to the blockchain. But the chain needs to be consistent: everyone needs to agree on what transactions have happened, and in what order. With thousands of people trying to add new blocks to the chain all the time, how do you decide which one is next? You could choose one at random, but how do you do that if nobody is in charge to do the choosing?

The solution to this problem is what makes Bitcoin, Bitcoin.

If you calculate the hash for a transaction block, you’ll get a number which is effectively random. When you look at it as a binary number, there’s a 50% chance that the first bit will be zero, a 25% chance that the first two bits will be zero, and so on. By the time you get to 40 bits, you’re talking about one in a trillion.

So what Bitcoin does is add a sort of filler, called a Nonce field, to the transaction block. It has no effect except to change the hash value of the block. For the block itself to be valid, its hash has to have at least a certain number of leading zeros.

To find a filler value which will make the block validate is a matter of brute force guesswork: set a new random number, calculate the hash, see if it matches, try again.

This is mining. That’s what it is: lots of people, all over the world, burning massive compute power to be the first to find a number that works.

The way Bitcoin miners get paid for all that hard work is with what’s called a Generation transaction or Coinbase. When they assemble a block of transactions, they add one with an output to themselves but no inputs. The value of that transaction — their reward — is set by the Bitcoin software. If they tried to claim more, everyone else would reject the block as invalid. The reward is currently 12 BTC, about $53,000. It’s cut in half every 4 years and will eventually go to zero. At that point, there will be 21 million BTC in existence.

An important feature is that nobody has to agree in advance on which transactions will go in the next block. (Since it’s a distributed network, everyone will receive transactions in a slightly different order.) Everyone can work on validating their own block, and the first one to generate a valid block broadcasts it. Everyone else checks that it’s valid and adds it to their chain. They stop work on their block, discard any transactions which were added in the new block, gather up more transactions, and start validating a new block.

What if there’s a tie — ​someone else finds an answer at the same time, before most people on the network have received the new block? That can cause a temporary split, where there are two competing versions of the chain. The first chain to add another new block wins. In theory, there could be multiple ties in a row, but this is unlikely enough that it’s not a serious problem.

To minimize the chance of ties, Bitcoin deliberately introduces a delay by making the solution hard to find. Given the total compute power of the Bitcoin network, you can estimate how long it would take to find a hash with a certain number of leading zeros. The Bitcoin software adjusts the number of leading zeros required so that the time required to find a valid block remains fairly constant, at around ten minutes.

Private Blockchains

Now let’s step away from Bitcoin, because a lot of the interest in the commercial world is in separating out blockchain and defining different rules to manage it. Again, Bitcoin’s focus is on anonymity and untrusted participants.

The delay introduced by the proof-of-work serves two purposes: to randomize the selection of the next block, and to reduce the chance of more than one valid block being sent out at the same time. Within a smaller network of trusted participants — ​a couple hundred banks — ​you could use a much more lightweight consensus protocol, reducing latency and increasing transaction volume. That’s important if you want to get to “major credit card scale”. Bitcoin’s proof-of-work limits it to about 7 transactions per second, compared to 115 for a large payments processor and 2000–4000 for a major credit card.

Scaling up blockchain to large transaction volumes may make it less democratic. Remember that it’s an ever-growing database of every transaction ever. As of this writing, Bitcoin’s blockchain takes up 107 GB and contains 207 million transactions, a bit more than VISA handles every day. At a large credit card scale, it would grow by 25 TB a year. Just transferring that data would eat up an average of 6.5 Mbps, 24/7. It wouldn’t take a large business to support that, but it would cost more than average users or even serious hobbyists could afford.

You could also use blockchain to record transfers of something other than BTC. While we may think of it as a payment system, it’s actually a record of the ownership and exchange of property. In the Bitcoin blockchain, that’s BTC, but it could be titles to cars, diamonds, artwork, or land. That opens up a lot of possibilities.

Smart Contracts

When we were looking at the example Bitcoin transaction, you may have been wondering why transaction outputs include this complicated validation script, rather than just a public key. The answer is that using a script allows other conditions for validating the transaction to be defined.

In a simple example, the script could require two out of three signatures: a buyer, a seller, and an arbitrator. The seller can’t claim the payment on their own. If the buyer is happy, they sign it. If not, the arbitrator can sign or not sign, depending on their judgement. The buyer and seller still need a real-world trust that the arbitrator is honest.

If, instead, it’s a bet on the outcome of a sports game or the price of a stock, the arbitrator could be a program which uses web services to detect if the right conditions are met. That is more efficient, but again, the participants are trusting that program.

Another example is a Kickstarter-style fundraiser. Create a transaction which pays you 100 BTC, and allow anyone to add inputs to it. Until they add up to 100 BTC, the transaction won’t validate, and you can only spend it once it passes validation.

In theory, you could also have future payments: transactions which are immediately valid, but whose outputs can only claimed after a certain date. This seems to be possible in Bitcoin, but it’s experimental and lacks good tooling. This would allow escrow transactions and returnable deposits.

Despite all the cool capabilities, there is an inherent limitation of Smart Contracts: the only way to mathematically guarantee that money can be paid in the future is to lock it up now. A transaction which pays someone in the future takes the money away from you today. Once it has been added to the blockchain, you’ve spent those inputs — ​you can’t double-spend them. The validation contract just keeps the recipient from spending the outputs until the time limit is up. If you wanted your rent for the next year to be paid automatically every month with a Smart Contract, you’d effectively be paying the full year up front.

Humans

There’s a lot of potential here, but there are also a few concerns. The mathematics of blockchain may be flawless and elegant, but people aren’t.

One serious challenge is key management, especially for consumer-facing applications. In Bitcoin, you are your private key: someone who steals your key becomes you; if you lose your key, you cease to exist, and everything you own is lost. Even within the Bitcoin community, which is mostly pretty tech-savvy, there are plenty of horror stories of private keys lost in discarded hard drives, rendering thousands or millions of dollars worth of BTC unspendable.

People have suggested some ways to mitigate that risk. You could set up your transactions so that all payments require one of two or more public keys, the others being backups stored offline or with some trusted third party. You could rely entirely on a third party to manage your key for you, and use some conventional scheme to authenticate with them. Those protect against loss, but open up possibilities for theft.

Another option for non-Bitcoin blockchains is to use a different validation mechanism. In a more traditional financial setting, you would have a registry of accounts. Transactions would be owned by an account number rather than a public key. A transaction could be signed by any key associated with that account.

I suspect that fully solving this problem will involve creating multi-layered authentication systems much like we use for forgotten passwords: mother’s maiden name, childhood pet, and so on, ultimately backed up by government-issued photo id.

The second big issue is that a blockchain transaction is only one side of the deal: goods and/or services are also exchanged in the real world. Bitcoin has no formal way to contest charges should anything go wrong with that. With smart contracts, blockchain transactions could have an arbitrator, as described above, but then you have an external human system which needs to be established and trusted.

Aside from theft, vandalism, and fraud, there are plain mistakes: typo an amount, or send it to the wrong address. Once it’s gone, it’s gone, and there’s no way to undo a transaction. You can ask nicely for the recipient to transfer the money back, but only if you know who they are. Without traceable identities and a governing authority, there’s no way to force them.

After we’ve ironed out how everything should work, we still need to allow for flaws in implementation. Bitcoin has a history of such errors resulting in significant losses. The Etherium DAO hack demonstrated the need for external, human processes to deal with software failures.

It’s fair to ask how much of the “efficiency” of Bitcoin comes from cutting out all the consumer protections and resiliency mechanisms which traditional financial services provide. Put another way, to make a blockchain which is suitable for typical consumers, how much of that infrastructure — customer support, fraud investigation, etc. — would you have to re-invent?

Conclusion

Ultimately, blockchain is not some unique piece of technological magic. It’s a distributed database with some unusual characteristics. If you’re thinking of using it, you should be able to explain why it fits your business requirements better than something like Kafka, or a plain relational database.

There’s a general caveat to automation (both software and mechanical): taking humans out of the loop may make the system more efficient, less error-prone, and less vulnerable to attack; but it also tends to make blunders and successful attacks more damaging and harder to recover from. We need to plan for those sorts of “black swan” events, and figure out how to mitigate them.

Accountants don’t use erasers, but they do strike things out. It’s good that blockchain transactions can’t be edited, but bad if they can’t be cancelled or revoked. Any real commerce system — almost any real information system — will need to have some way to handle that. More broadly, it needs to let people make mistakes and recover from them.

To its credit, blockchain is a brilliant solution to its original problem: a trusted system of record maintained by an untrusted, distributed, global network with unreliable connectivity and latency. Many things which most databases take as rare failure modes, it treats as normal behavior. While it has limitations which may prevent it from being used as-is in other contexts, there is much to be learned from it, and some version of it may well become a foundational technology with a wide range of applications.

Disclaimer: Capital One is not affiliated with the companies mentioned in this blog post. All trademarks used are the property of their respective owners.