Dr. Stuart Haber is a computer scientist and cryptographer widely regarded as a co-inventor of blockchain together with Dr. Scott Stornetta. Their work in cryptographic time-stamping was adopted by Satoshi Nakamoto as the basic mechanism for data integrity in Bitcoin. In this discussion, we talk about their thought process behind inventing the blockchain-like data structure as well as his views on the cryptocurrency space. You can follow Dr. Haber at LinkedIn.
Let’s start with how it all began. You and Mr. Stornetta are considered to be inventors of blockchain. How did you two meet?
We met at Bellcore, or Bell Communications Research, it was the newly formed Bell Labs — the local phone companies in the US, after the monopoly AT&T was broken up by an antitrust court decision in the mid-’80s. And the local phone companies, colloquially known as the ‘Baby-Bells’, were encouraged to set up their own version of the legendary Bell Labs and that entity was Bellcore.
There you published much of the groundwork for the emergence of blockchain technologies. Can you elaborate more on what was your research focused on at the time?
Well, Scott and I have been retelling this story in recent years, but it’s true, I was a young researcher at Bellcore, when Scott arrived I’d been there for a couple of years already working as a researcher in cryptography. Scott was a new arrival at the lab, he had just finished his Ph.D. — and this was fall of ’89 — and he came to me with what he thought was a very important problem, and it was, in fact, the title of our first paper — “How to timestamp a digital document”.
And by a digital document, we meant, of course, any bit-string whatsoever — a digital record of any kind. We weren’t limiting ourselves at all to particular kinds of records. It was clear that all of the world’s records were going online from the physical world.
We weren’t there yet, of course, back in the ’89; but it was clear that this was the direction things were going quickly. Scott was especially worried about being able to verify the integrity of digital records. So we came up with a reasonable solution that is the data structure now commonly called the blockchain.
At first, we thought about the trusted-agent solution, that is having an agent in the network that is trusted to certify the time of a record, that’s what we always called the ‘hash and sign’ solution, in cryptographic terms.
So, there’s a trusted agent who simply returns a digital signature of what amounts to the assertion: ‘I saw these bits at this time’. Where these bits might be, rather than the entire record, just the cryptographic hash value computed from the record.
But of course, an agent that signs such an assertion can sign any assertion whatsoever — that has nothing to do with the time when it is being computed. So that was a trusted-agent solution, the hash-and-sign solution was a pretty straightforward solution to the problem, whereas the tools for doing that weren’t widely available at the time, but it was clear that that would be an easy thing to build.
But it was unsatisfactory because it required trusting the agent. And we were knocking our head against the wall trying to do better, and in fact, we spent a while unable to come up with a better solution.
I believe, having the mathematical background of a theoretical computer scientist which was my training, I was the one who suggested to Scott — ‘Well, let’s prove that in fact, you cannot do better!”. That is, in a proper mathematical model of the algorithmic task that we set ourselves, it’s impossible to have a general proof of integrity of records that doesn’t require trusting somebody.
To prove something is impossible, in general, is very hard to do in computer science. It requires a very careful model, if you want to prove something mathematically, you need a model of what operations are possible and so on. Proper modeling of what it means to say that “this record existed in this world at this time” requires modeling the social process.
The social notion of time in human society — that’s hard to do mathematically and steps outside of our mathematical models. We realized that using cryptographic hash functions and their properties, and taking account of the wide distribution of records, we could in fact achieve what we wanted to do.
In our first couple of papers, we described the data structure — a linked chain of routes of Merkle trees — to enable verification of the integrity of records of any kind. That was already there in the paper we delivered in Crypto 1990, the annual conference on Cryptology, at the University of California in Santa Barbara. And then we had another paper a year later at another conference where we already described the data structure.
We managed to convince the Bellcore management to let us spin out a commercial enterprise to offer timestamping services. That enterprise was called Surety and the service was deployed commercially starting in 1995.
You have co-founded Surety with Mr. Stornetta, and the company was focused on providing timestamping services. Lots of people perceived it as a game-changer in the world of cryptography. Can you elaborate on what was the business model of the company?
The business model was offering timestamping services. We were not a wildly successful business, but let me say something technical about how it worked. We would receive timestamp requests from customers, which of course were just hash values of records that customers wanted to authenticate.
So a timestamp request (or an authentication request) is the hash value of the record in question computed locally on the customer’s machine or in a customer’s network and sent to us along with some authenticating information about the customer.
The Surety server would group requests into what nowadays I’ll call a block, even though we never used the word block. We’d group a set of request into a block and a Merkle tree, and link the roots and Merkle trees at a pretty frequent time interval.
And we would send back a ‘timestamp certificate’ for that request — which was just the sibling hash values enabling anybody to check it given the certificate and the record in question, and access to the Surety “blockchain” — the list of block hash values that were linked in the Surety chain. So anybody could check that that record was indeed linked to that chain.
Remember, this is 1995 when the internet was relatively new and there was much less interest in cryptographic verifiability of authentication of records.
So what we did, in order to enable worldwide agreement on values of our chain, we’d once a week build another Merkle tree out of all the block hash values, and publish that hash value as a classified ad in the national edition of the Sunday New York Times.
And for those customers who asked for an extended certificate, we provided their block hash to the New York Times value. So that any record that had been registered with Surety could be verified against the hash value that appeared in the New York Times.
Could you imagine back then what are the other ways that your work could be used for?
Well, we certainly spoke back then of authenticating all kinds of records, including of course business records and financial records. But the interest did not take off, for one reason or another — it simply might’ve been before its time.
But when Satoshi Nakamoto needed a data structure in order to enable authentication of digital records — in this case, Bitcoin transactions — Satoshi chose the right solution to the problem, which was our data structure. He beautifully linked it the same hash functions to the mining game, to provide an incentive for everybody to agree on what came to be called the Ledger of Bitcoin transaction.
It’s been a couple of decades since your invention — what are you professionally focused on now in these days?
I am advising several projects that use cryptographic tools in what’s now called the blockchain world, and in some other areas as well. I wrote my Ph.D. thesis back in the early days of multiparty cryptographic computation, which is something that nowadays many people have heard of but back then was brand new when I finished grad school in 1987.
Throughout the ’90s lots of engineers and cryptographers were part of the Cypherpunk mailing list. Did you participate in those discussions as well?
Yes, I did. I didn’t read the list every day, but I saw the Bitcoin paper fairly early on. I also went to the financial cryptography conferences that were at that time. There were lots of people trying to do digital coins or digital money of various sorts.
Maybe the founding name to go back to is David Chaum who was also one of the founders of the IACR (International Association for Cryptologic Research), which is the technical organization that runs the main technical conferences in the field of cryptography, including Crypto in the US, EuroCrypt in Europe, and AsiaCrypt in Asia.
What was your take on the Cypherpunk movement in general?
I was enthusiastic about the possibilities for what cryptographic techniques might enable people to do in the real world.
Were you aligned with the philosophy of the Cypherpunk movement?
Yes, I would say so. I think I appreciated early on the difference between the cryptographic tools as mathematical abstractions and cryptographic tools as used in the real world.
For example, what Scott and I didn’t like about trusted agent hash-and-sign timestamping is that trusting it depends on trusting the fact that the private signing key of this agent has been protected from the moment of the asserted signature until now.
But you had to also trust that the agent has not lied about the time when the signature was produced. And trusting that the key has been properly protected for decades is, to be charitable, unwise. And to build a whole system around that seems to be unwise, which is why we look for other ways of doing that and for embedding records into the social structure.
We also worried, though, about reducing reliance as much as possible. We didn’t want to rely only on the honesty of the trusted agent, about what time it is now. We didn’t want to rely on proper engineering measures about protecting keys either.
We also wanted to minimize as much as possible our reliance on, what is known in the cryptographic community as cryptographic assumptions. This includes mathematical problems that are hard. For example, that factoring the product of two large integers into it for prime factors is hard. That’s the assumption underlying the RSA cryptosystem.
We were pretty proud of the fact that our time-stamping system did not rely on protecting keys whatsoever. But we couldn’t get away without using cryptographic assumptions at all, of course, we used the properties of cryptographic hash functions. We were worried, already in our first paper, about practical attacks on widely deployed hash functions.
So every use of the hash function in Surety’s system was, in fact, an invocation of two different hash functions. The digital fingerprint of your record was the concatenation of its hash value using one hash function and its hash value using another hash function. And we built our Merkle trees that way and our hash chain as well. The idea being, if there is an attack on one hash function, it is not going to be a simultaneous attack on both hash functions.
Did you follow other digital currency projects of that time? Digicash, Hashcash, B-money, and others?
Yes, I did. I knew about them. I didn’t contribute to that literature myself, but I certainly knew the papers and some of the projects.
Is there something particular in your perspective about digital currencies that changed or evolved over time?
Well, I was amazed to watch Bitcoin to take off the way it did and I encouraged all this interest in this new area. I was also very pleased to see something that I started off with Scott Stornetta — the use of a cryptographic hash function to solve the problem, enabling the authentication of digital records of all kinds — to see that all taking off.
I was also pleased that people suddenly got more interested in blockchain and digital ledger technology in order to carefully enable authentication, registration, and validation of records. And you know, blockchain techniques have been suggested as a solution for all kinds of problems, many of them are fairly silly, but some of them aren’t. And those that are interesting I very much encourage and it’s amazing to watch what’s happened.
Stornetta and I didn’t at all foresee being able to walk into a room of bankers and say something in a couple of words and they would all know what we’re talking about 30 years ago.
Have you been also following the development of the Bitcoin protocol specifically?
Yes. I don’t run a BTC node myself and I haven’t exchanged a lot of money using Lightning channels for example, but I do follow what’s going on.
In the blockchain world, consensus algorithms are quite an important topic. What is your take on this kind of a battle between Proof of Work and Proof of stake?
Satoshi had the generosity to give proper references to Scott’s and my papers on timestamping. Satoshi did not point to the even older literature in the computer world on consensus mechanisms and consensus problems — the Byzantine General Problem — and invented what’s now called, the Nakamoto consensus, which showed the distributive computing community that there is a different way to regard thing.
So, the Proof of work is, of course, a great waste of energy. If there are ways to achieve consensus without it, I applaud all the research that’s going on to avoid that. I’m personally not an expert in consensus mechanisms, so I’m not going to weigh in here on which one’s going to win in what way. But that’s another flourishing area of work and research.
There has been lots of research going on in this field. Do you see some other interesting alternative consensus algorithms, apart from those that you mentioned?
Well, there are a number of different approaches, I’m not going to list them right now.
Yet another different area, of course, is the variations on the data structure used.
You can connect hash functions, hash values, using other data structures than just a directed chain or directed line as it were.
Any directed acyclic graph of hash value invocations can work the same way to authenticate records. Scott and I have pointed out that in our third timestamping paper, though we didn’t implement that ourselves.
Have you been following the development of the Ethereum protocol? Any takes on that?
Distributing computation, as well as just hash values to authenticate records, was a great innovation. It’s a little bit dangerous though, to run unchangeable code on a blockchain, as we have seen in the Ethereum world.
Do you see potentially some serious competition for Ethereum in the realm of smart contract platforms?
Yes. Though there I’m just sitting on a sideline and watching.
In the blockchain space there’s been lots of work done on the various privacy-preserving techniques and protocols. Are there any that you personally find exciting?
In terms of the use of privacy-preserving and privacy-enabling technologies, some of those go back to my early interest in the use of zero-knowledge proofs and techniques of multiparty computation to keep data private while computing useful answers to questions.
Bringing some of those techniques together with blockchain techniques for enabling verification of data is quite exciting. We’ll see where things go there.
Another general sub-world of the blockchain space is DeFi (Dececntralized Financec). Do you follow this ecosystem as well? Are there maybe some particular projects that you’re find interesting or impactful?
There are all kinds of acronyms being thrown around for things… You know, how is De-Fi different from other crypto projects? Which cryptographic projects count as De-Fi and which don’t? I’m not sure of that.
Perhaps De-Fi projects are more on the application level?
Could be. But just to make things concrete — one reason why Surety was not a wild raging success perhaps might’ve been that we did not have tools that made it easy to weave our technology into the business systems that companies, government bodies, and enterprises were using to keep track of their data.
And I think there’s plenty of opportunity for business systems that need authentication of data now with greater and stricter regulatory and compliance requirements. There’s a great need for building systems that enable plain, ordinary authentication of data using variations on Scott’s and my old stuff.
Not only in DeFi but in digital currencies, in general, regulation is a big topic. It often tends to dwarf the potential of the digital currencies. How do you see the issue of regulation in crypto?
I think the sorts of Bitcoin maximalists, who think that they’re just going to avoid all governments and regulations — just because of mathematics and code — are being silly.
Government, regulation, and money — these are all deeply embedded in social systems and you can’t avoid social systems if you want to build something that people use.
Therefore, regulation is something that people building blockchain or digital ledger or DeFi projects need to deal with. You can’t just say: ‘well that’s got nothing to do with me’.
Speaking of Bitcoin maximalists, they are also often vocal about blockchain not being good for any other use-cases apart from cryptocurrencies. What’s your take on that?
Well, that’s just silly. If somebody can do something else with a design that you invent, or even with your code that you wrote — why not let them? Just because you like things one way doesn’t mean somebody else might not have a different preference.
To pick another example, about Bitcoin. The Satoshi-given fact that there’ll only be so many million bitcoins ever — that is not God-given, that is not a law of nature. It’s a constant written somewhere, specified in a particular place in the Bitcoin source code.
That’s not a law of nature at all. It can be changed. And changing is the social process, just like the head of a central bank manufacturing money or squeezing down on money supply.
The crypto community is also usually quite skeptical towards permissioned blockchains, as they’re not deemed to be ‘decentralized enough’. Do you see some use cases for private and permissioned blockchains as well?
Yes, sure. There are groups of entities that just want to agree among themselves on something. They don’t have to be open to all participants at all.
In fact, very early on when Surety was trying to sell its technology to some government agencies in the US, they were not amendable. And perhaps they didn’t even understand, but they were not letting any of their own information leave their own walls. Even if the only thing that was leaving their walls were just occasional pairs of cryptographic hash values. But where it makes sense there is no reason not to build permissioned blockchains.
The last question I have for you is a little bit of a prediction. In 10 years from now, what do you think will be the role of the Bitcoin protocol in our society?
Well as one great writer said — “it’s hard to make good predictions, especially about the future”. I expect to see blockchain techniques used for many more kinds of data that they’re actually used for now. There’ll be also many projects that people are excitedly writing about now, that’ll have failed or be forgotten, a few years from now.
Regarding Bitcoin itself — it’s been over eleven years now — will it last another ten years? Could be. That depends, among other things, on the longevity of hash function designs. One challenge that has not been faced by much of the blockchain technical community is how to design the algorithmic agility — the ability to swap in a new hash function into a deployed system.
Yes, this has not been much talked about in the community yet.
Let me say something more about it. All of the blockchain world is built on the assumption that you can’t find two different documents, two different digital records, two different strings of bits that have the same hash value. But of course, that’s not true. It’s certainly not true that there are no pairs of documents with the same hash value, with the same fingerprint. That’s just a simple combinatorial argument.
The fact is, it might be computationally difficult to find two meaningful documents with the same hash value, it’s a hard problem. But as we see now in the history of attacks on hash functions, the most widely used hash function in systems everywhere is MD5 — Ron Rivest’s system, the Ron Rivest hash function design. But it’s now easy to compute collisions for MD5.
If you’re building a system that’s meant to last a long time, you need to think about how to swap in a new hash function. Stornetta and I posed and solved that problem for pure digital timestamping systems in our first two papers. People haven’t thought carefully about how to do that right in widely deployed blockchain systems.
Do you have any other suggestions or proposals how to do that? Have you seen some in the blockchain space?
We solved it in our second paper. I haven’t seen it addressed by somebody else’s system.
Thank you very much!
Coin Story brings you in-depth interviews with the brightest minds in the crypto space. If you like this interview, be sure to check out our past ones too, and Sign up to not miss out on the future ones, and to get a regular digest of news and trends in the crypto world. Explore more interviews and educational resources on cryptocurrencies at coinstory.tech