Blockchains and Databases Aren’t the Same Thing. Yet.
When the Ethereum mainnet reaches block 10 million, they will be. Here’s why.
PegaSys is the protocol engineering team at ConsenSys. Sign up for the PegaSys newsletter to get the latest updates on Ethereum 2.0 and enterprise blockchain development.
In an ideal world, enterprises would be able to reuse a traditional enterprise database to analyze and transact with permissioned and public blockchains far and wide. They would also be able to practise safe use of cryptographic schemes and token economics, while requiring nothing more than commonly-understood semantics and common sense.
Blockchains are often described as databases, and there are many valid arguments for this. Even the most naive blockchains allow participants to share and transact with state, often through smart contracts. Just like how enterprise databases track business processes, blockchains can provide these same capabilities to a consortium of authorized business entities. And so, with a snap of the fingers, blockchains are perceived as an enterprise technology.
With decades worth of databases currently running in production, it’s fairly easy to find suitable use cases where blockchains could — though not so simply in our current reality — replace databases entirely. With the confetti of blockchain predictions around hype cycles, markets, and production deployments still sprinkling down on us, my own prediction is that when the Ethereum mainnet reaches 10 million blocks, blockchains will become indistinguishable from traditional databases, driving key innovations in both databases and blockchains.
Proof that the line is blurring
My prediction starts by way of an in-plain-sight observation that wrapping smart contract capability around a key-value store is, by definition, another form of database. Moreover, different databases are better-suited to particular situations, and we’re still in a phase of learning to find, as John Wolpert writes, notStupid use cases for blockchains that have the right intentions. Even as cloud providers capitalize on hardening the distinction between blockchains and databases, such as calling a cryptographically-verifiable ledger service a database designed for blockchain analytics, it’s only a matter of time before any remaining distinctions between blockchains and databases are eroded as well.
And so, amidst the pencil-thin boundaries between blockchains and databases, I see rainbow-coloured ink blots of (new, and new-again) cryptographic schemes, smart contracts, and decentralization permeating into enterprise database scenarios. Of course, as there are more than a few abstractions and arguments that integrate blockchains and databases, perhaps my ex-post prediction is moot:
- Blockchains are not centralized databases, nor are they just another shared database
- ETL libraries for blockchain data
- Storage APIs that support databases
- Decentralized data processing framework, and decentralized database
- Distributed database for public and private data
- Sharing data while preventing collusion
What blockchains can learn from databases
Yet, there’s still just so much missing from blockchains. While databases allow for performant queries and transactions by using indexes and statistics, blockchains generally lack support for standardized query languages, like SQL and SPARQL, and instead depend on RPC calls that aren’t optimized. Applying well-understood database concepts to blockchain paradigms can simplify protocol development. For example, the Ethereum Query Language (EQL) is an SQL wrapper around RPC calls allowing users to query an Ethereum blockchain. However, since the underlying blockchain data is not closely integrated with the query language, building optimizations is a challenge.
Overstating the division between blockchains and databases limits the opportunity to engineer expressiveness into blockchain protocols. A potential solution then might be familiar, yet innovative, integration of blockchain and database paradigms, such as the optimization of query and execution plans of SQL-RPC wrappers. To be clear, we absolutely need to build these atoms of protocol, but we also need good ways to juggle their complexity.
Just as the difference between public and private blockchains will dissolve, so will the difference between blockchains and traditional databases. Developers/enterprises should not let the hard distinctions of today — while the tech is maturing — prevent them from innovating with enterprise-grade blockchain protocols that help to manage and share data in a secure, decentralized way.
In a future post, I’ll talk about some new projects that PegaSys is working on that make blockchains and databases more indistinguishable: modular and pluggable technologies for the integration of Ethereum into enterprise technology stacks.
Want to learn how to contribute to an Ethereum client codebase? Register for PegaSys’ Pantheon Demystified Webinar.
Disclaimer: The views expressed by the author above do not necessarily represent the views of Consensys AG. ConsenSys is a decentralized community with ConsenSys Media being a platform for members to freely express their diverse ideas and perspectives. To learn more about ConsenSys and Ethereum, please visit our website.