A Simple Explanation of How Devvio’s Blockchain Can Process Millions of Transactions Per Second
I’ve gotten a lot of questions about Devvio’s ability to handle millions of transactions per second. Devvio was the first company to prove that a blockchain can process millions of transactions per second, on-chain on a global public network. Devvio has developed a solution to the sharding problem, which is a big area of research in blockchain. Sharding, simply put, means parallelizing the processing of transactions.
Many people are incredulous of our scalability given all the claims out there in the blockchain space. Admittedly, many projects claim to be able to do what they have not proven, and the crypto space has therefore generally developed a “guilty until proven innocent” mentality. Our team comes from the traditional tech space, though, rather than the newer crypto space, where things are done differently. We internally proved that our solution worked before we announced what we can do. We didn’t want to make claims of what we hoped or thought we might be able to do. We took a more responsible approach and only made claims of what we internally proved we would be able to do when we knew for sure our approach would work. We implemented a fully working system, and benchmarked our algorithms at 8 million transactions per second. This benchmarking represents on-chain transactions on a global public network.
Our efforts then shifted from software research to software engineering, and software engineering is just a matter of execution. We have experienced technical team leaders who are able to build the technology, and there is no doubt at this point that our system works at millions of transactions per second. I’m writing this blog post targeted at those who want to understand how our scaling solution works, as well as to those who doubt our claims… our scaling solution is actually pretty simple at its core.
There are two primary concepts in understanding how our blockchain scaling works:
1. Devvio’s Solution Scales Horizontally
If one of our blockchain networks can handle 3000 transactions per second (which is where we were at when we did our benchmarking, though we’re closer to 8000 TPS per shard now), how would one be able to process 6000 transactions per second? Easy, you simply add another blockchain network. Each of the blockchains we add into our system is an independent blockchain operating its consensus algorithms on its own. 10 blockchains can handle 30,000 transactions per second of throughput. 1000 blockchains can handle 3,000,000 transactions per second.
It should be intuitively obvious that independent blockchains can scale as large as needed, purely in terms of throughput, by simply adding more of them. The blockchain networks are called shards (referred to as T2 networks in our whitepaper), and this type of scaling we refer to as horizontal scaling.
However, there is a problem in scaling this way. What if a wallet in one shard needs to send a transaction to a wallet in a different shard? That leads to the second fundamental concept in understanding how our scaling works.
2. A Single Additional Shard Handles All Cross-Shard Transactions.
There is one blockchain (referred to as the T1 network in our whitepaper) that handles transactions that go from one of our shards to another shard.
There are two key concepts that make our cross-shard mechanism work
- We assign each wallet in our system to one and only one shard.
- We separate payment and settlement (or sending and receiving, in the broader case).
All the transactions that go from one shard to another shard are first summarized by the sending-wallet’s shard. Every shard processes its blocks in its blockchain, and all of the blocks from all of the T2s are sent to the T1 network as its inputs. The T1 network processes those T2 blocks and reorganizes them into its blocks. Then every shard reads the T1 blocks to get the settlement/receipt portion of the transaction.
For example, if Alice sends Bob 10 Devv, Alice’s blockchain (the blockchain that Alice’s wallet is assigned to) first records the transaction in a block in its blockchain, and then that block is sent to the T1 network. The T1 network processes the block from Alice’s blockchain that contains that transaction (along with many other blocks from other shards), and then adds those transactions to its blockchain. Then Bob’s blockchain reads the block from the T1 blockchain, sees an incoming amount for Bob, and processes it. There is a small delay between sending and receiving (i.e. payment and settlement) of a few seconds, but settlement is always guaranteed because the blockchains from all of the shards are public and immutable.
That’s it! Scale by adding blockchain networks, and handle cross shard transactions with a single blockchain network.
A final point I’ll make is that in order to prove that our solution works, one needs to show that the T1 network can process all of the cross-shard transactions, and that it is not a bottleneck for the horizontal scaling. Again, it should be intuitively obvious that one can scale as large as is needed simply by adding new blockchains, so long as the T1 network can handle all of the cross-shard transactions. That is what we showed in our benchmarking. We created a T1 network that processed 8 million transactions per second. To process that many transactions per second, the computers running that network (i.e. the Validator nodes) need to be near each other so that they have low latency. Our T1 network needs to have nodes in the same city, for example, to reach that level of throughput. Even with that point, though, we can relax that constraint and scale vertically as well as horizontally in our algorithms by adding multiple cross-shard networks, or we can handle cross-shard transactions in other ways, but exploring those concepts is beyond the scope of this blog.
There really shouldn’t be any doubts that our system can scale to millions of transactions per second. We implemented our T1 and showed what it can do, and again, at its core, our sharding solution is a pretty straightforward algorithm. One can analyze the straightforward way in which our system works, see our benchmarking videos, and even analyze and test the code itself, if one wants.
Our shards can be located globally, and there does not need to be low latency from T2 to T1. Low latency on T2 computers (nodes) is not required either. Our shards can process smart contracts as well. Even with a global blockchain, and even while processing smart contracts, our system can handle millions of transactions per second. We can scale with our algorithms as large as we ever will need, and I have no doubt we can process 10’s or even 100’s of millions of transactions per second if ever needed, but we have proven 8 million transactions per second.
Why would one need millions of transactions per second? Here are some examples.
- Internet of Things (IoT) applications need security, which blockchain can provide, but IoT at scale needs millions of TPS.
- A global identity solution tied to payments and many other uses could utilize millions of TPS.
- A national mail tracking system can use millions of TPS.
- Global processing of Forex transactions could utilize millions of TPS.
- Global supply chain processing can utilize millions of TPS.
- Global processing of mobility transactions can utilize millions of TPS.
For some more in depth analysis, you can see a more detailed explanation of our sharding algorithm in our Greenpaper, which you can download at
In particular, I would recommend having a look at Appendix A, which shows how a transaction works its way through our system.
You can see a videos of our benchmarking tests that our VP of Software, Nick Williams, did.
Here is a shortened video that shows the 8 million TPS benchmarking run.
In the benchmark, we start the processes that run the T1 network and the T2 networks, as well as the process that takes all of the pre-computed T2 blocks and sends them out through ZMQ networking communications. It takes around 5 minutes to prepare the pre-computed transaction blocks to be sent through the network, so this shorter video doesn’t show that processing, in order to show the results more succinctly.
Here is a longer version of the video with the full run itself, uncut.