Zaptos: Reducing Blockchain Latency to the Absolute Minimum

Aptos Labs
Aptos
Published in
9 min readJan 22, 2025

By Zhuolun Xiang and Alexander Spiegelman

TL;DR: Zaptos is a novel parallel, pipelined blockchain architecture designed to minimize end-to-end latency while maintaining the high throughput of pipelined blockchains. On a geo-distributed network of 100 validators, Zaptos achieves sub-second, end-to-end blockchain latency at a throughput of 20,000 transactions per second (TPS).

For more details, please check out the Zaptos paper.

End-to-end blockchain latency, measured from the moment a transaction is submitted to the point of receiving confirmation that it has been committed, has become a critical topic of interest. End-to-end blockchain latency under high throughput is particularly crucial for the mass adoption of latency-sensitive blockchain applications, including payments, DeFi, and gaming.

Most research and innovation in academia and the Web3 industry centers on enhancing the performance of Byzantine Fault Tolerant (BFT) consensus mechanisms, such as recent works Shoal, Shoal++, and Mysticeti. However, the transaction lifecycle encompasses more stages other than consensus: communication between clients, full nodes, and validators, block execution, certification of the final execution state, persisting the results to storage, and communicating the outcome back to the client. In fact, modern consensus systems typically require around 300–400 milliseconds to order a transaction under low load. However, the end-to-end latency of even the fastest blockchains is around 1 second and increases substantially as the load rises.

To address the challenge of reducing end-to-end blockchain latency, we focused our research on architectural improvements. Today we’re introducing Zaptos, a parallel, pipelined architecture designed to minimize end-to-end latency while maintaining high throughput.

Zaptos shadows the block execution, state certification, and storage stages under the consensus latency in the common case. This means that a block has already been executed, its final state has been certified and persisted, by the time the block is ordered. Specially, the end-to-end latency of Zaptos in this case is equal to

Client-validator communication latency + Consensus latency

As client-validator communication latency is unavoidable, Zaptos achieves optimal end-to-end blockchain latency when the consensus latency is optimal.

Existing Blockchain Architectures

Existing blockchain pipeline architectures can be classified into three main categories based on the interaction between their pipeline stages: coupled-consensus-execution, execution-then-consensus, and consensus-then-execution. In the latter two categories, consensus and execution are decoupled as separate stages.

Figure 1. Illustration of the coupled-consensus-execution pipeline architecture.

In the coupled-consensus-execution architecture, the consensus stage is tightly integrated with the execution of blocks which determines the new blockchain state during consensus. For instance, in leader-based protocols, validators execute a block after the leader’s proposal and vote on the resulting new blockchain state as part of the consensus protocol. The output of the consensus stage includes the finalized state. Representative chains that use this architecture include Bitcoin, Ethereum PoS, Solana, Algorand, Cosmos, Redbelly, NEAR, XRP, and Stellar.

Figure 2. Illustration of the execution-then-consensus pipeline architecture.

The execution-then-consensus architecture is first introduced in HyperLedger. Validators first execute a list of transactions locally, producing execution outputs. These outputs are then subjected to a consensus process to agree on their ordering and, consequently, the new blockchain state.

Figure 3. Illustration of the consensus-then-execution pipeline architecture. When integrated with a pipelined consensus protocol, this architecture effectively becomes the pipelined architecture of Aptos.

In the consensus-then-execution architecture, validators initially reach a consensus on a new block extending the blockchain. Execution of the ordered block follows, producing the updated blockchain state. To produce publicly verifiable proof of the new blockchain state and avoid safety violation caused by non-deterministic execution, a certification stage is introduced prior to committing to storage. Representative chains that use this architecture include Aptos and Sui, with Avalanche currently implementing it.

Pipelined Architecture of Aptos Blockchain

Aptos is the first blockchain to employ a pipelined architecture since 2021, allowing different stages of different blocks to execute in parallel. This design improves blockchain performance by maximizing resource utilization.

The Architecture

Figure 4. Illustration of the pipelined architecture of Aptos. The figure shows client C_i, full node F_i, and validator V_i. Each box represents a stage in the blockchain that a block of transactions needs to go through from left to right. The pipeline consists four stages, including consensus (which consists dissemination and ordering), execution, certification and commit.

We describe the current Aptos architecture by tracing the lifecycle of a transaction (txn). A client can submit a txn to the full node it connects to (for DDoS protection). The full node, upon receiving the txn, will forward it to the validator it connects to.

  • Consensus stage: The validators first run a consensus protocol to agree on a block that includes txn. This process usually includes two sub-stages: a dissemination stage, where validators distribute payload batches, and an ordering stage, where they reach consensus on the order of blocks that contain the metadata for these payload batches. This stage is network bandwidth intensive.
  • Execution stage: The validator executes the block when there exists an ordered unexecuted block such that its parent block has been executed. This stage is CPU intensive.
  • Certification stage: After execution, the validator signs the cryptographic digest of the execution state and broadcasts the signature. When receiving a quorum of signatures on the same state, the validator aggregates the signatures to certify the state. This stage uses little computation or bandwidth resources, but takes one round to receive the signature from a quorum of validators.
  • Commit stage: If the new certified block is the next in height to be committed, the validator updates the highest committed height and blockchain state, then saves both to storage. This stage is storage IO intensive. When the commit finishes, the validator sends the newly committed block to the full node.

The full node, upon receiving the committed block from validators, ensures the state is certified, and adds the block to its pipeline. The pipeline of a full node is similar to that of a validator, but without the certification stage or consensus. The client can query whether txn is committed on the blockchain. The full node, upon receiving client’s query on txn, will respond with the inclusion proof of txn if it’s committed in some position according to the latest blockchain state. If the client receives the response within a timeout, it verifies that the proof is valid for txn and returns success or failure, respectively. The client may resubmit the transaction upon failure or timeout.

The Pipelining

Figure 5. Illustration of the pipelining of consecutive blocks in the Aptos pipelined architecture.

As illustrated, the pipelined design achieves high blockchain throughput by fully utilizing the different resources of the validators and full nodes. On Aptos, a validator can pipeline different stages of consecutive blocks (e.g., for blocks B_1,B_2,B_3), and the validator can perform the commit stage in parallel: (IO-intensive) of B_1, the certification stage of B_2, the execution stage (CPU-intensive) of B_3, and the consensus stage (network-intensive) of subsequent blocks. In practice, the durations of the stages may vary, but as long as the parallel stages utilize distinct resources, the pipeline improves throughput by maximizing resource utilization compared to non-pipelined designs.

Zaptos

Zaptos significantly reduces the end-to-end latency of Aptos’s pipelined architecture through three key optimizations while maintaining the property of maximizing resource utilization to achieve high throughput.

Figure 6. Illustration of Zaptos.
  • Optimistic Execution: This optimization improves the pipeline latency of both validators and full nodes by optimistically running the execution stage. When any validator receives the block proposal in consensus, the validator adds the block to the pipeline immediately rather than waiting for it to be ordered. Then, the validator can speculatively execute the block once the parent block has been executed. The validator also sends the proposal to the full nodes which subscribes to the validator. Similarly, the full node does optimistic execution to verify the state proof received from the validator.
  • Optimistic Commit: The second optimization reduces the commit stage latency for both validators and full nodes by allowing blocks to be optimistically committed to storage as soon as the execution stage completes, but before the state is certified. When validators certify the state, only a minimal update is needed to complete the commit stage. In the case of opt-committed block that is not eventually ordered by consensus, the opt-committed state will be reverted from the storage for data consistency.
  • Piggybacking state certification on Consensus: The final optimization further improves the pipeline latency of the validators by allowing the validators to start the certification stage of an executed block earlier, rather than waiting for the block to be ordered. This enables the validators to run the certification stage in parallel with the last round of the consensus, effectively reducing the pipeline latency by one round in the common-case.

With these key optimizations, Zaptos significantly reduces the pipeline latency while maximizing resource utilization to achieve high throughput.

Figure 7. Illustration of the pipelining of consecutive blocks in Zaptos. The left figure illustrates the pipelining, where a validator can also pipeline different stages of consecutive blocks. The right figure illustrates the dependencies between stages of consecutive blocks, e.g., the execution and commit stages of block B_2 also depends on the execution and commit stages of its parent block B_1, respectively.

Evaluation

We evaluate the end-to-end performance of Zaptos with geo-distributed experiments, using Aptos as a high-performance baseline. More evaluation details can be found in the paper.

We used Google Cloud to mimic the deployment of a globally decentralized network. Our testbed consists of 100 validators and 30 full nodes across 10 global regions, with machine specs similar to those used by Aptos to qualify as commodity-grade.

Throughput-Latency

Figure 8. Common case performance of Zaptos and baseline (Aptos Blockchain).

The figure above shows the end-to-end blockchain latency with respect to the throughput graph of Zaptos and Aptos. As depicted, both systems have a gradual increase in latency as the system load grows, and sharp spikes when reaching maximum capacity. Compared to the baseline, Zaptos is significantly reducing latency by 160 ms under low load and over 500 ms under high load.

Notably, Zaptos achieves sub-second, end-to-end blockchain latency at 20k TPS with a production-grade implementation tested in a realistic, mainnet-like environment. This is a breakthrough combination rarely seen in existing blockchain systems that unlocks the potential of blockchain for real-world applications demanding both speed and scalability.

Latency Breakdown

Figure 9. Latency breakdown of common case: Aptos Blockchain.
Figure 10. Latency breakdown of common case: Zaptos.

The latency breakdown graph depicts the duration of each pipeline stage for both validators and full nodes. To further analyze system performance, we provide a detailed latency breakdown of the data points from the throughput-latency graph:

  • The end-to-end blockchain latency of Zaptos is approximately equal to the consensus latency, up to 10k TPS. During this range, the opt-execution, certification and opt-commit stages for both validators and full nodes are effectively “shadowed” within the consensus stage. This validates the Zaptos design as a step toward achieving optimal blockchain latency.
  • As TPS increases, the non-consensus stages are no longer fully shadowed within the consensus stage. This is primarily due to the increased execution preparation required to wait or fetch larger blocks and the longer duration of the opt-execution stage. Despite partial overlap of the stages at maximum throughput, Zaptos significantly reduces latency by shadowing the majority of the stage durations. For instance, at 20k TPS, Aptos exhibits a total latency of 1.32s (consensus latency: 0.68s; other stages: 0.64s), whereas Zaptos reduces this to 0.78s (consensus latency: 0.67s; other stages: 0.11s).
  • The consensus dissemination latency remains a bottleneck of the end-to-end blockchain latency under high load. Further improving the dissemination latency presents an interesting challenge for future work.

Conclusion

Zaptos is a novel blockchain pipeline architecture designed to achieve low latency and high throughput by maximizing resource utilization through effective pipelining. For more details, please check out the Zaptos paper.

--

--

Aptos Labs
Aptos Labs

Written by Aptos Labs

Aptos Labs is a premier Web3 studio of engineers, researchers, strategists, designers, and dreamers building on Aptos, the Layer 1 blockchain.

No responses yet