Scalar DL: Scalable and Practical Byzantine Fault Detection Middleware for Transactional Database Systems

Hiroyuki Yamada
Scalar Engineering
Published in
3 min readApr 13, 2022

--

This blog post will briefly introduce Scalar DL, scalable and practical Byzantine fault detection middleware for transactional database systems. For challenges it addresses, please see the previous post. (Japanese version)

Overview

Scalar DL (Figure 1) is middleware that runs on top of databases and provides Byzantine fault detection capability without modifying the underlying databases.

Scalar DL provides a view of a single-instance database system to users and internally runs two types of database servers in separate administrative domains*1 (ADs):

  • Primary database servers (called Ledger servers) that manage a primary database replica holding an application’s data and make all the commit decisions
  • Secondary database servers (called Auditor servers) that manage a secondary database replica holding the same data as the primary database replica for auditing purposes.

Both servers separately manage the same set of deterministic functions (called Contracts) to derive states and results on the basis of given inputs.

Scalar DL architecture
Figure1. Scalar DL Architecture

The key of the Byzantine-fault detection protocol of Scalar DL is that Ledger and Auditor make an agreement on the partial ordering of transactions in a decentralized and concurrent way. Auditor first pre-orders a transaction given from a client partially on the basis of conflicts (ordering phase), and Ledger executes and commits the transaction that is ordered by Auditor (commit phase), and then Auditor validates the ordering result given from Ledger and executes the transaction (validation phase). The three-phase protocol makes both databases derive the same correct (strict serializable) states and results as long as both ADs are honest, i.e., if either is Byzantine-faulty, their states or results would be diverged, which makes it possible for clients to observe the divergence and detect the fault in the database system. I will explain more details about the protocol in an upcoming post.

*1: An administrative domain (AD) is a collection of nodes and networks operated by a single organization or administrative authority.

Design Goals

The design goals of Scalar DL are as follows:

Tamper-evident

Scalar DL makes a database system tamper-evident by detecting Byzantine faults in the database system. Scalar DL can detect Byzantine faults as long as either AD is honest (not faulty).

Scalable

Scalar DL achieves near-linear scalability when the number of nodes composing each database replica increases. Note that each database replica can be a multi-node distributed database that uses data partitioning and replication for high performance and crash fault tolerance.

Correct

Scalar DL guarantees correctness while exploiting the parallelism of the transactions of the database system. The safety side of the correctness guarantee of Scalar DL is that a database system provides strict serializability as long as both ADs are honest (not faulty) and can detect Byzantine faults if one of the ADs is faulty. I will explain more details about the correctness guarantee in an upcoming post.

Database-agnostic

Scalar DL is based on Scalar DB; thus, it is database-agnostic. It achieves the detection capability in a database system without either modifying the databases or using database-specific mechanisms. Scalar DL can currently run on PostgreSQL, MySQL, Oracle Database, Microsoft SQL Server, Apache Cassandra, Apache HBase, Amazon DynamoDB, Amazon Aurora, Azure Cosmos DB, and their compatible databases.

Cloud-agnostic

Scalar DL is cloud-agnostic since Ledger and Auditor are containerized middleware. Scalar DL has been deployed to AWS Cloud and Azure Cloud in production systems and has been confirmed to run on Oracle Cloud and Google Cloud.

Summary

This blog post explained the overview and design goals of Scalar DL. The primary benefit of Scalar DL is that it provides Byzantine fault detection capability without sacrificing scalability and practicality. Future posts cover more details and other interesting properties of Scalar DL.

--

--

Hiroyuki Yamada
Scalar Engineering

CTO of Scalar, Inc. Passionate about parallel and distributed data management systems.