Scaling blockchain data with Proxima Immutable Streams and Verifiable Audits

Published in

Proxima.one

4 min readAug 22, 2021

As blockchains grow larger and are able to process a greater number of transactions, decentralized applications are forced to deal with a veritable firehose of information. This puts strain on the data layer for blockchain and DApps who are required to create the infrastructure to construct and query descriptive data for their users.

At Proxima, we believe that the most efficient way to manage and manipulate the vast quantity of blockchain data is through the use of streams and materialized views. Framing our data system in this manner enables us to improve the data layer performance through parallelization and can be used to merge blockchain information, transform and filter events, create aggregates of accumulated data, and materialize views and documents for DApp-specific schema.

Audit and Streaming Infrastructure

Proxima streams are built to provide users the ability to process and audit streams of data in a fast, and efficient manner. Streams are processed through the Proxima Streaming server, where input events are consumed, processed, and produced to output streams. These output streams can come to represent new events or document updates for schema-specific updates, and can be pushed to a variety of data stores.

Processing Data streams

The data streams and functions are represented in the Proxima implementation through WASM byte code and can be executed on the streaming server, then produced to the corresponding output streams and pushed to any data stores used to track events.

Event stream audits

The structure of an audit represents a path in that graph that can be traced back to validate the origins of a specific piece of data. The audit reference can be written along with the event data. The reference is the combination of the byte code of the event, the location and id of the corresponding events, and the location of the state transitions produced by the evaluation of the functions.

Authenticating Audit Paths

In addition to processing the data and creating the corresponding event streams and materializing views, stream functions, external data, and events are used to audit and verify any piece of data in the Proxima network. Auditing in the Proxima Protocol is based on the ability to utilize, produce, and transform streams of data, and then store and retrieve them in a trustless manner. In order for audits to be used in practice, it is necessary to order, retrieve, and execute the web assembly code on the client-side. In the case of Proxima this audit path can be input into a specific process structure and can be authenticated according to any number of rules.

Each audit link incorporates several parts:

Authentication of aggregation function(s) involves the retrieval of function byte code, and verification of the witness proof.
Retrieval of external data/event streams that sever as inputs. Includes receiving and verifying the witness proof.
Validation and execution of aggregation function using external data and events.

In order to verify in the path for two specific events it is necessary to do the follow: check the correctness of the data, and check the consistency of the data. The Correctness of the data is done by performing state transitions between the events and data in the path.

Example

In order to understand this method in context lets look at a simple audit path for transforming dummy blockchain transfer contract events from raw data into a generalized form. The audit link performs the validation operation between Raw Transfer and function to determine the correctness of the event function handler.

In order to determine the audit meta data and functions are correct it is necessary to have access to a way to authenticate the data and its requests. This includes having a data check for the Transfer and Raw Transfer. In order to prove the correctness and consistency of the data it must be checked with a Merkle witness proof. This can be done for all portions of data used in the function handlers.

In the end, we check the audit by iterating over the nodes within the audit path, and verifying the input value and expected output values, the node meta data, and the function state transition.