The power and extensibility of streams: Case Study General DEX
Getting data is hard on blockchain, and the node API is not suited for arbitrary read queries, and since the amount of this data is enormous which prohibits getting data from full nodes. Loading and transforming data from the blockchain is becoming a difficult problem because of the quantity of data, the transaction speed, and the generation time. This problem is being felt by decentralized applications, most notably exchanges. In this case study, we pursue a formalization of a decentralized exchange data layer.
Goal
Aggregate data across several DEXes on Ethereum and BSC. Use external exchange rates to provide multi-currency support. Provide GraphQL endpoint for queries
This process is broken down into several stages:
- Creation of Streams
- Development of schema for Blockchains
- Pre-Processing of the streams into generalized set of events
- Aggregation and transformation on events that result in updates to DEX schema
This approach combines multiple blockchains, DEXes and versions, and remains extensible to the inclusion of any set of streams as well.
Article Highlights
- Versioning updates between the chains, smart contracts, and exchanges
- Enables fast switching and integration of different exchanges into a single stream of processed data, or into a materialized view.
- A/B testing for different streams can use multiple streams.
- In this case study, we construct the generalized DEX between Pancakeswap, SushiSwap, and Uniswap
Generalized DEX
This decentralized exchange incorporates three different exchanges, on two separate blockchains by pre-processing events into a common event schema. This reduces the issues that arise from different encoding techniques for variables, and different structures for events. With our system, it is possible to query different exchange rates, accumulate volume, and merge between tokens of different decentralized exchanges.
Using this technique it is possible to integrate new chains and exchanges with low-overhead, all that is required is the creation of the stream and the pre-processing of the events. Since content updates are represented in document changes, it is also possible to select different data stores to load the documents. This offers developers that ability to load schema into whatever data store they want, and run from their own infrastructure.
Usage for DEX Data Streams
The data streams that result from the decentralized exchange can be used for a variety of cases.
- Document-database or Relational index
- Source stream for additional stream processing
- Analytics
- Client-based trigger
Using streams to represent data
Events are the natural way to represent a change in any system. Processing updates as a stream of events allows us to construct a state at any point in time. Furthermore, an event stream is immutable, which means that once added to the stream, events remain there forever.
Stream Processing
In order to have infrastructure that can use 3rd party services and ensures events will be processed exactly once. The stream processor can take in one or multiple streams as input, and transform them into any number of streams. Additionally, the processor can include intermediate streams, and use 3rd party services in the processing of them.
- Merge
- Split
- Filter
- Transform
It is important that when merging streams that there is a deterministic method for ordering the streams.
Building the Decentralized Exchange
Build source streams and their schema (blockchains)
The source streams are the entry point to further processing. External event stream is usually an entry point to any further processing. The most natural event in any blockchain is “new block mined”, so having all “new block mined” events in the stream allows us to build any state based on that blockchain data. Blockchain is considered to be a chain, but dealing with live events requires handling chain reorganization events (forks). To support forks we have to introduce “undo mined block” event.
Build the DEX schema
This means that we need to construct the schema required for the source. We can’t solve these problems without building our own indices since we can’t rely on Blockchain’s node API. The Consumer is interested in some specific part of blockchain data and specific queries. We have to provide a way to index only needed data.
Every index that is built must be scalable. The most important point about building another representation of some data is to ensure that data is Eventually Consistent. Indices should allow users to run any kind of queries. Indices should be highly scalable in order to provide needed read performance.
Processing DEX Event Stream
While constructing we construct such that we both aggregate and transform data. In the case of PancakeSwap and Uniswap.
Managing data
The generalized approach to the management of streams allows individuals to set up their own streams, and push document updates to their own data layer so that they can run the infrastructure themselves.
Extending the DEX
It is possible to include new versions, new DEXes, or new exchanges into this generalized DEX approach. Since the DEX relies on a generalized set of the events, it is only necessary to transform the new stream events into the generalized DEX events. Once these events are transformed, they only need to be merged into the stream of generalized DEX events, and they will be processed. This means that function handlers do not need to be re-written when updating versions, blockchains, and when adding decentralized exchanges. It is only necessary to add new handlers when updating schema.