Building Blocks for DEX Router Construction & Analysis

Valve Finance
12 min readAug 9, 2021

--

by Alex Carreira & Prabhaav Bhardwaj

Overview

Exchanging one asset for another is a foundational concept of financial markets. In cryptocurrency markets, this commonly occurs where tokens or currencies are swapped or traded for others. Uniswap is an automated liquidity protocol that facilitates this type of swapping. It uses pairs or pools (henceforth pairs), pooled reserves of two assets[1], to allow a user to swap one asset for another.

Figure 1.0: A Uniswap pool of tokens A and B along with example swap and deposit interactions by a Liquidity Provider (LP) and Trader, respectively. The LP receives Pool Tokens for providing liquidity. Image credit [2].

What happens if someone wants an asset that is not available in a pair with the asset they wish to trade with? In this situation, a series of swaps are made across multiple pairs to get the desired asset — the pairs used to facilitate this trade are called a route.

Figure 2.0: A route involving multiple pairs to trade DAI for USDC. Image credit [3].

Routes are a path from one asset to another consisting of zero or more segments between pairs. A route can be formed from multiple paths to reduce slippage by incorporating more liquidity from other pairs if the amount being traded is large enough or if the liquidity of a pair is low enough.

Figure 3.0: A multiple path route illustrated by the Paraswap Router
Figure 4.0: A multiple path route, optimizing for maximum return from the 1inch Router

Slippage is the difference between the price a person expects to pay for an asset and the actual amount paid due to factors such as price movement between the time the order enters the market and the execution of a trade or low volume and liquidity [4].

Another consideration is the number of segments in a path. This will contribute to the cost of the trade in the form of gas fees for operations on the Ethereum blockchain. The more segments in a path, the higher the resulting gas fees. Similarly, if there are multiple paths, as shown above in figures 3 and 4, gas fees will also be higher.

A router must take these factors into consideration to produce suitable routes for the amount being traded. Furthermore, because market conditions are frequently changing which affects gas fees and pool liquidity, generated routes will also be dynamic and an excellent route now, will not necessarily perform well an hour later or the next day.

The remainder of this document discusses building blocks that can be used in the construction and analysis of a router for the Uniswap V2 protocol and beyond.

Data Modelling

Note: The Graph protocol mentioned herein differs from the graph data type discussed in the following section. The Graph protocol is an index of blockchain transactions for one or more smart contracts, whereas the graph data type refers to a representation of data using mathematical graph theory.

Just as a map is used to navigate between points, a graph data type [5] can be used to navigate the available liquidity pairs to produce routes that can be evaluated for improved returns. In modeling Uniswap pairs, a number of implementation choices need to be made at the outset:

  • Pairs, symbols, or asset identifiers as vertices or edges?
  • Directed or undirected?
  • Simple or multigraph?

To make the implementation choices outlined above, it’s important to understand the properties of Uniswap pairs:

  • Each pair has a unique ID
  • Each pair contains the following data for the pairs 2 tokens: Symbol, Name, ID
  • Token symbols are not unique — for instance, the symbol BOND represents many different assets or different IDs.
  • Token names are also not necessarily unique.
  • Token IDs are unique and are the token’s ERC-20 contract address

These properties of Uniswap pairs suggest the use of the Token IDs as vertices in the graph data type. It follows that the edges of the graph represent the unique pair ids. In this form, the graph may be undirected or directed with each edge representing the pair id and token price and reserve. However, the constant need to update token price and reserve information, suggests storing this data in a caching structure with appropriate time-to-live settings may be more efficient and scalable, especially for applications in real-time trading as opposed to static analysis.

Going forward, it is desirable to route between Uniswap V2 and V3 protocol pairs. In this scenario, it may be that multiple pairs exist for the same Token IDs. While it’s possible to add an additional edge between pair IDs, another solution is grouping the different pair IDs on the same edge, avoiding the performance costs of traversing a multigraph. A partial example of the grouped pair ID undirected simple graph structure appears below with symbol IDs substituted for symbol names:

Figure 5.0: Modelling Uniswap V2 and V3 protocol pairs in an undirected simple graph (note that the actual structure uses token IDs instead of token symbols, which would have been cumbersome here).

Initially, a Depth First Search (DFS) has proven sufficient for traversing the graph, limited to a depth of 4 in the general case. An exception to this depth is routes starting from WETH, where the number of connected nodes exceeds 30,000. When a route starts from WETH, the DFS has been limited to a depth of 2, to reduce traversal times.

A number of tools exist for working with graph data types including the graph databases Neo4J and RedisGraph. A discussion of these exceeds the scope of this document and the current project requirements can be met with the Javascript library Graphlib [6]. However, should the scale of the routing problem reach a similar magnitude to that of LinkedIn or other large networks, then the aforementioned graph databases reportedly scale to meet those demands, trading off cost and development complexity.

Constraints

Constraints are useful when computing routes in a graph data structure. For example, they facilitate determining routes that only traverse a limited number of pairs or can be used to ignore pairs containing certain assets.

Existing Uniswap V2 routing is dominated by routes through six assets that can be likened to airport hubs. These six assets are:

  • WETH
  • DAI
  • USDC
  • USDT
  • COMP
  • MKR

The six assets are useful because they are in common use and when combined with other assets in a pair, do not impose liquidity limitations (i.e. they’re not scarce and do not pose the same risk as new, unproven crypto assets). However, their use can pose efficiency problems documented here [7], where it is stated:

“Uniswap is not routing swaps in a decentralized way.”

Using constraints, such as ignoring the six hub assets mentioned above, allows exploration of potentially more efficient routes than what the current Uniswap V2 interface offers users. Constraints can also be extended to other criteria, such as:

  • Routing through pools with X liquidity in USD or Y% of the trade in USD.
  • Routing through specific pools with specific pricing. (Useful for Uniswap V3 Protocol)

It’s also worth noting that constraints are composable — that is they can be combined so that routing can be limited to a maximum of 2 pools, each with liquidity exceeding X. The current data model splits data between the graph data type and lookup tables which means that routes can be pruned both during the graph traversal and afterward when pair data is being discovered.

Scaling

Scaling is largely considered for exposing a public API to the router for computing routes based on current market data. Performance is a function of:

  • the time to compute a route
  • the time to get updated pair data if needed (token prices and reserves)
  • the time needed to compute the impact of a route request

The diagram below illustrates one initial system architecture, featuring caches for the most commonly requested routes and cached pair data for calculating route request impact. This architecture is flexible and can be scaled horizontally in multiple ways. For example, replicating the graph data structure and route caches as well as the request aggregator and pair caches completely, or simply by replicating the caches and distributing route requests among the caches.

Figure 6.0: Routing service architecture featuring scalable components, caching, and periodic updates.

Another potential scalability modification could be a complete route solution cache that considers the route request and amount; if the amount is within a certain tolerance a recently calculated result can be reused. The recently calculated result can also be used as a temporary result while a more accurate one is computed for a user, depending upon the needs of the UX and application.

Route Cache

The route cache consists of results of the most commonly requested routes with a Time-To-Live (TTL) related to the periodic update frequency from the data source. For example, if a route between WETH and DAI has been previously requested, the results of the graph traversal can be found within the route cache as an array of possible routes: [WETH -> USDC -> DAI, WETH -> WBTC -> DAI, …]. Unlike pair data, such as token prices and reserves, the route possibilities — specifically the presence of pairs — change less frequently, so this cache’s TTL is expected to be significantly larger than the pair data cache. Additionally, this component may expand to incorporate heuristics for dead routes (i.e. pairs that are expired or have superseded token addresses or illiquid pairs).

Pair Data Cache

The pair data cache contains the token price and reserve data for specific pairs, facilitating computing of the impact of a route request using a specified amount. This data changes more frequently than the available routing options and will have a TTL that is approximately one ethereum block (~ 15s).

When data requested is not in the cache, those requests will be bundled together and fetched from the data source, currently the Uniswap Subgraph on The Graph Protocol (see the Data Sources section below), in groups up to 1000 pair data queries. The wait time before sending a bundled pair data request will be tuned based on traffic — in low traffic scenarios, it doesn’t make sense to make users wait until additional requests are made.

Additionally, for improved UX, pre-results based on old data may be presented while updated data is being fetched and computed.

Static Analysis

Route performance will be evaluated by comparing the yield of generated routes against the existing Uniswap V2 router’s routes. Specifically by computing the yield for a given amount of the source token. For example, in a trade from DAI to COMP, the performance of the router will be compared by calculating how much COMP is received from 1 million DAI tokens and examining the same result from the Uniswap V2 router’s suggested route. The performance will be measured for a variety of varying inputs, for example, different initial amounts, different limitations to the search depth, etc.

The aforementioned pair data cache and bundled requests of figure 6.0 are not needed when the static analysis is being performed. Static analysis is the computation of trades at a specific block time in the blockchain. It facilitates consistency and repeatability of results for comparison purposes. The scope of the initial work in designing a new router for Uniswap V2 is aided by static analysis where an ensemble of trades can be evaluated at a block time and compared to the existing routing algorithm or variations. If the underlying pair data was changing, it would be unclear if improvements or degradations in trading results were due to algorithmic changes or pair liquidity and pricing.

Data Sources

As blockchain technologies mature, a plethora of services have surfaced to provide current and historical blockchain data. The choice of a data source involves the following considerations:

  • Budget
  • Development Effort & Cost
  • Latency
  • Data Accuracy

Development effort and cost can be reduced by using pre-digested or indexed data like that found in the Graph Protocol’s numerous sub-graphs.

For initial swap router design and implementation, the Uniswap Subgraph of the Graph protocol will be used. This data source provides excellent ease of use, the ability to analyze contract data from the past that would otherwise necessitate a more expensive archival Ethereum Node, and the ability to update the data for 1000 pairs in a single HTTP request when the static analysis is not being performed. Data in the Graph Protocol has much lower latency than the aforementioned solution and is only bested by running an Ethereum node directly or modeling the mempool (or perhaps with services like Blocknative). The Graph Protocol’s latency is observably about 1 block, which can change dynamically under certain indexing scenarios. Notably, the data can also change when indexed and mapped blocks prove to be invalid[8][9].

Once the swap router design has been evaluated, this data source would prove unsuitable for generating routes because of the need for real-time data to be competitive. In such a scenario, data directly from the current state of the blockchain would be required, for example, Ethereum nodes from Alchemy, Infura, or another source.

Future

The system outlined above provides flexibility and extensibility to analyze the performance of existing Uniswap systems as well as building new systems atop the protocol including a full-fledged trading solution. Similar to Coinbase vs Coinbase Pro or Synthetix vs Kwenta, there are advanced features essential for professional traders, and we have outlined a few below.

Trade Generator

By employing constraints that avoid the aforementioned 6 hub tokens, the system described herein can be used to check for alternate routes and their efficiency between certain tokens. This can be done on a regular basis to construct a heuristic-based list that can be employed by existing systems/traders to improve their recommended routing for those pairs or allow them to make other changes.

Cross-Protocol Routes

By augmenting the graph edges or alternately making the graph data structure a multi-graph and adding extra data sources, it is possible to extend this system to furnish users with routes between assets that cross from between the Uniswap V2 and V3 protocols. Depending upon the objectives, this can serve to reduce slippage in trades or alternately work towards managing fragmented liquidity.

Cross-Layer Routes

Similar to cross-protocol, the router can be extended to generate cross-layer routes. The promise of reduced gas fees and improved transaction bandwidth suggest that layer 2 solutions will help define a significant portion of the future of Ethereum. Crossing between a layer 2 and layer 1 asset presents a new routing challenge with complexity that exceeds protocol crossings. However, the same fundamental building blocks can generate solutions that allow for trades across layers, taking advantage of upcoming protocols for this purpose, for example, the Hop Protocol [10], Nova [11], and others.

MEV (Maximally Extracted Value)

By combining route solutions with technologies to prevent MEV, such as Flashbots [12], the routing system herein can be used to protect large trades from attacks. Heuristics or other inputs can determine if the value of a trade is sufficient to present such a risk and then protective solutions can be automatically incorporated into the trade solution presented from the determined best routes.

Low Latency Data Sources & Predictive Routing

The Graph Protocol data source while convenient and fast does have at least 1 block of latency typically. This may be insufficient for some applications in trading. It’s possible to substitute other data sources that are more performant in this regard, either by tying directly into an ethereum node and filtering completed transactions of interest to update the data structures of the routing system. Going a step further, it is also possible to construct a limited depiction of the mempool and apply similar transaction filtering in an effort to provide predicted routing results given the state of unexecuted known transactions.

References

  1. Hayden Adams, Noah Zinsmeister, and Dan Robinson. (March 2020). Uniswap v2 Core. [Online]. Available: https://uniswap.org/whitepaper.pdf
  2. “Core Concepts Pools”. Uniswap.org. https://uniswap.org/docs/v2/core-concepts/pools/ (accessed Jun. 22, 2021)
  3. Michiel Mulders. “What is Uniswap in Simple Words?” Cryptopotato.com. https://cryptopotato.com/what-is-uniswap-in-simple-words/ (accessed Jun. 22, 2021)
  4. “What is Slippage” Coinmarketcap.com. https://coinmarketcap.com/alexandria/glossary/slippage (accessed Jun. 22, 2021)
  5. “Graph (abstract data type)” Wikipedia.org. https://en.wikipedia.org/wiki/Graph_(abstract_data_type) (accessed Jun. 22, 2021)
  6. Graphlib. (2021). [Online]. Available: https://github.com/dagrejs/graphlib
  7. Alessandro Mario Lagana Toschi. (October 2020). “A Black Hole in Uniswap V2’s Front-End Router Is Draining the Value of Tokens” https://medium.com/dfohub/a-black-hole-in-uniswap-v2s-front-end-router-is-draining-the-value-of-tokens-26f5a459b5d7 (accessed Jun. 22, 2021)
  8. “Address certain corner cases of time-travel queries by block hash“ Github.com. https://github.com/graphprotocol/graph-node/issues/1405 (accessed Jun. 22, 2021)
  9. “Detect reorgs during execution of a GraphQL query” Github.com. https://github.com/graphprotocol/graph-node/pull/1801 (accessed Jun. 22, 2021)
  10. Chris Whinfrey. (January 2021). Hop: Send Tokens Across Rollups. [Online]. Available: https://hop.exchange/whitepaper.pdf
  11. Nova. (2021). Accessed Jun. 22, 2021. [Online] Available: https://docs.rari.capital/nova/#l2-novaregistry
  12. Hasu. (June 2021). How mistX uses Flashbots bundles to provide frontrunning protection. [Online]. Available: https://www.paradigm.xyz/2021/06/how-mistx-uses-flashbots-bundles-to-provide-frontrunning-protection/

--

--