Relaying The Message: A Deep Dive Into IBC Relayer Operations
The team at Interchain GmbH will shortly release ICS-29–relayer fee incentivization–for the ibc-go module. This marks an important milestone in the roadmap towards scalability of the IBC (Inter-Blockchain Communication) protocol.
Relayer operators are a crucial part of the IBC infrastructure, yet sometimes their presence is taken for granted. In this blog post, we will shine a spotlight on relayers and consider ways to ensure smooth and reliable relayer coverage for IBC enabled (or aspiring) chains. The content and suggestions below are derived from a number of interviews we’ve conducted with relayer operators, kickstarting an effort to monitor the relayer experience and plan additional support where it may be necessary. We hope the conclusions from the first iteration of interviews will be helpful for those interested in starting to relay, for new chains who want to enable IBC or simply for anyone interested in knowing more about IBC and how it’s made possible.
Reading guide: this article is intended to provide information for multiple audiences. Chain representatives interested in a broad overview may want to skip the sections on Relayer software and Technical setup. Aspiring relayer operators on the other hand may want to focus on precisely these sections as a starting point.
Relayers’ Role in IBC
Let’s briefly recap what relaying is, why it is important, and how it relates to Inter-Blockchain Communication (IBC). IBC provides blockchains a protocol to enable reliable, secure and permissionless transfer of packets of data. The protocol is agnostic with respect to the data, paving the way for application developers to develop a range of possible Interchain services (fungible and non-fungible token transfers are an obvious candidate, but also arbitrary cross-chain messaging via Interchain Accounts or Interchain Security).
On a high level, the way this works is as follows. A module on a source chain wants to send a packet to a destination chain. It submits a message to the source chain that stores a commitment proof on-chain and logs an event with the packet information. With this information and the proof we can submit a message to the IBC client on the destination chain, who will verify the proof and, if successful, store a receipt on-chain and have the receiving module execute the required actions according to the packet data. For simplicity, we will leave out the acknowledgement and timeout functionality, which you can find more information on in the documentation.
There are two important considerations to take into account in this flow. First, on the receiving chain, we need to verify the commitment proof on the source chain. This is why a light client is required to track the state of the counterparty chain (in an efficient way). Again we refer to the documentation to find out more. Second, blockchains cannot directly communicate with one another, so how do the proof and packet data arrive at the destination chain to continue the flow described above?
This is where the relayer operators come into the picture, they ensure the relaying of the packets over network infrastructure. Relayers have access to full nodes of both source and destination chains where they can query and submit messages. They are listening in on the channels they relay for events that require an IBC packet send. They run relayer software that enables them to rebuild the packet along with the proof and submit this to the destination chain. A similar process then happens upon storing the receipt on the destination chain, spurring the acknowledgement message to be sent to the source.
Who are relayer operators?
Relaying is permissionless. The IBC protocol is designed to function securely even in the presence of faulty or malicious relayers. The permissionless nature of relaying, as well as the permissionless creation of clients, connections and channels on an IBC enabled chain is a key value proposition for the IBC vision on interoperability. Unlike most other interoperability solutions that require a trusted validator set as intermediary for a token transfer bridge, in IBC we trust the chains themselves, not the bridge. The implication here is that the management, security and upkeep of the bridges in a scaling Interchain ecosystem, should not be bottlenecked by a limited number of trusted parties.
Permissionless relaying means anyone can, theoretically, start relaying. In reality, however, relaying between chains involves significant hardware requirements and know-how of running node infrastructure. Because relaying puts a lot of RPC (remote procedure call) pressure on a node (refer to the technical section below for more detail), it is recommended that relayers run their own full nodes for those chains they want to relay for. Due to this, and the lack of (in-protocol) incentivization for relayers before the introduction of ICS-29, relaying is mostly done by teams already validating IBC enabled chains (typically relayer operators would only service a few -if any- chains they are not validating on). We will revisit the topic of cost and incentivization later on, to investigate ways in which relayers are (or can be) rewarded for their efforts.
However, the take-away for chains who want to implement IBC and need to secure relayer operator support is: engage with your own validator community or those of the counterparty chains you would like to connect to. For example, most if not all of the current top relayer operators started relaying when a large chain decided to adopt IBC (Osmosis, Juno, Secret, …). Another option is to contact the main relayer operators and inquire if they offer relaying-as-a service (check out the Osmosis IBC relayer list to find out which teams offer this service). It will be interesting to monitor how the situation changes with relayer fee incentivization but as of the time of writing, this is the best bet.
Once someone’s decided to begin relaying, look at the different relayer software implementations that are available. At the moment these are:
- Hermes relayer. Developed by Informal Systems in Rust
- Golang relayer. Developed by Strangelove Ventures in Go
- Typescript relayer. Developed by Confio in Typescript
Both Hermes and the Go relayer offer some enticing features to get started relaying. The Hermes documentation is quite extensive, providing an overview of a range of functionality (please refer to the feature matrix in the documentation, to find out what is currently supported), but it may require a more thorough knowledge of the IBC protocol. The Go relayer focuses on making it easy to get up and running so it could be the easier option to start with. Both Hermes and Golang relayers offer tutorials to test with 2 local chains. This allows for experimentation with the software and configuration files before moving on to testnets and finally mainnets. A desire for more incentivized testnets became apparent during relayer operator interviews and might be something to consider when implementing IBC for a new chain, to onboard new or existing relayers.
Relayer software uses config files where all important configuration parameters are grouped, including chain parameters (a.o. RPC, gRPC and websocket endpoints) and path information. For Hermes, you can find the explanation for different parameters here. The Go relayer has functionality to fetch config data from the chain registry, as you can follow on their Github. You can always check out the Cosmos chain registry for Interchain chain and asset metadata. Recently, there has been an addition of an IBC data schema to improve user experience for relayers. We are constantly trying to improve the available data and keep it up to date.
Most relayer operators we spoke to during the interviews are either exclusively running Hermes or mainly running Hermes, complemented with the Golang relayer as backup or for specific use cases (for example, when there is congestion on a channel, and packets get stuck, the Golang relayer was found by some to work a little better to clear the channel when congested). The most cited argument for this setup is performance in a production environment. Speaking with the team responsible for maintaining the Go relayer however, they are heavily working on improving performance as well as logging and debugging tools for a future release. Additionally, they are focusing on using a provider interface for the relayer software to facilitate integration with non-SDK chains, an important feature as the IBC protocol gains more traction outside of the Cosmos ecosystem where it originated.
We believe it’s definitely worth familiarizing with both applications to find out the pros and cons and perhaps find hybrid workflows to optimize your setup. For example, the team at Notional provides an overview of a hybrid setup here (it should be noted that this was valid for the Golang relayer v1, so this should be revised when the final v2 release is cut).
When support is required to set up relaying operations or if you have questions or issues during the process, expert support is available from experienced relayers and the relayer software development teams in the IBC Gang Discord. Additionally, we are investigating the possibility of setting up a best practices repository where operators can share knowledge about their setup. More information on that will follow.
Some additional points we’d like to raise and common pitfalls to note, include:
- It should be stressed that not every relayer operator needs to initialize their own channel when starting to relay a certain path.This is a quite common misconception when starting to relay.
As of today, IBC token transfers via ICS-20 are the main IBC application, these require only one canonical channel to be opened between chains. In fact, tokens sent over a different channel will be non-fungible as a result of the way IBC denoms work, which is to be avoided. Unless you are helping a chain set up IBC, there likely is a canonical channel already established. When in doubt, you can check Map of Zones, Mintscan or query channels with the relayer software to see if there is already a canonical channel. Recently, a new data schema for IBC data to the chain-registry was added and it will be expanded in the near future. This will allow for definitive info on the canonical channels.
- Note: As more chains implement Interchain Accounts (which currently do require a separate channel for each Interchain account), it will become more commonplace for relayer operators to set up a channel, but it is still recommended to do some research first and see if there’s a canonical channel established.
- Relaying packets often requires that the relayer pays fees when submitting messages (more on this later). Keep in mind that you will need to have a wallet set up with funds available to successfully relay packets. This is true in local testing setups (you can easily grant addresses funds through the config files), on public testnets (look to get testnet tokens via a faucet) and on mainnet.
In the previous section, the different implementations of relayer software have been introduced and the config file has been identified as the place to group configuration parameters. In this section, we refer to a more concrete example for the technical setup and identify the biggest current technical bottleneck.
As a large user of IBC, Osmosis depends on good relayer coverage for all connected chains. They provide a guide to start relaying that goes in depth on the hardware requirements, installations and configuration setup to start relaying. When going through the guide, you will notice how it’s important to configure the endpoints for RPC, gRPC and websocket servers and to make sure the relayer config (Hermes in this example) includes the correct endpoints.
The Osmosis relayer guide mentions a few commands to clear packets, a more exhaustive list can be found in the Hermes documentation (for the Go relayer, you can run rly -h from the CLI to find a list of commands). These include commands to set up new clients, connections and channels (only in case no canonical channel is present) including handshake messages, packet messages to deliver Receive and Acknowledge messages, update and upgrade client messages, clearing packages and a number of queries. Most of these are executed automatically according to need when using the `hermes start` command, but they can also be used more granularly.
As we mentioned earlier: relayers have access to full nodes of both source and destination chains where they can query and submit messages. They are “listening” in on the chains they relay for events that require an IBC packet send, related to channels that they are relaying for (filters can be implemented to allowlist or filter certain channels). Currently, relayers use the Tendermint RPC endpoint to query for the commitment proofs that are required to verify IBC transactions on the counterparty chain. The volume of these queries can put significant pressure on the RPC endpoints of the nodes which is one of the main bottlenecks currently in production. Because the Tendermint RPC is single-threaded, large amounts of relayer queries may cause the node to run out of sync, requiring regular resets. This is not a result of the relayer software, but rather of the node architecture and will likely improve along with the ongoing work on the gRPC support.
Call to action: both the teams at Informal and Strangelove Ventures who maintain the relayer software would like to invite relayer operators to provide feedback if there are issues or struggles. They actively monitor the respective #hermes and #rly servers in the IBC Gang Discord and can potentially provide live debugging sessions when the need arises. Both teams are determined to improve the design of the software to suit the needs of relayer operators in production.
Cost and Scaling
Recall from above, relayers pick up events and submit messages to the counterparty chain. The submission of these messages requires a transaction fee (unless chains decide to adopt fee-less IBC messages, check this with the chain community) and, until now, without any on-chain fee incentivization, it meant that relayers needed to pay these fees out of pocket.
Now, consider an overview of current solutions to cover fees and potentially provide additional incentives for relayers:
- Team delegation: team delegation is a way for relayers to indirectly benefit from the relaying services they offer, given that most relayer operators are also validators for the chains they relay for.
- Direct funding: Chains directly fund the relayers’ wallets
- Relayer (community) pools: a portion of community funds are pooled and distributed among relayer operators. From anecdotal evidence, this is not always easy to execute in a fair way.
- Fee grant: Cosmos SDK chains have the fee grant functionality that allows an account to pay the fees for another account. This way, a chain operated account can pay the fees for the relayers, for example taking inspiration from this config file used by the Omniflix team. However, this functionality only covers relaying costs and does not provide incentives on top.
- Relayer fee middleware (ICS-29): the fee middleware allows end-users or chains to provide fees for relayers per individual packet, please refer to this blog post for more information on how this works.
Chains wanting to implement relayer incentives should investigate which of the above solutions — or a hybrid — best fit their needs and vision. For example, we can envision very different approaches from chains who fully embrace multichain functionality as core to their business model versus chains who view transfer of assets over IBC as more of an extra. They may have different takes on whether to put the costs of relaying on the end user or take it on themselves e.g. via a using the community pool or other chain-owned account to pay the ICS29 incentives as an integrated part of the IBC transaction flow.
An interesting observation coming out of the interviews is on the topic of scaling. Many relayer operators are currently relaying for around 10–15 chains. In principle, it is straightforward to add additional chains (channels) to the configuration file and provide relaying on those paths having acquired the knowledge how to do it. However, once you approach that number of chains, the complexity of the infrastructure setup increases along with the workload and engineering capacity in managing it. The result is an increasing proportion of total costs going to operational costs rather than fees, for a growing number of chains serviced. For many teams this could be a blocker, as they are often only a few people and finding profiles with the requisite DevOps skills to handle the additional complexity might prove difficult.
Along with cost and profitability considerations, the added complexity when scaling could lead to a natural ceiling to the number of chains relayer operators decide to relay for. It will be interesting to monitor how this develops over time as the Interchain scales. Will we see many smaller relayer operator teams relaying for a dozen chains or will we see a few dominant players overcoming the initial manpower and technical setup constraints and servicing most of the available chains?
For IBC to work, we assume the availability of relayer operators who provide the relaying of packets. Relaying is permissionless and can be incentivized. However, the permissionless nature should not obscure the need for quality of service guarantees. Tools must be available to ensure sufficient coverage for the paths IBC applications are using. At the time of writing, initiatives have been organically set up to track the coverage or redundancy of operators off-chain in spreadsheets initiated by the community of relayer operators or the chains offering IBC functionality.
For example, in the IBC Gang Discord, there are pinned spreadsheets tracking relayer coverage for Osmosis, Juno, Secret ( feel free to add additional resources like this) . This type of solution offers some immediate benefits: it’s relatively quick to set up and gives a first impression of which chains or channels the operators are servicing. Yet this approach also has downsides: all relayer operators must be made aware of the existence of these tracking files, maintenance to ensure its accuracy over time takes a lot of effort and, with the introduction of Interchain Accounts and CosmWasm contracts using IBC, each is expected to significantly increase the number of available channels. In the future, we may see a more dynamic landscape where relayers use channel filtering more prominently. Up until now, with IBC transactions limited to token transfers there was the expectation that all canonical ICS-20 channels would be picked up when relaying for a particular pair of chains.
Additionally, there’s the ibc-status bot originally developed by the Imperator team and now open source, opening the door for improvements by the community. This tool has proven to be a beloved tool used widely in the community, highlighting pending packets in need of clearing and allowing the relayer operator community to coordinate on solutions. For example when a relayer operator on a less covered path goes offline or a client gets expired, the status bot can help the community respond. We can envision, however, additional monitoring features revealing more information about redundancy (how many extra relayers could have picked up the packet if the one that did failed to do so).
These initiatives are a good start and have contributed to the successful launch of IBC (refer to Map of Zones for updated transaction volume and success rates). Further improvements, however, are necessary to ensure the Interchain vision is achieved. The team at Interchain GmbH continues to set out a roadmap and develop features that will aid the IBC community to achieve this vision. Furthermore, we facilitate discussion between the major stakeholders and keep them informed about the different strategies and technologies that are available in the relayer landscape, as well as provide educational materials for the community. The future progress of the community and interaction between different stakeholders will determine the most fitting solutions to ensure coverage, whether it be on the level of IBC-enabled chains, more sophisticated monitoring via a dashboard from relayer operators (especially those offering relaying-as-a-service) or even an on-chain monitoring protocol.
We would like to thank the relayer operator teams who we’ve interviewed for their collaboration and invaluable insights into the relayer experience: Lavender Five Nodes, CryptoCrew Validators, Cros-nest, Imperator, Notional and Cephalopod. Also thanks to the relayer teams of Hermes and Go relayer to provide us with valuable insights into their roadmap: Adi from Informal Systems and Justin at Strangelove Ventures.
We are open to conducting more interviews with relayer operators and will be reaching out to teams, if you’re a relayer operator and would like to get in touch, you can reach out to me at firstname.lastname@example.org
Special thanks go out to Charly Fei, for supporting me during the research and providing great constructive feedback. Many thanks as well to Alan, Carlos and Susannah for reviewing and providing feedback.