QUIC as an optimal base transport for peer-to-peer systems

Published in

Lumen Engineering Blog

7 min readJun 14, 2023

Context

For a long time, we have used WebRTC as the basis for our Mesh Delivery solutions. The main motivation was that it is the only protocol usable on web-based systems, and our technology covers a wide array of platforms including web, SmartTVs, Android, and iOS.

Recently we pivoted our focus to native platforms like native SmartTV applications, embedded set-top-boxes, and consoles (PlayStation, XBox). This was designed to cover both video streaming and pure native use cases like mesh-powered delivery of files/data objects that run on native Windows/OSX/Linux.

This opened us up to options other than WebRTC for transporting peer-to-peer data and control messages, and allowed us to look at performant and flexible protocols that are more suitable for raw data exchange than complex WebRTC Data Channels.

What is QUIC and why did we choose it as the future for our P2P transport?

Before we set out to describe why QUIC was an optimal base transport for our P2P data transport system, we need to define the requirements and criteria on which we weighed QUIC and its alternatives like SCTP, UTP, UDT4, etc. as a potential base protocol.

What is P2P transport and what does it need?

First, we outline what we considered as P2P transport and our requirements for what needs to be usable/implemented on top of it. For our use case, we needed:

A reliable, industry-proven network transport protocol that is well documented, researched and has well-understood technical characteristics.
Capable of working with P2P techniques like STUN or equivalents, NAT traversal (UDP hole punching, etc.) and simple-to-establish connections between two non-dedicated server nodes on the internet.
Minimal usage of machine and network-level resources: mapped/hole-punched ports, UDP sockets per connection, etc.
Channel secured by design, preferably with simple usage requirements.

Another nice-to-have would be for the transport to provide as many features and out-of-the-box support as possible for the required logic at the application level:

Unique UUID-likes to be able to consistently identify network nodes. This is very useful for multiple technical reasons:
– Identifying application-level states across NAT/DHCP-induced discrepancies.
– Authentication and verification of application-level network nodes especially with trusted endpoints (e.g. our activation backends).
Application-level multiplexing of data streams, allowing one connection to be used both as a control channel (heartbeats, information/stats propagation, peer exchange, DHT, etc.) and potentially multiple independent bulk data channels.
Resilient to network-level changes, e.g. being able to resume an “application-level” session should there be a change in the raw transport level. For example, this could be a change from Wi-fi to mobile networks, a temporary Wi-fi disconnection, etc.

How does QUIC perform as a P2P transport?

What is QUIC?

In May 2021, the IETF standardized QUIC in RFC 9000, supported by RFC 8999, RFC 9001 and RFC 9002. Although its name was initially proposed as the acronym for “Quick UDP Internet Connections,” IETF’s use of the word QUIC is not an acronym; it is simply the name of the protocol.

This makes it qualify as an industry-proven transport protocol as it is powering 25% of the internet, with many big users like Google, Microsoft, and Facebook using it to deliver their services in production. It has also been well battle-tested from its inception in Google Chrome in 2012 to its evolution as an IETF standard.

QUIC 1/0-RTT connection setup

One key improvement of QUIC is addressing the redundancy of ACK-based reliable and stream-encrypted network connections. To recap:

TCP-like protocols base their retransmission on a SYN -> SYN-ACK handshake which determines a starting sequence number for upcoming IP/UDP packets.
Stream encryption for a network channel relies on a negotiated per-channel symmetric key, Initialization Value (IV) and a sequence number from which the “key stream” would be calculated.
Error detection is based on both a data checksum (CRC32, Adler-32, etc.) and whether the data is decrypted with the correct parameters (key, IV and/or sequence number). Should the packet be tampered with or unintentionally modified, the decrypted output would be heavily corrupted which, in this case, could be easily detected even with weak checksum algorithms.

In short, certain connection parameters can be shared between retransmission/ACKs and channel encryption/decryption. This is the first key breakthrough of QUIC compared to the classic separated solution involving a raw transport channel with an encryption channel on top. This reduces the connection setup to up to 1-RTT for new connections and 0-RTT for resuming connections.

Combining transport and security parameters and simplifying connection establishment

Application-level oriented network-resilient endpoint identification

Another less known but useful design feature of QUIC is the detachment of a “connection” from its “network path.” Other protocols identify the other endpoint solely on its IP/UDP source/destination address, which could be tampered with by NAT and/or susceptible to DHCP remapping/network-level changes.

QUIC handles this inconsistency by leveraging a very flexible “Connection ID”-based system. For each QUIC endpoint, the connection parameters (sequence number, key, IV, etc.) are tied to an up-to-18-byte-string free-form pair of Connection IDs (CID) of each endpoint participating in the connection.

For explanation simplicity, suppose that connecting endpoint A (CID=CIDofA) establishes a connection to a listening endpoint B (CID=CIDofB):

If this is the first time A is talking to B:
– A will initiate a 1-RTT handshake with DestinationCID=CIDofB and SourceCID=CIDofA.
– A would then associate the connection parameters to the tuple of <SCID=CIDofA, DCID=CIDofB>.
– B would then associate the connection parameters to the tuple of <SCID=CIDofB, DCID=CIDofA>.
If A reconnects to B later:
– A will just send packets with <SCID=CIDofA, DCID=CIDofB>.
Packet handling is simply obtaining the connection parameters using the <SCID, DCID> tuple as map key regardless of IP-level addressing.

This may seem like a complication on top of just using IP-level Source/Destination Address to re-map connection parameters. It provides multiple technical advantages, however:

On a peer-to-peer system where we must deal with potential IP-level address manipulation (NATs, Firewalls, bad actors, etc.) and unexpected network changes (Wi-fi/mobile network switch, DHCP remapping, etc.). This allows us to handle those transparently to the application-level logic.
Coupled with an authenticated P2P tracker and a fixed-per-endpoint-CID scheme, we can ensure the application level authenticity of a connection even between the peers statelessly (e.g. using a tracker-issued CID-based JWT token). This improves the security and robustness of the whole P2P network.

One UDP port — multiple protocols/streams

Another impressive feature of QUIC is its multiplexing capability. Using only one listening UDP port, we can perform multiple levels of multiplexing:

Given that different CID pairs are used for each, a pair of IP endpoints can open multiple independent QUIC connections on different application protocols using differently-negotiated ALPNs. With QUIC it is technically possible to use QUIC/HTTP3 and other QUIC-based application protocols on the same UDP socket in parallel.
Inside an established QUIC connection, it is possible to have multiple uni/bi-directional independent TCP-like data streams and, in a recent extension (RFC 9221), UDP-like best-effort packets without having to open multiple UDP sockets/TCP-like connections.
Since HTTP3/QUIC is transparent on backend/service level, we can use the same protocol or socket to communicate with standardized RESTful APIs that exist before.

Those alone provide a very powerful protocol framework that can be tapped into by application-level protocols without any custom development efforts on the high-level application logic.

Also, that the protocol itself is UDP-based and the possibility of using a single UDP port open several technical possibilities:

The listening endpoint can perform STUN practically for free; it just needs to notify the remote connection endpoint what its IP source address is from the listener point of view. This also tackles the potential privacy issue of using a public STUN server.
Any NAT traversal technique applicable to UDP is usable on QUIC, with an added perk that it is performed on the same UDP socket that would do the actual post-hole-punch communication, making keeping alive the mapping trivial and free.
Reduce the strain on the upstream network. An endpoint only occupies one NAT mapping for all its application-level protocol needs; hence UPnP/STUN NAT traversal needs to be done only once on one mapped NAT port.

Conclusion

Given the technical possibilities QUIC provides, we implemented a prototypical QUIC stack to be used in our Mesh Delivery for File Downloads project, with the vision also to use it to power native-to-native P2P connections on our streaming video technology with successful results.

This document is provided for informational purposes only and may require additional research and substantiation by the end user. In addition, the information is provided “as is” without any warranty or condition of any kind, either express or implied. Use of this information is at the end user’s own risk. Lumen does not warrant that the information will meet the end user’s requirements or that the implementation or usage of this information will result in the desired outcome of the end user. All third-party company and product or service names referenced in this article are for identification purposes only and do not imply endorsement or affiliation with Lumen. © 2023 Lumen Technologies

_______________________________