On DDoS & QUIC

Recently, an internet friend and I were talking about QUIC, Google’s new protocol to replace TCP as an underlying transport for the internet. QUIC itself runs over UDP/80. Given traditional experiences with protocols over UDP, Carrollr protested:

< carrollr> Sargun: How does QUIC avoid a) spoofing of source, b) overloading congested links on the path

It turns out, the latter is still in flux, but Google has taken interesting approach to the earlier problem, and there’s a lot we can learn from it.

On UDP

For a few years now, the number of UDP reflection attacks on the internet have been slowly increasing. We first saw this in 2013, with the deliberate misuse of misconfigured NTP servers on many devices. By sending a command from a spoofed address to an NTP server, an actor could cause it to respond with a larger response, thus not only amplifying their attack, but hiding the original source(s) of the attack. The response against this kind of attack was met with futile efforts on the behalf of network engineers, as filtering from hundreds, if not thousands of ASs with misconfigured server was impossible. We’ve seen these attacks historically plague DNS, and now we’re even seeing misconfigured SNMP devices used in these kinds of attacks. This has generated a massive distate for UDP based servers.

TCP 3-way Handshake

TCP combats this problem via two mechanisms. First, in order to bring up a TCP connection, one must first have a three-way handshake occur. This mechanism allows for a very low cost way to ensure bidirectional communication over the underlying channel before exchanging potentially large application data. This, introduces another problem, where the server has to store state while it waits for the client to respond to its SYN+ACK. This introduces a new type of attack, called a “SYN Flood” — which we’ve developed defences against via using SYN cookies. SYN cookies are devices to make the first delay stateless, and it enables the server to restore the state it would have stored statefully based upon the response from the SYN+ACK.

QUIC, being designed from the ground-up has taken an alternative approach. But, first a quick background on QUIC. QUIC, rather than traditional TCP, takes something that looks closer to the SCTP approach. Each connection is then multiplexed into a variety of streams, which map to SPDY streams. This enables long-lived state across what would traditionally be multiple transactions going over different TCP connections. In doing this, there can be a slightly higher overhead to connect, but connections are few and far between.

First, QUIC defends against reflection attacks by making it more expensive for the client to initiate a connection. Initial hello packets are padded to the full size of a packet. This is typically somewhere between 1280 bytes (minimum MTU for IPv6), and 1500 bytes (typical MTU for the internet). This approach is incredibly simple, because bandwidth has an inherent cost, and always will. Given this, the invariant that the client hello will always be larger than, or equal to the size of the server response stands. This immediately rules out naive reflection attacks.

On resumption of the connection at a later point, information exchanged in the hello sequence can be used to make a lower cost reconnection. The server doesn’t need to store all of the state, because it can hand it off to the client, after signing it with its private key. In a state where the server is stressed out, either due to high load, or DDoS, it can require the client to regenerate the proof-of-ownership, therefore mitigating DDoS attacks.

Generalized Anti-DDoS

This got me thinking as to what other mechanisms could we employ to prevent DDoS attacks in protocols, by leveraging proof-of-work systems in order to fight attacks. As machines interact more and more over organizational boundaries in environments like the internet of things, machine to machine traffic in mobile networks, and greater integration of cloud services we need a better anti-DDoS mechanism.

Proof-of-Work

It turns out that cryptographers have been thinking about these problems far longer than I have. In fact, some of the first proof-of-work papers date back to the early 90s, as a mechanism to fight spam. More recently, they’ve become popular in cryptocurrencies as a mechanism to distribute work, and enable democratization of the advancement of the blockchain.

The way these protocols work is that typically, there is some NP-hard problem, like generating a hash that fulfills some criterion such as having a number of leading 0s that can quickly be verified (as NP-hard problem are), yet cannot easily be sped up. In addition, in order to keep up with Moore’s Law, these problem have an adjustable difficulty. As time goes on, these problem become more and more difficult to solve, and therefore negate the speed increase of processors. This is slightly complicated by the massive delta in bandwidth, memory, and compute capability from mobile devices to traditional desktops.

Part of the proof-of-work research has been in problems that are not in fact NP-hard by nature. On example of this is ReCAPTCHA. When ReCAPTCHA was invented, CV systems hadn’t developed to the capability to solve the provided captchas. ReCAPTCHA could then provide a solved captcha that had been obfuscated, with the addition of an actual unsolved captcha. By forcing the user to solve both, and using only half of the information to verify the authenticity of the user, it solved a multi-million human hour problem.

The Future

I’d love to find a useful NP-hard problem to solve. One that’s been particularly interesting as of late is protein folding. There has been quite a bit of work on it lately showing that it’s NP-hard, and it’s a well understood problem to distribute, thanks to folks at Folding@Home. Could future web browsers be required to solve a protein-folding problem before you’re able to login? Could we accelerate the cure for disease? Perhaps. There are unanswered questions. What protocols actually need to be secured against a DDoS? What are the ethics of utilizing a person’s computer for solving math without their explicit consent? I feel these questions could lead us to a path with better security, and greater democratization of computing power.