I was inspired to start a series of articles on the early history of DDoS by a few recent events. Rik Farrow interviewed me for a forthcoming issue (Fall 2019 Vol. 44, No. 3) of Usenix
;login: magazine while I was also writing up a history of the early days of the Honeynet Project, which refreshed my memory on a number of events in 1999-2000. I also read this MIT Technology Review article on the 20th anniversary of the “first DDoS attack” on the University of Minnesota:
The first DDoS attack was 20 years ago. This is what we've learned since.
July 22, 1999, is an ominous date in the history of computing. On that day, a computer at the University of Minnesota…
It took me a little while to remember that July 22 was not the first of the three days that the University of Minnesota spent off-line from persistent flooding. That happened almost a month later. Nor was July 22 even the start of the build up to that event. Now seemed like a good time to clarify this history.
So this is the first of what will be a series of articles laying out the real history, told from the inside of the events, including some things that have not been said publicly before now. I’ll publish other parts of the story on their 20th anniversaries. I hope you follow along and enjoy them!
The squall before the storm
Twenty years ago — July 22, 1999 —CERT/CC at Carnegie Mellon University issued Incident Note IN-99–04 warning of a large number of ongoing reports to them about the widespread exploitation of vulnerable Remote Procedure Call (RPC) service daemons (
cmsd, and the ToolTalk
ttdbserverd) on Sun Solaris 2.x computer systems.
The UW had thousands of such systems on its network, and I had already been helping UW staff and faculty who found them compromised (or were reported to us as being involved in abuse of other sites). On any given day, we were aware of a dozen or more compromised Solaris systems (though the real number would later prove to be an order of magnitude larger).
In 1999, a set of compromised computers, called Trin00 , took down a network at the University of Minnesota; and with this first documented case, botnet volumetric Distributed Denial of Service (DDoS) attacks were undeniably born. While earlier attacks against infrastructure exist in anecdotes and recollections, it is with this documented case that we can archivally establish the lower bound of 20 years. Many changes, enhancements, and evolutions to our mitigation technologies have happened since then, but are we demonstrably better off today (now 20 years later)? Trin00 used hundreds (and actually may have been composed of thousands) of compromised machines (“bots”).
— Osterweil, Stavrou, and Zhang, arXiv:1904.02739
Reference  in the above quote is CERT/CC’s IN-99–04. A more relevant CERT/CC document in terms of Distributed Denial of Service (DDoS) tools would actually be IN-99–07, released in November 1999 (long after the University of Minnesota DDoS) and mentioning the use of the referenced trinoo tool associated with the University of Minnesota DDoS and another tool (known as “Tribe Flood Network,” or TFN) that was discovered a bit later in 1999. Or even more directly relevant to DDoS would be CERT/CC Advisory CA-99–17 (Denial-of-Service Tools) or CA-2000–01 (Denial-of-Service Developments).
In terms of the MIT Technology Review article and its primary source paper quoted above, the final version IN-99–04 is actually a tertiary source as it references IN-99–07, which in turn cites other research and reports (though not in as much detail as CA-99–17 or CA-2000–01). The primary source in all of these documents — how CERT/CC learned of the full details about the tools, not just observables and indicators from reports of compromised systems — came from two reports I had submitted to them in late September that year and more that followed.
Each Incident Note has a release history. If you look carefully at these release histories, you can see this evolution in thinking as we collectively combined our knowledge behind the scenes. But this historical realignment is getting off track a bit, so let’s get back to the original July 22 release of IN-99–04.
A simple rootkit, a distributed sniffer, or something else?
IN-99–04 described the components of a standard “rootkit” (a backdoor, an IRC redirector, a network sniffer, trojan horse replacements of programs like
ps to hide malicious processes) and some other artifacts, including a file named
leaf.tgz. Remember that file name for later. (Spoiler alert: I did at the time.)
Years earlier I had learned how to do forensic analysis of these kinds of compromised hosts (and accounts whose passwords had been sniffed and were then being used as caches or distribution points when compromising hosts). I had written extensive documents on network sniffers, rootkits, Unix forensics, and responding to security incidents that I regularly sent out to faculty and staff on campus (and shared with anyone, anywhere else on the internet, where I thought they could be of use.)
I was seeing the same intrusion activity at the UW that CERT/CC described. It could have been a standard rootkit backdoor, but there was another possibility: This could have been a distributed sniffer!
The previous year, one of the University of Washington University Computing Services (UCS) system engineers had found program running on almost all of the servers in two different clusters of several dozen hosts each (the cluster serving student email accounts, and the cluster serving faculty and staff email accounts.) These were IBM AIX systems. The system engineer had the foresight to get a memory dump of one of the processes from each cluster before shutting them all down and had provided it to me. The processes had been started months earlier and the few servers that had been rebooted as part of normal operation were no longer running the daemon, but the ones that were still running had been up for a long time.
xxd on the dump and confirmed that the program appeared to act like a backdoor with a listening port and obvious calls to networking API functions. I identified what appeared to be an array of structures that included what looked to me like Unix epoch timestamps (monotonically increasing as you went deeper into the array, which made them stand out even more) and what looked like user names and another related string.
last login records for the host from which the memory dump was taken, matched the user names from the end of the array to the most recent last login records, converted the suspected Unix epoch timestamps to date strings, and voila! A match! The timestamps (minus the timezone difference from UTC to US/Pacific time) were a match to the logins and I was able to determine how to deobfuscate the last string in the structure to get the password. I contacted one person whose account I had identified to ask them if the string I got was in fact their password. It was. A sniffer that maintained its log of compromised passwords in memory only had been active on two clusters with a total of maybe 50,000 users! For months! Since there was no record of network connections to the listening service, and nothing could be identified in the accounts that were running the program, there was no way of knowing if the passwords had ever been retrieved or if any of the compromised accounts had been used. But I wrote up my analysis and reported it to CERT/CC.
I was told later by someone at CERT/CC that this was the first and most detailed report of this distributed sniffer on AIX that they had received. Others had found a similar program, but were not able to determine what it was. The sniffer was enabled by a little-known bug in the AIX network interface driver that allowed an unprivileged user to access packets on the interface. This was the reason the program was running on every host (to sniff internally, as opposed to the other standard SunOS, Solaris, and Linux sniffers of the day that required elevated privileges to put the network interface in promiscuous mode to sniff traffic between other hosts on the local area network.) A similar distributed sniffer found on Linux systems was later described in IN-99–06 in October 1999.
DDoS is not an “attack.” It’s more of a “life style” (“life cycle?”)
The computer security industry loves to use the word “attack.” It’s really hard to avoid using it, to be honest. The only problem is that many times when the word “attack” is used it is actually a terrible choice of words that causes more confusion than clarity. In this case, “attack” being a singular noun implies something happening as a discrete event. It often isn’t that simple at all. At least not the flooding part of DDoS, which typically ebbs and flows, repeats, or changes in type or volume based on the requirements for bringing a site, service, or entire network, off-line. So at best, a “DDoS attack” is actually two parts (or two different kinds of attack? see what I mean?) and the flooding part isn’t even the start.
Just think about the name for a minute: “Distributed Denial of Service.” It starts with a word that embodies complexity and multiplicity: Distributed.
Between 1995 to 1999, large DoS attacks (ick, that word!) amplified their strength by reflecting ICMP Echo Request packets with forged source addresses off of networks that would respond with ICMP Echo Reply packets sent to a host that never asked for them (known as the “Smurf Attack.”) Paul Ferguson and Daniel Senie had published a document describing how to mitigate the problem that enabled Smurf to work (and any use of source address forgery, for that matter) in 1996 (“Network Ingress Filtering Defending Against IP Source Address Spoofing”) and would later publish “Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing” (as RFC 2827 and BCP 38.) Only a small number of computers — even just one on a fast network connection — could use Smurf to take down a server or network.
The new handler/agent style DDoS tools were different. While they could use source address forgery and ICMP Echo Request packets with which to flood, they didn’t need to. Because of the number of hosts being controlled, they could also flood using large UDP packets, packets with invalid IP protocols (like IP Protocol 255), by completing a TCP three-way handshake and making a valid HTTP request, or a number of other methods controlled by a menu system on the handler system that are in turn relayed to hundreds or thousands of distributed agents.
What CERT/CC Incident Note IN-99–04 (and IN-99–05, and the advisories they cite including CA-98.11, CA-99-05, CA-99–08, and CA-99–12) were describing was the activity of building up a very large inventory of compromised hosts that were running an agent daemon (a “leaf”, if you will? 😉)
It took a while to build up this inventory. Because of the Incident Notes, the Advisories, and the activity of people like me, the inventory was slowly but constantly shrinking due to attrition as compromised systems were being cleaned up. The same attrition phenomenon occurs (at an even greater rate, for obvious reasons) when these hosts are used for flooding, leading to a tactic miscreants use when flooding that I will describe another time.
It was the automation of scanning and exploit delivery that increased the scale and efficiency of compromising computers, which meant the ability to build the necessary inventory of flooding resources for creating a significant and extended network outage.
That is the next part of this story…