Learning from California’s Data Breaches

Trends found within every CA breach notification in 2017.

Ryan McGeehan
Starting Up Security
5 min readDec 22, 2017


Data breach notifications offer hints about what caused a breach to begin with. Understanding these events can help us forecast the risks we should prepare for, or identify criminal trends that could make us a target.

How this data was analyzed:

I read every single data breach notification submitted to the California Attorney General. I surfaced the available root causes or forensic hints that were mentioned in each submission, and categorized them below.

Incident Categories in 2017 CA Data Breach Notifications

In cases where I think the notification was criminally vague, I did a second pass and looked for public journalism or public comment from the victim to infer what happened. Less than ten or so had no discernable lesson or insight. I removed duplicate notifications (8) that just had updates on the same breach (making 200 even). Several notices appear in multiple categories. This took approximately a long work day of effort, coffee, and blaring metal, and might have small errors which I am happy to correct.

I am also publishing this a few days before the end of the year, and will increment data if any breaches come through this week.

A single vendor created an entire category. (18%)

There were thirty seven total notifications related to a vendor’s security. A breach at Sabre showed up in twenty six breach notifications and largely inflated this category and the total amount of notifications this year. A reported issue in Schoolzilla’s product triggered four notifications. Seven vendor issues could not be categorized easily.

See: Answering the Security Questionnaire

“Account Takeover” (ATO) for IT apps and consumer websites. (14.5%)

There were twenty nine cases that suggested a compromised credential (password, usually) for a victim that resulted in a remote login to an account.

See: Investigating Account Takeover and Preventing Account Takeover

eCommerce websites with hijacked “Checkout” processing. (12.5%)

There were twenty-five cases where a website’s checkout system or payment processing code was modified by an attacker to siphon and deliver victim credit card numbers to an attacker. Very often they were notified of this due to downstream fraud that identified them as the source of leaked credit cards.

There was a trend in targeting small tax preparation shops. (9.5%)

There were nineteen notifications which were downstream attacks that eventually contributed to IRS fraud. Attacks focused on tax documents took many forms. Many attacks occurred by finding open remote management into CPA firms. Malicious RDP into a victim’s network was specifically mentioned repeatedly. Various portals that manage tax returns on behalf of clients for CPAs appeared specifically attractive. Outside of CPA firms, HR teams at corporations were often successfully targeted with social engineering attacks, either spoofing an executive or by first gaining access to an internal employee’s email address to be extra convincing. In those cases they would explicitly ask the victim to send all W-2 information to the attacker.

Attackers that leveraged remote access to a victim’s network. (8.5%)

There were eighteen cases that mentioned some form of remote network access to a victim’s network. There was a drastically higher amount of remote network access to CPA firms for the purpose of committing IRS fraud on breached tax information, which we just discussed.

Adversaries intercepting payment information from swipe. (8.5%)

There were seventeen notifications that related to the interception of payment cards. Nearly all of these mentioned specifically that malware was present on a point of sale device. Only asmall amount of disclosures were related to ATM skimmers, which Brian Krebs has written a significant amount on.

Employee email compromises were frequently cited. (8.5%)

There were seventeen cases where an employee’s email was compromised (A subset of the larger “ATO” issue mentioned elsewhere). This was largely cited to being either credential reuse or credential theft in a phishing attack.

Inadvertent attachments on emails or incorrect recipient. (7.5%)

There were fifteen cases that included very simple explanations like “We sent the wrong attachment and it had PII”. There were a few botched CC’d recipients. Some cases involved an inadvertent response to a law firm which included discovery from a subpoena, which was far from a casual mistake.

Theft or lost devices and hardware. (6.5%)

There were thirteen cases triggered by theft or lost devices. A few of these were confrontational robberies, and many were home, car, and office invasions. Many of these mentioned they had no reason to believe the theft was targeting private information. In one instance, an entire safe was stolen from an office.

Ransomware hitting networked file systems and databases. (5%)

Ransomware caused ten notifications. A few of them clearly described database ransoms, and the rest described ransomware that propagated to network attached drives. In nearly all of these cases, the victim could not describe exfiltration, but the sensitivity of the data that was encrypted was at issue. One case alluded to further network intrusion from the ransomware adversary which was unclear.

Security researchers were notably present in notifications. (3%)

There were six notifications involving security researchers. Four were related to the Schoolzilla disclosure. The investigation in all six breach notifications made a specific point that no data was accessed beyond the researcher’s activities and suggested they collaborated positively (🎉)with the vendor or researcher.

Applications with bugs that produced the wrong information. (2%)

There were four issues because of a software bug in an application that displayed the wrong information to people using an application hosting sensitive data.

OK, so now what?

It’s important not to overvalue new information. That’s all this is, a source of information and certain way of interpreting it. I wrote something similar about the breaches I worked on personally this year, and it is very different from California data as well. So, it’s important to realize that trend data has a bias different than your own situation and risks.

There are strategic needs to make strong authentication easier for anyone to use. That isn’t new. Issues around poor authentication were rampant in these reports, and are an annual issue. Additionally, I think we need to get serious about pushing ephemeral communication and short retention policies whenever we can. A large chunk of these issues involved simple inbox exposures with years of data.

More tactically, I think it’s probably smart to send our HR teams a heads up about social engineers prowling for W-2’s, and to contact our tax preparation friends and let them know about the threats to their office networks.

Other than that, keep fighting the good fight.

Ryan McGeehan writes about security on medium.



Ryan McGeehan
Starting Up Security

Writing about risk, security, and startups at scrty.io