The chain of data breach incidents

Hannah Yan Han
2 min readApr 18, 2018

Are data breaches spiralling out of control? I looked into the data from etc as compiled by Information is Beautiful, to find the biggest and most severe data breaches.

Data breaches in the past few years

Sized by the number of records lost in incidents and colored by data sensitivity/severity. The border of the bubbles indicate account leaked via hacking. Click to redirect to interactive version

Since the Yahoo breach in 2013, which allegedly affected 1billion users, several big incidents were reported in the past 2 years. The most recent big cases include the leak of 1.1billion profiles in 2018 from Aadhaar, India’s national database of ID and biometrics information, and 700million accounts affected by Spambot in 2017.

As some breaches were unearthed retrospectively and the scale of the breach tends to be discovered gradually, it’s unknown whether there were other prior breaches still undiscovered.

Most affected industries

Number of leaked accounts and severity by industry

Most incidents occur in web companies or government. Those affecting governments can have high level of severity due to the amount of personal information collected.

How did the breaches happen

top causes for data breaches

Hacking is the main cause of the breaches, followed by accidental publish among the reported incidents.

Number of incidents and severity by company from late 2014 to early 2018

Repeated incidents

AOL, Citibank and Yahoo had 3 reported data breaches in the past 5 years. Dropbox and US Military also had breaches more than once.

This is #day85 of my #100dayprojects on data science and visual storytelling. If you like it, please share it. Suggestions of new topics and feedbacks are always welcomed.



Hannah Yan Han

#100daysproject on data science and visual storytelling ✈️🗺️