1.4 Billion Clear Text Credentials Discovered in a Single Database

Julio Casal
@4iQ
Published in
5 min readDec 9, 2017

A Massive Resource for Cybercriminals Makes it Easy to Access Billions of Credentials.

Now even unsophisticated and newbie hackers can access the largest trove ever of sensitive credentials in an underground community forum. Is the cyber crime epidemic about become an exponentially worse?

While scanning the deep and dark web for stolen, leaked or lost data, 4iQ discovered a single file with a database of 1.4 billion clear text credentials — the largest aggregate database found in the dark web to date.

None of the passwords are encrypted, and what’s scary is the we’ve tested a subset of these passwords and most of the have been verified to be true.

The breach is almost two times larger than the previous largest credential exposure, the Exploit.in combo list that exposed 797 million records. This dump aggregates 252 previous breaches, including known credential lists such as Anti Public and Exploit.in, decrypted passwords of known breaches like LinkedIn as well as smaller breaches like Bitcoin and Pastebin sites.

This is not just a list. It is an aggregated, interactive database that allows for fast (one second response) searches and new breach imports. Given the fact that people reuse passwords across their email, social media, e-commerce, banking and work accounts, hackers can automate account hijacking or account takeover.

This database makes finding passwords faster and easier than ever before. As an example searching for “admin,” “administrator” and “root” returned 226,631 passwords of admin users in a few seconds.

The data is organized alphabetically, offering examples of trends in how people set passwords, reuse them and create repetitive patterns over time. The breach offers concrete insights into password trends, cementing the need for recommendations, such as the NIST Cybersecurity Framework.

While we are still processing the data, below are the technical details of our initial findings, including:

  • Sources of the Data
  • Details about the Dump File
  • Data Freshness
  • Discoveries regarding Credential Stuffing and Password Reuse

Source of the Data

The dump includes a file called “imported.log” with 256 corpuses listed, including and with added data from all those in the Exploit.in and Anti Public dumps as well as 133 addition or new breaches. Some examples of the breaches listed the file we found:

Last breaches added to the database

About the Dump File

The 41GB dump was found on 5th December 2017 in an underground community forum. The database was recently updated with the last set of data inserted on 11/29/2017. The total amount of credentials (usernames/clear text password pairs) is 1,400,553,869.

There is not indication of the author of the database and tools, although Bitcoin and Dogecoin wallets are included for donation.

The data is structured in an alphabetic directory tree fragmented in 1,981 pieces to allow fast searches.

Data is fragmented and sorted in two and three level directories

The dump includes search tools and insert scripts explained in a README file.

Freshness

We’ve found that although the majority of these breaches are known within the Breach and Hacker community, 14% of exposed username/passwords pairs had not previously been decrypted by the community and are now available in clear text.

We compared the data with the combination of two larger clear text exposures, aggregating the data from Exploit.in and Anti Public. This new breach adds 385 million new credential pairs, 318 million unique users, and 147 million passwords pertaining to those previous dumps.

Data comparison with Exploit.in and Anti Public breaches

Credential Stuffing and Password Reuse

Since the data is alphabetically organized, the massive problem of password reuse — — same or very similar passwords for different accounts — — appears constantly and is easily detectable.

A couple of the constant examples of password reuse that can be found:

password reuse examples discovered

And how password patterns changes over time:

password patterns discovered

Top Passwords

The list of top 40 Passwords and volume found:

More Analysis, Stay Tuned

This experience of searching and finding passwords within this database is as scary as it is shocking. Almost all of the users we’ve checked have verified the passwords we found were true. Most reactions were

but that’s an old password…

commonly followed by an

Oh my god! I still use that password in <this> site…

a few seconds later.

4iQ’s mission is to protect your digital identity in the new data breach era by scanning the surface, social and deep and dark web.

We will be following up with more information soon and will provide solutions to protect consumers and companies from this and other alarming exposures.

UPDATE — 12/12/2017

Some answers to a number of requests we’ve received:

Can you provide a link to the database?

Quite a few people have asked for a link to the database, but we cannot do that. Our policy, is not to share links or details open resources that can spread such sensitive information.

--

--