Yet another enumeration of subdomains with statistics

zzzteph
2 min readFeb 15, 2022

--

Or how to collect million of bugbounty subdomains in order to make a few wordlists.

I tried to collect more than a million subdomains for near 3000 domains from bugbounty scopes. Among them were google, paypal, apple, and many others. I used this resource https://github.com/arkadiyt/bounty-targets-data to gather all required data for further analysis.

I put all information in the repository, so there you will find:

Summary

During information gathering about the particular scope, no matter this is bugbounty or private assessments, it’s always needed to find as much information as possible. There are many examples of incidents when some companies were hacked through high critical vulnerabilities on their servers found through subdomain enumeration.

Great things in subdomains that most of them have some meaning or legend that are hidden in the name, like:

  • dev.services
  • api
  • staging
  • ds1-eu-central.portal
  • us-vpn-poc
  • blo01–01m01-sw01 — even this nasty one(!)

So if there was a way to find how these names are generated, they might be easier to find. For example, you can generate a massive wordlist with all possible combinations for us-vpn-poc. But, how big a wordlist will it be? There will be at least 26⁸ or 208827064576 of different combinations… Do you need to iterate all of these combinations? I’m not sure, and it will probably take near a year, even with 10000 subdomains per second. And for ds1-eu-central — millenniums of millenium.

To make subdomain finding easier, There are a lot of different wordlists that contain popular subdomains names that allow researchers to find targets quickly, like:

And to enumerate the subdomains, you can also find many excellent tools like (each instrument has also built-in wordlist):

The idea was to collect all subdomains from all public bugbounty scope, find with amass all subdomains, make some analysis and generate a few wordlists on the results that may be helpful.

On GitHub, you can find some statistics for the subdomains https://github.com/zzzteph/substats

You can download the full list by following links:

all — All collected subdomains, that were validated with removed root domain.

all_unchecked — All collected subdomains with removed root domain

complex — List of words that used in complex subdomain names like mon01-dev-test. So the list contains words like: mon01,dev,test

In the wordlists folder, you can find lists for each subdomain level, from 1 to even 9*.

*Numbering begins from 0

So, this is it! Thank you for reading and hope this information will help in your information gathering!

--

--