How to discover up to 10,000 subdomains with your own tool

Published in

Nerd For Tech

7 min readFeb 26, 2021

This time you will learn how to create your own tool with which you will be able to discover subdomains of websites. If in your free time you dedicate yourself to report vulnerabilities this can be very helpful for you.

To create this tool we are going to use bash to program the tasks that we will use, which we are going to see them by parts

First we need to know what a subdomain is:

A subdomain is a way to have a site (web) related, as an attachment, to a main web.
The subdomains are of the type: http://subdominio.dominio.com, these actually point to a folder of the hosting you have hired, but showing its content from the subdomain.
For example, if instead of displaying your blog as follows: http://www.tudominio.com/blog/ you want to display it from a subdomain, you could create a subdomain that is: http://blog.tudominio.com
https://www.dondominio.com/help/es/116/que-es-subdominio/

Having in mind what a domain is, let’s proceed to what the tool is.

If you have never used linux, let alone made your own bash scripts, here’s what bash is:

Bash is a command interpreter that usually runs in a text window where the user types commands in text mode. Bash can also read and execute commands from a file, called a script.
https://es.wikipedia.org/wiki/Bash

Well, let’s get down to business.

For this tool we are going to make use of CURL, which will help us to make the requests. Our tool is going to be based on a web page which has this function of discovering the subdomains of a website and what I like the most is that it is very fast.

The website we are going to help us with is the following, it is not owned by me, so I would like to thank the creators for such a great tool!

https://sonar.omnisint.io/subdomains/ejemplo.com

When we have the result of the page we will see something like this:

What we are looking at are the subdomains of the domain we use in the page URL.

In this example it looks more detailed and we found few subdomains.

Already entering to what is the creation of the tool that we are going to do the first thing we have to know is that this same request that makes the page sonar.omnisint.io we can do it from a terminal, using the CURL tool.

This we do it with the following command line:

curl “https://sonar.omnisint.io/subdomains/ejemplo.com"

As you can see we have the same result when executing this command and visiting the page from the browser. Up to here it seems that it does not make sense just to change the place where the request is made, but let’s not forget that we are working from linux in this case, and we have the option to create a script in BASH.

Taking advantage of this possibility we are going to create a SCRIPT that will help us to automate and see a little more details, in this case we are going to deal with these 2 simple things:

Create a variable where we store the domain we want to scan
Create another variable to store the number of subdomains found.

But let’s go step by step.

The first thing we are going to do is to redirect the result of the command passed so that it is saved in a text file:

curl “https://sonar.omnisint.io/subdomains/ejemplo.com" > subdominios.txtcat subdominios.txt

There is something that we have to keep in mind when creating a script and that is to keep our work environment clean, we always have to try to eliminate what we do not need and in this case we have the following:

What is inside the red box is part of the CURL tool where it shows us the process of the time it is taking to make the order, but this does not interest us, what we want is to keep our environment as clean as possible, for that we have the following options:

curl --silent --insecure “https://sonar.omnisint.io/subdomains/ejemplo.com" > subdominios.txt && cat subdominios.txt

With this we have a much cleaner result. Now, as we have already defined what we want to do, and one of those things is to count the number of subdomains, we are going to do that with the command “wc”.

cat subdominios.txt | wc -l

What the wc -l command does is to count the lines contained in a file, now returning to the point that we have to remove the parts we do not need, we realize that in the output of the cat subdomains.txt command we have 2 lines too many:

The lines that are in red are the ones that are of no use to us and the ones marked in white are the ones that really interest us because they are the lines where the subdomains are. And at the moment of counting the lines we will stop having that margin of error of 2 lines.

For that we are going to use REGEX.

grep -oE "[a-zA-Z0-9._-]+\.ejemplo.com"

What this regex does is to filter the result and display only data starting with:

letters a-z
letters A-Z
numbers from 0–9
and finally _

all this is done by looking for lines where the above is fulfilled and that is next to the text “.example.com”.

complete command line

curl --silent --insecure "https://sonar.omnisint.io/subdomains/ejemplo.com" | grep -oE "[a-zA-Z0-9._-]+\.ejemplo.com" > subdominios1.txt

I decided to add the comparison of the output without the regex and with the regex so you can see the difference, the first time we applied a wc -l gave us as a result that 28 subdomains were found and now with the regex only 25, that is because the output without the regex also returns the domain, something that with the regex is eliminated, and if we have the exact number of subdomains.

Now we have everything ready to start creating the script. Which will not look like this.

#!/bin/bashclear
read -p "Ingresa un host: " HOSThost=($HOST)
numero_subdominios=$(cat subdominios2.txt | wc -l)for hosts in "{host[@]}"
do
        curl --silent --insecure "https://sonar.omnisint.io/subdomains/$HOST" > subdominios1.txtcat subdominios1.txt | grep -oE "[a-zA-Z0-9._-]+\.$HOST" | sort -u > subdominios2.txtecho "Se encontraro: $numero_subdominios subdominios"done

Now let’s take it one step at a time:

#!/bin/bash

This is to start all bash scripts

clear
read -p "Ingresa un host: " HOST

“clear” is used to clear the screen.

“read -p” is used to ask the user for a word that will be stored in the HOST variable.

host=($HOST)
numero_subdominios=$(cat subdominios2.txt | wc -l)

“host” is the variable where the chosen domain is stored.

“number_subdomains” is the result of the execution of the commands cat subdomains2.txt | wc -l

for hosts in "{host[@]}"
do
        curl --silent --insecure "https://sonar.omnisint.io/subdomains/$HOST" > subdominios1.txtcat subdominios1.txt | grep -oE "[a-zA-Z0-9._-]+\.$HOST" | sort -u > subdominios2.txt

here we enter the body of the script, this part is the one that is in charge of making the requests to the page with which we obtain the subdomains, the commands used were already explained above.

echo "Se encontraro: $numero_subdominios subdominios"done

Finally we have an echo that prints text on the screen, and what we are printing is the variable numero_subdominios to obtain the number of subdomains found.

Finally we have this result:

To finish I would like to clarify that this is just a small example of everything we can achieve, I personally took this from one of my tools that I have published in my github account, which you can find here: