Writing a scanner to find reflected XSS vulnerabilities — Part 1
In 2016, I worked on a web application scanner project that works very similar to burp suite, it proxies the HTTP(s) requests from a browser or selenium automation tools and sends them to different modules/plugins for vulnerability scanning. The architecture we built was pretty modular and the majority of the application was written in Python with the back end written using Django and Celery for asynchronous tasks.
I am writing this blog post with the whole objective of helping security engineers to write vulnerability scanners for themselves or the community.
Give them a scanner only they will benefit.
Teach them on how to write a scanner, the whole community will benefit.
While building anything we first need to stick to the basics and figure out the following things
- How does it function? Create a simple flow chart on how you want the scanner to function, it can be what input does it take, what’s it going to parse, and then finally what will be the output.
- What technology are we going to use? Selecting the right technology is very important. While selecting a technology you should be aware of libraries that you need to use, and how to scale it up as per your needs. But more than anything else the most important thing would be your comfort zone with the technology. I might spend 20 hours writing code in Golang, the output might be slightly better than my Python project which I completed in just 5 hours. If that’s the case I would stick with Python.
Let’s get this started since my previous project was in Python, I’d stick to it. First, let’s start by creating a function diagram. For that, we need to understand what reflected cross-site scripting vulnerability is and how to identify it.
How does it function
The whole scanner can be divided into the following modules
- Raw HTTP request parser
- Initial prober
- Context analyzer
- Payload generator
- Payload confirmer
Let’s start by creating each module and then at-last patch them together.
Create a python virtualenv
pip3 install virtualenv
python3 -m virtualenv xss_env
Activate the virtualenv
cd xss_env/Scripts && activate
Create a new folder outside the virtualenv folder
mkdir rxss
1 - Raw HTTP request parser
Now that we have set up our environment let’s start writing some code. The first module would be raw HTTP request parser that takes input from a file and converts into a request object. For this, we are going to use the existing http library in python3
from __future__ import absolute_import, unicode_literalsfrom http.server import BaseHTTPRequestHandler
from io import BytesIO
class HTTPRequest(BaseHTTPRequestHandler):
def __init__(self, request_text):
self.rfile = BytesIO(request_text)
self.raw_requestline = self.rfile.readline()
self.error_code = self.error_message = None
self.parse_request()
def send_error(self, code, message):
self.error_code = code
self.error_message = message
The above class takes a raw HTTP string and converts it into a request object.
POST /search.php?test=query HTTP/1.1
Host: testphp.vulnweb.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 27
Origin: http://testphp.vulnweb.com
Connection: close
Referer: http://testphp.vulnweb.com/search.php?test=query
Upgrade-Insecure-Requests: 1searchFor=asdas&goButton=go
Save the above request in request.txt
from __future__ import absolute_import, unicode_literals
from http.server import BaseHTTPRequestHandler
from io import BytesIO
class HTTPRequest(BaseHTTPRequestHandler):
def __init__(self, request_text):
self.rfile = BytesIO(request_text)
self.raw_requestline = self.rfile.readline()
self.error_code = self.error_message = None
self.parse_request()
def send_error(self, code, message):
self.error_code = code
self.error_message = message
with open("requests.txt", "rb") as f:
request = HTTPRequest(f.read())
if not request.error_code:
print(request.command) # prints method
print(request.path) # prints request.path
print(request.headers.keys()) # prints requests headers
print(request.headers['host']) # prints requests host
content_len = int(request.headers.get('Content-Length'))
print(request.rfile.read(content_len)) # prints request body
save the above code as request_parser.py and execute it using the following command
python3 request_parser.pyPOST
/search.php?test=query
['Host', 'User-Agent', 'Accept', 'Accept-Language', 'Accept-Encoding', 'Content-Type', 'Content-Length', 'Origin', 'Connection', 'Referer', 'Upgrade-Insecure-Requests']
testphp.vulnweb.com
b'searchFor=asdas&goButton=go'
We have successfully parsed an HTTP request. Now one final thing that is left is to convert the body and request params into a DICT so that we can easily parse and add our own payloads.
The full request parse code can be found here
https://gist.github.com/akhil-reni/5c20f40729179858570ad1ffdf4502f3
2 — Initial Prober
Once we are done parsing, we need to find a way to insert a probe in request params as well as the post body and check whether if it is reflecting back in the response or not. To do this we are going to use requests package in python.
pip3 install requests
Create a new file called create_insertions.py with the following code
https://gist.github.com/akhil-reni/ed890e7fb7d90a7581c3ce380744b609
The above code parses params and body to create a list of requests objects with probes as a payload.
python3 create_insertions.py [<__main__.HTTPRequest object at 0x0000021E8AD34A30>, <__main__.HTTPRequest object at 0x0000021E8AD34B80>, <__main__.HTTPRequest object at 0x0000021E8AD34BB0>]
Now let’s send each request and check which parameter value is reflected in the response.
import requestsdef send_request(request, scheme):
url = "{}://{}{}".format(scheme, request.headers.get("host"), request.path)
req = requests.Request(request.method, url, params=request.params, data=request.data, headers=request.headers)
r = req.prepare()
s = requests.Session()
response = s.send(r, allow_redirects=False, verify=False)
return responsewith open("requests.txt", "rb") as f:
parser = RequestParser(f.read())
i_p = GetInsertionPoints(parser.request)
for request in i_p.requests:
response = send_request(request, "http")
if "teyascan" in response.text:
print("probe reflection found in "+request.insertion)
The output would be something like this
python .\test.py
probe reflection found in searchFor