Writing a scanner to find reflected XSS vulnerabilities — Part 1

Hungrysoul
4 min readApr 26, 2020

--

In 2016, I worked on a web application scanner project that works very similar to burp suite, it proxies the HTTP(s) requests from a browser or selenium automation tools and sends them to different modules/plugins for vulnerability scanning. The architecture we built was pretty modular and the majority of the application was written in Python with the back end written using Django and Celery for asynchronous tasks.

I am writing this blog post with the whole objective of helping security engineers to write vulnerability scanners for themselves or the community.

Give them a scanner only they will benefit.
Teach them on how to write a scanner, the whole community will benefit.

While building anything we first need to stick to the basics and figure out the following things

  • How does it function? Create a simple flow chart on how you want the scanner to function, it can be what input does it take, what’s it going to parse, and then finally what will be the output.
  • What technology are we going to use? Selecting the right technology is very important. While selecting a technology you should be aware of libraries that you need to use, and how to scale it up as per your needs. But more than anything else the most important thing would be your comfort zone with the technology. I might spend 20 hours writing code in Golang, the output might be slightly better than my Python project which I completed in just 5 hours. If that’s the case I would stick with Python.

Let’s get this started since my previous project was in Python, I’d stick to it. First, let’s start by creating a function diagram. For that, we need to understand what reflected cross-site scripting vulnerability is and how to identify it.

Reflected XSS Flow Chart

How does it function

The whole scanner can be divided into the following modules

  • Raw HTTP request parser
  • Initial prober
  • Context analyzer
  • Payload generator
  • Payload confirmer

Let’s start by creating each module and then at-last patch them together.

Create a python virtualenv

pip3 install virtualenv
python3 -m virtualenv xss_env

Activate the virtualenv

cd xss_env/Scripts && activate

Create a new folder outside the virtualenv folder

mkdir rxss

1 - Raw HTTP request parser

Now that we have set up our environment let’s start writing some code. The first module would be raw HTTP request parser that takes input from a file and converts into a request object. For this, we are going to use the existing http library in python3

from __future__ import absolute_import, unicode_literalsfrom http.server import BaseHTTPRequestHandler
from io import BytesIO

class HTTPRequest(BaseHTTPRequestHandler):
def __init__(self, request_text):
self.rfile = BytesIO(request_text)
self.raw_requestline = self.rfile.readline()
self.error_code = self.error_message = None
self.parse_request()

def send_error(self, code, message):
self.error_code = code
self.error_message = message

The above class takes a raw HTTP string and converts it into a request object.

POST /search.php?test=query HTTP/1.1
Host: testphp.vulnweb.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 27
Origin: http://testphp.vulnweb.com
Connection: close
Referer: http://testphp.vulnweb.com/search.php?test=query
Upgrade-Insecure-Requests: 1
searchFor=asdas&goButton=go

Save the above request in request.txt

from __future__ import absolute_import, unicode_literals

from http.server import BaseHTTPRequestHandler
from io import BytesIO


class HTTPRequest(BaseHTTPRequestHandler):
def __init__(self, request_text):
self.rfile = BytesIO(request_text)
self.raw_requestline = self.rfile.readline()
self.error_code = self.error_message = None
self
.parse_request()

def send_error(self, code, message):
self.error_code = code
self.error_message = message


with open("requests.txt", "rb") as f:
request = HTTPRequest(f.read())
if not request.error_code:
print(request.command) # prints method
print
(request.path) # prints request.path
print
(request.headers.keys()) # prints requests headers
print
(request.headers['host']) # prints requests host
content_len = int(request.headers.get('Content-Length'))
print(request.rfile.read(content_len)) # prints request body

save the above code as request_parser.py and execute it using the following command

python3 request_parser.pyPOST
/search.php?test=query
['Host', 'User-Agent', 'Accept', 'Accept-Language', 'Accept-Encoding', 'Content-Type', 'Content-Length', 'Origin', 'Connection', 'Referer', 'Upgrade-Insecure-Requests']
testphp.vulnweb.com
b'searchFor=asdas&goButton=go'

We have successfully parsed an HTTP request. Now one final thing that is left is to convert the body and request params into a DICT so that we can easily parse and add our own payloads.

The full request parse code can be found here
https://gist.github.com/akhil-reni/5c20f40729179858570ad1ffdf4502f3

2 — Initial Prober

Once we are done parsing, we need to find a way to insert a probe in request params as well as the post body and check whether if it is reflecting back in the response or not. To do this we are going to use requests package in python.

pip3 install requests

Create a new file called create_insertions.py with the following code

https://gist.github.com/akhil-reni/ed890e7fb7d90a7581c3ce380744b609

The above code parses params and body to create a list of requests objects with probes as a payload.

python3 create_insertions.py [<__main__.HTTPRequest object at 0x0000021E8AD34A30>, <__main__.HTTPRequest object at 0x0000021E8AD34B80>, <__main__.HTTPRequest object at 0x0000021E8AD34BB0>]

Now let’s send each request and check which parameter value is reflected in the response.

import requestsdef send_request(request, scheme):
url = "{}://{}{}".format(scheme, request.headers.get("host"), request.path)
req = requests.Request(request.method, url, params=request.params, data=request.data, headers=request.headers)
r = req.prepare()
s = requests.Session()
response = s.send(r, allow_redirects=False, verify=False)
return response
with open("requests.txt", "rb") as f:
parser = RequestParser(f.read())
i_p = GetInsertionPoints(parser.request)

for request in i_p.requests:
response = send_request(request, "http")
if "teyascan" in response.text:
print("probe reflection found in "+request.insertion)

The output would be something like this

python .\test.py
probe reflection found in searchFor

That worked like a charm. I would end this blog post here and will finish the whole scanner in part two of this series.

--

--

Hungrysoul

Python Dev, Part time Bug Bounty Hunter & a Full time entrepreneur.