Detecting Cross-Site Scripting Vulnerabilities

MRunal
MRunal
Jun 9 · 8 min read

Abstract

The best practice to prevent Cross Site Scripting
(XSS) attacks is to apply encoders to sanitize untrusted data. To
balance security and functionality, encoders should be applied to
match the web page context, such as HTML body, JavaScript,
and style sheets. A common programming error is the use of a
wrong encoder to sanitize untrusted data, leaving the application
vulnerable. We present a security unit testing approach to detect
XSS vulnerabilities caused by improper encoding of untrusted
data. Unit tests for the XSS vulnerability are automatically
constructed out of each web page and then evaluated by a unit
test execution framework. A grammar-based attack generator is
used to automatically generate test inputs. We evaluate our
approach on a large open source medical records application,
demonstrating that we can detect many 0-day XSS vulnerabilities
with very low false positives, and that the grammar-based attack
generator has better test coverage than industry best practices.

I. INTRODUCTION

Cross Site Scripting (XSS) is one of the most common
security vulnerabilities in web applications. Cross Site
Scripting attacks occur when an attacker successfully injects a
malicious JavaScript payload into a web page to be executed
by users requesting such a web page. Advised best practice to
prevent XSS attacks is to encode untrusted program variables
with dynamic content before their values are sent to the
browser. While one can prevent all XSS attacks by using the
most strict encoder, it also takes away many useful web site
functions. To balance security and functionality, developers
must therefore choose the appropriate encoder depending on
the context of the content, such as HTML or JavaScript. Static
analysis [15] techniques are widely used to ensure a web
application uses encoding functions to sanitize untrusted data.
However, static analysis cannot verify whether the correct
encoding function is used. Acunetix Web Application
Vulnerability Report [1] shows that nearly 38% and 33% of
web sites were vulnerable to XSS attacks in 2015 and 2016
respectively.

In this paper we present a unit-testing based approach to
automatically detect XSS vulnerabilities due to incorrect
encoding function usage. We have built a proof-of-concept
implementation for web applications written in Java and JSP.
This approach can be extended to other server-side web
programming languages as well (e.g. PHP and ASP). For the
rest of this section, we provide a brief background on encoding
and our approach.

Well tested encoding functions have been written for
content placed in the following contexts: HTML body, HTML
attribute, CSS, URL, and JavaScript. Consider the fragment of
a JSP program shown in Fig 1. Native Java code is enclosed in
<% %>.

1. <% String pid = (String)request.getParameter(“pid”); %>
2. <% String addr = (String)
request.getParameter(“addr”);%>
3. <a href=”javascript:void(0)” onclick=”action(‘
<%=escapeHtml(pid) %> ’)”> mylink </a>
4. <p> <%=escapeHtml(addr) %>

This example has two user provided inputs: pid and addr.
Variable pid is used as part of rendering an HTML anchor
element on line 3, and addr is displayed in the HTML body on
line 4. A maliciously supplied input for addr might be
<script> atk(); </script>.
If the encoding function, escapeHtml(), were not applied,
this would cause the execution of the JavaScript function atk()
on line 4. Encoding function escapeHtml() replaces < and >
characters with &lt; and &gt; respectively and transforms the
malicious input into the following string, preventing atk();
from being interpreted as a JavaScript program by the browser:
&lt;script&gt; atk(); &lt;/script&gt;
However, the same encoding function does not work for
the case on line 3. A malicious input for pid might be the
following
’); atk(); //
It will pass escapeHtml() unchanged. The rendered anchor
element would be as follows.
<a href= “javascript:void(0)”
onclick= ”action( ‘’);atk(); // ’ ) ” > mylink </a>

JavaScript function atk() will be executed when the link is
clicked. The correct JavaScript encoder would, in this case,
replace the single quote character with \’ to prevent this attack.

II. UNIT TEST CONSTRUCTION

To ensure test path coverage, we construct a set of unit tests
automatically based on each JSP file with the goal that if the
original JSP file has an XSS vulnerability due to incorrect
encoder usage, at least one of the constructed unit tests will be
similarly vulnerable as well. We refer to the JSP file in the
application as the original unit test and each unit test JSP file
generated as the XSS unit test. The following are inputs for
XSS unit test construction: (1) source code and (2) untrusted
sources and (3) sinks. Untrusted sources are Java functions orstatements from which malicious data can be brought into the
web application, such as request.getParameter(). Sinks are
statements used to generate the HTML outputs to be rendered
by browsers. There are a number of sinks in the context of JSP
applications: out.write() , out.print(), out.println(),
out.append(), or <%= %>. We illustrate the unit test
generation using Figure 2 as the original code and Figure 3 as
one of the constructed XSS unit tests.

  1. <% String ordID = request.getParameter(“order”);
    2. ordID = escapeHtml(ordID);
    3. if(editMode){ %>
    4. <a onclick=”edit(‘<%= ordID %>’)” href=”#” > Edit
    Order </a>
    5. <% } else { %>
    6. <span> Order:<%= ordID %> </span>
    7. <% } %>

3. ATTACK GENERATION

Because our test evaluation is based on execution of attack
strings, we must make sure attack strings are syntactically
correct. Furthermore, we want to include all possible types of
attack scenarios. Related work in generating XSS attacks relies
on either expert input [21], or on reported attacks [22, 37]. It is
difficult to show that all possible attack scenarios are included
using these approaches.
Our approach consists of two components. First we use
grammars to model how JavaScript payloads are interpreted by
a typical browser. Assuming this grammar is accurate, then a
successful attack must follow the rules of this grammar.
Second, we devise an algorithm to derive attack strings
systematically based on the grammar. Assuming the grammar
accurately models the way the browser interprets JavaScript,
and assuming that the attack derivation algorithm can generate
at least one attack string for every type of attack, then our
approach would cover all possible attack scenarios. It is
possible that either the grammar may have missed a way by
which a browser interprets JavaScript, or the attack
enumeration algorithm failed to consider a possible derivation
path based on the grammar. Through peer review, we can
improve both components in a way similar to how crypto
algorithms are revised.

Injection Points
We assume that taint sources are specified as a set of Java
methods, such as user forms and database queries. Taint flow
analysis is used to identify injection points in the program.
Injection points are places that the variable containing the
attack string (as an input parameter for the unit test) should be
used in. This variable is used as the value of the tainted
variable that is an argument of the first encoder function. Since
a XSS unit test contains no branching logic, detection of such
injection points is straightforward. Figure 5 shows part of an
original source code. Untrusted variable fName is used in a
sink on line 5 after being sanitized using two encoders on lines
3 and 4. This variable originated from variable prf as result of a
database call, searchProfile(), a tainted source on line 2. In the
corresponding unit test in Figure 6, variable containing the
attack string atk will be used as the input parameter of the first
application of the escapeHtml() encoder, on line 3.
We also instrument each XSS unit test so that it reports the
line number in the source code if a vulnerability is found as
shown in line 6 of Figure 6. We identify the line number of
each sink statement in the original JSP file. Suppose the line
number of a sink in the original JSP file is L1:
L1: <%= escapeHTML(x)%>

Original Code with Injection Point

A. Attack Grammars
A typical web browser contains multiple interpreters:
HTML, CSS, URI and JavaScript. The browser behavior can
be modeled as one interpreter passing control to another upon
parsing specific input tokens while rendering HTML
documents. We refer to the event of interpreter switching as
context switching. For example, the URI parser transfers the
control to the JavaScript parser if it detects input javascript: (if
supported by the browser) as in the case:
<img src=”javascript: atk();”>
A successful XSS attack is to induce the JavaScript
interpreter to execute the attack payload. We use a set of
context free grammar (CFG) rules to specify possible input
strings that cause the browser to activate the JavaScript
interpreter to execute an attack payload. Portners et. al. [31]
observed that a successful XSS attack must either call a
JavaScript function (e.g. an API), or make an assignment (e.g.
change the DOM). In JavaScript, wherever an assignment
operation can be executed, a function call can also be made.
Therefore, without loss of generality, we assume the attackpayload is a function call atk() that changes the title of the
webpage.

  1. URI context:
    URI (Uniform Resource Identifier) strings identify
    locations of resources such images or script files. Based on
    RFC 3986, they have the following generic syntax:
    scheme:[//[user:password@]host[:port]][/]path[?query][#frag
    ment]
    Here, the scheme represents the type of protocol (such as
    ftp or http) used to access a resource, and the rest of the string
    expresses the authority and path information required to
    identify the resource. To cause the URI interpreter to switch to
    the JavaScript interpreter, the scheme must be equal to the
    keyword javascript, followed by JavaScript statements. Other
    possible schemes include http, ftp, and https. Since no
    JavaScript can be injected into schemes other than scheme
    javascript, we concentrate on describing URIs that contains the
    JavaScript scheme [11]. An URI can be properly interpreted by
    a browser only as a value of an expected attribute of a host
    context. We continue with the example of

<img src=”javascript: atk();”>,
where src is the source attribute of the HTML img tag. It is
referred to as URIHOST. Figure 11 represents the grammar
for URI. Rule URIATRIB specifies a URI attribute consisting
of a URIHOST name and the URLVAL. Rule URIHOST lists
all possible URI host contexts in an HTML document. Again,
for the purpose of generating attack strings, we only consider a
URI of the JavaScript scheme. PAYLOAD is a special
nonterminal representing a JavaScript attack payload. It signals
to our attack generator that a context switch to JavaScript is
possible at this point.

<URIATRIB> ::= URIHOST eq URIVAL
<URIHOST> ::= src | href | codebase | cite|action |
background | data | classid | longdesc|profile |usemap |
formaction|icon | manifest | poster | srcset | archive
<URIVAL>::= sq URI sq | dq URI dq | URI
<URI> ::= javascript: PAYLOAD

Attack generation
We compare our grammar based attack generation with two
well regarded open source XSS attack repositories: ZAP and
HTML5 Security web site [21]. We applied these attack
repositories to the XSS unit tests we constructed. Both ZAP
and HTML5Sec attack repositories found the same
vulnerabilities. However, we found several vulnerabilities that
cannot be detected by ZAP and HTML5 Security cheat sheet
attack repositories. One example is shown below.

<div style=”height: <%= escapeHtml(input) %>px; “>
</div>
The following attack string generated by our approach can
detect this vulnerability.

;background-image:url(‘javascript:attack()’);