SQL injection vs XSS

This post comes from a discussion started on stackoverflow (which has been deleted now) about SQL injection and XSS. I won’t talk about vulnerabilities here, just give some clarifications. If you don’t know about it, google it, you will have a lot of explanations.

A lot of developers know these vulnerabilities but I find that often the the way to secure are misunderstood.

Input vs Output

Even if it seems obvious, I was asked to explain what was Input and Output.

Basically, input is the data incoming to the server. It can be data from a form submitted or also data sent by a client. In PHP it’s everything you get in $_GET, $_POST …. In expressJS if you use the body-parser it’s what you get in req.body …

On the other hand, output is what you send back to client. It can be what you display with your template engine, or a simple JSON in a REST API etc…

Verification and Sanitization

Verification and Sanitization is the fact to clean input data and check if it match your requirements.

Has email value a real email format ? Is password less than 8 chars length ? Remove all html tags or special chars, get the integer part of a number, etc…

That’s nothing related to security. If it is then it’s a side effect :)

Escaping

Escaping is the fact to prevent some chars to be considered as part of a query or a language notation. Most of RDBMS use the single quote as field delimiter. If a value contains a single quote you have to escape it (prepend it with backslash) for the RDBM to considere it as a literal char.

$data = “Let’s rock”; or $data = ‘Let\’s rock’;

$query = “INSERT INTO table SET field = ‘Let\’s rock’; ”;

Without escaping the data, your query is prone to a SQL injection. This is an input vulnerability

Encoding/Converting

Most of the time you won’t change encoding/charset (iso-8859–1, UTF-8, …)in your own application. The only case I see it’s when you work with a third party and you have to match his requirements.

When you output your content on a webpage you will need to convert some characters to HTML entities. If you use a template engine this is already done for you.

Converting to HTML entities prevent your pages to be vulnerable to XSS

Suppose you have this variable to print.

$content = ‘<script>alert(document.cookie)</script>’;

converted to html entities will output ‘&lt;script&gt;alert(document.cookie)&lt;/script&gt;’ which is just a literal string

This is an output vulnerabily, because the risk is only when you display the content in a specific context.

Should I encode data before storing

Because you can never assume how data will be use, you can’t use any encoder before storing. Suppose you will need to display your content in several format, HTML, XML and/or JSON you won’t use the same encoder.

You must store in database your business value, it means stateless data with no context. There is always exceptions, but if you know them you are probably not reading this post.