Basic Web Security

20 min readMar 28, 2022

Basic Security, Browsers, Attackers, Cookies, Same-Origin Policy, Cross Site Request Forgery (CSRF), Cross Site Scripting (XSS), Sandbox Domain, Password Management, SQL Injection, Transport Layer Security (TLS)

Abstract

This article introduces basic web security, attacks and defense for engineers who don’t have a security background but have interest in security. We will explain the thinking process of security engineers and why things work the way they do.

Basic Security

Let’s start with a trivial case where there is a web server running on your laptop with no internet access and it hosts 1 file index.html that only you have access to. We’ll introduce basic security by conducting a simple threat modeling that security engineers do all the time in their work.

As the web server has no internet access, remote attack is not possible, so we eliminate a wide range of attack vectors. The only option for attackers is physical attack, i.e., steal your laptop to attack your web server. This requires significant attackers’ effort and it is extremely risky for attackers as they can get caught easily. Furthermore, physical attack is not scalable, the attackers have to travel to each target. This is not to say that physical attack is not important, but it’s typically the case that we have to prioritize protecting against remote attack in web security.

Extend the above paragraph a bit, assuming your website is accessible through Bluetooth. This is remote access, but it requires the attackers to be close to you within Bluetooth’s ~10 meters range. Therefore the web server’s risk increases but it’s not as bad as remote attacks through the internet. This is the reason why security engineers don’t lose sleep over Bluetooth security bugs although we are concerned about them. It’s worth mentioning that removing direct internet access doesn’t make attacks impossible. For instance, the Stuxnet worm can compromise an isolated private network in 2 steps: it first infects a USB flash drive and when a user plugs in the infected USB flash drive in the private network then it can infect the network. As you can see, while the attack is possible, the attack is more complicated and it takes more effort. In general, the goal of security is not to create a foolproof security system, but to make it as hard as possible to attack, i.e. to increase attackers’ cost. The Stuxnet worm shows that security is a never ending cat-and-mouse game between attackers and defenders, so security engineers have job security :) One last point, while removing internet access to web servers is infeasible in practice, removing unnecessary internet access to electricity, water, and temperature control systems is a powerful defense mechanism.

What will happen if the attacker manages to get your laptop and unlock it to access your web server? Unfortunately for the attacker, your web server has no valuable data, it only contains the file index.html which hurts attackers’ brains just by reading your terrible index.html code :) Security is only worth it if we have valuable assets to protect. In general, security is a tradeoff between the security protection’s cost and the value of protected assets. That’s the reason why most early phase small startup companies do not hire security engineers because they have nothing valuable to protect. Furthermore, if the attacker has physical access to your laptop and can unlock it then the security consequences may be far worse than stealing data on your web server. This is a subtle point. Once we assume certain attackers’ capabilities, the attackers can cause more damages beyond what we thought. It means that certain attack vectors don’t make sense not because they’re infeasible or illogical, but because under certain security assumptions, the attacker can attack us in different, cheaper and easier ways and can target more valuable assets.

Now, we’re ready to make our web server accessible on the internet through https://example.com. A website that returns index.html is boring, so let’s make it an interesting shopping website. The website can authenticate users using username and passwords, has a forum for users to post comments, and it allows users to make purchases. First, nowadays it’s standard to use https, instead of http and we’ll briefly discuss TLS protocol in the last section. Second, the moment your server is exposed to the internet, you have to make sure that you don’t expose anything else. Have you ever heard about the default root SSH password vulnerability that allows everyone in the world to control your server? What it means is that if you don’t use SSH, you should shut it down and in general, you should shut down all the unnecessary services. What we’ve done is called reducing attack surfaces which is a common security practice.

Browsers

If it’s the first time you deal with web security, it’s not clear what parties are involved. It’s natural to assume that there are 2 parties: users and the web server. However, it’s not the case. Browsers such as Google Chrome are another party that play a significant security role in web security. The fundamental reason is that while JavaScript code is from the server, browsers are the one who interpret and execute it. If it isn’t obvious to you, open Developer Tools in Google Chrome, choose Console tab and type alert(document.domain), Google Chrome will interpret and execute it to show a popup as illustrated in the figure. Furthermore, browsers enforce security mechanisms and sometimes only by explicit instructions from the web server. It gets more complicated when multi-parties are involved. As a consequence, while it’s possible to make equivalent HTTP GET/POST requests to the web server using the “curl” command, from a browser or from iOS/Android, they have different security. When the “curl” command receives the response from the server, it treats the response as text/data and in contrast to a browser, curl doesn’t execute any JavaScript code from the response.

It’s worth mentioning that Chrome, Safari, Microsoft Edge browsers are complicated and don’t strictly follow any specification. There are many exceptions and edge cases, to the point that it’s almost impossible to make a blanket correct security statement for all browsers (see [1]), hence we will only discuss basic security rules without details and nuances.

Attackers

Before discussing security, let’s talk about attackers. Attacks on the web are wide and sometimes it’s confusing who the attackers are and what attackers’ capabilities and goals are.

We’ll consider 3 types of basic attackers:

A malicious user of example.com.

2. A malicious website attacker.com. Note that tricking users to visit attacker-controlled malicious website attacker.com is not a difficult task, e.g. through advertisement.

3. A man-in-the-middle (MITM) attacker who can monitor the traffic from the user to example.com. The attacker can be a compromised public Wi-Fi access point or malicious internet service provider.

The above types of attackers implicitly assume the attackers’ capabilities. For instance, a malicious user can send arbitrary malicious content to the server. Assuming the user’s content is trustworthy and safe is the most common and devastating mistake in security. Repeat after me: a user can be an attacker and user-supplied content shouldn’t be trusted. Again and again :) A malicious website attacker.com controls its JavaScript code and HTML in its domain attacker.com, but it does not have the capability to observe users’ traffic to example.com. A MITM attacker can monitor traffic from users to example.com, but it can’t easily force users to visit a malicious website attacker.com.

The attackers’ goals vary. For instance, a malicious user may attack another victim user or try to steal all users’ data stored in the server. A malicious website attacker.com may want to force another victim user to make purchases at example.com and ship them to the attacker’s address. A MITM attacker may want to read the user’s message to the server or change the user’s message sent to the server.

While reading about attacks, it’s worth keeping track of who attackers are and their goals and capabilities. Furthermore, it’s important to understand what type of attacks our defense mechanisms are designed for.

Cookies

Have you ever wondered why a user only enters username and password once in google.com and after that the user is automatically authenticated? The reason is that after checking the user’s username and password:

The server response contains set-cookie headers with name=value pairs containing information about the logged in user, i.e., the cookie allows the server to authenticate the user in later requests

The browser will store name=value pairs in its cookie jar. In the below picture, using Chrome Developer Tools, you can see that each cookie has Name, Value, Domain = .google.com.

For all future requests to google.com, the browser automatically attaches the cookies belonging to the domain google.com in the requests’ cookie headers. The server checks the received cookies to authenticate the user

Note that the server can instruct the browser to enforce 2 important cookie’s security attributes. The 1st one is the “HttpOnly” attribute which prevents JavaScript code from reading cookies using document.cookie. This is not to prevent attacks, but to mitigate the security consequences in case the JavaScript code is compromised, it can’t steal the user’s cookie and run away. The 2nd one is the “Secure” attribute which asks the browser to only send the cookie in https, not http.

Same-Origin Policy

There are 2 day-to-day features on the web that we will take a closer look at.

The 1st one is users often visit multiple websites https://example.com:443 and https://attacker.com:443 at the same time.

The 2nd one is the webpage https://attacker.com:443 can always iframe to example.com as follows:

In both cases, it’s obviously a security disaster if Javascript code from the webpage https://attacker.com:443 can access content of the webpage https://example.com:443. For concreteness, let’s see how browsers protect the Document Object Model (DOM) of https://example.com:443 from malicious JavaScript code in https://attacker.com:443. The security mechanism that browsers implement for this security task is called same-origin policy.

An origin is defined to be the tuple (protocol, host, port). For instance, for the URL https://example.com:443, the origin is (http, example.com, 443). Therefore, the followings have different origins from https://example.com:443:

http://example.com:443 : different protocol http.
https://attacker.com:443 : different host attacker.com.
https://example.com:8443 : different port 8443.

The same-origin policy says that 2 JavaScript code can access the DOM of the other if and only if their origins are the same. As https://attacker.com:443 and https://example.com:443 have different origins, they can’t access the DOM of the other.

As we’re here, it’s noteworthy that the security rule for cookies is a bit more relaxed. Anyway, the domain attacker.com is independent from example.com, so attacker.com can’t access cookies from example.com.

One last note related to same-origin policy is that while it’s possible for https://example.com:443 to use JavaScript code from https://attacker.com/code.js as follows

, the attacker’s code.js is executed under the origin of https://example.com:443. That is, the attacker’s code.js has full access to example.com’s DOM and its cookies. This is dangerous, do not do this! It may sound like trivial advice, but in practice the attacker doesn’t name their domain attacker.com, it names their domain as an attractive name like awesomejavascript.com and if we aren’t careful, we can fall for it.

Cross Site Request Forgery (CSRF)

Let’s say the user already logged in to example.com and the browser stores the user’s cookies in the browser’s cookie jar. The attacker tricks the user (e.g. through advertisement) to visit the attacker’s website attacker.com. The webpage https://attacker.com transparently and automatically (i.e. without user’s interaction) makes a request to https://example.com/buy using hidden form submission as shown in the figure with the attacker’s shipping address. The effect is the browser in the user’s laptop makes a request to the server example.com and automatically attaches the user’s cookie for the domain example.com. The web server example.com sees the authenticated user’s cookie in the header, so it allows the request to go through, i.e., it will charge the user but ship the bought Tesla to the attacker’s address. One small note is that to reproduce this attack, we have to set example.com’s cookie SameSite attribute to None due to a recent change in Google Chrome.

This attack is fascinating and strange because you don’t see any vulnerable code in this case. All actions from webpage attacker.com, the browser and the server example.com are legitimate and everything works as intended, yet the attack works. The hidden truth is that for sensitive state-changing requests like purchases, using cookies is not enough. Let’s slow down a little bit to see what’s going on. As we’ve learned from the same-origin policy section, there are 2 things that the webpage attacker.com can’t access: example.com’s cookies and example.com’s DOM. In the attack, the webpage attacker.com makes a request to the server example.com and although the webpage attacker.com can’t read example.com’s cookies, the browser “helps” the attacker by automatically attaching user’s example.com cookies to the request. Therefore, when the server example.com receives it, the request looks exactly as if the webpage example.com sends it. In other words, to prevent CSRF, the server must have a capability to differentiate the requests coming from example.com and attacker.com. There are a few options such as using custom request header or Origin header, but we won’t discuss them further. Instead, we’ll discuss a solution where the webpage example.com has an unspoofable CSRF token in its document (DOM). Similar to cookies, the webpage attacker.com can’t read it. However, if the webpage attacker.com makes a request to the server example.com, unlike cookies, the browser does not automatically attach that token to the request. On the other hand, if the webpage example.com makes a request, it can read its own DOM containing CSRF token and attach it to the request. Therefore, the server example.com knows whether the request comes from the webpage example.com or attacker.com by checking the CSRF token, so the server can reject the request if it detects the request doesn’t come from the webpage example.com.

Let’s dig deeper into how to design CSRF tokens. Obviously, CSRF token must be unpredictable, otherwise attacker.com can guess it and attach it in the request to web server example.com. So, a random token like 128-bit random number is enough to prevent CSRF? No, because it’s still vulnerable to the following attack. The attacker 1st access example.com, receives its random CSRF token. Now, in the webpage attacker.com, the attacker adds

<input type=”hidden” name=”csrf” value=”attacker csrf token”>

in the form submission to the server example.com. The browser automatically attaches the user’s cookie to the request. The server example.com receives the user’s cookie and attacker’s CSRF token and they’re all valid. What’s wrong? The core issue is that the above CSRF token is not bound to a specific user, i.e., the server can’t differentiate the user’s CSRF token from the attacker’s CSRF token. Therefore, to correctly defend against CSRF, the CSRF token must be unpredictable and bound to a specific user. A simple csrf token as follow suffices

HMAC(server-secret-key, user_id || session_id || timestamp)

The described attack stops working because when the server example.com receives the user’s cookie and attacker’s CSRF token, the server can verify whether the user_id inside cookies match the user_id in CSRF token and in this case, they don’t match.

Cross Site Scripting (XSS)

In the CSRF section, while the attacker controls attacker.com’s code, the attacker.com’s code can’t access example.com’s DOM and cookies. Therefore, the attack is limited and preventable by using CSRF token. What if the attacker wants to access example.com’s DOM and/or cookies? In particular, let’s consider what a malicious user can do. The idea is that the attacker tries to inject its malicious code to the server example.com and when the user visits example.com, the server example.com will deliver the attacker’s code to the webpage example.com running in the user’s browser. As the attacker’s code is executed in the context of webpage example.com, the attacker’s code can access example.com’s DOM and cookies. This explains why the attack is called cross-site scripting (XSS) because the attacker’s code crosses the server example.com before reaching the user’s browser. Recall that in the XSS attack, the attacker has access to the user example.com’s DOM and cookies, so once there is an XSS bug, CSRF attack is not needed and the CSRF protection if any is useless.

It’s time to discuss how the attacker can inject the attacker’s JavaScript code into the server example.com in the first place. One common example is through comments in example.com’s forum. Let’s say example.com/forum allows users to post comments

Instead of posting the data comments, the attacker posts JavaScript code comments as follow

Good job!<script>alert(document.cookie)</script>

If the server example.com isn’t careful in sanitizing the user’s comment, the attacker’s JavaScript code “<script>alert(document.cookie)</script>” will be stored in the server example.com and when the victim user visits the example.com/forum that comment’s JavaScript code is delivered through the template {{.Comment}} and is executed. You may wonder why the attack sounds too trivial and isn’t the browser or the server supposed to differentiate between code and data? How come data and JavaScript code can be mixed up so easily causing this kind of dangerous security bug? It has a lot to do with parser, but answering this question or how to prevent XSS is out of scope of this introductory article. Just note that after 20 years of discovery, XSS is still one of the most popular bugs on the web.

Sandbox Domain

The XSS in the previous section is dangerous because the attacker’s code has access to the DOM and cookie of the sensitive main domain example.com. One way to mitigate this is to host the user(attacker)-controlled code in a different sandbox domain such as usercontentexample.com. What is a sandbox domain and how does it prevent XSS attacks? A sandbox domain is a domain that doesn’t have important authenticated cookies and doesn’t serve anything else except attacker-controlled JavaScript code. Note that by same-origin policy, the code in usercontentexample.com can’t access example.com’s DOM and cookies. So, user(attacker)-controlled code can’t cause damage to example.com. In that sense, the sandbox domain does not prevent XSS, it only mitigates the security consequences of potential XSS. In other words, even if there is XSS, there isn’t anything valuable in the domain usercontentexample.com for the attacker to steal or to damage. It’s worth mentioning that sandboxing is a great idea in security that is applied beyond web security, e.g., you can let malicious binary code run in a sandbox process with restricted permission and security privilege so that it can’t harm other processes.

Password Management

Undoubtedly users’ passwords are one of the most sensitive data that your server keeps. Therefore, let’s discuss how to manage passwords on the server. The simplest approach is the server creates a SQL “users” table where each row is the following:

username, password

When a user enters username and password, the server checks whether the user-supplied password matches the server’s stored password. From a software engineering perspective, it works perfectly. However, security engineers would ask a follow-up question: what will happen if your SQL database is compromised? All your users’ passwords would be leaked to the world. The job of security engineers is not only to take care of the current security task but also to prepare for the worst. We want to make sure that when the server’s SQL database is compromised, the security consequence is mitigated.

A safer approach is the server never stores the plaintext password, but stores

username, SHA-256(password)

When a user enters username and password, the server computes SHA-256(user-supplied-password) and checks whether the computed hash matches with the server’s stored hash. This is safer because if the SQL’s database is leaked, it only leaks a list of SHA-256(password), but it never leaks the password itself. Given SHA-256(password), as SHA-256 is a “one-way function” it’s hard to find the password, so we’re perfectly safe? Not quite. The main problem is the majority of passwords have low entropy, i.e., they’re predictable. An attacker can guess a list of common passwords such as 123456, supersecret, iamsafe, quan0230 (my name and birthday), etc and pre-compute the hash of those passwords, store them in a table as follow:

12345, SHA-256(1245)

supersecret, SHA-256(supersecret)

iamsafe, SHA-256(iamsafe)

quan0230, SHA-256(0230)

After that the attacker searches whether the leaked SHA-256(password) is in the 2nd column of the above table. Note that the attackers only compute this single table once for all users. This attack is fast and effective. To defend better, we can hash the password with random number (which is called a salt) per user, i.e., the server stores the following table

username[1], salt[1], SHA-256(salt[1], password[1])

username[2], salt[2], SHA-256(salt[2], password[2])

username[3], salt[3], SHA-256(salt[3], password[3])

Why is this approach safer? The main advantage is that salt[1], salt[2], salt[3], etc are independent random numbers per user, so the attackers can’t precompute the table of hash passwords as above. The attackers are forced to work with 1 random salt at time, i.e., for each salt[i] the attackers must compute a separate table of hash (salt[i], SHA-256(, salt[i], guessed-password)). This significantly increases the attackers’ cost.

While the above approach is pretty good, it still has weaknesses. The main issue is that the attackers can design specific ASIC chips to compute SHA-256 in parallel quickly. A better approach is to use the scrypt protocol which forces the attackers to use a large amount of memory. It’s harder to parallelize scrypt computation which means it makes attackers’ cost higher. In summary, the following solution is pretty good

username[1], salt[1], scrypt(salt[1], password[1])

username[2], salt[2], scrypt(salt[2], password[2])

username[3], salt[3], scrypt(salt[3], password[3])

SQL Injection

In the previous section, we discussed how to manage passwords using SQL and to prepare for the case when SQL databases are compromised. In this section, let’s try to prevent SQL compromise in the 1st place. Note that to check the password, the server example.com has to retrieve entry from its SQL database using statement such as

statement = “select * from users where username = ‘ “ + user + “ ‘; “

where the variable user is supposed to be a valid username that the server receives from the web browser. However, recall that the user may be an attacker and can be malicious, so the user can supply a malicious username as follow

user = “a’; drop table users; #”

The final statement will become

“select * from users where username = ‘ “ + “ a’; drop table users;# “ + “ ‘; “

= “select * from users where username = ‘ a’; drop table users ;#’; “

In SQL, “;” is a separator between commands, i.e., in the above attack, there are 2 commands and the 2nd command is “drop table users” (note that # means that everything after it is treated as a comment). The effect is that your SQL table “users” is deleted. What’s going on? The server’s intention was to choose a row by using the SQL select command, but the attacker manages to change it to 2 commands select and drop table. If you pay closer attention, you will realize that it has a similar root cause with XSS: command/code is mixed up with string/data. Fortunately, for SQL, there is a robust defense mechanism. It’s called prepared statements. It happens in 3 steps:

Prepare: create a statement template such as

select * from users where username = ?

Compile: SQL compiles the above statement and stores it without execution.
Execute: send the username the server receives to the stored compiled statement and SQL will replace ? with user-supplied username.

Why does the above prepared statement prevent the attack? The main observation is that the prepare and compile steps fixes the semantic of SQL command forever, i.e., it’s a select command with specific conditions and hence no matter what the username string is, it can’t change the semantic of SQL statement. In other words, no matter what user-supplied name is, the statement can’t become 2 commands or another select command with a different condition. It’s worth mentioning that using prepared statements is not a replacement for sanitizing username, e.g., only allow alphanumeric characters.

Basic Transport Layer Security (TLS)

In the previous section we ignored the issue on how to protect the communication between the webpage http://example.com and the server example.com. Note that HTTP protocol doesn’t have encryption, so the MITM attacker can observe the traffic from the browser to the server and it can impersonate example.com, i.e., it can send its own JavaScript code instead of example.com’s code to the browser. Nowadays, it’s standard practice to use https (aka http with transport layer protocol (TLS)) where s is an abbreviation for secure. Note that changing to “secure” protocol https https://example.com only protects the communication against MITM attacker, it does not mean that the server example.com is secure, nor it’s safe against CSRF, XSS, etc.

It’s out of scope of this article to explain TLS protocol, although I will try to explain the basics. Supporting encryption from the browser to the server is a nontrivial task because for encryption to work, both sides need to agree on a common key, but how do we deliver the common key in the 1st place? Remember that the MITM attacker always listens to the communication between the browser and the server. Fortunately, Elliptic Curve Diffie-Hellman (ECDH) protocol allows us to achieve the above task. Denote G is a base point in the elliptic curve. For the purpose of this article, you don’t need to know what G is, just think of it as a “number” that you can add together, i.e., 2G = G + G, 3G = G + G + G, …, aG = G + G + … + G (a times). The “number” G has a special security property that given aG = G + G + …+ G (a times), the attacker can’t find a. A = aG is called the public key and a is the private key.

The server generates its private key b and public key B = bG, the browser generates its private a and public key A = aG. The server sends its public key B = bG to the browser and the browser sends its public key A = aG to the server. The browser computes key = aB = a(bG) = abG and the server computes key = bA = b(aG) = baG. As you can see, both the server and the browser compute the same key abG. The MITM attacker who sees A = aG and B = bG doesn’t know what a or b is, so it can’t compute key = abG.

The described protocol is vulnerable to the following attack where the attacker generates its own private key e and public key E = eG. To the browser, the attacker pretends to be example.com and to the server example.com it pretends to be the browser. The end effect is that the attacker establishes a shared key1 = aeG with the browser and a shared key2 = ebG with the server example.com. Therefore, the attacker can read all the encrypted communications between the browser and the server. The core issue is that the browser has no way to know whether the public key E = eG belongs to the attacker or the legitimate example.com server.

Preventing the above task is a non-trivial job. The idea is that we need central authorities who certify that the public key B belongs to the server example.com. Those central authorities are called certificate authorities (CA) and the CA uses digital signatures to sign the server’s certificate and the server in turn signs and binds the public key B to its domain example.com. The question is where does the browser get the public key to verify the CA’s signature? The way it works is the browser injects the list of trusted CAs in its source code and binary or it uses the list from the operating system that it’s running, so the browser knows what CAs to trust.

Acknowledgement

We thank Winston Howes for the feedback on this article.

References

1/ Michał Zalewski. The Tangled Web: A Guide to Securing Modern Web Applications.

2/ Dafydd Stuttard, Marcus Pinto. The Web Application Hacker’s Handbook: Finding and Exploiting Security Flaws.

3/ Wenliang (Kevin) Du. Computer & Internet Security: A Hands-on Approach.

4/ https://owasp.org/www-community/attacks/ and https://cheatsheetseries.owasp.org/IndexTopTen.html

5/ https://bughunters.google.com/learn/

6/ https://github.com/cryptosubtlety/intuitive-advanced-cryptography/