Introduction to hacking web applications

7 min readMay 9, 2019

In this article we explore the common types of attacks on a web application and the core defence mechanisms. By starting with the defences commonly employed we can think of ways to maneuver around these defences and begin mapping out our own attacks which will be covered in future posts.

Introduction

The internet and the content accessible on the web is a lot different now than it was 10 years ago. Static websites have evolved to web 2.0 applications that do a lot more than just serve text on a web page. These applications allow users to shop online, talk with friends and even do internet banking. Because these web applications now handle important data about the user such as log in details, private messages and banking details, hackers have increased their efforts to attack and steal user credentials.

Some of the most common forms of attacks:

Cross site scripting (XSS) — This vulnerability allows the attacker to target other users on the application to gain their user data or perform unauthorised actions on their behalf.
Cross site request forgery (CSRF) — Is an attack that forces a user to perform an unintended action. E.g. get them to change their password or transfer funds to the attackers account.
Information Leakage — As the name suggests this vulnerability allows the attacker gain information about an application or its underlying data. This could be from developers leaving comments in the source code or through poor error handling.
Broken access controls — Is where the web application fails to protect its data or functionality held for admin users enabling an attacker to access data it shouldn’t be allowed to.
Broken authentication — This includes vulnerabilities that allow users to set weak passwords, have weak or easily guessable recovery password questions, allowing brute force login attempts.
SQL injection — The popularity of this attack has fallen in recent years due to new functions and libraries that can help developers protect their application. Increased awareness of SQL injection taught in schools and throughout university means that even junior developers are aware of the potential implications of not sanitising user inputs that form part of the SQL query string.

Photo cred: https://www.pexels.com/photo/forced-perspective-photography-of-cars-running-on-road-below-smartphone-799443/ by Matheus Bertelli

Core defence mechanisms

The main defences used to protect a web application can be separated into 4 types:

Handling user access
Handling user input
Using suitable defence and offensive measures when targeted for attacks to frustrate the attacker. E.g. Employing login attempts and time outs for each successive failed attempt
Configuring the application to allow admins to monitor the application and its functionality thus allowing us to identify suspicious activity which could indicate an attack

Handling user access

Web applications typically have different user roles each with their own access to data and the actions they can do. Examples of user roles on a shopping website could have an anonymous user who can browse items, logged in users who have access to the shopping basket and making transactions. Business users would have the ability to advertise their products on the website and maybe set adverts on certain pages of the web application. Lastly the admin would have access to the full application, be able to remove existing users, escalate privileges and monitor sessions.

The defence of handling user access can be broken down into 3 sections:

Authentication
Session management
Access control

Authentication is the process of identifying a user with who they say they claim to be. Without authentication a web application wouldn’t know the difference between each user and would have to assume the lowest level of trust for that person. I.e. an anonymous user. The most common form of authentication is that of a username, typically an email address and a password. Some sites feature extra security precautions using multi-factor authentication. Think of banking websites that require you to input a pin number, your user account number, passwords and confirm your identity with a linked phone number.

After handling the initial authentication step, you will have to keep the user signed in throughout their journey on the web application. This is known as the session. Since websites are stateless by nature other methods are used to maintain the session for the user to keep them logged in whilst browsing/interacting with the website.

These methods include retrieving a session token when the user signs in either as a cookie or stored in local storage. When the user attempts to interact with the site that is only allowed for authenticated users then the cookie containing the session token will be sent to the server to validate the user and maintain the session.

Most of the attacks associated with session management consist of trying to break the security of the session token to try and find patterns of how the token is made. If this is possible then the attacker may create an existing token and effectively take control as another user. Other attacks involve trying to steal the cookies using JavaScript or through XSS.

The last defence for handling user access is protecting the HTTP requests that a user can make. Even though the functionality may be hidden to unauthorised users, if an attacker can guess the HTTP request to make then he may be given access to data or be able to perform actions outside his/her access control level. Protection against this type of attack is normally difficult as developers often forget to add user role logic checks on all routes as they make the flawed assumption of how they expect the user to interact with the web application.

Handling user input

Web applications generally have restrictions on the inputs that a user can enter on a website to prevent the user from trying to maliciously attack or break the website. However there are times when you should allow the user to be able to type anything they want. For instance a blogging application where the user wants to talk about internet security they may require to show certain code that is used in attacks. Depending on how strict we wish the input validation to be there are several different ways to handle the input.

The first method of input handling is to reject all known bad words or characters associated with hacking. This approach normally consists of a blacklist which the server will check against to make sure its not containing any of those words. It’s a very bad practice as there are constantly new attacks out there meaning the blacklist will have to be regularly updated. Another concern with this approach is that it can be easily navigated around by changing the letter case of the attack, using NULL bytes or by breaking up the expression by using comments.

E.g. If <script> is blocked then try <scRiPt> instead.

E.g. SELECT /*bla*/ user, pwd FROM/*abc*/ usersTbl.

The opposite of rejecting all the bad inputs is to instead create a set of rules that the input has to follow. For example only numbers to be accepted on an age field. If we do need to accept potentially harmful inputs then sanitization can be used. This is where the harmful data is removed from the input or encoded to make it safe.

When handling input we also need to consider the inputs that are not intended to be entered or changed by the user. For instance a banking web application may contain a hidden form field with the users account number. An attacker can change this value on the client side which could affect the system functionality. Therefore we need to do semantic checks to ensure that the users account number matches that to the one in the form at the time of submission.

Since a user can simply bypass the client side its important that we have validation on the server side as well. An effective form of protection is known as boundary validation. This is where every individual component treats its input as being potentially malicious and does its own validation checks. This is done because it can be very difficult or even impossible to defend against every attack at the external boundary. There is often component input chaining in an application. If we do all our validation only for the 1st input then we may have accidentally allowed an attacker to create an input that gets transformed into an undesirable output by another component which could cause the application to act in unexpected ways.

This is common in multi-step login forms where an attacker can get a payload through the validation step. For example consider a filter that removes the string “<script>” in any input. However it may not perform this recursively for each successive validation step or in a single step. Thus if we enter “<scr<script>ipt>” the filter checks the input and sees the string <script>which is then removed. However it is not performed for each step and therefore “<scr<script>ipt>” becomes <script> .

We can also use this method of attack to perform a double encoding attack.

“This attack technique consists of encoding user request parameters twice in hexadecimal format in order to bypass security controls or cause unexpected behavior from the application. It’s possible because the webserver accepts and processes client requests in many encoded forms.” — https://www.owasp.org/index.php/Double_Encoding

An example:

<script>alert('XSS')</script>

Encoded once it looks like this:

%3Cscript%3Ealert(%27XSS%27)%3C%2Fscript%3E

Double encoded:

%253Cscript%253Ealert(%2527XSS%2527)%253C%252Fscript%253E

Handling attackers

Web applications should employ the following mechanisms to handle attackers:

Maintaining audit logs — An effective log will allow you to understand when a breach was made and how it was done and can even uncover the intruders identity (IP address). Logs can also indicate a potential attack and be used to alert administrators to investigate in further detail. Beware that logs can be a potential source for attacks as they contain valuable authentication records.
Blocking IP addresses — Following a successful audit log, an admin can block the IP of the attacker to stop successive attacks. However this approach will not stop a determined/smart hacker who is capable of spoofing their IP address.
Intrusion detection systems and firewalls — These can be used to identify suspicious activities e.g. large automated requests, requests containing attack strings or requests containing unexpected data.

Summary

We have seen that most attacks stem from the vulnerabilities in these core defence mechanisms. In the upcoming blog posts we will explore in detail, the techniques and tools used by attackers to map out potential vulnerabilities and break into systems.