So, they are not developers but for me EVERYBODY SHOULD LEARN HOW TO WRITE SOME CODE and obviously WRITE HTML CODE !
One of my students asked “What is sanitization ?” , I explained them what is about and how important can be in a real world project and the difference between sanitize and validate a user input.
- Sanitizing will remove any illegal character from the data.
- Validating will determine if the data is in proper form.
I would like to share with you the definition of the word:
verb (used with object), sanitized, sanitizing.
1. to free from dirt, germs, etc., as by cleaning or sterilizing.
2. to make less offensive by eliminating anything unwholesome, objectionable, incriminating, etc.:
to sanitize a document before releasing it to the press.
In real world sanitize is to “clean” anything from “bad things”. In computer sciences it means the same thing. Mostly for security purposes, we protect the system from malicious data.
For example, a user can type anything in an input form and submit it. the input value is a valid form but in the server side it can be dangerous. It might be a malicious escape codes, such as with SQL injection apply to checking the validity of a field, where it can return an error to the user.
Writing a post in a blog can be a good example for us. The user enters some HTML code or via a WYSIWYG editor and we store it in the database and then we show it. So, what if the user copy and past some code from the internet and contains a <script> tag that contains some malicious code ? For this case, we do a HTML sanitization.
HTML sanitization is the process of examining an HTML document and producing a new HTML document that preserves only whatever tags are designated “safe” and desired. HTML sanitization can be used to protect against cross-site scripting (XSS) attacks by sanitizing any HTML code submitted by a user.
Basic tags for changing fonts are often allowed, such as
<strong>while more advanced tags such as
<link>are removed by the sanitization process. Also potentially dangerous attributes such as the
onclickattribute are removed in order to prevent malicious code from being injected.
Depending on the context, sanitization will take on a few different forms. Could be as simple as removing vulgarities & odd symbols from text to removing SQL injection attempts and other malicious code intrusion attempts.
Never trust a user entry, it’s a very naive approach. Always, validate the forms and check it in frontend as well backend. Sanitize data and URLs is a MUST.