6 things that you probably don’t validate

But you should.

Published in

grabek.io

3 min readSep 7, 2017

Input validation is a huge topic. There was a lot written and said about this. I just want to add a little bit to the topic and point out few areas that are often forgotten.

#1 Data from trustworthy external services

It’s not hard to imagine a service that integrates with big, established services like Google, Facebook etc. Those guys are doing their best that content that users put there is not harmful (to those services). The problem is that it may be harmful to yours.

Let’s say we have an application that allows users to grab their contacts from Google Contacts. It’s totally feasible that note section of one of the contacts contains data that may be problematic for your system. SQL injection, HTML, javascript, strange Unicode characters etc. All this information needs to be validated and sanitized before it can reach more delicate parts of your system.

#2 Data from your own database

This one can be controversial but for important systems, it’s a must have. Most developers trust connection to the database without any concerns and in most cases this trust is never put to the test. But data from the database is still external data that is feed to your system.

There are two basic scenarios when this data can be harmful:

You are not connecting to your database at all. The fact that you connect to a database doesn’t necessarily mean that it’s the one that you wanted to connect to. This kind of situation should be prohibited from happening by infra guys but don’t rely on someone else, secure your system like everyone is evil and incompetent.
Harmful data was introduced to the database a long time ago when it wasn’t harmful but new features can express some unexpected behaviors when those two meet.

#3 External API

Practically the same principals as in the case of databases apply to any external database. You don’t know if you are actually connecting to API that you want to connect to. Additionally, you are dealing with a lot more possible execution paths that all need to be handled. Not only proper responses need to be correctly validated if you are logging errors or basically do anything with them, validate error responses as well.

#4 Time

Common practice is to use current date and time from the server as a creation time for entities or other objects that need to be timestamped. Again, in an overwhelming majority of cases, this should work fine. The problem arises when this timestamp is an important information (for example when used to determine what happened first). You should check if time provided from the system makes sense. If there are other entities that were created seemingly in the future you should probably do something about this.

#5 Environment variables

Checking env variables may seem even more paranoid than validating data from database but it’s not that crazy when you think about it.

Environment variables are generally easy to set and retrieve so it may be very unlikely that your infra guy has set something dangerous in one of the variables but some other software might.

#6 Headers

Basic filtering and validation of HTTP headers will be probably conducted by the framework that you have chosen to use. Of course, if you involve data from headers in any kind of custom logic, you should validate them again according to rules that should be obeyed within this logic.