The pattern described in this article is available as an iOS framework on my GitHub
Whenever a program get its data from an untrusted source, such as a user or an external webservice, it’s particularly important that this untrusted data get thoroughly validated before the program starts working with it. Otherwise it will run at the risk of performing errors, corrupting data, or, worse, be vulnerable to a whole array of injection attacks.
Fortunately, those risks are now well known and development teams usually try to make sure that external inputs are systematically validated before they are used. And while such efforts are a good start, they can easily get harder to enforce as the code base grows. For instance, consider the following code:
The necessary validation is indeed performed before the external data is used, but this validation happens somewhere in a possibly large chunk of code, and could consequently easily be modified or deleted at some point in the future, at which point nothing guarantees that the mistake will be easily spotted during code review.
Based on this example, it becomes clear that we need to lay down some kind of pattern to validate external data in a way that lets code review be easy and reliable.
Model driven security
A good point to start, is to find a way to clearly differentiate untrusted and trusted data. Model-driven security is a sound approach to reach this goal. The idea behind it is to define business object that will wrap trusted data. This way, untrusted data will be stored using primitive types and trusted data using those business objects. The line between them is clear and easy to enforce:
The bad part is that the initializer has access to the external data regardless of whether it passes the validation. While this concern might sound a bit extreme, remember that initializers tend to be written hastily, sometimes with a heavy rely on copying and pasting. So there definitely is enough room for mistakes to happen.
A container for untrusted data
To address the issue, we are going to define a generic struct called
Its role is to act as a wrapper around external data. Whenever external data is handled, it should be systematically and immediately be stored in an
Notice that the value is stored with the visibility
fileprivate, which means that it cannot be retrieved outside of the file where the field is defined. It might sound weird and impractical at first, but it actually is the angular stone of the pattern.
In the same file, we now declare a protocol
This protocol expects conforming types to provide a
validation(value:) function that will tell whether the data is correct or not. To actually get the data, the developer will have to call the function
validate(untrusted:) which will return the value if it passes the validation, and
Putting things together
We now have a pattern that looks pretty tight, let’s have a look at it in action:
This time, there is no room for mistakes: when all the external data is wrapped in
Untrusted containers, the initializer cannot extract the actual value unless it passes the validation test.
Now let’s see if our initial goal of making external data systematic and easy to enforce has been reached:
- Whenever external data is retrieved, it must be immediately stored in an
Untrustedcontainer => easy to enforce in code review ✅
- Initializer of business objects can take as arguments either other business objects or
Untrustedcontainers, never primitive types => easy to enforce in code review ✅
- Business logic can only operate with business objects, never with primitive types => easy to enforce in code review ✅
The goal is indeed achieved: by using the
Untrusted+Validator pattern along with the 3 rules above, we can guarantee that, as long as the validation functions are correct, the business logic will only deal with safe and validated data.
For more information around the topic of secure coding, I recommend this video from the 2016 GOTO conference: https://www.youtube.com/watch?v=oqd9bxy5Hvc
Originally published at gist.github.com.