Back-end validation on AWS Lambda and NodeJS

Pedro Fernando Marquez Soto
A man with no server
5 min readMar 18, 2017

JavaScript is a great language: Provides flexibility and transparency. You can always now for sure what a variable contains without knowing what type it is because, well, there are no types.

Sure, there are Strings, Objects and Functions, but that’s it. Print an object and you will see its contents. Simple.

That same flexibility brings a big challenge when working with Lambda and API Gateway (on almost any web app with NodeJS in the back-end): How can we let the function’s client when the data we are receiving is correct?

We could manually check each one of the fields we expect (in the TacoGallery project we have used before we are kind of doing that); but that would make our code extremely complex.

Fortunately, the world of NodeJS is vast and full of modules.

Simple indeed

I like jsonschema, a JavaScript library which allows us to create a schema for our objects and use it to validate them.

Among the things this library validates are: format, presence of required fields and the structure of composite types. And, in case that the validation fails, it will return a user friendly error, along specific data of what caused the error.

What about TypeScript? I mean, TypeScript addresses this problem natively, and we are using ES5 anyway, so why not it? While it’s really popular (even Angular 2 uses it by default), it’s still a superset on top of regular JavaScript. If you feel comfortable using it, go ahead; it’s a perfectly good choice too.

In our current Taco Gallery app, we only save a single object: A new Taco:

{
"id": "..." //Auto generated
"name": "Al pastor",
"description": "A delicious taco"
}

Let’s take a look at our saveTaco function:

var params = {
TableName:'TacoGallery',
Item:Object.assign({
"id": id
}, taco)
};
saveTaco(tacoName,tacoDescription){
let id = uuid.v4();
var params = {
TableName:'TacoGallery',
Item:{
"id": id,
"name": tacoName,
"description": tacoDescription
}
};

return this.db.put(params).promise().then(data => {
data = Object.assign({id:id},data);
return data;
});
}

We are sending the properties of our entity to save, one by one. This is good for an example, but as our application grows, so our models. Keeping this same approach would result in functions with too many parameters.

Let’s send an object instead:

saveTaco(taco){
let id = uuid.v4();
var params = {
TableName:'TacoGallery',
Item:Object.assign({
"id": id
}, taco)
};

return this.db.put(params).promise().then(data => {
data = Object.assign({id:id},data);
return data;
});
}

This is our original problem: How can we trust that all the required fields are included? How do we now that we are not getting more data than what we expect?

We can create a schema to define our Taco object with jsonschema, and use it for validation:

const schema = {
"id": "/Taco",
"type": "object",
"properties": {
"name": {"type": "string"},
"description": {"type": "string"}
},
"required": ["name"],
"additionalProperties": false
}

This schema guarantees the following about the objects to validate:

  • It includes the required field name. If it’s missing, an error will be thrown.
  • If both name and description are found, both have to be strings. Numbers, booleans or objects will throw an error.
  • No other fields beside name and description are allowed.

Taking a look at out Lambda function, where is the best place to put this validation? Since all our lambda specific code is in handler.js looks like a good place put it. It would decouple our validation logic from the business logic.

However, what happens if tomorrow you want to move your application to a non-Serverless architecture, like an Express app? With that thought in mind, makes more sense to have that separation between Lambda-specific code and application-specific, so let’s put it in TacoGallery.js.

Another thought: Today I like jsonschema, but in the future it might make more sense to use a different library. To help our future-selves, let’s also decouple this library from our application by wrapping it in a new class, decorator style.

For now, we define the the scheme in the class constructor; but you might as well load it from a static JSON file, to structure your project better.

Also, notice that our validation function returns a Promise. This will allow us to plug and chain our validation function into other promises, like the one we are using to save in DynamoDB.

saveTaco(taco){
let id = uuid.v4();
var params = {
TableName:'TacoGallery',
Item:Object.assign({
"id": id
}, taco)
};

return this.v.validate(taco, "/Taco").then(data => {
return this.db.put(params).promise().then(data => {
data = Object.assign({id:id},data);
return data;
});
});
}

Here is the complete version of the file:

Let’s give it a try. Create a valid taco:

curl -d '{"name":"Al pastor","description":"A good taco"}' -H "Content-Type: application/json" -X POST http://localhost:3000/taco

The console prints:

Serverless: POST /taco (λ: saveTaco)
Serverless: The first request might take a few extra seconds
Serverless:
Serverless: [200] {"statusCode":200,"body":"{\"id\":\"6e608c59-8015-4afe-9de5-af64c4b2eedb\"}"}

Now, run the same command, but without the required field name:

curl -d '{"description":"A good taco"}' -H "Content-Type: application/json" -X POST http://localhost:3000/taco

The console prints:

Serverless: [409] {"statusCode":409,"body":{"message":"Could not save the taco","stack":[{"property":"instance","message":"requires property \"name\"","schema":"/Taco","instance":{"description":"A good taco"},"name":"required","argument":"name","stack":"instance requires property \"name\""}]}}

The request fails, and even returns a user friendly message! Now, let’s try to create a record with extra, unexpected data:

curl -d '{"name":"Al pastor","description":"A good taco", "x":"x"}' -H "Content-Type: application/json" -X POST http://localhost:3000/taco

The console prints:

Serverless: [409] {"statusCode":409,"body":{"message":"Could not save the taco","stack":[{"property":"instance","message":"additionalProperty \"x\" exists in instance when not allowed","schema":"/Taco","instance":{"name":"Al pastor","description":"A good taco","x":"x"},"name":"additionalProperties","argument":"x","stack":"instance additionalProperty \"x\" exists in instance when not allowed"}]}}

Again, the request fails, and let’s us know which field is violating the validation.

Using jsonschema allows us to validate our data objects, helping us to keep the flexibility of JavaScript objects.

This model also allows us to create schemas as they are needed, and use them in our business logic to make sure all the data that’s stored in DynamoDB is valid. Also, we make sure to let know the consumers of our API which data and why it’s failing.

Today’s post is not really Serverless specific. Any NodeJS can take advantage of it. The takeaway from here is that, while building Serverless applications we still have to care about common concerns of a web application.

Don’t forget that you can find the full code at my Github account: https://github.com/pfernandom/taco-gallery

--

--