Building a Node.js REST API with Express

Part 2, Making your API more robust

18 min readJan 31, 2015

We have the foundation.

Let’s build on top of what we have learned in Part 1.

Our API has very basic functionality. We’re able to define routes, read information sent from a client, perform database queries and respond with a resource object.

How do we deal with malicious input? How do we handle returning subsets of our data? How do we accept multi-part forms (e.g file uploads)? When we’re ready to publish our API how do we ensure that we maintain existing functionality when making changes or adding new features?

Validation

We always want to make sure that data our API is processing is validated.

Data validation helps ensure that our handlers operate using only clean, correct and useful data. This helps maintain the health of your dataset, and improve the user experience of your API. Most importantly, it helps prevent the processing of malicious requests. We cannot trust the source of the request because any request sent to your server could be spoofed or modified in-flight before it gets to us.

What’s the worst that could happen?

Wrong type of data stored in database, breaking future requests
Unintentional, or malicious access to data other than requested
Malicious code runs on your server exposing your database and codebase

How do we implement validation in Express?

Validation can be done in the routes’ URI (against parameters), directly in
each of the routes’ handlers (before using the data), or by using middleware that can be applied to multiple routes (less duplication of code).

For our tutorial we’ll be implementing parameter validation in the URI, and data validation middleware with help from a validation library made specifically for Express.

Parameter validation

We have data entering our application two different ways currently. We have POST bodies providing payloads of data, and we have URI parameters defined in our routes (e.g /photos/:id).

Express provides native support for Regular Expressions (referred to as regexes) in URIs in order to do basic validation on a parameter. Regexes can become complex very quick, but we can keep them simple for parameter validation.

For our Photo resource we allow users to take actions based on providing an ID in the URI. In our database, this ID value is an integer. So for our routes, we want to ensure that the ID parameter that we accept is numeric, let’s first review our existing route, which would also match /photo/random-text :

photoRouter.get(‘/:id’, lookupPhoto, function(req, res) {

Express allows us to add a regex directly after the name of the parameter. We want to allow any length of number:

([0–9]+)

Where + indicates one or more of the preceding definition (a number 0–9).

photoRouter.get(‘/:id([0-9]+)’, lookupPhoto, function(req, res) {

Now if someone were to request /photo/random-text it will not be captured by this route and proceed to check the rest of the router for a match (resulting in Cannot GET /photo/random-text).

Data validation

Our API has two different routes which currently accept data: create (POST) and update (PATCH) of a resource. Both data objects should accept values for fields found in our Postgres database.

By using placeholders in creation of our SQL queries (e.g WHERE id = $1) we are protecting ourselves against SQL injection. But what if we weren’t using a library that took care of this for us? What if we were constructing the SQL manually and executing it?

var sql = “SELECT * FROM photo WHERE id=“ + photoId;

What if someone were to pass the following in place of an actual ID?

0%20OR%20id%20>%200

Example of SQL injection while constructing SQL queries manually without validation

I was able to return a record (in the screenshot above) because the data I supplied evaluated to an additional expression, circumventing the logic:

SELECT * FROM photo WHERE id=0 OR id > 0

This particular attack would be mitigated by our parameter validation.

It’s not all about security

Data validation plays a large role in the user experience of your API, and health of your dataset. By validating incoming data you’re able to catch typos or mistakes by a client in constructing a request. This means that your final data set is predictable and clean.

Perhaps a client attempts to create a Photo resource object for an Album which does not exist yet? Or supplies a description which is longer than your application can support? You want to prevent these records from being stored in the database.

The data validation library we’ll be using is express-validator

Why not write our own validation library?

When you’re just learning about validation it can be good to attempt to write the validation yourself and learn the basics, but when you’re building a product you want to spend as much time as possible enhancing your product and not building bricks for the foundation.

Open-source software projects also carry the benefit of generally being built by developers who focus in that particular area of expertise. This results in their implementation being more comprehensive than something you may write yourself.

What does our inbound request look like?

For this section we’ll be using the POST /photo route as our guinea pig. When a client attempts to create a new Photo resource object there are three parameters which are required: description, filepath, and album_id.

What are our expectations of these values?

description — should be text, of a certain length
album_id — should be numeric
filepath — we will return to this value later in the tutorial

Let’s start by loading the Express Validator middleware to our index.js file:

var express = require(‘express’);
var bodyParser = require(‘body-parser’);
var expressValidator = require(‘express-validator’);var app = express();app.use(bodyParser.json({ type: ‘application/json’ }));
// We add the middleware after we load the body parser
app.use(expressValidator());

What this is doing is extending the standard Express request object with additional helper functions for validating post body, query or parameter data.

To enable us to re-use the same validation on the PATCH route, we need to make our own middleware that leverages the functions which Express Validator gives us.

function validatePhoto(req, res, next) {
  req.checkBody(‘description’, ‘Invalid description’).notEmpty();
  req.checkBody(‘album_id’, ‘Invalid album_id’).isNumeric();  var errors = req.validationErrors();
  if (errors) {
    var response = { errors: [] };
    errors.forEach(function(err) {
      response.errors.push(err.msg);
    });    res.statusCode = 400;
    return res.json(response);
  }  return next();
 }

So what’s happening here?

First we are using the checkBody helper function that is supplied from Express Validator to test the description (not empty), and the album_id (numeric). checkBody works by supplying the field name, and an error message. Then on the resulting object we can chain specific tests (in this case, notEmpty and isNumeric) on the values for those fields.

req.checkBody(‘description’, ‘Invalid description’).notEmpty();
req.checkBody(‘album_id’, ‘Invalid album_id’).isNumeric();

Hold on, what is method chaining (cascading)?

Method chaining is the process of calling one method after another on a particular object. In our example: we first created a validation object using the Express-Validator library (checkBody) then we chained the applicable tests. Here’s an example of chaining methods to affect a string:

function MyString(value) {
  this.value = value;
  return this;
}MyString.prototype.upperCase = function() {
  this.value = this.value.toUpperCase();
  // Because we return the appropriate context: this
  // We're able to continue calling methods
  return this;
};MyString.prototype.addQuotes = function() {
  // Our mutated value can be updated again
  this.value = '"' + this.value + '"';
  return this;
};var str = new MyString('Hey, this is awesome!');
str.upperCase().addQuotes();
console.log(str.value); // "HEY, THIS IS AWESOME!"

Pretty cool, right?

Next in our example we are checking whether the validation library detected any issues with the tests which were performed:

var errors = req.validationErrors();

From the documentation of Express Validator, we know that errors is an array when something is wrong, and falsey if all tests pass. This allows us to return a specific error response (status code 400, Bad Request), or continue to the next middleware (the route handler).

To actually use this middleware we simply inject it before our route handler (just like our lookup function from Part 1):

photoRouter.post(‘/’, validatePhoto, function(req, res) {

Request sent to our API without a description value

Pagination

When dealing with large amounts of data, it’s important to paginate. Pagination is useful from a performance perspective in that it decreases the amount of large queries against your data set, but it is also useful from the client’s perspective as they can request only the data they need.

Pagination is done in our queries by using the concepts of OFFSET and LIMIT. Offset specifies at which record in the list of results we want to start at, and Limit denotes the total number of records to be returned.

SELECT * FROM table LIMIT 10;

In this example, we are asking for 10 rows from the beginning of our table.

SELECT * FROM table OFFSET 10;

In this example, we ask for all rows from the table starting at record 10.

Combining these two we can retrieve any subset of records. If we wanted to retrieve 10 records after the first 10 we specify it as such:

SELECT * FROM table OFFSET 10 LIMIT 10;

How do we apply this to our API?

First we need the client to be able to tell us which records they want. This is implemented with query strings. We can allow the user to specify two parameters: Page and Limit.

Limit is the number of records to be returned, on which we will set a maximum and default. Page will begin at 1 and increase the offset of records returned.

Validation should be performed on the query parameters passed, and default values should be supplied (our default limit will be 10, maximum 50):

// parseInt attempts to parse the value to an integer
// it returns a special "NaN" value when it is Not a Number.
var page = parseInt(req.query.page, 10);
if (isNaN(page) || page < 1) {
  page = 1;
}var limit = parseInt(req.query.limit, 10);
if (isNaN(limit)) {
  limit = 10;
} else if (limit > 50) {
  limit = 50;
} else if (limit < 1) {
  limit = 1;
}

To apply these settings we first need to perform a query which tells us how many records there are. This sets us up to calculate the total number of pages.

For our tutorial, we have no WHERE clause so this query is simply counting all of the records in the table, but if you allowed filtering you would need to also specify those criteria here:

var sql = 'SELECT count(1) FROM photo';
postgres.client.query(sql, function(err, result) {
  var count = parseInt(result.rows[0].count, 10);
});

Because the count returned is a string, we must parse it to an integer in order to perform some basic calculations.

We now calculate the offset. The page number is reduced by one, and multiplied by the limit per page. This gives us the starting record for our query.

var offset = (page - 1) * limit;

We have our lower and upper bounds for our final query (offset and limit), let’s build the SQL to return our paginated results:

sql = 'SELECT * FROM photo OFFSET $1 LIMIT $2';
postgres.client.query(sql, [offset, limit], function(err, result) {
  res.json(result.rows);
});

So now we can start our server and test some requests. If you receive an empty result, simply create some sample data in your database.

If you look carefully, our page number has changed and a different result is returned

Multi-part forms

While not actually a concept of REST APIs, it’s important to understand the difference between types of forms and requests which are pointed at your API.

So far we have looked at the ability to send data over a POST body in a HTTP request, we have seen how to retrieve data from a URI by using parameters but one thing we haven’t touched on is how might someone send a file to our server.

Enter multi-part forms.

A multi-part form is a method of sending data from your browser to a remote server via the POST method in HTTP. To enable multi-part in a particular HTML form we simply specify an enctype (encoding type) attribute. Without this attribute the browser will not attempt to read in data from the local system to send to the server.

<form action=”/photos” method=”post”
enctype=”multipart/form-data”>

By adding the enctype attribute to our form the browser now knows to look for inputs with the file type, read the contents of the file specified and send it unencoded to the server.

The unencoded part is key. If we were to submit our form, including file inputs, with no enctype attribute the values in the form are converted to ASCII (using the default application/x-www-form-urlencoded enctype). This will not allow your server to store the file in its original state.

We can modify our POST / route to receive the file upload. For testing purposes, we’ll include an HTML route in our app so we can try the upload (and would be removed later).

var uploadRouter = express.Router();app.set(‘views’, ‘./views’);
app.set(‘view engine’, ‘ejs’);
uploadRouter.get(‘/’, function(req, res) {
  res.render(‘form’);
});app.use(‘/upload’, uploadRouter);

A great module to use for parsing the multipart data is Express’ Multer middleware. You simply supply some options to its middleware and it takes care of all of the parsing and saving of the files.

To receive/parse the file data we need to modify the photoRouter route for POST / which will allow us to save the file, then store it in the database.

We simply add the Multer middleware function before our photo validation function in the route definition:

photoRouter.post(‘/’, multer(), validatePhoto, function(req, res) {

Why isn’t the Multer middleware the same as our validation one?

There are different ways of approaching middleware, but the basic concept is ultimately the same: A function is returned which accepts a Request and Response object.

For our validation function we are simply allowing Express to invoke the function with the aforementioned arguments.

What Multer is doing is accepting an options argument which affects the function that is ultimately called by Express. Here is an example of a closure, using an option to deny a request to a particular URL:

function (opts) {
  var denyUrl = opts.denyUrl;
  return function(req, res, next) {
    if (req.url === denyUrl) {
      return next(new Error('Bad URL'));
    }    return next();
  }
}

To give you an idea: Multer is essentially receiving your options and then making assertions on the request object (such as the sizes of files being uploaded) and then continuing to the route handler if all checks pass.

By default the middleware is taking any number of files, of any size and storing them in its default destination directory: an uploads folder in your project’s root.

We learned about validation for request data earlier, but how can we apply this to file uploads?

We definitely don’t want to allow uploading of just any files, and we want to be able to limit the file sizes so it doesn’t fill up our hard drive.

By looking at the documentation for the Multer module we can see that they accept options by passing an object to the middleware function. The first thing we want to define is our destination directory, using the dest property:

photoRouter.post(‘/’, multer({
  dest: './uploads/'
}), validatePhoto, function(req, res) {

The file that we receive from the client could be named anything, so Multer provides a rename option which accepts a function that receives two arguments: the field name (of the file input), and the original filename. We simply return the filename we ultimately want to store the file under:

photoRouter.post(‘/’, multer({
  dest: './uploads/'
  rename: function(field, filename) {
    filename = filename.replace(/\W+/g, '-').toLowerCase();
    return filename + '_' + Date.now();
  }
}), validatePhoto, function(req, res) {

We are able to remove any non-word characters using a regex, normalize the filename to lowercase, and append the date stamp for when the file was uploaded.

To prevent clients from eating up our server’s disk space we can specify a limit on both the number of files, and the individual file size:

photoRouter.post(‘/’, multer({
  dest: './uploads/'
  rename: function(field, filename) {
    filename = filename.replace(/\W+/g, '-').toLowerCase();
    return filename + '_' + Date.now();
  },
  limits: {
    files: 1,
    fileSize: 2 * 1024 * 1024 // 2mb, in bytes
  }
}), validatePhoto, function(req, res) {

One last modification we want to make is to ensure that the file has not been truncated. The truncated property will be True if the file violated the file size limit. We can check this value, as well as ensure the file field we’re looking for exists in our validatePhoto middleware.

function validatePhoto(req, res, next) {
  // The file does not exist, something bad happened
  if (!req.files.photo) {
    res.statusCode = 400;
    return res.json({
      errors: [‘File failed to upload’]
    });
  }
  // The file was determined to be too large
  if (req.files.photo.truncated) {
    res.statusCode = 400;
    return res.json({
      errors: [‘File too large’]
    });
  }

At this point the file is uploaded to our liking, but how to we track the file in our database? Multer appends a files attribute to the Request object, which is an object of the files uploaded (keyed by form field name).

Knowing our form fields, and that our field name is photo we’re able to modify the database query to now store the real filepath value in the database.

var data = [
  req.body.description,
  req.files.photo.path,
  req.body.album_id
];
postgres.client.query(sql, data, function(err, result) {

So let’s restart our server, load up http://localhost:3000/upload and attempt to upload a file. If you inspect your project files after attempting to upload you should see an uploads directory which contains your files.

Integration testing

So you’ve written some basic REST API endpoints, and you’re going to keep building on top of them. How do you prevent yourself from accidentally breaking what you already have?

One way is to completely ignore it and hope for the best: you don’t make mistakes — Right (not recommended)? Another way is to exhaustively test each possible scenario manually when you change something (feasible, but prone to error).

Integration testing is a way for us to run checks against/through our API without having to actually test it manually. We can make changes to the code, run a set of defined tests and quickly know whether our changes had an effect on other areas of our codebase. The test is the same each time it runs, unlike a manual test which can have slight variations in order of operations and parameters.

Code testing itself can be daunting, it comes in many forms and there are many tools available to help you with it and that can make it confusing to get started.

The most popular testing framework for Node.js is Mocha.

Mocha works by using describe statements to outline the functionality you’re testing, and then specify assertions in each of the callbacks to actually test your expected outcomes. Lets look an example of verifying 1+1=2.

First we install mocha globally with NPM, this allows us to use the mocha command in the CLI to run our test files:

npm install -g mocha

Then we create a test file, let’s put ours under the test directory as example.js:

// We're testing Math right now
describe('Math', function() {
  // And in Math, we have Addition
  describe('Addition', function() {
    // Then we provide specific test cases
    it('should return 2 when 1 is added to 1', function() {
    });
  });
});

So we’ve now created the structure for a test about Math, specifically Addition, which verifies that when 1 is added to 1 it equals 2.

To actually verify this is the case, we use a built-in library called Assert.

We require the module at the top of our file, then add assertions to the it blocks (our specific test cases):

it('should return 2 when 1 is added to 1', function() {
  // Typically we would be running part of your application's code
  var result = 1 + 1;
  // Equal takes your test case, and the expected outcome
  assert.equal(result, 2);
});

We can now run our tests from the command line using Mocha:

> mocha test/example.jsMath
 Addition
 ✓ should equal 2 when 1 is added to 11 passing (3ms)

Of course, you don’t actually need to test basic Math functionality.

And we can introduce other sub-sections, or assertions (test cases) to expand our example test file.

In this example we’ve added a Subtraction section along with an extra test for Addition.

With integration testing you’re verifying that when pre-determined inputs enter your application, a specific outcome is achieved on the other end.

In our basic example we were testing code which only existed in the test file itself. Our REST API is an actual server, the code needs to be running for us to test particular outcomes.

We’re able to leverage a library built specifically for testing Express applications called Supertest. This library wraps an Express application, evaluating what happens when certain inputs (requests) are sent to the application.

We will still leverage Mocha for running our tests, but instead of doing simple math we will have Supertest activate our server and test a particular request.

Let’s start by outline what we actually want to test in describe statements:

var assert = require(‘assert’);
var request = require(‘supertest’);var app = require(‘../index’);describe(‘Tutorial REST API’, function() {
  describe(‘Create photo’, function() {
    it('returns the created resource on success', function(done) {
    });
    it('returns 400, with error message on bad request', function(done) {
    });
  });
});

The goal is to cover all of the potential outcomes, the more coverage you have the more robust/resilient your API will be (less prone to failure).

You’ll notice that the callback for our it statements now has an argument called done. Done is the callback function used by Mocha to know when you’ve completed running your test subject’s code. In this case, when Supertest has finished running its requests.

Now we can specify the Supertest itself:

it(‘returns the created resource on success’, function(done) {  var validPhotoResource = {
    description: ‘Photo created on ‘ + Date.now(),
    filepath: ‘/path/to/photo.jpg’,
    album_id: 1
  };  request(app)
    .post(‘/photo’)
    .field('description', 'My photo description')
    .field('album_id', 1) 
    .attach('photo', __dirname + '/abe.jpg')
    .expect(201)
    .end(function(err, res) {
      if (err) {
        return done(err);
      }      assert.equal(res.body.description, validPhotoResource.description);
      assert.equal(res.body.album_id, validPhotoResource.album_id);
      done();
    });
});

So what is happening? First we are specifying what we believe is a valid POST body for creating a Photo resource object.

We then pass our app object to Supertest (named request). This allows Supertest to actually run our app code when we run our tests. Supertest then provides a number of chainable functions to both specify what the request should be, and what we expect the response to be.

Our first example shows us making a POST request to /photo with a description and an album_id property in the POST body.

request(app)
    .post(‘/photo’)
    .field('description', 'My photo description')
    .field('album_id', 1)

Our API expects a multi-part file to be uploaded along with the description and the album id fields. How do we simulate uploading a file?

Supertest has an attach method which allows us to simulate someone attaching a file to their form upload.

request(app)
    .post(‘/photo’)
    .field('description', 'My photo description')
    .field('album_id', 1) 
    .attach('photo', __dirname + '/abe.jpg')

The file has to exist inside of our test directory in order for the test to pass, since it needs a test subject to attempt to upload.

We then expect that the response will include all of our values supplied, an id property, and the HTTP status code 201. This signifies that the resource was created properly.

.expect(201)
.end(function(err, res) {
  if (err) {
    return done(err);
  }  assert.ok(res.body.id);
  assert.equal(res.body.description, validPhotoResource.description);
  assert.equal(res.body.album_id, validPhotoResource.album_id);
  done();
});

If we go to the command line and run it we find that our test fails. We get a really long stacktrace and some red text. Why is that?

The problem is that when Supertest starts up our server it’s unable to connect to the database, because the database itself is initialized using the bin/www script in our repository.

We can solve this a few different ways, but the easiest and most simple for a beginner is to create a test database, and initialize the Postgres client before the tests run.

Mocha provides a number of helpful interfaces for setting up the environment before the tests actually run. One of these interfaces is before. Any code specified in a before block runs, you guessed it, before the tests in that context.

All we have to do is import our Postgres client code from lib/postgres.js and initialize it before the tests run:

var pg = require(‘../lib/postgres’);var DATABASE_URL = ‘postgres://username:password@localhost/test’describe(‘Tutorial REST API’, function() {
  before(function(done) {
    pg.initialize(DATABASE_URL, done);
  });

This ensures that there is a database connection available for the test cases to run against. (You can create your test database by just copying the existing api sample database).

Now when we run our tests we should see something like this:

> mocha -R spec test/api.jsTutorial REST API
 Create photo
 ✓ returns the created resource on success1 passing (35ms)

We can develop further with confidence knowing that we will be notified of any breaking changes via running our tests.

Where to go from here?

Modern web APIs offer much more than we’ve accomplished so far. What are some things we’re missing that would be good to add?

Authentication, OAuth or Basic Auth
Security features

To view the code for our API thus far visit the part-two branch in the sample app’s repository: https://github.com/jeffandersen/node-express-sample/tree/part-two

If you have any questions or comments don’t hesitate to contact me directly via Twitter @jeffandersen, or simply comment on this post.

This tutorial has been built specifically for purpose of exploring creation of a Node.js REST API with UIT Startup immersion students. Check out UIT at http://uitstartup.org or on Twitter @UITStartup