Express Middlewares, Demystified

Viral Shah
8 min readNov 1, 2018

--

Express application: Middleware Layer Stack

In the last few years, Express has been under the microscope for its position as the best web framework in JavaScript. There are many worthy competitors that have come up; namely — Koa, Hapi, Fastify, Restify. While all these frameworks may have some benefits over Express, they are still far behind in the popularity race.

Web frameworks download trends (courtesy NPM trends)

I’ve used Express in many of my Node.js applications. Express is the oldest web framework around the block. And a simple “Hello World” Express application indeed looks very clean and easy

Hello World Express App

Hello World Express App

However, the moment you start making a real production app, things get messy. You have to add a bunch of best-practice middlewares, some routers, the actual API routes, error handlers, etc. And let’s assume, you did your usual code cleaning, refactoring and all good stuff, your app will still look something like this

Real World Express App

Real world Express Production app

If you place close attention to the code above, you will realize that the .use() method in the express app is heavily overloaded.

What’s up with the “.use( )” method ?

Let us focus on all the variations for .use() method,

  • It is used for registering Routers, Middlewares, Routes & Error handlers.
  • It can be used with or without a path as first param
  • Both an App and a Router have a .use() method

Great!
It means, we just have to use this
use() method everywhere and everything will just magically work, Right?
Well, in most cases yes; but not necessary.
If you really jumble the order of all these methods & test your app again,

Best case: Your code will not work
Worst case: Your code will work in unexpected ways! 😮

I speak from experience! 😓
One fine day, I got frustrated with all the magic and decided to look behind the curtains; dive deeper into the source of Express.js. After hours of reading & debugging the source, I finally understood it. This post is to share some of those learnings.

App == Router

Yes, every Express app at it’s core is a Router. When we create an app using express() , we are essentially creating a root level Router. The reverse is true as well. Every Router is like a mini-app in itself. In fact, that is why, the .use() method of both Router and App looks so similar.

So then it begs the question, What is really there in this Router?

Well a Router mainly consists of a two things,

  1. handle() function
    It is the function that processes all the requests received by the Router
  2. Layer-stack
    It is a stack of Layers registered on the Router. I will soon get into the details of a Layer, but for now just understand that every Layer has a path and its own handle function. Every time we call the .use() method on an Express app or Router, we are basically creating a new Layer in the Router’s stack.

Layers

A Layer can be one of the following things,

  1. Middleware
    A function with signature func(req, res, next) . A Middleware usually runs some piece of code, optionally modifies the request or response and at the end, either sends the response or calls the next Layer.
  2. Route
    They consist of the actual Request handlers for processing one or more HTTP method Types (GET, PUT, POST…). A Route method’s handler also has the same signature as middleware, func(req, res, next) . Typically, it will contain the business logic to process the request and send a response. In case of an unexpected error, it can throw the error or call the next() function by passing the error as its first param.
  3. Error handler
    They are the functions responsible for handling the errors thrown by any previous Layer or sent by previous Layer using next() method. They have a signature of func(error, req, res, next) . While defining them, it is absolutely necessary to have all four params in the signature. This is the only way, app can differentiate between errorHandlers and other middlewares
  4. Another Router
    As I mentioned before, a Router is like a mini-app. Only difference is that a Router will usually be registered on the main Express app using a path. It is both contained in a Layer and has its own stack of Layers This kind of nested structure of Routers allows us to create modular mini apps within an Express app. They are created by invoking Router() method on express object.

Request Handling

Okay. So now that we understand the basic structure of our App let us understand how a request is actually handled.

  • Iterating the Layer stack
    When the handle method of the Router receives a new request, it starts processing the request by looping through the Layer-stack. The Router will loop through the Layers and call the handle function on every Layer with a matching path
  • Path Matching
    Path matching refers to matching, the path provided for the Layer in .use(path, handler) to the Request url. When we do not provide a path in the .use() method, the Layer defaults to the root path for Router. Which means, the Layer will match all the requests passing through the router.
  • Nested Layers
    To understand nested Layers, let us see code on creating new Routers
Express Router
  • Here, we are defining a new adminRouter, and registering it on path /admin . This creates a new Layer on app’s root Router, with path /admin . After that, any middlewares or routes registered on the adminRouter, will create a new Layer inside Layer-stack of the adminRouter.
  • Error Handling
    The errorHandlers are responsible for all the error handling logic in an Express app. They are part of the same stack as the middlewares, routers & routes. However, as I said before, they do have a different signature with error as an additional first param func(error, req, res, next).
  • handle_request vs handle_error
    When we define the handle function of any Layer, it is actually called by another wrapper function; either handle_request or handle_error. While iterating through a Layer-stack, the Router keeps track of a variable called LayerError, which is initialized as null. The request is considered to be in a non-error-ed state. While iterating through the stack if the any Layer throws an Error or passes any some object via next function like next(someObject) the LayerError will store that error/object and Request is not considered to be in errored state*

* small exception is when someObject is a String with value “route” or “router”
That is a special instruction to skip all pending route Layers or router Layers

  • So, as long as Request is in a non-error-ed state, the Router will keep calling the handle_request method for all Layers. Here, if underlying Layer is an error handler, it’s handle method will not be called. Similarly, when Request in an error-ed state, the Router will now switch to calling handle_error method, which will only call the underlying handle method of the Layer, provided it is an errorHandler.

Phew!! I know that was too much to “handle”
But if you are with me till now,
you are close to understanding all the E
xpress magic!

Request Process Flow

With all the learning of Express internals, let us now understand the how a Request is actually processed. Referring to the same code above, let us say we wanted an admin to get all users details. To do so,

  1. We will make an HTTP GET request by concatenating the path of Route’s API and the path of all its parent Routers i.e. GET /admin/users
  2. When the Express app receives the request, it will pass it to the root Router’s handle method.
  3. Root router will first pass Request through the app level middlewares defined on top, namely — Helmet, Compression.
  4. Next, this request will try to match the adminRouter’s path /admin . Since our request will match the /admin path, it will go inside the Layer stack of adminRouter
  5. It will run through the verifyAdminMiddleware() defined on the adminRouter. This middleware will verify if the client requesting is indeed an admin.
    If admin — it will simply call next() method without any param.
    If non-admin — it can either send a 401 or 403 error response, and end the Request-Response cycle here. Optionally, it can continue the cycle by throwing an Error or passing an Error via next(error). Either way, the error passed is stored in the LayerError
  6. If there is no LayerError yet, the app will invoke the getUsers() function. The function fetches the user details and sends it to the client with a 200 success Response, thus ending the Request-Response cycle here.
  7. However, if there is a LayerError from previous step, the API handler getUsers() function will be bypassed, in spite of matching
  8. If any of the previous steps have passed down a LayerError, the next Layer i.e. notifyErrorHandler will be invoked. This is irrespective of whether the Response is sent to the client.
  9. The notifyErrorHandler will probably log the error & send alerts. If Response is not yet sent, it can choose to send it. It can optionally pass the error to the next layer by calling next(error)
  10. Our globalErrorHandler is like a final catch all error handler. It will be called if there was an error passed from the previous middlewares and error handlers. It will send a Response to the client, if not already sent. Strictly speaking, it is not needed as Express app too adds its own finalHandler at the end of Layer stack

After understanding the internals of Express and Request handling, it is quite intuitive now to understand the following guidelines suggested by the Express.js documentation

The order of middleware loading is important: middleware functions that are loaded first are also executed first.

You define error-handling middleware last, after other
app.use() and routes

— Because the middlewares are registered & called in order on the Layer-stack. The errorHandlers are expected to handle error thrown from other middlewares and routes, hence they should go at the end of the stack.

If the current middleware function does not end the request-response cycle, it must call next() to pass control. Otherwise, the request will be left hanging.

— Because calling next() function is the only way of telling Router that I am done and you can pass Request to next Layer.

Calls to next() and next(err) indicate that the current handler is complete and in what state. next(err) will skip all remaining handlers in the chain except for those that are set up to handle errors.

— Because passing error in next() will register a LayerError and only handle_error function will be invoked on all the matching Layers after that.

You must provide four arguments to identify a middleware as an error-handling middleware function, even if you don’t need to use all the arguments

— Because that is the only way express understands difference between ‘middleware’ and ‘errorHandlers’, which are both registered via ‘.use()’ but called by the Request in different states.

For errors returned from asynchronous functions invoked by route handlers and middleware, you must pass them to the next()function, where Express will catch and process them

Express was not built to await on handlers or handle returned promises. Hence, when a handler function is an async function, it will return a promise. When we throw an error, we are simply rejecting that promise. So, the only way to pass an error is via ‘next(err)’ function Or is it?

For more details I encourage you to go through the source code. If source code is too much for you, you can also checkout this amazing blog post by Soham Kamani, explaining some of the source code.

--

--

Viral Shah

Passionate programmer, Wishful writer, Rookie gamer, Potential philosopher