Optimizing HapiJS for Benchmarks

In the past year or so, our team successfully developed and standardized a platform Electrode for developing and deploying Enterprise applications using NodeJS at @WalmartLabs.

Among the many architectural decisions made, one of them was to use Hapi as the framework to power our web servers. With that we constantly received the question regarding performance and references to many of the benchmarks that show express is X times more performant than Hapi.

Just like any other architectural decisions, performance is only one piece of the puzzle. Hapi’s creator Eran Hammer has written a very good blog about this. Nonetheless, instead of having to explain the other important qualities such as security checks and validations that Hapi brings besides performance, I decided to dig into the details and look at exactly what Hapi does that is a tradeoff for performance, and see if there’s any opportunity for optimization.

Some of the reasons that we chose Hapi:

  • Well defined conventions and established app structure with validations
  • It has a plugin system that’s separated from the request lifecyle
  • It has security checks and configuration validations
  • It comes with default features in the request lifecycle such as cache and auth
  • It has a system of logging events built-in
  • There’re some general exposure with Hapi at @WalmartLabs given that it was being developed here for a while

Benchmark performance is not one of Hapi’s priorities as its creator explained and various benchmarks show that:

Obviously, any difference between any framework and Node’s raw sample, is how much code are inserted between receiving the request and sending back the response. For example, at the minimum, for any framework to be useful, the first thing required is routing by URL matching.

I am going to trace through Hapi’s request lifecycle code path to see what it does before the response is sent, and then run V8 profiling on it to see where the hot spots are.

First, some quick base numbers when running the simple benchmark from here for express and Hapi on our commodity servers running CentOS 7 and Node 6.10.0:

Rough numbers I got are 1550 req/sec for express and 550 req/sec for Hapi. This overhead in Hapi is constant. It’s showing 3X difference because there’s absolutely no business logic. If you have logics that take a big chunk of the response time, then Hapi’s overhead is only a very small percentage and express would only be a small percentage better in overall response time.

Next I traced through Hapi’s code when a request is received. Here is an outline of the interesting points Hapi goes through in the request lifecycle:

  • First Node’s http server emits the request event that is handled by Hapi’s Connection in _dispatch.
  • Hapi creates its own Request object, wrapping the http request.
  • Hapi has overload protection so it does load check.
  • Hapi executes request lifecycle, protecting it with Node’s Domains.
  • First thing in the request is check URL routing.
  • Setup response timeout.
  • Execute lifecycle steps.

The lifecycle steps are fairly straightforward:

  • First check cookies in hapi/lib/route.js
  • Check auth in hapi/lib/auth.js
  • Finally call route handler
  • The route handler does some initializing to measure timing benchmark

Since the samples are very simple, there are no cookies or auth so those are skipped.

Two things I noticed are:

  • Hapi defaults to use Node’s Domains to protect the request lifecycle execution
  • Hapi has a debug setting on by default so it subscribes to server log events

Both can be turned off with options when creating the server, however, doing so didn’t yield any meaningful changes in performance.

Nothing else stands out in the request lifecycle code path so I turned to V8’s profiler.

  • Running the sample with NODE_ENV=production node --prof
  • Querying the server with ab

The results from the profiler show that the top hits are a lot of Joivalidations and Podium events processing.

Setting breakpoints at joi/lib/any.js:442 I traced the calls to be originated from Podium, and that point to two lines of code in podium/lib/index.jsthat calls Joi.attempt to validate the event object.

To verify this finding, I commented out those two lines and ran the timing check again. The result is Hapi can now do more than 1000 req/sec. A double in benchmark performance and now only a fraction less than express.

Running the profiler again, now only Podium events processing show up as the top hits. Given that Hapi’s inner systems rely heavily on it, this is not surprising.

I decided to stop here because Podium is clearly written to be a feature rich event emitter that’s not restricted by performance requirements and it probably contributes largely to the remaining extra overhead compare to express. It’d definitely be interesting to dig further to see what the overhead breaks down to.

An opportunity for optimization may be to avoid revalidating all events in Podium. For example, in hapi/lib/request.js, when a request is created, it immediately calls Podium with an array of three static events that are unnecessary validated with Joi for every incoming request.

Conclusion: Hapi is a framework focus on providing an infrastructure for developing solid business applications. It makes tradeoffs on “benchmark performances” for features such as strict checks and validations that are very useful when hundreds of developers are working on the apps. While most of these are configuration validation during app startup and not contributing to the hot code path, Hapi’s Podium custom event emitter’s validation of the events during run time is adding a lot of redundant overhead, even on static event objects. In a way, this is almost like the V8 engine revalidating the same JS code every time it’s executed. Even though doing well in “benchmark performance” is not very meaningful, it’d help to shave a few ms off of Hapi’s request lifecycle if there’s a way to reuse these validations or just turn them off in production.

Thanks to Dave Stevens for reviewing this post.