Making a Rails Health Check that doesn’t hit the database

In a production application you usually have many servers, and each of those servers gets checked periodically to make sure they’re still healthy and working as expected. When they are, requests can be routed at them by your load balancer. If a server doesn’t respond to the healthcheck, then it is presumed to be dead or unhealthy, and requests are diverted to the healthy servers instead. If you’ve got an autoscaling solution set up, unhealthy servers can be killed, rebooted, and re-added to the load balancer’s pool of healthy servers.

In a Rails app, a health check controller for this process might look like:

class HealthcheckController < ApplicationController 
def alive
User.first
render json: { ok: true , node: "It's alive!"}
end
end

You’d write a route so requests matching GET https://your-app.com/healthcheck are directed at that controller action. When the healthcheck is called, it will hit the DB by checking for the first User record, and if that works, returns a successful response. This is a great concept because for many apps, if the server can’t talk to the database then it isn’t going to be much use to your users!

But if your database goes down, all of your servers will go down too. None of them can access the database, so they’re all unhealthy. If you have autoscaling, all of your servers will be continuously torn down and started up. That’s what happened to the team at Buildkite, who have a really good postmortem on this issue.

Due to an error in how we bootstrapped these new servers, the health checks failed which meant no new servers could come online to replace the ones that were removed. 
 Over the course of a few hours, this caused a cycle of new servers to launch and terminate instantly due to failing health checks. 
 — 
Keith Pitt from Buildkite

Let’s not talk to the database!

You might think that all you have to do is remove the database call, and then your healthcheck will pass. Unfortunately that’s not true, because of Rails’ default Middleware stack. Middleware are bits of code that each request is passed through before it hits your app proper. You can see a list of Ruby on Rails’ default Middleware stack here.

Have a go! Remove your database call from the healthcheck action, shut down your local database, and see what happens when you hit your healthcheck. For me on Postgres I get:

PG::ConnectionBad 
could not connect to server: Connection refused Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

Even though your healthcheck doesn’t have a database call anymore, the request still fails!

Here’s why: one of those pieces of Middleware is ActiveRecord::QueryCache (view the source here). The QueryCache middleware is executed on every request, before the request even makes it to your controller action. It connects to the database to enable a caching layer in ActiveRecord. So if a request comes in and your database is down, this middleware will throw a big fat error! Since your healthcheck is way up further in the stack, in a controller action, it never gets a chance to run.

To get around this, you’ll need to create your own middleware which comes before ActiveRecord::QueryCache.

Side note: if you have a separate marketing site which has its own separate healthcheck, then it should hopefully stay up. This is still a valuable exercise for if you want to have more granular dependency healthchecks.

Implementing a Ruby on Rails middleware healthcheck

class MiddlewareHealthcheck 
OK_RESPONSE = [ 200, { 'Content-Type' => 'text/plain' }, ["It's alive!".freeze] ]
  def initialize(app)
@app = app
end
  def call(env) 
if env['PATH_INFO'.freeze] == '/healthcheck'.freeze
return OK_RESPONSE
else
@app.call(env)
end
end
end

Save that in your app to /app/middleware/middleware_healthcheck.rb. 
 In your /config/application.rb file, add the following line:

config.middleware.insert_after "Rails::Rack::Logger", "MiddlewareHealthcheck"

Now when your app starts, Rails will add our new healthcheck middleware after initializing the logger, which also happens to be ahead of QueryCache. The middleware will “capture” any requests to /healthcheck and immediately return a 200 text response with “It’s alive!” in the response body. No more database calls!

In the code we’ve frozen some strings to reduce object allocation, which is handy for frequently called bits of code. I’ve called it MiddlewareHealthcheck because, well, it only checks that a request reaches the Middleware layer, and nothing else. You should also add a comment to your routes.rb file, so that there’s less chance for another developer to get confused and try to add something at that request path.

So there we have it! A simple middleware for Ruby on Rails which intercepts healthcheck requests so that you don’t have to hit the database. You could, and probably should, go ahead and add more Healthchecks which test for the availability of your other dependencies, to get a more complete picture of your system’s status:

  • Is the App responding (example above)
  • Can the App talk to the database?
  • Can the app talk to your cache layer?
  • Does the homepage load? (a more traditional end-to-end healthcheck)

Want more? Read How to load test your APIs using Go and Vegeta!
Web APIs are a ubiquitous part of modern web or app development. We all have to deal with them and often have to build and maintain them, too. But if you operate one, how do you know it’ll stay up when the hordes of the internet send 10x or 1000x the usual traffic to it?
https://thisdata.com/blog/load-testing-api-interfaces-with-go-and-vegeta/


Originally published at thisdata.com on August 31, 2016.