Deploying a self-contained maintenance page on AWS

Background story and approach

Nino Ulsamer
StashAway Engineering
5 min readFeb 27, 2018

--

Some of our deployments, especially those for larger features that involve database migrations, or those that involve infrastructure components, require us to take a core service offline for some period. In monolithic times, you would have usually done something like php artisan down (oh, those fond memories of PHP…), which will show a maintenance page for any request that hits your application. However, when you have a React web application, several mobile apps, API services, and other connected infrastructure components, your life will become a bit more complicated. Also, what if you want to be able to enable your maintenance page even if your main services are down (e.g. during a region-wide outage of AWS)?

The solution we have implemented at StashAway is separated entirely from our core application services, and relies only on two AWS components: S3 and Cloudfront.

Serving StashAway’s maintenance page

In the above illustration you can see that we use AWS Cloudfront to serve our React web-app (app.stashaway.com) as well as our Node.JS API (api.stashaway.com). They each forward requests to the respective core services running inside AWS EC2.

Whenever maintenance mode is enabled, both Cloudfront distributions will be modified to serve from the Maintenance Origin instead, and the following will happen:

  • app.stashaway.com will return a static HTML maintenance page
  • api.stashaway.com will return 503 Service Unavailable for any request
  • A health-check endpoint on the API is periodically called by the React web-app. Once it returns 503 (see previous step), it will refresh itself, which will serve the maintenance page (see first step). This is necessary because our web-app is a Single Page Application which would otherwise not know that the maintenance mode has been activated.
  • The mobile apps will also periodically call the same health-check endpoint on the API, and will show a nicely designed maintenance page once it returns 503.
  • The maintenance pages on both web-app and mobile apps will refresh itself every minute so that the maintenance page will disappear once the Cloudfront distributions have been re-enabled back to normal operations.

We will now examine a few more details on the above configuration.

Serving the maintenance page

The first part of this setup will have to ensure that app.stashaway.com serves our maintenance page instead of the regular (React) web-app. An additional requirement is that all sub-pages (e.g. app.stashaway.com/assets) should serve the maintenance page as well.

In order to achieve this, we will create a new S3 bucket in AWS that holds the (static) HTML/JS components of our maintenance page. Make sure that you allow public (everybody) access to the files, by clicking the “Make public” button once you’ve selected all files and folders that you have uploaded, and by adding “Public access” for “Everyone” under the bucket’s permissions.

We then need to enable the webhosting mode of our S3 bucket, so that it can be easily used as a destination for a Cloudfront origin. You can do so under the tab “Properties > Static Website Hosting”. You should set an index document that also exists in your bucket and will contain the actual maintenance page’s HTML code.

We will also add a special rule here that will redirect all traffic leading to a 404 error back to app.stashaway.com. This is so that a request for example to app.stashaway.com/assets (which does not exist in the maintenance bucket) will be redirected to app.stashaway.com (which will instead serve the default page from the maintenance bucket).

Next, we need to make sure the maintenance page will be served by Cloudfront. For that we will create an additional Origin and respective Behavior in app.stashaway.com’s Cloudfront distribution, that will redirect all traffic to the maintenance S3 bucket that we just created. The path pattern should be set to * to redirect any traffic, but of course this must only be enabled when the maintenance page should be displayed. So during regular operations, we will simply modify this path pattern to something like /disabled/*, so that it will not apply to any traffic. Caching should be set to minimum values — otherwise it will be difficult to “get rid of” your maintenance page once maintenance mode is over.

Behavior for showing maintenance page

Returning 503 for all API requests

To return a 503 for all API requests we must deploy a little trick in the Cloudfront config. First of all, we create another S3 bucket which only contains a single file called down.json that contains a specific JSON object to be return by the health-check endpoint whenever the maintenance mode is enabled.

The remaining config should be the same as for the first bucket (allow public access, enable static website hosting, this time without the redirect rule).

We then add another Origin and Behavior to the API’s Cloudfront distribution which redirects all traffic to this new bucket (configuration equivalent to the web-app’s config). What happens now is that all requests, e.g. for api.stashawy.com/foo/bar will be redirected to our new (mostly empty) bucket, and therefore result in a 404 error.

Because we want our API to return a 503 error though, we add the following Cloudfront Error Pages config:

This instructs Cloudfront to return a 503 error code whenever a 404 is encountered (from our empty S3 bucket), and to use the down.json file’s content to serve back as the response body. (Under normal operations we modify the HTTP Error Code to 503, so that no error response rewriting is performed by Cloudfront.)

And that’s it! It takes us just a few minutes to activate the maintenance page for all of our applications, and we could do so even if no EC2 service or instance is accessible for some reason.

Where to go from here?

At StashAway we aim to codify all of our infrastructure operations, and as you have noticed, we do have a few manual steps involved in getting the maintenance page up or down. In the future we want to automate these steps as well, which shall become the topic of a future blogpost.

We are constantly on the lookout for great tech talent to join our engineering team — visit our website to learn more and feel free to reach out to us!

--

--