Using Hexo and AWS to build a fast, massively scalable website for pennies

TL;DR; I built myself a website that can handle front page of Reddit traffic. I’ve paid 50 cents a month in hosting so far. Here’s how.

`One of the Mind Maps for the Site

Last summer, I realized I needed a website. I really wanted to learn AWS and work on scalable code, so I ruled out prefab options like WordPress. I loved React and considered the MERN stack, but I didn’t want to pay for EC2 instances, and definitely didn’t want to set up auto scaling groups. I could’ve used React to build a static client side SPA using a JSON bundle for content, but that seemed like overkill.

That’s when my friend James told me about Hexo, a modular static page generator built in NodeJS. With Hexo, I write page templates in the language of my choice, and write content in YAML + Markdown files. Hexo then generates static HTML that I could upload to S3 and CloudFront to get an inexpensive, massively scalable website.

There was one obvious disadvantage: no server side code. No dynamically generated pages or content. No emailing scripts. Nada.

As it turns out, this wasn’t a problem.


10000 Foot Summary

  1. Register for AWS Free Tier.
  2. Register a domain name. I recommend Route 53 because it’s cheap and will save you extra work. If you have another provider, configure it to work with Route 53.
  3. Create a S3 Bucket to hold your website. Configure it to use your custom domain.
  4. (Optional) If you want your site to be really fast and are willing to pay a bit more, create a CloudFront distribution and connect it to your s3 bucket. This is not strictly necessary.
  5. Create a local directory for your website. Run npm i --save hexo hexo-deployer-s3 or npm i --save hexo hexo-deployer-s3-cloudfront depending on whether you set up CloudFront.
  6. Build your website using Hexo.
  7. Use AWS Lambda + API Gateway for any server side code you can’t live without.
  8. Deploy: hexo clean && hexo g && hexo d
  9. Enjoy!

If you do this with AWS Free Tier, you’ll pay around 50¢ a month during your first year, plus the cost of registering a domain name. That’s it. Once your free trial expires, you’ll still be paying peanuts for S3 unless you’re hosting a lot of data.

Keep in mind, this site is fast. I mean, it is fast. It’s so fast with Amazon’s CDN that it seemed to actually load quicker from the internet than it did from my computer, where there was a slight lag as Node served the content.

Keep in mind, this is with little optimization other than using Hexo Filter Cleanup to compress / minify content. I could probably boost the numbers further if I spent some time with Lighthouse.


Step 1: Set Up AWS

If you don’t already have an AWS Account, get one now. Most of it’s free for one year, plus AWS is quite arguably the best Cloud Infrastructure service on the market. (If you already have AWS, don’t worry, this is still inexpensive).

Once your account is ready, login to your console. If you want a somewhat technical overview of what AWS can do, Amazon has a nice Getting Started Tutorial.

Protip: you can get to any AWS service without going through the main interface by going to https://console.aws.amazon.com/your-service-name-here i.e. https://console.aws.amazon.com/s3 for s3 or https://console.aws.amazon.com/cloudfront for CloudFront.

Step 2: Register a Domain Name

Protip: Register your domain name with Route53. Seriously.

I registered my domain (TedY.io) with GoDaddy because I didn’t know better. I paid more and now must manage my site in two places. No fun!

In any case, if you either already have a domain name or don’t like Amazon’s no-frills interface, here’s how to use an external domain name registrar.

Step 3: Create an S3 Bucket, Configure it for Static Hosting

S3 is the Amazon Simple Storage Service. Here’s how it works: you upload files to a bucket (think directory) that are associated with a given key (think full path name). You set permissions on the files. Amazon handles the rest, ensuring that they’re distributed around their network and delivered quickly to clients. For example, the bucket for my website is TedY.io and a key might be index.html.

A common use case of S3 is static website hosting. They have a pretty straightforward tutorial on how to do this, so I won’t reiterate it here. Follow their lead.

Protip: If you get a permissions or access denied error, check that you configured permissions properly. It’s easy to skip that step. AWS does not enable public access to your S3 buckets by default.

Step 4: (Optional) Enable CloudFront

CloudFront is Amazon’s Content Delivery Network (CDN). It distributes your website content around the world on Amazon’s various servers, making it fast to download from any location. There are many CDNs out there, but CloudFront is the simplest to use with AWS.

You do not have to use CloudFront—s3 is fast on its own. However, it will upgrade your site to ludicrous speed. You will be able to handle Reddit Front Page traffic without a hiccup (though you will pay extra for a massive spike). Though it’s free the first year, in the long run CloudFront is usually more expensive than S3, unless you’re handling a lot of traffic (TL;DR: more than 10TB).

Follow Amazon’s instructions and create a CloudFront distribution for your website. After creating the distribution, AWS will give you a URL that you can use to access your distribution. Your site will still be accessible from the old S3 Static URL, but the CloudFront URL will use the CDN. Update your DNS records in Route53 accordingly, so they point to the CloudFront distribution URL instead of to your s3 bucket.

For example, my S3 static website URL is http://tedy.io.s3-website-us-east-1.amazonaws.com/. My CloudFront URL is http://d2ya1wjaraby7p.cloudfront.net/. These are the relevant Route53 settings:

Protip: If you want, you can enable CloudFront after you’ve built your website. This will make it easier to check that updates are live, because you do not have to wait for them to propagate through the CDN.

Step 5: Create a Directory for Your Website

If you don’t have NodeJS and NPM installed: 
Follow the instructions here. NodeJS is a JavaScript runtime—it lets your computer run JavaScript code outside of a web browser. This is great because you can build full featured applications in JavaScript.

NPM is Node’s package manager. Node’s base installation is very spartan; it provides a basic set of capabilities and allows developers to extend it by writing their own modules. NPM manages all of these extensions so that you don’t have to manually keep track of them.

Installing Node will install NPM. Once you’ve installed in, open up your terminal to the directory you just created and run the following commands:

npm init —this will set up the directory as an NPM package and allow you to install other packages, run scripts, etc. Follow the instructions, using defaults whenever you don’t know the answer to a question.

npm i --save hexo — this installs Hexo, which we will use to generate our website.

If you used CloudFront:

npm i --save hexo-deployer-s3-cloudfront

Otherwise:

npm i --save hexo-deployer-s3

These plugins tell Hexo how to upload your website to your s3 bucket / CloudFront distribution automatically when you run the hexo deploy or hexo d commands. Very handy!

Step 6: Build Your Website Using Hexo

This is obviously the longest step here, and not one I can cover in this post. There are many tutorials on how to use Hexo to create a blog. There are also many tutorials on how to create your own theme for Hexo.

Protip: If you want to keep things simple, use one of the prefab Hexo themes.

Hexo has extensive documentation, but it can be a bit confusing until you really wrap your head around how Hexo works.

Here’s what I wish I knew:

  • Anything you put in themes/your-theme/_config.yml will become available as a JavaScript global variable in any Hexo template as theme.variable-name-here. Conversely, anything you put in your-site-directory/_config.yml will become available as site.variable-name-here. Lastly, any values you set in the top of a post Markdown file will become available as page.variable-name-here. This is a good way to set values that your whole website or individual posts use.
Protip: If you’re going to redistribute your theme, keep site content related variables out of your theme’s config file. Example: it’s cool to configure a default banner in your theme config, but leave the intro text from your homepage out.
  • If you want to write JavaScript code that will run in the person’s browser who is viewing your page, put it in your source/js or themes/your-theme/source/js folder. If you put it in the themes/your-theme/scripts folder it will be executed by Hexo while building your page and will not be available in the browser.
  • Conversely, any scripts you want Hexo to run when building your site should be put in the themes/your-theme/scripts folder. Any variables or functions in these scripts will not be available for use in your templates unless you add them to Hexo as a helper. See my files getters.js and register.js for an example. Since I was building a small site, I encapsulated all the code I needed to help me generate my page in a function called _get().
  • Lodash is your friend. Learn to use it :-) It’s available to use in any Hexo template as the _ global variable.
  • If you want to create a static HTML page and have Hexo copy it to your public folder, add it to your site/_config.yml file under the skip_render line. For example, I have a folder called demos that Hexo is instructed to copy as-is. In my site/_config.yml folder I have skip_render: demos/**
  • You can include HTML in Markdown files. Mind blown.
  • If you run hexo generate and things look funny, try running hexo clean first. If you want to include this all in one command, you can do hexo clean && hexo g && hexo s or hexo clean && hexo s --generate . These commands will clean your public folder and serve the website so you can test it.

My Build Process:

I decided to build my own theme based on a Bootstrap template. My first decision was what templating language I’d use. Templates are the way I tell Hexo where to put information in the various pages in my site. Rather than writing static HTML that must be rewritten for each page, I specify where elements go and what goes inside of them.

By default, Hexo uses EJS (Embedded JavaScript). I found the EJS syntax to be ugly. The simplicity of Pug (formerly Jade) appealed to me, plus it supported embedding plain JavaScript in my templates (though, I avoided this for reasons I detail below).

I built Pug components for the various elements in my site—headers, inserts, listings, sections, etc. You can see them here. Rather than building big, monolithic pages which would be hard to maintain, I opted to build little components as Pug mixins (think: functions that output HTML or other Pug mixins) and includes. This allowed me to reuse code.

While building, I noticed I was repeating lots of JavaScript code. For example, many templates needed code to conditionally apply HTML data attributes, choose text to display, or render a tag description on a post. I could accomplish many of these effects through logic in Pug mixins, but many became ugly with long blocks of code, especially if I had to choose one value among many possibilities.

For example, here was the code that pulls the proper description for a tag and render it in a hidden div underneath.

I could just insert this in any template that uses tags. Following the Don’t Repeat Yourself (DRY) principle, however it’s best to factor these out into a common location. I could place it directly in my tagStub.pug mixin, however this just feels like too much logic for a view.

Thus, I created a universal getter helper that I called _get(). Then, any time I encountered code like this, I would just add it to a property of _get(). (Note: at first, _get was an object, not a function, but Hexo only permits helpers to be functions)

Now I just call:

And here’s the code for _get():

Much cleaner! And it has the benefit of keeping my Pug templates dumb. They’re not doing complicated logic—just displaying the information I pass to them. This conforms better to the Single Responsibility Principle. Plus, I was then able to write unit tests for the _get() code using Mocha and Chai.

I used many client side JavaScript effects plugins as well: ParticlesJS, JQuery, Waypoints, SparkleJS and probably others. Some of these were fairly convoluted to integrate with my site, and unnecessary for a simple website (but I’m a perfectionist) :-)

Step 7: Use AWS Lambda + API Gateway for Server Side Code

Static sites are all fun and games until you need run code that won’t work in a browser. For example, I didn’t want to have a vanillamailto link to contact me—I wanted a styleable contact form that would send me email directly! Of course, there has to be a server to do this.

Or, there used to need to be one—AWS API Gateway and Lambda to the rescue. API Gateway is a service that sets up a REST Endpoint, essentially a URL where you can make HTTP requests and have something happen. That’s intentionally vague—REST endpoints can do everything from querying databases, to running data analysis code, to fetching Wikipedia articles.

Lambda is a “serverless compute” system. On a traditional server, you write code and then the server waits for people to request it. When no code is running, you’re still maintaining the server and paying for usage. Lambda, on the other hand, using a pay-per-use model. You write code and Lambda keeps it on file. Whenever your code is called and runs, you pay. This can result in a dramatic cost savings. (1 million requests per month are free).

Thus you can run code without a server by setting up a REST endpoint on API Gateway, and using that REST Endpoint to trigger your lambda code. Boom—pay-per-use server.

Here’s how I used this to send email: I set up an API Gateway Endpoint and Lambda Function (link for tutorial). The Lambda function was linked to my Mailgun account, a freemium service used to send/receive email programmatically.

(For the curious: I didn’t trigger Mailgun directly from the contact form because their API doesn’t support CORS).

This is far too involved to explain here, so I’ll do another writeup on how I wrote this code later. If you want it sooner, send me an email via the form on my site!

Protip: This technique can be used to implement any kind of server side code you want on your s3 + CloudFront website.

Step 8: Deploy

If you’ve installed the Hexo deployment plugins I recommended and configured them properly in your _config.yml file, you’re ready to deploy. When you’re ready, run:

hexo clean && hexo g && hexo d

This will clean your public folder, generate your website, then deploy it.

If everything is set up properly, you should see a progress bar and then see that deployment is complete. Wait a bit and go to your website url.

Protip: if you’re using CloudFront, you may need to wait about 15 minutes to see your updates go live.

Step 9: Enjoy

Congrats! You’ve built a fault-tolerant, extremely fast website for cheaper than the cost of a Starbucks Latté. Pat yourself on the back, and follow me here or on Twitter. Cheers!

Here’s the Github Repo if you want to see the code for my site :-)