Using EC2, NGINX, and Prerender.io as a proxy for a Single Page App

Single Page Apps(SPAs) built in javascript seem to be the hotness right now. One common issue, however, is that web crawlers are unable to scrape them. This is an even bigger issue if you want to have social sharing on your site. I’m going to walk you through how to setup a proxy server using EC2, NGINX, and Prerender.io (note: you can also host Prerender’s software for free, checkout readme here) for your SPA.

In my particular case I had an Ember app, hosted via S3, but this will work for any js app that is being hosted statically.

Prerequisites:

  1. You have a SPA hosted somewhere, ie. S3, GitHub Pages, surge
  2. You have an AWS account

That’s it!

Okay, first thing we gotta do is create our EC2 instance. To do so go to the EC2 dashboard and click the ‘Launch Instance’ button.

In step 1 you will be given a list of options, select the first one, it should look similar to this:

Once you have selected your Amazon Linux instance, select step 6 from the top to configure your security group(this is the only thing we will customize). Here you will want to add an HTTP rule that is listening on port 80 like so:

Next, click ‘Review and Launch’ where you will be taken to step 7

Go ahead and launch! It should take about 3 minutes for your EC2 instance to be active.

Once active, you will need to ssh into your instance.

Once you have connected to your instance using ssh, you will want to update it using sudo yum update .

Once your instance is updated you will want to install NGINX using sudo yum install nginx .

Now start your NGINX server sudo service nginx start .

You should now be able to access your EC2 instance and NGINX server via your EC2 instance’s Public DNS that will look something like this: ec2–00–111–22–333.us-west-2.compute.amazonaws.com

The page should look like this:

Okay, now we know that our NGINX server is working like it should and we can see that configuration should go in the /etc/nginx/nginx.conf file. This is the only file we will touch.

Replace that file’s contents with the following:

# For more information on configuration, see:
# * Official English Documentation: http://nginx.org/en/docs/
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
# Load dynamic modules. See /usr/share/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log  /var/log/nginx/access.log  main;
sendfile            on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include             /etc/nginx/mime.types;
default_type application/octet-stream;
# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;
server {
listen 80;
server_name example.com;
location / {
try_files $uri @prerender;
}
location @prerender {
proxy_set_header X-Prerender-Token YOUR_PRERENDER_TOKEN_GOES_HERE;
set $prerender 0;
if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator") {
set $prerender 1;
}
if ($args ~ "_escaped_fragment_") {
set $prerender 1;
}
if ($http_user_agent ~ "Prerender") {
set $prerender 0;
}
if ($uri ~* "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff|svg|eot)") {
set $prerender 0;
}
#resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
resolver 8.8.8.8;
if ($prerender = 1) {
#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
set $prerender "service.prerender.io";
rewrite .* /$scheme://$host$request_uri? break;
proxy_pass http://$prerender;
}
if ($prerender = 0) {
rewrite .* /index.html break; # Throw away the path because this is a single page web app with only an index.html
proxy_pass YOUR_SITE_URL_GOES_HERE;
}
}
}
}

Things you must do:

  1. Change example.com to your site url
  2. Replace YOUR_PRERENDER_TOKEN_GOES_HERE with your Prerender.io token
  3. Replace YOUR_SITE_URL_GOES_HERE with your static site url, ie. S3, Github Pages or surge
  4. MOST IMPORTANT: WHENEVER you make changes to this file, you must run sudo service nginx reload for those changes to take effect

If you want to force http to https using Elastic Load Balancing:

  1. Add a Classic Load Balancer to your EC2 instance with proper http and https listeners
  2. Place this block above the first location block:
if ($http_x_forwarded_proto != 'https') {
rewrite ^ https://$host$request_uri? permanent;
}

3. Change $scheme to https

And… that’s it! Now you should see your own static site instead of the default nginx index.html. Try scraping to make sure that social sharing is now working using tools like Facebook’s sharing debugger. If so, you have properly setup your proxy! Hopefully this helps save you some time!