A Webpack built, Elastic Beanstalk deployment leveraging AWS S3

Web applications typically begin their life serving their own static assets (JavaScript, CSS, etc) to the browser. An alternative to this strategy would have these assets be hosted by an asset hosting service (e.g. Amazon Simple Storage Service (S3)).

This post will demonstrate a set of deployment tasks to automate the configuration of a web application to delegate it’s static assets hosting to another service. Specifically, we will upload Webpack-generated javascript assets to an S3 bucket and then deploy the associated web application (with references to the versioned, immutable assets) to an AWS Elastic Beanstalk (EB) environment.

TL;DR

Why use an asset hosting service?

A solution

  1. Webpack configuration targets s3
  2. Upload assets to S3
  3. Include built backend assets in the deployment package

Make a script out of it

Build and deploy locally

Check out the codes!

This post was peer reviewed by Matt Shwery
Author’s Note (6 September 2017): This post was originally titled as ‘A Webpack built, Elastic Beanstalk deployment leveraging AWS S3 as a CDN’. As was pointed out on Hacker News, I was inaccurately referring to AWS S3 as a CDN. While the major ideas here are mostly unaffected, references to CDNs have been corrected. Sorry if this caused an issue!

Why use an asset hosting service?

To be clear, we’re talking about two different static asset hosting strategies.

For example, when a web browser makes a request for a web application, the application’s web server responds with a string of html. This markup includes references to static assets, including the web application’s javascript and external css. These references instruct your browser to make an HTTP request to fetch the asset. But from where should the browser request these static assets?

  1. the application’s web server
  2. a separate service for hosting/serving static assets

As with many software architectural decisions, there are several pros/cons associated with each choice. Let’s enumerate these:

App webserver Pros:

  • Can provide a simpler application architecture. Node.js webserver frameworks (express.js, hapi.js, etc) provide this configuration by default. For prototyping an idea, this provides the least amount of development friction.

App webserver Cons:

  • Long running static asset request handling will block/delay web server functionality. Javascript static assets are often large files (>1MB). In particular, Node.js web servers are single-threaded processes and can really only work on a single task at a time. Consequently, long-running responses to requests effectively block the web server from responding to any other HTTP requests, including other APIs it is responsible for (e.g. initial requests of page markup, REST/graphQL data fetching, backend for frontend interfaces, etc.)
  • Coupled instance scaling. Horizontal scaling the web server instances to meet static asset request demand could prove relatively expensive (and less efficient) compared to the asset server’s optimized auto-scaling.

Asset server Pros:

  • Your app server can spend time doing more important things than serving static assets. Removing static asset handling from the application web server could result in web server requests being responded to quicker.
  • Asset servers provide automatic scaling for static asset hosting. Asset servers are usually configured to handle many parallel requests of static assets.
  • Decoupled instance scaling. As application traffic increases, the web server and the asset server can each scale independent of each other. This is a more efficient scaling strategy as one can not guarantee that as traffic increases, both the web server APIs and the static asset serving loads will need to scale at the same rate. This decoupling could prove to be more cost and hardware efficient.

Asset server Cons:

  • Adds complexity to an application’s deployment procedure. This post describes a set of deployment tasks that, while are relatively simple to implement, are by definition an increase in the complexity of an application’s deployment.
  • More expensive initially. This strategy has additional upfront costs due to the use of an additional 3rd party service (e.g. s3). However, as traffic increases, this decoupled scaling strategy could prove more cost efficient.

Each project should weigh these pros and cons before deciding to implement the changes described in this post. I would argue that this architecture is best suited for production applications, especially global ones — as the complexity tradeoff may not be worth it otherwise.

A solution

We desire an automated process that will:

  • use Webpack to generate an application’s static assets with versionable filenames;
  • upload these assets to S3;
  • have Webpack add the S3 asset url references to the web application;
  • version, bundle, and deploy the web application.

Webpack configuration targeting s3

We desire a Webpack configuration that can perform 2 significant tasks:

  • Name the output bundles with unique hashes and expose these filenames in a manifest file. Since we will be storing the static assets on a remote server, we will need a way to uniquely identify the assets associated with a particular application version.
  • Include the proper URLs to target the static assets. At build time, the static assets obviously will not be uploaded to the asset server yet. However, s3 makes it easy to anticipate the URL that these assets will eventually be assigned. We will want to include these references in our Webpack build.

Interesting parts:

  • Generate bundles named with unique hashes (line20) By simply including [hash] in the output.filename property of the Webpack config, the generated assets will have filenames injected with unique hashes. Unique filenames will allow us to treat the uploaded assets as immutable (i.e. any change to the content produces an entirely new bundle). This is good because older deployments can be redeployed without having to rebuild/upload the static assets.
  • assets-webpack-plugin (line44) This plugin generates a json file that provides mappings between the bundles declared in the Webpack configuration’s entry property (in our case, a single bundle) and the hashed filenames that are generated for each.
  • output.publicPath (line21) We’ve included the fully qualified url that targets the AWS s3 bucket to which the static assets will eventually be uploaded. Webpack will use this url whenever it references the static assets(e.g. in HtmlWebpackPlugin).
  • HtmlWebpackPlugin (line45) This plugin uses a template to generate an html file with the appropriate <script>, <link>, etc. references to the static assets that Webpack generates(using the url provided above). This is the base file of the client application and with what the web server will respond to requests for the web application.
  • source maps(line26). Simply including these in the same s3 bucket as the associated front end assets will automatically provide sourcemap information for first class production debugging (e.g. non minified codes + proper line numbers in chrome dev tools, 3rd party error tracking services, etc).

Upload assets to S3

We need a script that can locate and upload the static assets to the projects S3 asset bucket:

Interesting parts:

  • apply meta tags to files of certain filetypes(Lines 38–44) browsers can be given hints about how to handle file downloads via HTTP request headers. s3 allows these to be set with what it calls Metadata. Here we are associating content-identifying headers for any files with .gz (gzipped files, our Webpack config doesn’t generate these but it could!) and .css extensions, respectively.
  • skip backend assets(Lines 46–48) in addition to client assets, Webpack configurations sometimes generate assets that are intended to be used directly by the application web server (e.g. babel-transpiled backend codes, index.html, etc). It is not desired to upload these to s3, consequently the script should skip them.
  • stream upload via s3-upload-stream(Line 84) quick, low-memory data transfer is achieved with streams

Include built backend assets in the deployment package

A project can include a file named .ebignore in an application’s root directory to have the Elastic Beanstalk CLI bypass the project’s .gitignore during deployments. Normally, Elastic Beanstalk deploys only those files tracked by git (i.e. all files not filtered by .gitignore). In our case, we want to include the backend assets that result from the Webpack build step(specifically, index.html), which have references to the uploaded frontend assets, in the deploy.

We use an .ebignore file because we don’t want to version control the Webpack output but we do want to include those files in the deployment package. Including these assets, specifically index.html, is important because we are associating references to the s3 assets in the deployment package. This enables us to redeploy earlier deployed packages (e.g. conveniently via the Elastic Beanstalk web app) and correctly target the static client assets without having to rerun the Webpack configuration.

Compare this .ebignore:

to the project’s .gitignore:

Their are two notable differences:

  • .ebignore does not mention the Webpack output subdirectory public
  • it includes the project’s .git subdirectory, which avoids the including of a project’s entire git history with the deployment bundle

Make a script out of it

We can connect the above steps with an npm script. In the package.json's scripts property:

Notice the deploy:minor and deploy:patch scripts. These commands encapsulate all of the steps mentioned in this post:

npm version {patch|minor}

To make the version updating as painless as possible, one can use npm:

npm version {major|minor|patch}

This will:

  • update the version property listed in the project’s package.json according to the chosen semantic version flavor (major, minor, or patch)
  • generate a git commit and tag which will allow us to align an application’s git history with the Elastic Beanstalk deploy (see below)

npm run build

This runs the Webpack configuration from earlier.

npm run upload

This runs the upload script from earlier

eb deploy -l `git describe`

The Elastic Beanstalk CLI command eb deploywill do a couple things:

  • generate and upload a zip file of all of the files filtered by either .gitignore or .ebignore
  • trigger the Elastic Beanstalk deployment process associated with the EB environment targeted by the project’s .ebconfig

To generate an .ebconfig for a project, invoke eb init in the project root and answer the questions.

The -l `git describe` option will label the deployed package with the git tag generated via npm version (e.g. v1.0.1). This is an easy technique that versions the deployment with a reference logged in the project’s git history.

Build and deploy locally

This deploy script is intended to be invoked on a developer workstation, after all codes have been reviewed and approved for a deployment. One wouldn’t want to continuously deploy assets to the asset server as part of the development process.

Since the Webpack build happens locally and the generated files are both uploaded to s3 and bundled with the deployment package, the current Elastic Beanstalk deployment and any future redeploys of any deployment bundles will not require additional Webpack invocations. This is significant.

It is a common practice to include the Webpack build step as part of the Elastic Beanstalk deployment. It’s a subtle difference, but the strategy described in this post performs the Webpack build before the deployment. If the Webpack build occurs during the deploy, each and every deploy (including rollbacks to previous deploys, horizontal scaling during increased traffic, etc) will execute a Webpack build. Avoiding this removes significant overhead from each deploy, speeding up deployments and avoiding unnecessary production downtime. 🙂