Automating Your Staging Environment Generation for Web Development

Or Kaplan
Remitly Israel (formerly Rewire)
7 min readJan 25, 2021

If you’re looking to provide your developers with an instant and publicly available staging environment relevant for every app-related change — you’ve reached the right place. Leveraging our CI system and Cloudflare Workers, we at Rewire, were able to deploy a fully working environment by simply typing a Git commit command.

Our previous post, about serving a single page application using Cloudflare Workers, demonstrated how to deliver a single page application from a Google Storage bucket. Once the implementation process was completed, we came to the conclusion that this technique can probably be used to solve other challenges such as testing features during the development process.

Speeding Development Cycles

As our main focus at Rewire is on a high delivery pace, deploying multiple app releases a day, it’s important that we enable our developers to share their deliverables before they reach the pre-production environment, in order to review and improve quality. This is especially true in these two scenarios:

1. Product Sign-Off

Our engineering process includes final sign-off by a product manager and a UX designer. In order to finish the process, we should have a running environment with the new code that simulates the production environment.

At first, we had 5 static staging environments to which every developer on our team could deploy the final feature. However, the situation of having a limited number of staging environments created a lot of tension within the team as we had to continuously check who is using each environment. In addition, we wanted to find a way to showcase a demo of partial features while still in development, in order to get feedback from other team members.

2. End to End and Regression Testing

The testing pyramid

Before merging to the master branch (which is our pre-prod env), our CI pipeline needs to successfully complete running a set of automated tests in order to ensure we don’t introduce any regression, and that everything works as expected.

On the famous testing pyramid concept, we can find end-to-end (E2E) testing: this testing methodology runs the application and the API all while validating visual and functional elements on basic user journey flows.

In order to develop E2E tests, we’ve decided to use a product called testim.io, an AI-based E2E framework that runs a selenium grid on an external location and accesses the web app as a regular user. Therefore, we had to provide testim.io access to our web app. Potentially, we could have solved this issue by using a local Nginx server and exposing it using an HTTP/2 tunnel. However, the solution you’re about to see seemed much more elegant.

Back to the Drawing Board

The production solution we created, which was demonstrated in the previous post in this series, includes a docker image with all the static files relevant to Rewire’s web app. Then, we copied those files to the production bucket using our GitLab CI pipeline. Every time a developer opens a new pull request, we build a similar docker in order to run automated UI tests on the application. This means that we have all the relevant files on hand.

To build a staging environment, we first had to figure out how to serve the application to our users (this time, I’m referring to our developers). We could try to use the worker as described before, but the worker was set to serve a specific environment during the build time. So, we figured that our GitLab CI pipeline should be able to load the files to a dedicated Google Storage folder for the pull request. We used the pull request built-in environment variables to generate a unique folder name to which we upload relevant files.

Our CI Pipeline Illustration

At this point, we had to decide how to serve the files. While we could generate a new subdomain DNS record for every instance of the staging environment, we actually wanted to avoid the housekeeping process of the DNS records that were not in use. Hence, we decided to pass the folder name on a query string. That way, we were able to maintain one subdomain for all staging environments. This worker acts as a router with the following process:

  • On deployment, an index file is loaded to the worker key-value store which is relevant to the app version (the pull request number)

Example: load the index file for PR 35740 from gs://rewire-pr-assets/35740/assets/index.html

  • A user enters the router worker with the relevant version query parameter

Example: https://preview.rewire.internal/?version=35740, where 35740 is the pull request number

  • The worker stores the version in a cookie (restricted to on the router subdomain) and serves the index page from the relevant folder

Example: Set-Cookie: rewire-preview-version: 35740

  • Each time a user requests a resource, the browser automatically sends the resource version using the cookie
  • The worker knows which resource to fetch by the version’s ID, attaches the version ID from the cookie to the bucket’s base path and as the base folder with the resource name

Example: to fetch the files bundle.js when the version is 35740 the worker will load the resource from gs://rewire-pr-assets/35740/assets/bundle.js

  • When a user wants to use another version, the developer sends a new request with a new query param
  • In case there is neither query parameter, nor cookie, the worker falls back to our pre-prod environment

Using this process, a developer who wants to work in a staging environment should just open a pull request and wait for the build to finish. That way, he or she can get a dedicated environment within a few minutes. Subsequently, each push to this branch will update the staging environment and invalidate the cache instead of creating a new environment.

Bonus — Select the API server

Now that the web application staging environment is up and running, we understand that it might not be enough. Some big features will require us to change both the API and the web application. One way to tackle this is delivering the API prior to the web application. However, this method is not possible with every feature. In addition, in order to complete the End to End tests, we would like to run the API of the current code change. To achieve this goal, we would need to build a staging environment for the API (stay tuned for our next post) and direct the application to access it.

Up until today, we used to set the API address during the bundling time. Therefore, we had to find a way to manipulate the application so that it would connect to another API server. Again, we had two potential paths to take:

  1. Manipulate the JS code on the worker, which is a bit complex as the code is minified.
  2. Use the same technique as before: Send the API version as a query param and set it as a cookie, which is accessible from javascript. Afterward, our client reads the cookie and selects the correct server to access. In case there is no cookie, it falls back to the bundled version.

We selected the second approach and enabled it only on staging builds.

For example: to use a specific API staging environment, just pass it as a query param —

https://preview.rewire.internal/?version=35740&api_version=apistaging

Keep It Secured

The staging environment might contain secret features that we do not want to leak. Therefore, we had to find a way to hide it and make the staging environment private. As part of its zero-trust package, Cloudflare provides an Access proxy that enables access control to the application. Using this feature, we were only able to provide access to the team using our authentication service. Cloudflare Access is basically an application access gateway that we can attach to a specific route, and Cloudflare makes sure only authorized users are able to get it. The administration is very simple and supports roles for access control.

Securing our Staging environment using Cloudflare Access

Cost Awareness

One important tip I acquired at a previous workplace, is to only store the necessary data.

As a startup, we have to optimize our costs in order to minimize our burn rate because every pull request loads our bundle to the object storage, which inflates our storage costs. Hence, we enabled retention rules to the staging bucket, which deletes all the files created more than X days ago. That way, we save only what we need all while saving a few bucks.

In case of a long time pending pull request, the developer may either rebase or rebuild, and the staging environment becomes live again in a matter of minutes.

Wait, There’s More

As you see, using Cloudflare workers helped us automate the generation of staging environments and implement scenarios like automated E2E tests, which can be expensive and tedious.

All we had to do is identify the right storage pattern and find a way to signal the resource version. One thing that made the development very quick was hooking to the current CI process.

Want to know how we built our end to end test framework using Cloudflare workers, GitLab CI, and testim.io? Stay tuned for our next post in the series. We’ll tell you all about the cool tricks that make our job much easier and can make yours easier as well.

--

--