Serverless landing page optimization: implementing rapid experimentation at Opendoor.
This is the first in a series highlighting Opendoor’s use of Cloudflare Workers. If at any point you want to jump straight to working code, it is available here (and in the implementation section).
opendoor.com/w/* to our internal Wordpress instance. Our worker allows for bundling of required npm packages, has full test coverage, and is actively iterated on by developers at Opendoor every day. (Bonus! It’s written in Typescript, and Opendoor ❤s Typescript). Over time, their utility, ease of use, and continually expanding capabilities led to an explosion of use-cases; they now run before every request to an Opendoor web property.
Cloudflare Workers power our landing page infrastructure and enabled us to:
- Run an A/B test across old and new infrastructure
- Run an A/B test across two separate landing page designs
- Run multiple A/B tests on different variations of our new design
- Simplify developer experience on the new infrastructure
All with a pretty small amount of experimentation/routing code. We hope you all are as excited as we are about this approach and the power of workers.
At the beginning of 2019, our landing pages were starting to show their age. They were difficult to iterate and experiment on, loaded slowly (especially on mobile devices), and needed a design refresh. We set out to address these issues and boost our conversion (while measuring everything along the way). After our initial investigation, we decided that we were going to:
- Rewrite our landing pages with a new infrastructure/framework (optimized for page-speed improvements, in this case, nextjs) with the existing design and roll this out (ensuring conversion was flat or positive).
- A/B test the new design vs. the old with the new infrastructure
- Run multiple experiments on the winning variation to boost conversion
All of these steps were ordered such that we would first validate our new infrastructure, then our new landing page design, and finally, experiment on variations of the winning design.
There are a couple of high-level steps when a request comes in:
- Identify the user (or generate an identity if they’re a new user)
- Choose the experiment group the user is assigned to (or was previously assigned to), and satisfy their request for that content
- Log the user and variation for use in conversion analysis
The two commonalities across all experiments are the experimentation platform (Optimizely) and a distinct user id (held in a first-party cookie on the
opendoor.com domain). The three A/B tested domains are infrastructure, design, and variation. Each of these need to be rolled-out, measured, and either A or B is declared the winner. We can not mix and match these three categories or there are too many variables to isolate in analysis. In order to accomplish these goals, we have to decide where the shared infrastructure lives.
Our first experiment domain is the old vs. new infrastructure. These are two separately deployed services, the first is served from our heroku deployed rails monolith, the second is a kubernetes hosted node service running server-side rendered nextjs — these differ by host (e.g.
k8s.opendoor.com). Next, our design test happens on the selected serving infrastructure. Each design variant will be on the same host but won’t share much code. Finally, we’ll test variations on the winning design (headlines, calls to action, and the like). These will be mostly the same code with conditional variations. We have two options here, given these requirements.
Option 1: Experiment logic in application code
If the experiment logic lives in the application code, then an infrastructure test would not work, as we’d have to wait until a backend was fulfilling the request before determining which infrastructure to use to fulfill it. If we’re on new infrastructure, we could assign in application code, but then we couldn’t cache requests to our backend, because we’d have to hit our upstream to determine the experiment group. Developers would also have to fake user information and experiment groups while developing.
Option 2: Experiment logic before hitting application code
If the experiment logic runs before application code, we are able to cache requests to our backend and determine which infrastructure to hit before being in application code. This introduces the requirement that our request URL is the only data dependency for the backend. This requirement also simplifies developer experience, as there’s no need to mock or fake experiment groups while developing. The nextjs backend doesn’t even know who the user is, it renders solely based on the request URL.
Decision: Option 2
On each request for a landing page, our worker does the following:
- Match the request against the landing page path (GET + $landing_page_path)
- Parse the “anonymous_id” cookie (generate a random uuid if it isn’t present)
- Check what experiment group the user is in (this is a stable assignment based on user-id)
- Fetch the content for the url associated with that group (and return the response to the user, usually cached)
- Log the experiment assignment
A full working example of the concepts in this blog post (including sample infrastructure, design, and variation tests) is available as a Cloudflare Workers template using the following commands. This includes all the configuration necessary to integrate Optimizely with a worker (including a working webpack config, bundled libraries, and cookie management):
An simplified example of the worker code for this is as follows:
If you have any questions or feedback, feel free to open up an issue on the linked repository or reach out on twitter (@defjosiah).
In the end, we managed to successfully run all of our intended A/B tests on this new landing page infrastructure. The Cloudflare Worker logic has been rock solid (partially due to our testing strategy), and we haven’t run into any production issues. Developers love developing against this service and adding new features and experiments is easy (mostly due to the request URL requirements). The focus of this post is not the actual conversion results; but we managed to successfully boost conversion with the new infrastructure (due to page-speed improvement), release a new landing page design, and finally run a number of successful variation experiments. Compared to implementing a one-off proxy server (nginx or similar) with these features, Cloudflare Workers are a breeze.
Our overall workers architecture, testing, deployment, and implementation deserves its own blog post. We also use workers for serving organic content, first-party cookie management, user-agent differentiated responses, “serverless” lambda functions, localization, and frontend application serving. We’d like to share these use-cases with you all, we love workers and think you will too!
If you’re interested in this type of work, Opendoor is hiring engineers! Head to our careers page to learn more.