No Monkey Business Static Progressive Web Apps (part 1)

or; How and why Interactive Investor uses decoupled Drupal, Gatsby.js, ReactJS and AWS to deliver rich content without making Google cry

Introduction

Back in June 2017, on completion of our acquisition of the European business of TD Direct Investing (branded as TD Waterhouse), we, the new Interactive Investor (ii), were presented with a technology challenge:

How do we bring two technology stacks together, in a new modern stack, to benefit all, from our customers through to our developers?

This series of posts will cover the technology and challenges we faced relaunching our public site (free research, news and analysis, discussion and product marketing site) — focusing on “the front-end” User Interface (UI), UX and Content Management System (CMS) layers.

In this first post, I, Dominic Fallows — Apps Technical Lead (web, mobile and content) at ii, will go further into the challenges we faced and the solution that enabled us to deliver our modern UI architecture.

In the next post in this series, we will discuss our CMS architecture with a deep dive by Elliot Ward — CMS Technical Lead at ii.

Part 1 — Table of Contents


Our challenge… is our opportunity!

How do we bring two technology stacks together, in a new modern stack, to benefit all, from developers through to our customers?

Bringing together two technology stacks

Our Product + Engineering teams believe in taking a user-centred and product-led approach. Our opportunity was to:

Modernise our solution, learning from the best and worst parts of our past technology stacks, whilst improving our experiences, using the latest technologies.

What makes a great UI?

There are many things that make up a great User Interface (UI) from a user, product and technology point of view. I have shared some of the qualities we identified and how we interpreted these to help form our UI architecture solution. From a technology perspective, a great UI should be:

  • Intuitive — we need a platform and technology that would enable us to build a high-quality UI allowing us to deliver on strong UX strategy and design
  • Attractive, clear and consistent — we need to be in full control of the UI templates and content production workflow allowing us to deliver on strong UI design and content strategy
  • Responsive and adaptive — deliver the same great experience across all devices
  • Fast and efficient — we need to create a modern web app and architecture that delivers great performance for users on their browsers and devices and also for search engines, like Google

What makes a great solution?

In this post, I’m going to focus on three aspects of what makes a great solution:

Experiences
Of course, that includes UX and a great UI, but also the experiences of our teams. A whole selection of teams works on our solution, to name a few: Web App developers, Mobile App developers, API + Services developers, DevOps engineers, Analysts and Testers, Product teams, Marketing and Editorial content producers, etc.

Happy and engaged teams make great products.

We needed a stack that would be attractive to work with for our teams, current and future. There was a huge appetite from our teams to develop their skills on a modern stack made up of the latest technologies.

Security
As a FinTech company, we operate in responsible ways to protect our customers. A great solution to us includes security, from Information Security (InfoSec) to DevSecOps — the philosophy of integrating security practices throughout the DevOps process.

Discoverability 
A great solution enables people to find our product and services, from our free research, news and analysis, discussion platform through to our product sign-up journeys.

Shaping our solution

Following on from our discovery and analysis of all the aspects above (and many more requirements from across the business) we started to shape a solution:

  • AWS Cloud Platform as our cloud computing provider
  • Micro-service architecture to deliver our scalable business and data APIs
  • Decoupled CMS to provide an enterprise content production experience
  • Familiar UI language and frameworks considering developers across API + Services (Java) and App (web, mobile) teams
  • An interactive, performant and scalable web app that is Search Engine Optimised (SEO)

UI language and frameworks for an interactive, performant and scalable web app that is Search Engine Optimised (SEO)

Working with web browsers is hard. There are many browsers actively used by our customers — like Internet Explorer, Chrome, Safari, Firefox — which all have unique ways of working. Added to this is the wide variety of devices that is now common-place — low-powered and high-powered smartphones and tablets, laptops and desktops — which all add on more complexity.

We needed a UI language and framework that would make working with web browsers efficient, allowing us to focus on our objectives (like UX/UI, Security and Discoverability — like SEO).

We needed to enable developers across teams to use their skills and work together on a large codebase. Our choice of UI language and framework should also enable our Java developers to up-skill in modern UI development.

After much research, and considering all of the popular web app options back in 2017, we selected the following for our UI language and framework:

  1. UI language: TypeScript and JavaScript
    TypeScript is a superset of JavaScript (JS) enabling us to create a large robust codebase through aspects like static typing, classes and interfaces — familiar to JavaScript and Java developers.
  2. UI library: React
    React enables developers to work with a virtual browser, which acts as an agent between the developer and the real browser, standardising the unique way each browser works with web apps. As the React team themselves say “React makes it painless to create interactive UIs”.
  3. UI framework: GatsbyJS (v0-v1)
    In 2017, Gatsby v0-v1 was a JavaScript framework, single-page application (SPA) and static site generator. Gatsby is powered by React (and other techs, more below) but essentially gave us the ability to create a directory of static HTML and JavaScript file — which enhances SEO value.

We chose Drupal as our headless CMS, which we will cover in the next post in this series.

We were very happy we had found the perfect solution to deliver our objectives, so we got head-down and kicked off the development of our new public site in October 2017.

Google and JavaScript Web Apps

During our selection of UI language and framework, we did a deep dive into SEO and Google’s handling of JavaScript (JS) web apps. It’s important for search engines, especially Google, to be able to render web sites and apps so they can be included in Search results. Search results are one of enabling customers to find our products and services.

We created prototypes in React and Gatsby (v0–v1) to prove our analysis and found that Google could indeed render and index both our static generated pages and dynamic JavaScript web pages.

Pre-2018 Google was public in their efforts in the JavaScript ecosystem. They had their own JS framework (Angular) and developers were excited by the Google search engine’s increasing abilities to render JS apps.

By the end of 2017, we were ready to launch a beta version of our public site. This beta site was small but helped us start to get customer opinion and to test our new web app with Google.

We started to release more pages to our beta public site like our CMS powered marketing pages and our Editorial news and analysis Articles. By mid-2018, we had several hundred marketing pages and upwards of 8000 news and analysis articles on our web app.

Example of a CMS page — Analysis & Commentary Article (2019)

We already knew that we wanted to increase the SEO value of these pages, full of useful information about our products and our analysis and commentary. We used the power of Gatsby v0-v1 to create SEO optimised static versions of these pages.

We also launched our research pages (like our research page for Microsoft). We had approximately 24,000 research pages which we started to test with users and Google. Given the number of research pages, backed by our prototypes giving us confidence at the time with Google’s ability to render JavaScript, we decided to leave the research pages dynamic. Dynamic pages being ones that populate data on request after the first load, rather than having static HTML delivered on the first load.

Example of a hybrid dynamic/static page — a research page for Microsoft (2019)

All went well, and we were happy with our solution choices.

We had an expected drop in organic traffic (we were changing our entire stack and URL structure, so we expected Google would take a little time to fully index our new site), Google seemed to be indexing both our static and our dynamic pages well. We knew that our JavaScript pages would take longer than our static pages to index, based on our understanding of how Google renders and index JavaScript.

Google’s process to render and index JavaScript pages

However, by Q3/Q4 2018 our organic traffic wasn’t recovering as fast as we thought it should and we started to experience problems with getting our pages indexed in Google. Some pages were taking months to appear, some didn’t appear at all. That was bad enough, but our scariest finding was that some pages started to get deindexed from Google’s search results.

What was going on?

After much research internally and with our SEO consultants, we found that, if a JavaScript page hadn’t been indexed on a second wave of indexing (orange section of the diagram above) BEFORE a new crawler came back to the page (blue section) then the crawler was marking pages for deindexation.

After seeing positive signs that Google knew about all of our pages, even if they were taking time to include them in search results, we now saw our pages disappearing from the indexing process altogether.

At the same time, Google started to adapt its recommendations when creating JS web apps, see this Google I/O ’18 conference talk. They started to recommend a concept called Dynamic Rendering.

Currently, it’s difficult to process JavaScript and not all search engine crawlers are able to process it successfully or immediately. In the future, we hope that this problem can be fixed, but in the meantime, we recommend dynamic rendering as a workaround solution to this problem. Dynamic rendering means switching between client-side rendered and pre-rendered content for specific user agents. Quote from Get started with dynamic rendering
Dynamic Rendering

To put it another way, Dynamic Rendering means delivering a different rendering and experience of the web app to users and Google. This concept was not new, it was something we had considered in our early planning days.

However, we felt there must be a better way to recover our SEO and indexing, whilst not reverting back to more traditional methods of web app development.

The part of Dynamic Rendering whereby we set up a pipeline to render our JavaScript pages into static HTML was interesting. After all, we had the power of Gatsby (v0-v1) already doing that for our CMS pages.

We were not comfortable with the idea of switching our output based on who was requesting it, that is fraught with complications (we’d need to detect, Google, Bing, Twitter, Facebook, LinkedIn etc — who are all search engines in their own right) with more moving parts to maintain.

At the time Gatsby (v0-v1) would not have been able to process the ~25,000 pages that we had selected for static versions, it simply required too much expensive cloud resource (GBs of RAM and lots of processor power) and took far too long to deploy (over 20 mins for just our CMS content alone, not including 24,000 research pages). Our content workflow requires us to publish, and therefore build and deploy, our web app many times a day.

Gatsby (v0-v1) was a powerful single-page app (SPA) and static-site generator, but we were asking too much of it, to be able to handle our hybrid static/dynamic pages as well.

Gatsby v2 to the rescue!

Static Progressive Web App and Gatsby v2

The JS ecosystem and Gatsby v2 started to develop the idea of Static Progressive Web Apps (SPWA).

A Static Progressive Web App (SPWA) is an evolution and hybrid of a:

  • Static Site
    A site made up of static web pages (sometimes called a flat site) that is delivered to the user exactly as stored.
  • Single-Page Application (SPA)
    An app made up of dynamic web pages which are generated by a web application on a user’s browser. An SPA delivers all necessary code — HTML, JavaScript, and CSS — either with a single page load or dynamically loaded and added to the page as necessary, in response to user actions. Critically there is no static HTML delivered to Google.
  • Progressive Web Application (PWA)
    A PWA enhances SPAs with offline functionality, push notifications, and device hardware access traditionally available only to native applications. PWAs combine the power and flexibility of the web with the experience of a native application.
An SPWA meant that we could deliver the same great experience that is optimised for robots and humans alike.

Gatsby v2 was now a powerful SPWA generator. They had made huge improvements to its internal codebase. This gave us confidence that combining the concept of an SPWA and Gatsby v2 meant we could start to process large amounts of pages (~25,000) that used acceptable amounts of cloud resource and deploy times.

Summary and future

By November 2018 we had delivered our rebuilt site in Gatsby v2. We now statically generate ~25,000 web pages in approximately 5–6 mins. We are seeing fantastic SEO improvements, pages now get indexed in Google search results in hours and days (not weeks and months), update in minutes and hours and our organic traffic is ever increasing.

What’s next?

  • We continue to learn new ways of optimising Gatsby V2 and our React application
  • We are working on further ways to increase the efficiency of our SPWA
  • We will keep on improving the Product, UX and UI for our customers
  • We will continue to grow our modern stack

This is the first post in our series: No Monkey Business Static Progressive Web Apps or; How and why Interactive Investor uses decoupled Drupal, Gatsby.js, ReactJS and AWS to deliver rich content without making Google cry

In the next post, we will dig further into the Headless CMS (Drupal) architecture.

You can find our public site (free research, news and analysis, discussion and product marketing site) at https://www.ii.co.uk

Update 03/04/2019: with links to Elliot Ward post No Monkey Business Static Progressive Web Apps (part 2)