Building NOS’s new Google AMP implementation

Laurens Hoek
NOS Digital
Published in
8 min readSep 24, 2019

NOS is the largest Dutch public news and sports broadcasting organization. At NOS, we care about informing the public on what’s happening around the world, so that the public can be better equipped to make decisions. We’re a broadcasting organization by origin and digital plays a pivotal role in the future of our organization. This post is about digital. It’s about how we work with Googles’ Accelerated Mobile Pages (AMP). About how it began and how we are now enabling it at NOS. We hope you enjoy it! Please let us know what you think.

It can be hard to understand our website as a product. Some of our visitors start their daily routine by checking the news. Some only visit us on Sundays to check the football scores. Others just visit us because one of their friends shared one of our stories with them.

The cohorts of people visiting our website are always changing, but the big patterns are less prone to change nowadays. One of these big patterns is the way we receive visitors from Google Search. Hundreds of thousands of visitors find us every day through Google Search. This number tells us something about Google’s enormous scale, but it also tells us how news as a service fits into search. If you want to validate something you talked about with a friend, for example, you could use a search engine to further dive into it.

First steps with Google AMP

In February of 2016, Google introduced Accelerated Mobile Pages. A web standard which Google prefers over other normal web pages. If your news story is powered by AMP technology, it’s typically ranked higher in Google Search results than regular web pages with the same story. It’s even ranked higher than Instant apps.

Example of a Google search resulting in our AMP article.

Shortly after the launch of Google AMP, we started publishing our stories in the AMP format. Mainly to conserve and further build on our already existing Google traffic. Nowadays other platforms like Twitter also prefer AMP for its speed. In our first implementation of AMP we extended our regular website story format to the AMP format. A decision which would come back to bite us later on…

We serve out hundreds of thousands of AMP articles daily.

Visitor intentions

Coming from Google, visitors usually read a single article and leave immediately afterwards. They have the intent to learn something specific, and move on afterwards. It’s notoriously hard to convince these types of visitors to do something else besides reading the story they came for. We experimented with some new features to increase engagement. But these haven’t been too effective. Instead, we now focus on making it as easy and comfortable as possible to access a story from search. However, as volumes are very high and we’d like to retain some of it, we still keep a close eye on increasing engagement.

GDPR as our breaking point

Initially we built our AMP pages as an extension to our website. This worked well for a while, but this architectural decision started breaking down at the end of 2018.

When working on compliance issues with European GDPR (privacy) guidelines, we decided to handle social media embeds in our news stories differently. Unfortunately, we could not guarantee GDPR compliance for our embed-technology. Not all embeddable platforms ask for consent when processing user data and we could not ask for this consent, as we do not have a clear insight into what these platforms are doing exactly.

As a result, we decided not to use this embed technology any longer. Instead, we decided to build a window displaying the content ourselves. So a Twitter post on our website now looks like a post on Twitter, but it’s not an embed. It’s a piece of content we render ourselves, built with the data we acquire from Twitter. This decision is a reminder how one of the cool benefits of web technology, embedding, has powered the flux of privacy related data.

Sadly, our handling of social embeds in stories didn’t work in our AMP stories, as AMP is a separate set of web-guidelines. It’s like HTML, but different (the main reason why some frontend engineers dislike AMP). It’s fragmentation of the web. But for us it’s not an option to just leave out AMP, or to leave out social embeds as this would discard editorial decisions. In short: we discovered it was just no longer feasible to handle the stories on our regular website the same way as our AMP stories.

Isolating AMP

As mentioned earlier, our AMP application was initially built as an extension of our website. This was a logical decision at the time, as AMP was still an experimental product and the AMP standard was still mostly regarded as a subset of HTML and CSS standards.

As it turned out however, the latter proved to become less true as time progressed. AMP and its standards evolved to become increasingly disjunct from HTML and CSS standards. There currently is some overlap between the respective standards, but they are truly distinct. Furthermore, there is no guarantee in regards to how much of this overlap is going to be retained in the future. On top of that, we noticed that our website became increasingly invested in rich client-side rendering, while AMP retains heavy focus on high speed server-side rendering.

These notions prompted our first challenge: we had to regard and treat our AMP application as a distinct product. Just as we would treat our website and smartphone apps as separate products. The solution to our challenge? We had to isolate AMP from our website.

The schematic displaying our architectural problem.

Our platform

In order to fully appreciate what isolating AMP means, it is important to get a very basic overview of our platform. By platform, we mean the totality of our products and the internal back-end services that they rely on.

All our products are provided with content through our back-end APIs. These APIs can be tailored towards the specific needs of a specific product, but they generally share a common content model.

We strongly prefer to separate content from presentation logic. Products therefore have sole responsibility over how the content is presented to the user. This can consist of two parts, being client-side components and server-side components, depending on the needs and the infrastructure of the product involved.

Technical redesign

Considering the platform architecture, we knew that we had to take care of a number of things:

  1. Creating a back-end API that could serve content for our new AMP application.
  2. Creating a server-side application that could consume the API and render content in the AMP standard.
  3. Implementing an operational infrastructure focused on very high speed and constant availability.

The API

For the API, we quickly discovered that the needs of our AMP product would not fundamentally differ from the needs of our website. Nonetheless, we decided on creating a separate API anyway. The rationale behind this being that it would be worthwhile for us to be able to perform content model tailoring and performance tweaks in isolation from our website API. Just like the standards diverge, we predict that the needs might diverge as well. Especially in the area of embeddable content and rich media, as mentioned earlier.

The AMP application

As we started work on the AMP application, we had two main concerns. The first being performance. As AMP is all about speed, our application had to be fast. The second concern was that our team had to be either familiar with the technology stack, or be able to become familiar with it quickly.

We were very lucky as Symfony 4 was released just a year earlier. Symfony 4 has been designed as a micro-framework, so it’s all about building fast and lightweight applications. Our team was already very familiar with the Symfony ecosystem from applying it in other projects. As it fit our needs perfectly, it was an easy choice to make.

With Symfony 4 being around a year old, we had enough confidence in its stability. The Symfony ecosystem was able to provide us with most of the general purpose packages that we would need in order to effectively develop the application.

The specifics of the application are pretty straightforward. It exposes a single dynamic route, which is purposed for serving our news articles in the AMP format. This is done through a very simple responsibility chain.

The content is consumed from the aforementioned API by using a wrapped Guzzle client. We predicted that our content is generally valid for at least one minute, so we implemented a mechanism for caching the consumed content in a Redis instance in order to increase speed and decrease the load on our back-end services. Subsequently, the content is rendered through Twig, applying templates and extensions specifically developed for rendering our content in the AMP format. The rendered content is then served out to the client.

Our infrastructure

As lucky as we were with the release of Symfony 4, we were also very lucky with the OpenShift cluster that our hosting partner NPO had just provided us with. This allowed us to containerize the application in order for us to create an automated deployment pipeline as well as implement automatic scaling, providing us with high performance whenever we would need it. In order to even further increase speed and availability, we also implemented response caching and stale caching by using an NGINX reverse proxy.

Great success!

The rebuilding of our AMP application proved to be a success. We now have a modern application, built on modern technology, using a modern infrastructure. It is very well maintainable and it is very fast. Serving out our content in the AMP format now consistently takes mere milliseconds. We are very happy with this outcome, as it is exactly what we had aimed for.

The number of indexed articles in the AMP format is now stable and increasing.
The number of validation errors has dropped to virtually zero.

This story was co-authored by Tom van den Broek. Please let us know what you think!

--

--

Laurens Hoek
NOS Digital

Platform Architect, Software Developer, working on high volume applications and complex domains.