Bringing SEO to Angular Applications

Andres Rutnik
Slalom Build
Published in
12 min readAug 9, 2018

When writing single page applications, it is easy and natural to be caught up in trying to create the ideal experience for the most common type of users — other humans like ourselves. This aggressive focus on one kind of visitor to our site can often leave another important group out in the cold — the crawlers and bots used by search engines such as Google. This guide will show how some easy-to-implement best practices and a movement to server-side rendering can give your application the best of both worlds when it comes to SPA user experience and SEO.

Prerequisites

A working knowledge of Angular 5+ is assumed. Some parts of the guide deal with Angular 6 but knowledge of it is not strictly required.

Web site vs Web application

A lot of the inadvertent SEO mistakes we make come from the mindset that we are building web applications not web sites. What’s the difference? It’s a subjective distinction, but I would say from a focus of effort point of view:

  • Web applications focus on natural and intuitive interactions for users
  • Web sites focus on making information generally available

But these two concepts don’t need to be mutually exclusive! By simply returning to the roots of web site development rules, we can maintain the slick look and feel of SPAs and put information in all the right places to make an ideal web site for crawlers.

Don’t hide content behind interactions

One principle to think about when designing components is that crawlers are sort of dumb. They will click on your anchors, but they are not going to randomly swipe over elements or click on a div just because its content says “Read More”. This comes to odds with Angular, where a common practice to hide information is to “*ngif it out”. And a lot of times this makes sense! We use this practice to improve application performance by not having potentially heavy-weight components just sitting in a non-visible part of the page.

However, this means that if you hide content on your page through clever interactions, chances are that a crawler is never going to see that content. You can mitigate this by simply using CSS rather than *ngif to hide this kind of content. Of course, smart crawlers will notice that the text is hidden and it will be likely weighted as less important than visible text. But this is a better outcome than the text not being accessible in the DOM at all. An example of this approach looks like:

Don’t create “Virtual Anchors”

The component below shows an anti-pattern that I see a lot in Angular applications that I call a ‘virtual anchor’:

Basically what’s happening is that a click handler is attached to something like a <button> or <div> tag and that handler will perform some logic, then use the imported Angular Router to navigate to another page. This is problematic for two reasons:

  1. Crawlers will likely not click on these kinds of elements, and even if they do, they will not establish a link between the source and destination page.
  2. This prevents the very convenient ‘Open in new tab’ function that browsers provide natively to actual anchor tags.

Instead of using Virtual Anchors, use an actual <a> tag with the routerlink directive. If you need to perform extra logic before navigating, you can still add a click handler to the anchor tag.

Don’t forget about headings

One of the principles of good SEO is establishing relative importance of different text on a page. An important tool for this in the web developer’s kit is headings. It’s common to completely forget about headings when designing the component hierarchy of an Angular application; whether or not they are included makes no visual difference in the final product. But this is something you need to consider to make sure crawlers focus on the right parts of your information. So consider using heading tags where it makes sense. However, make sure that components which include heading tags cannot be arranged in such a way that a <h1> appears inside a <h2>.

Make “Search Result Pages” linkable

Returning to the principle of how crawlers are dumb — consider a search page for a widget company. A crawler is not going to see a text input on a form and type in something like “Toronto widgets”. Conceptually, to make search results available to crawlers the following needs to be done:

  1. A search page needs to be set up that accepts search parameters through the path and/or the query.
  2. Links to specific searches you think the crawler might find interesting must be added to the sitemap or as anchor links on other pages of the site.

The strategy around point #2 is outside the scope of this article (Some helpful resources are https://yoast.com/internal-linking-for-seo-why-and-how/ and https://moz.com/learn/seo/internal-link). What’s important is that search components and pages should be designed with point #1 in mind so that you have the flexibility to create a link to any kind of search possible, allowing it to be injected wherever you want. This means importing the ActivatedRoute and reacting to its changes in path and query parameters to drive the search results on your page, instead of relying solely on your on-page query and filtering components.

Make pagination linkable

While on the subject of search pages, it’s important to make sure that pagination is handled correctly so that crawlers can access every single page of your search results if they so choose. There are a couple of best practices you can follow to ensure this.

To reiterate earlier points: do not use “Virtual Anchors” for your “next”, “previous” and “page number” links. If a crawler can’t see these as anchors, it may never look at anything past your first page. Use actual <a> tags with RouterLink for these. Also, include pagination as an optional part of your linkable search URLs this often comes in the form of a page= query parameter.

You can provide additional hints to crawlers about the pagination of your site through adding relative “prev”/”next” <link> tags. An explanation on why these can be useful can be found at: https://webmasters.googleblog.com/2011/09/pagination-with-relnext-and-relprev.html. Here is an example of a service that can automatically manage these <link> tags in an Angular-friendly way:

Include dynamic metadata

One of the first things we do to a new Angular application is make adjustments to the index.html file — setting the favicon, adding responsive meta tags and most likely setting the content of the <title> and <meta name=”description”> tags to some sensible defaults for your application. But if you care about how your pages show up in search results, you can’t stop there. On every route for your application you should dynamically set the title and description tags to match the page content. Not only will this help crawlers, it will help users as they will be able to see informative browser tab titles, bookmarks, and preview information when they share a link on social media. The snippet below shows how you can update these in an Angular-friendly way using the Meta and Title classes:

Test for crawlers breaking your code

Some third-party libraries or SDKs either shut down or cannot be loaded from their hosting provider when user agents that belong to search engine crawlers are detected. If some part of your functionality relies on these dependencies, you should provide a fallback for dependencies that disallow crawlers. At the very least, your application should degrade gracefully in these cases, rather than crash the client rendering process. A great tool for testing your code’s interaction with crawlers is the Google Mobile Friendly test page: https://search.google.com/test/mobile-friendly. Look for output like this that signifies the crawler being blocked access to an SDK:

Reducing bundle size with Angular 6

Bundle size in Angular applications is a well-known issue, but there are many optimizations a developer can make to mitigate it, including using AOT builds and being conservative with inclusion of third party libraries. However, to get the smallest possible Angular bundles today requires upgrading to Angular 6. The reason for this is the required parallel upgrade to RXJS 6, which offers significant improvements to its tree shaking capability. To actually get this improvement, there are some hard requirements for your application:

  • Remove the rxjs-compat library (which is added by default in the Angular 6 upgrade process) — this library makes your code backwards compatible with RXJS 5 but defeats the tree-shaking improvements.
  • Ensure all dependencies are referencing Angular 6 and do not use the rxjs-compat library.
  • Import RXJS operators one at a time instead of wholesale to ensure tree shaking can do its job. See https://github.com/ReactiveX/rxjs/blob/master/docs_app/content/guide/v6/migration.md for a full guide on migrating.

Server Rendering

Even after following all of the preceding best practices you may find that your Angular web site is not ranked as high as you’d like it to be. One possible reason for this is one of the fundamental flaws with SPA frameworks in the context of SEO — they rely on Javascript to render the page. This problem can manifest in two ways:

  1. While Googlebot can execute Javascript, not every crawler will. For the ones that do not, all of your pages will look essentially empty to them.
  2. For a page to show useful content, the crawler will have to wait for Javascript bundles to download, the engine to parse them, the code to run, and any external XHRs to return — then there will be content in the DOM. Compared to more traditional server rendered languages where information is available in the DOM as soon as the document hits the browser, a SPA is likely to be penalized somewhat here.

Luckily, Angular has a solution to this problem that allows serving an application in a server rendered form: Angular Universal (https://github.com/angular/universal). A typical implementation using this solution looks like:

  1. A client makes a request for a particular url to your application server.
  2. The server proxies the request to a rendering service which is your Angular application running in a Node.js container. This service could be (but is not necessarily) on the same machine as the application server.
  3. The server version of the application renders complete HTML and CSS for the path and query requested, including <script> tags to download the client Angular application.
  4. The browser receives the page and can show content immediately. The client application loads asynchronously and once ready, re-renders the current page and replaces the static HTML the server rendered. Now the web site behaves like an SPA for any interaction going forward. This process should be seamless to a user browsing the site.

This magic does not come for free, however. A couple times in this guide I’ve mentioned how to do things in an ‘Angular-friendly’ way. What I really meant was ‘Angular server-rendering-friendly’. All of the best practices that you read about Angular such as not touching the DOM directly or limiting use of setTimeout will come back to bite you if you have not followed them — in the form of slow loading or even totally broken pages. An extensive list of the Universal ‘gotchas’ can be found at: https://github.com/angular/universal/blob/master/docs/gotchas.md

Hello Server

There are a couple different options to get a project running with Universal:

  • For Angular 5 projects you can run the following command in an existing project:
    ng generate universal server
  • For Angular 6 projects there is not an official CLI command yet for creating a working Universal project with a client and server. You can run the following third-party command in an existing project:
    ng add @ng-toolkit/universal
  • You can also clone this repository to use as a starting point for your project or to merge into an existing one: https://github.com/angular/universal-starter

Dependency injection is your (server’s) friend

In a typical Angular Universal set up you will have three different application modules — a browser-only module, a server-only module and a shared module. We can use this to our advantage by creating abstract services that our components inject, and provide client and server-specific implementations in each module. Consider this example of a service that can set focus to an element: we define an abstract service, client, and server implementations, provide them in their respective modules, and import the abstract service in components.

Fixing server-hostile dependencies

Any third-party component that does not follow Angular best practices (i.e. uses document or window) is going to crash the server rendering of any page that uses that component. The best option is to find a Universal-compatible alternative to the library. Sometimes this is not possible, or time constraints prevent replacing the dependency. In these cases there are two main options to prevent the library from interfering.

You can *ngIf out offending components on the server. An easy way to do this is to create a directive that can decide if an element will be rendered depending on the current platform:

Some libraries are more problematic; the very act of importing the code may well attempt to use browser-only dependencies that will crash the server render. An example is any library that imports jquery as an npm dependency, rather than expecting the consumer to have jquery available in global scope. To make sure these libraries don’t break the server, we must both *ngIf out the offending component, and strip the dependent library out of webpack. Assuming the library that imports jquery is called ‘jquery-funpicker’, we can write a webpack rule like the one below to strip it out of the server build:

This also requires placing a file with the contents {} at webpack/empty.json in your project structure. The result will be that the library will get an empty implementation for its ‘jquery-funpicker’ import statement, but it doesn’t matter because we have removed that component everywhere in the server application with our new directive.

Improve browser performance — don’t repeat your XHRs

Part of the design of Universal is that the client version of the application will re-run all the logic that was run on the server to create the client view — including making the same XHR calls to your back end that the server rendering already made! This creates an extra load on your back end and a perception to crawlers that the page is still loading content, even though it will likely be showing the same information after those XHRs return. Unless there is a concern of data staleness, you should prevent the client application from duplicating XHRs the server already made. The TransferHttpCacheModule from Angular is a handy module that can help with this: https://github.com/angular/universal/blob/master/docs/transfer-http.md

Underneath the hood, the TransferHttpCacheModule uses the TransferState class which can be used for any general purpose state transfer from server to client:

Pre render to move time-to-first-byte towards zero

One thing to consider when using Universal (or even a third party rendering service like https://prerender.io/) is that a server rendered page will have a longer time before the first byte hits the browser than a client rendered page. This should make sense when you consider that for a server to deliver a client-rendered page it essentially just needs to deliver a static index.html page. Universal won’t complete a render until the application is considered ‘stable’. Stability in the context of Angular is complicated, but the two biggest contributors to the delay of stability will likely be:

  • Outstanding XHRs
  • Outstanding setTimeout calls

If you have no way to optimize the above any further, an option for reducing your time-to-first-byte is by simply pre-rendering some or all of the pages of your application and serving them from a cache. The Angular Universal starter repo linked earlier in this guide comes with an implementation for pre-rendering. Once you have your pre-rendered pages, depending on your architecture, a caching solution could be something like Varnish, Redis, a CDN, or a combination of technologies. By removing the rendering time from the server-to-client response path, you can provide extremely fast initial page loads to crawlers and the human users of your application.

Conclusion

Many of the techniques in this article aren’t just good for search engine crawlers, they create a more familiar web site experience for your users as well. Something as simple as having informative tab titles for different pages makes a world of difference for a relatively low implementation cost. By embracing server-side rendering, you won’t be struck by unexpected production gaps such as people trying to share your site on social media and getting a blank thumbnail.

As the web evolves, I hope that we will see a day where crawlers and screen capture servers interact with web sites in a way that’s more in-line with how users interact on their devices — differentiating web applications from the web sites of old they are forced to emulate. For now though, as developers we must continue to support the old world.

--

--