Why Google web cache is not as important for SEO as you think

Justin George
3 min readJul 18, 2019

--

Since the inception of Single Page Applications (SPA), people using Angular, React and similar client JS libraries/frameworks worried about the visibility of their website on the web. This write-up focuses on similar cases. Where we stand on SEO when it comes to JavaScript Client web applications.

In 2009, Google proposed a system that crawls the AJAX contents on the website. This was aimed towards reading the contents in a website that could be coming from JS scripts or AJAX calls. In 2015, Google announced another blog post, they are planning to deprecate their AJAX crawling scheme since we have come a long way from that scenario.

In 2009, we made a proposal to make AJAX pages crawlable. Back then, our systems were not able to render and understand pages that use JavaScript to present content to users. Because “crawlers … [were] not able to see any content … created dynamically,” we proposed a set of practices that webmasters can follow in order to ensure that their AJAX-based applications are indexed by search engines.

Fast forward to 2019

I came a across similar issue while building a website using ReactJS. To excel in SEO, all the necessary steps were taken such as setting title, description, og tags, etc.. All were good until we checked on Google web cache. All the sub-pages of the websites were falling back to the homepage. For instance, ‘contact-us ’ URL would show the homepage in Google web cache.

I was told that this is a major hit for SEO and should be fixed ASAP. Not that I was reluctant to address the issue, I wanted to know how much of a real impact it has on the SEO.

Below are the couple of points I came across to justify that Google Web cache content does not have much of an impact in the modern JS world.

  1. Google caches the HTML source code of a page, not an HTML snapshot after JavaScript has been executed. If your site is built with AngularJS, or any other JavaScript framework, it’s likely that the cached page from Google doesn’t represent the actual page as seen by users. The explanation in this case is quite simple. The routing system of AngularJS, when using the HTML5 mode (pushState), relies on the URL itself to make the AJAX call in order to serve the right content on the right page. Read More
  2. There are cases which prove that sometimes what you see in the cache is completely different from what will be indexed eventually. Read more about the Hulu case study
  3. If you browse about the importance of Google web cache, you might come across the articles like this one. But this is written before 2015 and as Google mentioned, we have come a long way since then.
  4. John Mueller
https://twitter.com/JohnMu/status/1151080532203266053

With fetch & render, Google(the one we care about the most) can read the content of the Single Page Applications. Google webmaster console helps you to see how your website is rendered and viewed. If your Fetch & Render looks good, that will be mostly sufficient irrelevant of what Google Cache tells us.

--

--