A case of Server Side Rendering (SSR) and Googlebot not being able to crawl

Dominic PrawnStar Bauer
Oct 17 · 5 min read

It was Wednesday. Thursday and Friday were promising to be sunny — perfect beach days but then of course… Our client found a bug, as they do on optimal beach days. The problem — Google wasn’t indexing certain pages.

I was so close!

Now a few things to note before we move on:

  1. The pages were loading correctly in browser
  2. The pages were returning a 5xx server error to the Googlebot
  3. We had no idea WTF to do
Graph of pages not being indexable from Google search console

As with all things you have no idea what to do about, we started throwing sh*t against the wall and hope something sticks.

First point of call was to check whether other crawlers were experiencing the same issue. I tried this one, by Robert Hammond (thanks for the free online crawler), which was pretty useful. It showed me that only Googlebot was experiencing the issue.

So what was the actual error? I found another useful resource, Websniffer, which also gives the header of the request made.

GOOGLEBOT FAIL

Before we get ahead of ourselves let me give you some background into the architecture of the website:

  1. We use the Vue framework for the frontend
  2. Nuxt is used for Server Side Rendering (SSR)
  3. AWS is used to host the website
  4. Elastic Beanstalk handles pulling AWS resources to get website up and running (servers, load balancers etc…)
  5. CloudFront caches the website at edge locations so it’s faster for users to access the website from different locations.

Okay so back to the header.

It must be CloudFront right…?

As you can see from the above, I was lead to believe that CloudFront was in fact the issue. The initial hypothesis was that Googlebot was hitting a CloudFront edge location which had cached pages with 500’s.

To cut a long (LONG) story short, I was wrong. As I have begun to realize in the embryonic point of my career: 9/10 times CloudFront is NOT the problem.

Thanks to the help of a colleague, aptly nicknamed Bane (yeah he looks like Batman’s nemesis). He directed me to the fact that the page wasn’t being picked up by Googlebot when using the url for the original Elastic Beanstalk instance too — ruling out a CloudFront issue.

(Bane also showed me another really cool resource which is probably the only resource you’d need when testing Googlebot issues — thanks technical SEO for the free resource.)

Okay so if CloudFront is not the issue THEN WHAT IS!?

Again Bane came to the rescue (yeah Batman up your game bro) and decided to take a look at the Elastic Beanstalk logs. As per usual looking back at it, this was the most logical thing to have done — hindsight is always 20/20.

3:08:56  ERROR  Cannot read property 'map' of undefined

AHA! The function checkAndChangeBreadcrumbString was where the error was. Oh yes and guess who was to Git Blame… Yep… it was me.

Mmmmmm I could almost smell the solution.

checkAndChangeBreadcrumbString () {
return this.post.breadcrumbs.map(breadcrumb =>
breadcrumb.name === 'Why go'
? { ...breadcrumb, name: 'Tours & Safaris' }
: breadcrumb
)
},

CheckAndChangeBreadcrumbString was meant to find any breadcrumb that had a name of ‘Why Go’ and change it to ‘Tours & Safaris’. Also an important note is that it was a computed property. Which means the function is run on initial load and then waits for changes to happen to its properties before re-running.

The fatal flaw, however, was the fact that it assumed this.post.breadcrumbs had been rendered. Since this.post is actually data being called from an API it doesn’t exist until the API request completes.

There was a race condition — when Googlebot loaded the site there wasn’t enough time for this.post to load and therefore it returned a 5xx error.

So, the fix was as follows:

checkAndChangeBreadcrumbString () {
if (this.post.breadcrumbs && this.post.breadcrumbs.length) {
return this.post.breadcrumbs.map(breadcrumb =>
breadcrumb.name === 'Why go'
? { ...breadcrumb, name: 'Tours & Safaris' }
: breadcrumb
)
}
return []
},

Adding the if (this.post.breadcrumbs && this.post.breadcrumbs.length) solved the issue because now the function only runs once the this.post.breadcrumbs exists and has a length.

SUCCESS

Now Googlebot could finally crawl the page because there was no race condition resulting in an undefined object. The data simply loaded once it was ready. The page was happy and Googlebot was happy, so we were happy and the client was happy and everyone was happy … THE END (but not really because #reflection)

Lessons I learned:

  1. Check search errors in google search console regularly to detect issues early
  2. A page loading in browser DOES NOT mean Googlebot can index it
  3. The problem is probably not CloudFront even though it may seem like it
  4. You are probably the culprit (well maybe that just applies to me)
  5. Looking at logs is the way to solve problems — because you find out what the problem actually is
  6. You can’t assume the data will always be available when Googlebot initially crawls the site
  7. HINDSIGHT IS ALWAYS 20/20
  8. Bane > Batman
  9. Everyone is never happy

NONA

A high-end custom software development studio focused on long-term partnership. Get in touch: studio@nona.digital

Dominic PrawnStar Bauer

Written by

Slowly trying to piece the world together one idea at a time :).

NONA

NONA

A high-end custom software development studio focused on long-term partnership. Get in touch: studio@nona.digital

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade