Server side rendering with puppeteer and headless chrome— a practical guide
TL;DR — serve JS to users, server side rendered content to bots (the source code is available here — http://bit.ly/2m6HN8w)
While building Binge, my framework of choice was VueJs (and single page application). Happy with both choices.
If you (assuming a dev) view the source code — it would only show a couple of lines of code — some meta tags and an almost empty <body> tag with a root div and a bundled js file — so when people visit it, it’s a typical website, browsers download the bundle and render the page.
When Google (and other search engines or social networks) parses it, it’s that (almost) empty <body />.
Google understands JS, just needs a couple of days to index and process it (as this google’s webmaster videos says)
The solution I used: server side rendering with puppeteer and google chrome. My goal for implementing SSR was:
- When people share a link on a social media platform, the link & html should be pre-rendered so a pretty preview (you know, text & image) is shown instead of just text or link. And if facebook can’t parse the html, they won’t make it look pretty.
- Search engines understand html. When bots visit the site, pre-render it and show the rendered version
- Should be resource friendly and fast to load
The setup we will use:
- node server (with expressjs)
- puppeteer with reusable headless chrome instance
yarn
package manager instead ofnpm
in the examples
Let’s use the existing github repo to speed up the setup, explaining the important bits.
#clone the repo
git clone https://github.com/dblazeski/express-chrome-ssr.git#install the dependencies
cd express-chrome-ssr
yarn install
Our entry file is ./src/index.js
Importing express
for our server, puppeteer
for managing our headless chrome instance and rendering the url, and booting our app
using express
looks like this:
That’s the minimum setup we need for the server.
Let’s add a route that will accept url parameter/ssr?url=http://google.com
Once we start the server, we will pass our url’s and get the rendered html as response.
Here’s the code:
What happens?
- We’re registering our
/ssr?url
route browserWSEndpoint
(was initialized =null
in the first file above) is our headless chrome instance. We talked about performance — we’re reusing the same instance as it’s a lot faster — just managing new “pages” (or tab’s if you will) on it. In my tests, this saves >0.5s on a 2s total response time. The only times chrome is initialized is if it crashes or the first time we ping our server.- We’re calling the async function
ssr
which we imported above. Will go over this function in the next embed
Let’s take a look at the ssr function that actually does the render:
What happens?
- #1 Our async function accepts two params — the
url
and the existingbrowserWSEndpoint
chrome instance. - #3 We init our browser and open a new page (or tab)
- #[6–10] We wait for chrome to fetch the url and render the page.
networkidle0
comes handy for SPAs that load resources withfetch
requests —networkidle2
comes handy for pages that do long-polling or any other side activity. - #[13–18] We’re adding the base tag to make sure relative links work
- #[20–24] Remove all scripts as they’re already executed
- #[26–29] Get the page content and close the page (tab)
Once the render is complete, the response is passed back to our index.js file and sent back to the server #[18–20] in ssr-2.js — that’s the html we’re after!
The content can then be printed in the browser and bots can parse it 🎉
Sending the content to the browser
This final step can vary depending on your programming language / framework. I use Laravel, so the example will be in Laravel / PHP — but I’m sure it’s easy to understand.
- Check if the visitor is a bot
- Ping our nodejs server with the url and get the html
- Output the html directly to the browser
A package for php that’s good for user agent detection is CrawlerDetect (and it has support for all popular frameworks).
Pseudo code example:
The source code with examples you can use is available on http://bit.ly/2m6HN8w. The repo also has ready server scripts (see package.json
) you can use with nodemon
or pm2
.
Started using SSR in attempt to serve the content bots require for rich links preview. If you’re an avid movie fan, check out my latest project Binge.
Thanks for reading.
—
On an unrelated note, are your Macbook Pro animations lagging? You should try switching to the dedicated (more powerful) graphic card when on power — automate it with this app https://gum.co/mac-auto-gpu