Deep dive into SSG, SSR and how a CDN and HTTP caching can increase your performance by 10x
First, we dive into the difference between Static site generation and Server-side rendering. How an old fashioned HTTP caching can increase your performance by 10 times without breaking a sweat!
Static site generation.
So it’s you and your computer. You have got your website and it’s time to build. So you say:
yarn build or
npm build . Let’s see what happens, well your computer’s fans spin up first of all.. and then you’re probably going to make some network requests. You're going to go off to some database somewhere or a CMS. Something that’s got the data that you going to use to build the page out of. There is work happening over on that server as well when you’re building. So it's not just your computer that's doing work here. There is stuff out in the cloud, stuff out on another server somewhere. Of course, you may have all this stuff locally on your machine.
But in most situations, you’re going off to some API somewhere and getting the data to build the static site, so it sends the data back to you. And now you have created a document. Then you do it again because you’ve got to make your next document, so we’re going back and forth and building documents. We do it again and again and again until we build every single document, every single page of your website.
Both of these things cost money. It costs you time, sitting there waiting for the build to finish. And it also cost money over on the servers, for them to process those requests so you can build the data. There’s compute happening locally, maybe you are doing a build in the cloud somewhere, you are paying for that too.
Whenever there’s compute happening, whether it’s some server or it’s your local machine. When you’re building a static site, that costs money.
Now you got a CDN and you upload everything to the CDN. So what’s really great about static site generation is that all those documents are pre-rendered. They’re pre-built. They’re just static they’re just sitting there on a CDN waiting for somebody to come and visit your website to get one of those documents. So they make a request and the CDN doesn’t have to do any work at all, it doesn’t have to build the page, doesn’t have to render anything. You can just send it right on back to the user and the user is super happy.
It’s a fast response and it was a cached response so the user is happy because it’s super snappy. Now let’s say you edit some of the data, what does that mean for static site generation?
Well, something in your database changed but your CDN still has all those documents from the last deploy. So if a user visits the page, they’re going to get a fast response but it’s stale and it’s not up to date yet. Sometimes that’s fine.
In order for us to turn that stale content into fresh content. We have to rebuild the website again. So it’s you, your computer and the server that’s got all your data. We’ve only got one little changed document in there but the way the static site generation works is, you’ve got to rebuild every single one of those pages. So that’s going to cost money and then you upload everything again to the CDN. This means that you build every page, for every deploy and for any edit.
Let's see what's the difference with Server-side rendering with no CDN involved. We don’t have a big build step. There’s no build step at all. You just upload your website to the internet and then you build the pages on demand.
So when somebody asks for a page, you build the page on your server and then you send that page back to the user. This takes time because it’s kind of slow, the user had to wait for the page to build. So static sites are nice because you built it before. You sat there and waited for the build so that your users don’t have to wait for the build.
Every single visit cost you money. You’re running compute on every single one of those things. However, if you have some pages that are never visited, you never build them!
With static site generation, you’re building every single page whether people visit them or not. With the server, you only build the pages that people visit. If no one visits, then the server never builds it and never sends it.
Content delivery network
Let’s throw a CDN into the middle here. The first visitor shows up. They actually make a request to the CDN, not to your server.
CDN goes: “Oh I don’t have that document yet”
- CDN goes over to your origin server.
You may have heard of that term before. The origin server, that’s your actual webserver :)
- The user is still waiting for your origin server to build the page.
- The origin server is ready and sends the page to the CDN.
- CDN caches the page
- CDN then sends it to the user
This still takes time because we had to wait for the whole cycle. This is just the first person ever to visit this URL.
It costs us money to build that page and send it to them. However, with a CDN, the second visitor flow is:
- They request the page from the CDN
- CDN goes: “ah I know this! I’ve already gotten this from the origin server so I can leave the origin server alone”
- CDN sends cached response right back to the user
We got an immediate response because it was fast and it was cached and it was fresh, it was accurate. This saves you money too because you don’t do anything. You’re not building the page on the fly. Now, this isn’t just for the second visitor, this is the third the fourth, the fifth, the hundred, the millionth visitor.
Control your cache
Then there’s this idea of
max-age when you have a CDN. There is a cache-control header and max-age is one of the values in there that says:
The Cache-Control HTTP header holds directives (instructions) for caching in both requests and responses. A given…
“Hey, this is how long you should cache this thing. It’s in seconds so you can say, cache it for 60 seconds, cache it for a day or cache it for a year.”
Let’s say that we were caching this for like a day and that time has passed. Well, what’s going to happen is a visitor shows up and says:
- User: Hey, I want that page
- CDN: “Oops, this request is stale…”
- CDN: “This is old. I was told to rebuild this or to go to the origin server and have it rebuild it after a day.”
It’s been a day
- CDN goes to the origin server and says: “hey I want the page.”
- You build the page on your origin server”
- It comes back to the CDN
- CDN puts the new fresh one in the cache
- CDN then sends it to the user.
And of course, we got a “slow” response and then we had to pay money for our server to build that page. But every visitor after that, they now have a fast and fresh page.
What’s interesting about this is, when we have an edit we don’t need to rebuild the whole website like with static site generation. It’s just the pages that change that need to get rebuilt and with
max-age you can just kind of put it on a timer and say:
“You know what, cache it just every day. If someone comes to this page again rebuild it but everyone else for the rest of the day just give them that built version”
Or maybe you decide this page is good for a week and you can just make an edit anytime during that week and it’s not going to show up on your website yet but after that.
The CDN is like: “Oh hey this thing’s expired. I’m going to rebuild it, then everyone will get the fresh one”
You kind of get to decide how often do you want it to get built. Rather than having to build every single page.
Let's visualize how this works
🚢 Deploy website
📝 Data changed
🥱 User waits for the page
😄 Cached response
😅 Cached but stale
🕐 Max-age expired
If you don't have any caching and you don't have a CDN. You ship the site and everybody is a yawner but everybody gets fresh data too. This is what we want to avoid. It's expensive for you because for every request you are building a page for that user. But what's good about it, everybody gets something fresh!
Max-age for a year. You ship the website. The first person ever to visit a URL on that website get a yawner because they have got to wait for the page to build. But then everybody after that gets a smiler until you edit. Once you edit, you get the sweater because we said, this thing lasts for a year. So even if you edit it, the CDN is never going back to get a fresh copy. You can even deploy your website again, nothing is going to change. Because you told the CDN, this thing is good for a year. So no one will ever get a new version of that document.
We saying this is good for 60 seconds — 1 minute — You ship the website. You get a yawner for the first visitor and then for the next 60 seconds everyone is a smiler. Fast cached responses. You edit the page, you get sweater because it's fast but it's stale. That's fine, it’s only going to be stale for another few seconds. Maybe another 10 seconds. So once the timer hits, then the next visitor is going to cause the CDN to go to the origin server and come back again with the fresh copy. They get the fresh copy of the page but they are yawning. It took a little while but then everyone for the next 60 seconds after that person is going to get fast responses. And so on.
Every 60 seconds, somebody is going to have a longer response time then someone else.
Target purge + s-maxage=3.154e7
So one strategy to mitigate having so many yawners in there is to say: lets set a really long max-age. There is this other cache-control value called
s-maxage and that's for the surrogate. That's for the CDN. So far we have been talking about
Max-age is just in the browser and the CDN will use it. But if you say
s-maxage . Then
s-maxage will say to the CDN: hey, I'm not going to cache it for max-age, I'm going to cache it for s-max-age. This is great because you can now tell the CDN to cache it differently than the browser. Which allows you to go in and purge — The red stop sign. What you can do now is, you can have these targeted purges.
You can now say: on my CMS, whenever I change one of these articles. I know what the URL is. So now I can go over and purge my CDN of that document. You make an edit, you write a little script that says. This URL we gotta purge that from the CDN. Now the next visitor is a yawner. But everyone else is happy after that. This is a really cool strategy. Takes more time and more code but if you can automate it..great approach!
You dont want a long max-age here. Because you cannot purge the user’s browser. If you dont use s-maxage with this strategy and you have a visitor who goes to your page. They are going to have that page for a year. There is no way when your CMS edits a page, to go to all your users browsers and say: purge this thing. Thats why we set a smaller max-age for the browser, like 60 seconds and a big s-maxage for the CDN. You can set up targeted purges to purge the pages that you are changing.
SSR Stale While Revalidate
How does static site generation fit into this?
Keeping things fresh with stale-while-revalidate
An additional tool to help you balance immediacy and freshness when serving your web app. stale-while-revalidate helps…
Static Site Generation
Well, we’ve got a new emoji. The tools icon, that’s the big build where you’ve got to create every single page on your website before you can deploy it.
We’ve got the big build first, but now everybody’s a smiler because you’re getting superfast fresh responses. You edit the page, now everybody’s getting stale stuff until you do the big fat build again and you deploy. After that everybody gets happy stuff again. It basically comes down to any time any data changes you have to redeploy.
For some websites that's fine, other times it can be just way too expensive. Remember it’s not just your machine. You might offload it to some cloud machine. Either way, some machine somewhere is doing that work whether it’s your origin server or your build server, you’re paying for a computer to build those pages one way or the other.
Incremental Site Regeneration
You may have heard of NextJS’s incremental static regeneration? The idea here is, it’s a kind of a bummer that in static site generation we are forcing dynamic data to be static data. To get fast responses we kind of have to pretend that our data doesn't change or at least not very often. But data does change often. NextJs solution is incremental static regeneration.
What happens is, we do the big build. So you’re still going to build the whole site, every single page of the website and then you’re going to deploy it. Everyone gets fast fresh responses. We got the single tool in there that represent rebuilding just a single page. So the way incremental site regeneration works is, when someone visits a page, it triggers a build of just that page. But it sends the current version of that page or the stale version. It just sends them what is already built and then builds a new one in the background.
Now the person after that first sweaty emoji, they’re gonna get the result of the last incremental build. Then someone else visits the page and it kicks off the regeneration again in the background. It sends them the current version of the page, builds the new one. Now that new page is sitting waiting there for the next visitor. Next visitor shows up, gives them the current version of the page and then sets up a build in the background again.
This way everyone’s getting fast responses. Everyone’s getting a pre-rendered document. And everyone’s getting something pretty fresh. It’s probably just off by a little bit. Everybody is just one-off, by shifting it by one, everybody gets a fast response. And you only build the parts that you need. Instead of the whole thing.
You're always off by one, but it is always fast
Stale While Revalidate
Incremental static site regeneration from NextJs is inspired by a cache-control value called stale while revalidate. Let’s look at what stale while revalidate is.
It's just for the browser has nothing to do with static site generation, or with NextJS or any of that kind of stuff. It’s just a part of HTTP caching.
SSR with stale while revalidate.
This is our first visitor after the max-age is up. You remember, last time when the max-age is up.
- Visitor asks for the page.
- It’s expired, the CDN would go to the origin server to build it.
Not with stale while revalidate.
Stale while revalidate says:
- “Hey, you know what, I’m just going to give you what I've got right now. This document is expired but I really want to just give you what you want right now”
- It’s stale but it’s fast
- Then the CDN once it’s done with that request, says: “Okay, I’m going to go over the origin server. I'm going to revalidate this page”
- It starts building on your origin server. If somebody shows up in the meantime, maybe you’ve got like a visitor right after another visitor while you’re origin server is still building. The CDN is going to go: “Hey, yup here have the stale one again. I’m still figuring this thing out”.
- Finally, our origin server is done. Sends the new document to the CDN.
- CDN caches that new fresh one
- Now anybody who shows up afterwards is gonna get the new fresh one.
If we say a max-age of one. We are saying: “Hey this thing expires every second because our data might be changing every second.”
We think that this page is highly dynamic, but we don’t want the user to have to wait for the server to build that page every time. We want to give them a fast response, so we just shifted it by one.
After the deploy, we first get a yawner that’s just the very first time ever. Never again, even after we re-deploy we’re not going to get a yawner at this URL. When the
max-age hits we get a sweater because it’s fast but it’s stale, we’re revalidating. Now the next person who shows up is going to get a fast cached response so we got a smiler. Then the clock ticks again. Then we send something that’s stale. We could deploy in the middle, it doesn’t affect anything. If we deploy in the middle, we don’t have to wait for that big static build either. We don’t have to build all those pages again, we’re only building the pages that people visit!
You can kind of look at the two, incremental static regeneration and stale while revalidate. You can see that they are actually the same for the user. Everybody gets fast responses. Expect for that very first person, after the initial deploy of this URL ever.
We don't want a whole bunch of our requests to be slow. We want them all to be fast. So incremental static regeneration is saying why don't we give them the one from a second ago.
If we’re at the 60 second mark and this page is probably stale now. There’s no reason we can’t give just one more person that old copy. They are only one second off. If they were there a second earlier they would have gotten the same thing that we’re giving them a second later.
Increment static site regeneration — NextJs
You can see the clock hits and instead of waiting to build we just send them what we’ve got. New stuff shows up after that request and now everybody gets new fresh responses. Everyone’s smiling no one’s yawning.
Now you can see that with max-age of 1 and stale while revalidate without incremental static site generation, just server-side rendering and a CDN. An URL comes in, and it just builds that page. Then you place a CDN in front of it. So you are building static pages. Just on demand instead of all at once. You can see its the very same where everybody gets a fast response. We are just shifting it over by one. We are trading, sending someone something stale that's only a second stale. So that everyone gets fast responses.
Really it’s this question, If you using cache headers and a CDN and server-side rendering.
How often do you want to rebuild?