API 101: Lowering the load with pagination

Johanne Andersen
5 min readNov 10, 2019

--

Pagination is very much like eating one piece of cake at a time because it’s impossible (or very difficult) for you to eat the whole cake at once. So we eat the cake in bite-size pieces instead of eating the whole thing in one sitting.

The same thing happens on many websites as well, the information is split into digestible pieces. Pagination can be twofold since the UI can paginate data to lower the cognitive load on the user. The backend can also paginate data to lower the load on the server.

You may have noticed two ways of doing pagination. One where you physically have to change pages, like Google. Or another one with infinite scrolling like Twitter. These are the two types of pagination called, offset-based pagination and cursor-based pagination.

Pagination is useful when you’re dealing with a lot of data or resource-intensive operations. In order to not overload your memory or CPU, pagination can be used to cut the data into smaller pieces that can be consumed more efficiently by the server.

This can result in lower wait times for the client since they don’t have to wait for the whole data set to be processed, but the client can start displaying a subset. Then only if the client is interested can it fetch more.

In some cases, it might not even make sense to implement pagination. If your data set is small and won’t grow over time adding pagination might just add overhead. So consider the dataset you want to paginate before deciding to use pagination.

  • Does your dataset have less than 2000 entries?
  • Do you have simple data processing, as in minimal transformations, joins and/or calculations.

If you can say yes to both of these, then you should consider not using pagination.

However, if you do decide to use pagination, these are the things to consider for offset-based pagination and cursor-based pagination.

Offset-based pagination

Offset-based pagination is the simplest one to implement. It’s like flipping through a book, you consume one page at a time.

Every time you send a request to the server, you specify which page you want and you get the content of that “page” back. This means getting the data that’s between the page offset and the size of the page.

So if you have a page size of 20 items, the first page would contain the first 20 items and the next page would contain the next 20 items.

The response usually looks something like this:

{
"data": {
...
}
"page": {
"number": 0,
"size" : 1000,
"totalElements": 4000,
"totalPages": 4
}
}

Implementation is pretty simple since you can use the offset and limit functionalities of your database.

SELECT * FROM data ORDER BY id DESC LIMIT 20 OFFSET 20

It does have it’s drawbacks though.

As your dataset grows and you have to fetch later pages, your database still has to load all the data before it to get the next page worth of data.

Also, if your data changes often, you might skip over data or display the same data twice.

To summarize:

  • PRO: Good for static content
  • PRO: Simple to implement
  • CON: Not good when items move between pages frequently (you can end up skipping data or displaying the same data twice)
  • CON: Gets slower as you start fetching later pages.

Cursor-based pagination

To deal with the drawback of offset-based pagination, you have cursor-based pagination, which deals both with large datasets and real-time data.

Instead of using offsets, we can use unique identifiers to determine where we are in our dataset. With this location, we can take the next elements after this location. This guarantees that the elements won’t change as long as the data is ordered.

This cursor should ideally be a timestamp since it won’t break if the record is deleted.

Then instead of providing a page number in your request, you provide a cursor. This cursor is provided by the API as you start paginating.

So the response would look like this:

{
"data": {
...
}
"nextCursor": "12345"
}

Fetching the data from the database looks something like this:

SELECT * FROM data WHERE id <= %cursor ORDER BY id DESC LIMIT 20

As you can see, we’re no longer doing an offset, so we’re not loading all the previous data. Instead, we’re using a WHERE clause, which will scale much better than OFFSET.

I suggest looking at this blog post, to ensure you avoid leaking backend implementation details to the frontend.

To summarize:

  • PRO: Can deal with dynamic data
  • PRO: Scaled well as you go further into the dataset
  • CON: More complex
  • CON: Harder to do bidirectional pagination

Bringing it all together

In most cases, using cursor-based pagination is recommended, since it leads to better user experience as users are less likely to click through actual pages and most likely to keep scrolling. How often do you go to the second page of google? And how often do you end up scrolling for a long time on facebook or medium?

Another argument for cursor-based pagination is the performance gain since it doesn’t get slower as you get further into the pages.

However, for small and relatively static data sets, offset-based pagination might be a sufficient solution. The added complexity of cursor-based pagination might not be worth the performance gain.

Thank you for reading. If you are interested in diving a bit deeper, I suggest looking at this excellent post from slack.

Previous blog post: Restricting access with authorization

--

--

Johanne Andersen

Software Developer, learning enthusiast and plant lover.