Offset vs Cursor-Based Pagination: Which is the Right Choice for Your Project?

Olga
5 min readAug 17, 2023

Pagination is a way to prevent your browser from crashing or giving users a bad experience when dealing with large amounts of data. Instead of loading all the data at once, it delivers it in small, manageable chunks.

There are two types of pagination that are most commonly used:

offset-based pagination and cursor-based pagination.

Offset is easier to implement but can be inefficient when dealing with large data sets.

Cursor is more efficient for large datasets but may require more complex implementation.

Pagination addresses several challenges:

  • A large dataset may cause your browser to crash. Pagination delivers data to the client in manageable parts, preventing browser crashes.
  • Even if your dataset is not large enough to crash the browser, showing hundreds or thousands of rows at once is not a good user experience.
  • Serving large datasets can be taxing on the resources on both the server and the client.
  • Fetching large datasets can cause timeouts and increase latency.

What is pagination?

also known as paging, is the process of dividing a document into discrete pages, either electronic pages or printed pages. Wikipedia

Offset-based pagination works as follows.

The client tells the server how many rows it wants by providing page_size, and also provides the offset or page number.

Let’s say we have 1000 rows of data in our database.

The client provides the following parameters in the request to the server:

page_size: 50
page: 1

The server sends back the first 50 results.

On the next request, the client might provide the following

page_size: 50
page: 5

The server sends back fifty rows that come after the first 200 rows.

When implementing a SQL query, it is important to always use ORDER BY to ensure consistent results and avoid data duplication when implementing cursor-based pagination.

Query plans may vary depending on the LIMIT and OFFSET values provided. Inconsistent results can occur when different LIMIT and OFFSET values are used to select subsets of a query result unless a predictable result ordering is enforced with ORDER BY.

See more on this here: https://www.postgresql.org/docs/current/queries-limit.html

Pros of the Offset-based pagination

  • Generally very easy to implement.
  • Works well for datasets with data that doesn’t change much.
  • Easy to implement the logic that lets the user go to specific pages in the dataset.

Cons of the Offset-based pagination

  • Does not scale. Performance degrades since the database has to scan all the rows from the beginning to get to the requested rows.
  • If your dataset changes often, it can affect the pagination results causing duplicate or missing rows.

Cursor-based pagination works as follows

A client requests the first 50 rows.

page_size: 50
cursor: None

The server returns those 50 rows and also a cursor.

{
items: [
// 50 rows of data
],
cursor: "some opaque string"
}

When the client requests the next set of 50 rows, it sends the server the cursor.

page_size: 50
cursor: "some opaque string"

Cursor is like a special pointer that shows where the next set of results can be found. The server uses the cursor to determine where the data that the client needs is located.

The cursor is represented by an opaque string that is designed to prevent the client from manipulating or reading the value of the cursor as well as ensuring that the details of the implementation are hidden.

See this article for a brief but good explanation about why it is an opaque string: https://graphql-ruby.org/pagination/cursors.html

In order to build an effective cursor-based pagination solution, it is important that the dataset is sorted.

The core idea of cursor-based pagination relies on having a continuously sortable field or a set of fields, such as timestamps, UUIDs, or PKs.

This ensures reliable sorting, even with inserts or deletions. It’s important to add a tie-breaker to the continuously sortable field in cases where it ends up being non-unique for some of the rows.

For example, if you have two rows that were created at exactly the same time, they will both have the same created_at timestamp. If we are using the timestamp as the sortable field we would need a tie-breaker to ensure a consistent order. We can use a unique field such as a primary key as a tiebreaker.

Pros of the Cursor-based pagination

  • Scales well. It is very efficient for large datasets because it can quickly seek to the cursor and return results without scanning through previous rows.
  • Good for cases where the data is frequently changing, for example, if we are dealing with live-feed data for a social media platform
  • Cursor-based pagination offers more consistent results than offset-based pagination. Even if the dataset changes, because the cursor always points to a specific location in the dataset.

Cons of the Cursor-based pagination

  • Implementation is typically more complex
  • Setting up logic to navigate between pages and jump to a specific page number is more complex.

Initial and subsequent queries Cursor-Based vs Offset-based

Cursor-Based Pagination

Having a cursor relies on an indexed column for efficiency.

On the initial fetch to retrieve the first 10 posts based on the created_at date, the database executes a SELECT statement with a WHERE condition and an ORDER BY clause.

SELECT * FROM posts ORDER BY created_at DESC LIMIT 10;

For subsequent fetches in cursor-based pagination, you can use the last known value of a column, such as signup_date, as a cursor for the next fetch. The query would look like

SELECT * FROM posts WHERE created_at < 'last_known_created_at' ORDER BY created_at DESC LIMIT 10;

Using cursor-based pagination with an indexed column is very efficient. The database can efficiently locate the index that corresponds to the cursor value and then retrieve the rows from there.

Offset-Based Pagination

Offset-based pagination is a method that relies on the limit clause to tell the database to skip a certain number of rows before returning a specific number of rows.

On an initial fetch, similar to the cursor approach, where a SELECT statement with a WHERE condition and an ORDER BY clause is executed to retrieve the first set of data.

SELECT * FROM posts ORDER BY created_at DESC LIMIT 10;

On subsequent fetches, use the OFFSET keyword to skip the preceding rows and fetch the next set of data.

SELECT * FROM posts ORDER BY created_at DESC LIMIT 10 OFFSET 10;

This skips the first 10 records and fetches the next 10.

Sources used:

https://en.wikipedia.org/wiki/Pagination

https://slack.engineering/evolving-api-pagination-at-slack/

https://www.postgresql.org/docs/current/queries-limit.html

--

--