Faster SQLite Lookup Using Without Rowid Optimization

Tam H. Doan
Feb 18 · 5 min read

TL;DR: A WITHOUT ROWID table can reduce DB size and perform query faster than an ordinary rowid table.

A good strategy is to simply not worry about WITHOUT ROWID optimization until near the end of development, then go back and run tests to see if adding it helpul or not.

SQLite rowid

By default, every row in SQLite has a special column — usually called rowid.

rowid can be used to uniquely identifies a row within the table.

For example, create a new table:

CREATE TABLE TopRatedFilm(
movie TEXT,
studio TEXT,
rating REAL
);

As a result, SQLite creates rowid beginning with 1 and increasing by one with each added row. But if rows are deleted, gaps can appear.

Image for post
Image for post
TopRatedFilm table.

SQLite allows us to control the rowid assigned, so that rows can be inserted at the top of the table by using negative rowid.

The rowid(s) are always unique. They may not be consecutive but always in strictly ascending order.

SQLite searching

Normal lookup

Suppose you want to know the rating of Starwars. The query would be:

SELECT rating FROM TopRatedFilm WHERE movie='Starwars';

SQLite reads every row out of the table, checks to see if the movie column has the value “Starwars”, and if so, outputs the rating from that row.

Image for post
Image for post
Normal lookup.

This algorithm is called a Full Table Scan.

In a full scan, the entire content of the table must be read and examined to find needed row(s).

If your table contained a million rows, SQLite might read MBs of content just to find a single value.

Lookup by rowid

One technique for avoiding full scans is to lookup by rowid. For example, to know the rating of Starwars, we query for the entry with rowid of 2:

SELECT rating FROM TopRatedFilm WHERE rowid=2;

Since rows are stored in order of increasing rowid, SQLite can find using a Binary Search, so the time complexity is reduced to O(logN) (rather than O(N)).

Image for post
Image for post
Lookup by rowid.

The problem is: you actually don’t want what the 2nd entry is — you want to know the rating of Starwars!

Lookup by index

An index is another table similar to the original one, but only store rowid & indexed column(s).

SQLite creates a B-Tree to hold the indexed data.

Let’s create an index by specifying the primary key:

CREATE TABLE TopRatedFilm(
movie TEXT PRIMARY KEY,
studio TEXT,
rating REAL
);

Or use the explicit CREATE INDEX statement:

CREATE INDEX movie_indexed ON TopRatedFilm(movie);

After that, TopRatedFilm table is implemented as two separate B-Trees:

  • The main table uses the rowid as the key and stores the (movie, studio, rating) columns as data.
  • The index table uses movie and rowid as the key and stores no data at all. Note that: the movie column is sorted to support binary search.
Image for post
Image for post
An index on the movie column.

This index can be used to implement a faster lookup algorithm:

SELECT rating FROM TopRatedFilm WHERE movie='Starwars';

The key used to look up is still the rowid:

  • The query starts by doing a binary search on the movie_indexed for entries that have movie='Starwars'.
  • Having found rows in the movie_indexed, SQLite extracts their rowid.
  • Then, it does a second binary search on the original TopRatedFilm table using the extracted rowids to find the original rows.
Image for post
Image for post
Lookup by index.

With index, SQLite still has to do two binary searches. But for a large table, this’s still much faster than doing a full scan.

The WITHOUT ROWID Optimization

If the phrase WITHOUT ROWID is added to the end of a CREATE TABLE statement, then the special rowid column will be omitted.

Benefits

The WITHOUT ROWID syntax is an optimization

A WITHOUT ROWID table can use about half the amount of disk space and can operate nearly twice as fast.

Consider TopRatedFilm, as an ordinary rowid table, two separate binary searches are required when doing a lookup.

Image for post
Image for post
An index on the movie column.

On the other hand, a WITHOUT ROWID table uses a different data design:

CREATE TABLE TopRatedFilm(
movie TEXT PRIMARY KEY,
studio TEXT,
rating REAL
) WITHOUT ROWID;

In this table, there’s only a single B-Tree that uses the movie column as key and the rating column as data. (Technicality: the low-level implementation stores both movie and rating in the “key” area of the B-Tree).

Because there’s only a single B-Tree, lookup for value only involves a single binary search.

Entries of the movie column are now only stored once. Querying the rating value only involves a single binary search (into the main B-Tree), since the rating can be retrieved directly from the record found by that first search.

Differences from ordinary rowid tables

The WITHOUT ROWID syntax provides no new capabilities. Anything that can be done using it, can also be done in exactly the same way with an ordinary rowid table.

They both generate the same answers given the same SQL statements.

There’re only some additional restrictions on WITHOUT ROWID tables:

  • It must have a PRIMARY KEY
  • AUTOINCREMENT does not work.
  • NOT NULL is enforced on every column of the PRIMARY KEY

When To Use?

The WITHOUT ROWID optimization is helpful for tables that:

  • Have non-integer or multi-column PRIMARY KEYs: it’ll work correctly for tables with a single INTEGER PRIMARY KEY, however, ordinary rowid tables will run faster.
  • Do not store large strings or BLOBs: it works best when individual rows are not too large. A good rule-of-thumb is that the average size of a single row in a table should be less than about 1/20th the size of a DB page. This’s because rowid tables implemented as B*-Trees (all content stored in the leaves), whereas WITHOUT ROWID tables implemented using normal B-Trees (with content stored on both leaves & intermediate nodes). Storing content in intermediate nodes makes them take up more space and increasing the search cost.

A good strategy is to simply not worry about WITHOUT ROWID optimization until near the end of development, then go back and run tests to see if adding it helps or hurts performance, and retaining the WITHOUT ROWID only in those cases where it helps.

Happy coding~

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store