The Startup
Published in

The Startup

Faster SQLite Lookup Using Without Rowid Optimization

Photo by Campaign Creators on Unsplash

TL;DR: A WITHOUT ROWID table can reduce DB size and perform query faster than an ordinary rowid table.

A good strategy is to simply not worry about WITHOUT ROWID optimization until near the end of development, then go back and run tests to see if adding it helpul or not.

SQLite rowid

By default, every row in SQLite has a special column — usually called rowid.

rowid can be used to uniquely identifies a row within the table.

For example, create a new table:

movie TEXT,
studio TEXT,
rating REAL

As a result, SQLite creates rowid beginning with 1 and increasing by one with each added row. But if rows are deleted, gaps can appear.

TopRatedFilm table.

SQLite allows us to control the rowid assigned, so that rows can be inserted at the top of the table by using negative rowid.

The rowid(s) are always unique. They may not be consecutive but always in strictly ascending order.

SQLite searching

Normal lookup

Suppose you want to know the rating of Starwars. The query would be:

SELECT rating FROM TopRatedFilm WHERE movie='Starwars';

SQLite reads every row out of the table, checks to see if the movie column has the value “Starwars”, and if so, outputs the rating from that row.

Normal lookup.

This algorithm is called a Full Table Scan.

In a full scan, the entire content of the table must be read and examined to find needed row(s).

If your table contained a million rows, SQLite might read MBs of content just to find a single value.

Lookup by rowid

One technique for avoiding full scans is to lookup by rowid. For example, to know the rating of Starwars, we query for the entry with rowid of 2:

SELECT rating FROM TopRatedFilm WHERE rowid=2;

Since rows are stored in order of increasing rowid, SQLite can find using a Binary Search, so the time complexity is reduced to O(logN) (rather than O(N)).

Lookup by rowid.

The problem is: you actually don’t want what the 2nd entry is — you want to know the rating of Starwars!

Lookup by index

An index is another table similar to the original one, but only store rowid & indexed column(s).

SQLite creates a B-Tree to hold the indexed data.

Let’s create an index by specifying the primary key:

studio TEXT,
rating REAL

Or use the explicit CREATE INDEX statement:

CREATE INDEX movie_indexed ON TopRatedFilm(movie);

After that, TopRatedFilm table is implemented as two separate B-Trees:

  • The main table uses the rowid as the key and stores the (movie, studio, rating) columns as data.
  • The index table uses movie and rowid as the key and stores no data at all. Note that: the movie column is sorted to support binary search.
An index on the movie column.

This index can be used to implement a faster lookup algorithm:

SELECT rating FROM TopRatedFilm WHERE movie='Starwars';

The key used to look up is still the rowid:

  • The query starts by doing a binary search on the movie_indexed for entries that have movie='Starwars'.
  • Having found rows in the movie_indexed, SQLite extracts their rowid.
  • Then, it does a second binary search on the original TopRatedFilm table using the extracted rowids to find the original rows.
Lookup by index.

With index, SQLite still has to do two binary searches. But for a large table, this’s still much faster than doing a full scan.

The WITHOUT ROWID Optimization

If the phrase WITHOUT ROWID is added to the end of a CREATE TABLE statement, then the special rowid column will be omitted.


The WITHOUT ROWID syntax is an optimization

A WITHOUT ROWID table can use about half the amount of disk space and can operate nearly twice as fast.

Consider TopRatedFilm, as an ordinary rowid table, two separate binary searches are required when doing a lookup.

An index on the movie column.

On the other hand, a WITHOUT ROWID table uses a different data design:

studio TEXT,
rating REAL

In this table, there’s only a single B-Tree that uses the movie column as key and the rating column as data. (Technicality: the low-level implementation stores both movie and rating in the “key” area of the B-Tree).

Because there’s only a single B-Tree, lookup for value only involves a single binary search.

Entries of the movie column are now only stored once. Querying the rating value only involves a single binary search (into the main B-Tree), since the rating can be retrieved directly from the record found by that first search.

Differences from ordinary rowid tables

The WITHOUT ROWID syntax provides no new capabilities. Anything that can be done using it, can also be done in exactly the same way with an ordinary rowid table.

They both generate the same answers given the same SQL statements.

There’re only some additional restrictions on WITHOUT ROWID tables:

  • It must have a PRIMARY KEY
  • AUTOINCREMENT does not work.
  • NOT NULL is enforced on every column of the PRIMARY KEY

When To Use?

The WITHOUT ROWID optimization is helpful for tables that:

  • Have non-integer or multi-column PRIMARY KEYs: it’ll work correctly for tables with a single INTEGER PRIMARY KEY, however, ordinary rowid tables will run faster.
  • Do not store large strings or BLOBs: it works best when individual rows are not too large. A good rule-of-thumb is that the average size of a single row in a table should be less than about 1/20th the size of a DB page. This’s because rowid tables implemented as B*-Trees (all content stored in the leaves), whereas WITHOUT ROWID tables implemented using normal B-Trees (with content stored on both leaves & intermediate nodes). Storing content in intermediate nodes makes them take up more space and increasing the search cost.

A good strategy is to simply not worry about WITHOUT ROWID optimization until near the end of development, then go back and run tests to see if adding it helps or hurts performance, and retaining the WITHOUT ROWID only in those cases where it helps.

Happy coding~




Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +756K followers.

Recommended from Medium

C Static Libraries

Preparing for AWS Dev Associate Certification

AWS CDK ( Python ) HTTP API + Cognito

Containerize a basic HTML/CSS/JS app with nerdctl & Rancher Desktop

Reverse proxy using Netflix’s Zuul in Spring Boot

Python Basics For Beginners

Basic Breakdown of the Faker::Gem

Mainframe Application Portfolio Analysis

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tam H. Doan

Tam H. Doan

Software Engineer, also a Runner and Guitarist.

More from Medium

Generating SDKs for your API

How to Setup a Video Streaming Service by aaPanel

How Feature Flags Reduce Log4j Web and App Vulnerability

Exploring the Salesforce Mobile SDK Using React Native

science beakers