Squeezing Performance from SQLite: EXPLAINing the Virtual Machine

Jason Feinstein
Apr 23, 2017 · 12 min read

The virtual machine? Yep, but not the Java Virtual Machine. In this post I will focus on providing you with a basic understanding SQLite’s “Virtual DataBase Engine” or VDBE.

My “Squeezing Performance from SQLite” series is primarily meant for Android engineers, but this post in particular will dive into SQLite itself and the topics discussed will hold true for all developers who use it.


SQLite is a Virtual Machine

I used to be under the impression that SQLite parsed and ran statements like an interpreter, but it turns out that’s not quite the case. While looking for SQLite’s version of MySQL’s EXPLAIN, I stumbled across documentation that described the way SQLite operates:

SQLite works by translating SQL statements into bytecode and then running that bytecode in a virtual machine. — SQLite.org

The SQLite virtual machine is called the “Virtual DataBase Engine”, or the “VDBE” for short.

Incredibly simplified depiction of SQLite executing a Statement.

You might already know what a bytecode program in SQLite more commonly goes by: “prepared statement”. Also, just like most programs: prepared statements can take input (? variables).

The bytecode program is a binary listing of instructions which each consist of an opcode and parameter values. Each opcode corresponds to a particular command that the VDBE knows how to process, and when being processed can operate on data contained within a bank of registers within the virtual machine. According to the official documentation, the number of registers is finite but can be quite large and depends on how SQLite was configured at compile time.

In the remainder of this article we’ll explore how SQL statements are processed by SQLite into bytecode programs, and then I’ll show you how you can examine the bytecode your statements get compiled into.


Preparing a Statement

When you ask SQLite to prepare a statement, your painstakingly handcrafted SQL is dissected (parsed), analyzed (query planned), and boiled down (compiled) into a bytecode program SQLite’s VDBE is capable of executing.

Tokenizing & Parsing

Just like any programming language, SQL starts as a bunch of text. To get from a string of text into something SQLite can understand, that text needs to be broken down and understood. This is what we call parsing.

SQLite’s parsing approach is made abundantly clear throughout the official documentation. For example, if you’ve ever ended up at sqlite.org while looking for how you are supposed to write an INSERT statement — you’ve likely seen a diagram that explains a part of how SQLite’s parser works:

INSERT syntax diagram (from sqlite.org)

Diagrams like the one above are a way of visualizing the Backus-Naur Form (BNF) description of the SQL grammar that SQLite understands. Reading the syntax diagrams is a pretty straightforward process when you know what to look for:

Query Planning

After the statement has been parsed into its component parts, SQLite needs to decide how to approach executing the statement.

For any given SQL statement, there might be hundreds or thousands or even millions of different algorithms of performing the operation. All of these algorithms will get the correct answer, though some will run faster than others. The query planner is an AI that tries to pick the fastest and most efficient algorithm for each SQL statement. — SQLite.org

Understanding how SQLite determines the best way to execute your statements is a big enough topic to warrant its own post. However, for this article it’s just important to know that there is an optimization step between parsing and compilation.

Compilation

Finally: after SQLite has determined how to best approach running your statement, it puts together a list of low-level bytecode instructions that describe the whole operation. That list of instructions is, quite literally, a program which will be run on the VDBE. A more popular name for the program compiled by SQLite is “prepared statement”.

Each bytecode instruction consists of an opcode (name of the instruction) and up to 5 parameters (input values or references to registers). In modern SQLite versions, there are over a hundred different types of instructions. They run the gamut between simple control-flow instructions like Eq: “jump to an instruction if two registers have the same value” and more database-specific instructions like ResultRow: which provides data to the database cursor at the current position, pointing at values which have been loaded into the VDBE’s registers.

Once compiled into bytecode, a prepared statement does not need to be parsed or to go through the query planning process again. This is precisely what makes re-using prepared statements so fast when compared with opting not to re-use them.


“EXPLAIN”

Now that you know how your raw statement becomes a prepared statement, you may be wondering if it’s possible to examine the bytecode program that was compiled.

SQLite provides a mechanism you can use to inspect the bytecode it would generate for any statement. In order to see the bytecode, just prepend your statement with EXPLAIN. From the documentation:

When the EXPLAIN keyword appears … it causes the statement to behave as a query that returns the sequence of virtual machine instructions it would have used to execute the command had the EXPLAIN keyword not been present.
SQLite.org

EXPLAIN's output is a series of rows where each row is an instruction in the prepared statement bytecode and each has 8 columns:

That last column: comment will likely be empty unless you compile SQLite yourself and set the -DSQLITE_ENABLE_EXPLAIN_COMMENTS option.

Explaining “EXPLAIN”, by Example

Let’s use EXPLAIN to take a look at the bytecode for a few statements. Because my install didn’t have the comments turned on, I’ll leave that out. Each example starts with the output you receive when you use EXPLAIN, followed by a walk-through of the flow that the VDBE takes through the bytecode program.

(Note: all bytecode shown here is from SQLite 3.16.2, and may not be the same in other versions.)

“Hello World”

sqlite> EXPLAIN SELECT "hello world";
addr opcode p1 p2 p3 p4 p5
---- ------------- ---- ---- ---- ------------- --
0 Init 0 1 0 00
1 String8 0 1 0 hello world 00
2 ResultRow 1 1 0 00
3 Halt 0 0 0 00

Sorry, I had to.

CREATE TABLE

sqlite> EXPLAIN CREATE TABLE blog (
title TEXT NOT NULL,
author TEXT NOT NULL,
pub_date INTEGER,
body TEXT
);
addr opcode p1 p2 p3 p4 p5
---- ------------- ---- ---- ---- ------------- --
0 Init 0 27 0 00
1 ReadCookie 0 3 2 00
2 If 3 5 0 00
3 SetCookie 0 2 4 00
4 SetCookie 0 5 1 00
5 CreateTable 0 2 0 00
6 OpenWrite 0 1 0 5 00
7 NewRowid 0 1 0 00
8 Blob 6 3 0 00
9 Insert 0 3 1 08
10 Close 0 0 0 00
11 Close 0 0 0 00
12 Null 0 4 5 00
13 OpenWrite 1 1 0 5 00
14 SeekRowid 1 16 1 00
15 Rowid 1 5 0 00
16 IsNull 5 24 0 00
17 String8 0 6 0 table 00
18 String8 0 7 0 blog 00
19 String8 0 8 0 blog 00
20 Copy 2 9 0 00
21 String8 0 10 0 CREATE TABLE blog (
title TEXT NOT NULL,
author TEXT NOT NULL,
pub_date INTEGER,
body TEXT
) 00
22 MakeRecord 6 5 11 BBBDB 00
23 Insert 1 11 5 00
24 SetCookie 0 1 1 00
25 ParseSchema 0 0 0 tbl_name='blog' AND type!=...
26 Halt 0 0 0 00
27 Transaction 0 1 0 0 01
28 Goto 0 1 0 00

INSERT

Lastly, let’s see what it takes to add a row to our newly created blog table.

sqlite> EXPLAIN INSERT INTO blog (title, author, pub_date, body) VALUES ('Winter is Coming', 'Ned Stark', date('now'), 'It comes in season 7.');
addr opcode p1 p2 p3 p4 p5
---- ------------- ---- ---- ---- --------------------- --
0 Init 0 12 0 00
1 OpenWrite 0 2 0 4 00
2 NewRowid 0 1 0 00
3 String8 0 2 0 Winter is Coming 00
4 String8 0 3 0 Ned Stark 00
5 Function0 1 6 4 date(-1) 01
6 String8 0 5 0 It comes in season 7. 00
7 HaltIfNull 1299 2 2 blog.title 01
8 HaltIfNull 1299 2 3 blog.author 01
9 MakeRecord 2 4 7 BBDB 00
10 Insert 0 7 1 blog 1b
11 Halt 0 0 0 00
12 Transaction 0 1 1 0 01
13 String8 0 6 0 now 00
14 Goto 0 1 0 00

Try it yourself!

If you’re still with me, nice work! If you’d like to learn even more about SQLite’s VDBE and bytecode, try looking through the EXPLAIN results for following on your own:

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store