Experimenting With SQLite in iOS

What We Can Learn From Facebook’s Project Lightspeed

Jason Jobe
Capital One Tech
14 min readMay 27, 2020

--

black road with orange and white flash of light along the side and red and white circular sign with “50” written inside it

The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka” but “That’s funny…” Isaac Asimov (1920–1992)

Facebook Engineering recently published an article on Project Lightspeed, their Messenger rewrite, and it suggests they found good reasons to lean in on SQLite. They report that, compared with the previous iOS version, the rewrite reduced code by 84%, from 1.7 millions lines to 360,000, resulting in a 75% reduction in application size and a 50% reduction in launch time.

Most interestingly absent from the discussion on Project Lightspeed were React Native and GraphQL, two large and prominent Facebook projects used for mobile development. What has Facebook (re)discovered about SQLite to make them favor a traditional Relational Database Management System (RDMS) over these other patterns and technologies?

That question caused me to pause with a puzzled, “That’s funny. That can’t possibly be that efficient nor simple. Can it? “But what if, pardon the pun, we take this at face value?

There’s a lot at stake here; and from an engineering standpoint, rapid, efficient development and application performance are key drivers for Facebook. So if Facebook Engineering says SQLite is a key component in that change then we should be curious and take a look.

I am very familiar with RDMSs and the variety of Object Relational Managers/Mappers (ORMs) used in many environments ranging from Hibernate with Java, NeXT’s Enterprise Object Framework, to Apple’s Core Data. And I’ve done a fair share of tinkering with SQLite. Despite being the most widely installed SQL database, and possibly the second most widely used piece of software on the planet, it has always played a quiet, often unnoticed role in our operating systems and applications. Despite my experience, I have frequently steered away from a RDBM/SQLite solution thinking it’s a sledgehammer when I just need a ball peen. But I’d like to think that a good “old dog” should recognize a new trick when he sees one. Time for a second look to see what new things can be learned from SQLite.

The Experiment

In a nutshell, it was a before and after graphic and comments from Facebook’s article that illustrates the fundamental question I wanted to answer. “What does it mean to remove the intermediate controller layer in terms of complexity and coding patterns and how well does it perform?”

figure of jumbled web of black arrows and rectangles pointing to a touchscreen phone with white and black screen
Figure 1a: Before Lightspeed Configuration
figure of black arrows and rectangles pointing to a touchscreen phone with white and black screen
Figure 1b: Lightspeed Configuration

My (less animated) interpretation emphasizes the before (Figure 1a) and after (Figure 1b) architecture suggested by Facebook’s.

So I decided to do an experiment to test the performance of SQLite in driving the UI of a simple application. Of particular interest will be the visual responsiveness of the User Experience and the impact on developer productivity; is it faster and less complicated?

The Application

We want our test application to be simple, but it should exercise the database in terms of manipulating a decent number of records (on the order of thousands, at least). Since the typical mobile application doesn’t do a lot of data I/O, and form filling is a low bandwidth operation, a mostly read-only application should be more than sufficient for our testing. We also want our domain schema to be representative of a real-world application. In addition to our Swift API to the SQLite library we will be writing some SQL along with a small amount of networking and Binding “glue” to integrate with iOS’s MVC architecture. To connect these pieces together we’re going to use a well known and widely adopted pattern, the Model-View-ViewModel (MVVM) and implement a ViewModel and SQL Driven Binding framework to encapsulate our database with the thinnest veneer possible. Fewer, if any, domain Classes or Types, no notifications, no Subject-Observers, no KeyValueObserving.

ViewModel and SQL Bindings (in green) flow chart with black arrows and blue SQLite logo in corner
Figure 2: Our Design inserts a ViewModel and SQL Bindings (in green) into the traditional iOS MVC pattern with SQLite backing the Model.

Importantly — The coin of the realm of service APIs is JSON — nested objects and arrays — and they do not easily align with the tables and rows of a RDBMS and this can often be a barrier to adoption. Developers live in an object-oriented world and mapping from object graphs to RDMS schemas is no fun, easy, or efficient task. This is why many noSQL, graphQL, and the variety of ORM@ solutions are popular. But big name databases, including Oracle, MySQL, MSSQL, and Postgres, have included extensions to not only store JSON documents but also to read and edit the JSON’s object graph in place. And, of course, SQLite also includes a robust implementation (JSON1) to ease our ORM issues as well. We’ll take a look at some details on that in a bit.

To stress test a database we give it more records to operate over and drill down through any nested structures. On the UI side this means we want to display large lists in different ways. For our application we will be using UITableViews and MKMapViews. In addition, we will add a search/filter capability, given this is a core capability of the database we would hope this should be an easy and uncomplicated capability to integrate into our application.

The Hypothesis

By removing the vast majority of the Controller mediating logic, we should see a User Experience that is certainly no worse than the traditional solution, and ideally even more responsive and performant. We also want to lean into SQLite, preferring to use the database whenever possible. For example, values are fetched directly from the DB on demand; eschewing the need for domain-specific Types. Likewise, caching outside the database would be considered bad form.

Most importantly, we would expect to see the Developer Experience improve in the following ways.

  • Simpler, Less Complex — Cognitive overload is reduced with fewer interacting components, less code, fewer files, and clearer contracts and narrower APIs between components.
  • More Efficient — Being “easy” and near-at-hand means less bouncing between a larger number of files. It should be easier to discern the intent of the implementation, more obvious where to look in solving a problem.
  • Low “Change Amplification — Changes in the data schema or layout should have less impact on other components requiring fewer, if any code changes. A better design reduces the amount of code that is affected by each design decision.
  • Error Reduction — Declarative forms are easier to understand and validate. Simpler modular designs should be easier to test.

We will hit on these points as we go along and review them in our conclusion. So with these thoughts in mind let’s jump in and see how this plays out.

The Setup

The data we’re going to be using comes from Nessie, Capital One’s Hackathon Mock API. if you’ve attended a Capital One hackathon you may have used this mock API on your project. Nessie provides access to some real public-facing data such as Capital One ATM and bank branch locations, along with some mock customer account data, and was designed for use in experiments like this. This will offer a sufficient variety of data options for our experiment.

As a point of reference and conversation, the application is implemented using the MVVM pattern. It was chosen for its simplicity and close alignment with Cocoa’s MVC frameworks which is also evolving towards an MVVM design with the introduction of SwiftUI.

  • Model — In our example, the Model is our Swift bridge to the lower level SQLite C API. Importantly, it does NOT overlay any form of ORM or requirements to map a database record into and/or out of a traditional Object data structure. I selected a SQift fork (a Nike open source framework) because of its small footprint (3,754 loc) and unencumbered access to SQL. Interestingly, since we are cutting out the ORM and heavy Controller intercession we have virtually no need for any Swift classes or structs to hold any domain data.
  • View — The View specifies the layout; the placement, size, and rendering of values from our Model. For example, we configure a UITextField, specifying its size, font, color, borders and so on, but the actual content such as the user’s first name comes from some table and column in our database.
  • Binding with SQL — Most binding implementations associate values of instantiated objects leading to a more rigid and “baked in” component interdependence. But at its core a binding is simply a data structure or expression that associates a particular value in our Model to a specific UI rendering component. In terms of SQL, imagine someTextField.text = SELECT first_name FROM users WHERE id = 1. This is a binding by “name” and NOT by “reference”. Without this bindings variant, a new UIViewController subclass would be needed to hold references to the UI and the logic to map model values to each view. Each ViewController would depend both on the Type/Class of model value and the composition of the views. Using “names” allows us to eliminate the need for new distinct classes for every rendition baking in as constants the references to the Views along with the desired Model data type and at least parts of its schema.
  • ViewModel — Sitting in between the Model and View is the ViewModel. It performs the simplest, most mundane tasks: 1) Lookup a value in the Database and give it to a view, and 2) Dispatch User and System input events to functions that mutate the Database and/or output data from the application. Notably, the View is a function of the state and so is not considered “output” in this sense.

Using SQL, Not Objects

The use of SQL is not totally foreign to mobile developers, but neither is it something we readily embrace or utilize in our daily work. Normally, any use of SQL or relational databases is abstracted away in our world of Object-Oriented Programming.

But in our investigation we are keenly interested in what the database is actually doing so we won’t shy away from using SQL directly in our examples. We can add convenience methods later in our actual implementation that will abstract away the construction of the text and error handling, etc.

The reification of specific data structures (Objects) in code requires, in turn, explicit memory management, passing of object references (memory pointers) and adherence to special handling based on the Type and/or Swift Protocols (aka “Interfaces” in Java).

Following the semantics of the design of the Domain Model and its Objects, each Type typically carries more information than may be required by individual components. For example, a particular text label in the UI would only display a Person’s name. The Person’s address, while part of the definition of a Person, is instantiated in code but not displayed at all in some views. In this sense, an object can be “heavier” than needed in some contexts.

In contrast, database records have primary keys (identifiers) that are used instead of pointers and SQL gives us direct, explicit access to individual properties (e.g. database columns) of our data structures. To display the Person’s name, the SQL request is “SELECT name FROM people WHERE id = <id>”. In this context, the component knows nothing of the rest of the structure of a Person and, consequently, is not impacted by changes in our model unless it specifically impacts the “name” property of a Person.

The Application

Our application will be simple and plain (since I am not a UI designer). The interface layout will be a simple tabbed view — one for a table view and another for a map. Each will handle their collections differently and so will help us examine different data manipulations and retrieval strategies. The table view being a scrollable list is a good place to demonstrate the handling of a large number of records interactively (fast scrolling should NOT be impaired). We will also introduce a keyword search feature to challenge the database and UI responsiveness in continuously rebuilding and rerendering a larger number of records.

2 phones — screen on one on left is white with lines of black text and screen of one on right showing a tan and green map
Figure 3: Two ways to display an array of “locations” in our Demo.

The interface is actually easy, but first we need our data to display. Let’s start by looking at the raw JSON we get from our Nessie mock API and how it relates to the data needed in other modules, namely the database and the MapViewController.

3 logos (nessie, SQLite, Google Maps) with lines of code under each one
Figure 4: Illustrating the schemas as specified in each layer; RESTful, Database/Model and View.

Notice that SQLite lets us pack the arrays and nested substructures, address and geocode, of the inbound JSON into a single database column. We will need to use the JSON extension syntax to retrieve the nested values, but this way we don’t have to map out every detail before leaving the gate. This is where “database views” come into play. The observant reader may have noticed the “_” prefix on the table name. This is a convention I adopted to indicate the “raw” underlying data tables. The database VIEW above vends a subset of the _branches columns as renamed columns in the “locations”. Note that we take advantage of the SQLite JSON extensions to extract selected properties of the address and geocode.

Now that we have seen how our database is set up and our domain schema defined it’s time to move to the UI. Each of our main views, table and map, display a collection of locations with individual cells (small views) for each location record. This is a row in the Table View and and the pin Annotation in the Map View. Each text label and field is programmatically annotated with a “Model Key” binding (as per our MVVM design). In our example, each cell has labels for the “title” and “subtitle” of the branch. This binding is a string property of the view specifying the type/table and property/column value to be displayed. It can be set programmatically or in Xcode’s Interface Builder. Each binding translates to a SQL “SELECT <property> FROM <table> WHERE id = <id>” for each value to be displayed.

screen grab of a red pin on a map with arrows pointing to white table cells with black text

After coding up the application I found it became surprisingly easy to put things together. Dare I say “delightful” in some ways? So much so that I decided to try my hand at a video demonstration. Time from project creation to build and run is measured in hours, not days or weeks.

This video is a quick (less than 2 minutes) light hearted step-by-step guide to creating the example application. Only 78 lines of Swift, 68 lines of SQL and 1 Storyboard.

Video Illustrating the steps in creating “Fred’s Bank”

Conclusions — Discoveries, Observations, and Opinions

So how does it look? A great deal of the value I found lies in what I was able to remove, not what I needed to add. In addition to the actual application performance, the ability to quickly and intuitively map data structures from JSON to UI using SQL resulted in a significant speed increase in the design -> code -> build cycle.

For the sake of discussion, the following diagram illustrates a typical component configuration when using applying the MVC/MVVM patterns. It will serve to highlight the differences of our design from the more usual implementations.

flow chart made of white rectangles and black text and arrows and pointed finger clicking on each main section
Figure 6: Typical MVVM Configuration

The main point of note above is that each component is “Location” specific. In other words, each relies on the structure and any associated methods of the “Location” data structure and Model (e.g. Type). Any change in the Location Type, by design, implies a potential cascade of required changes throughout this system of interacting components. The degree to which additional changes are required defines the magnitude of such a “Change Amplification”.

Had we followed the same pattern, we would be creating (and maintaining) boilerplate constructs like this.

flow chart of lines of black text and black arrows leading to empty form fill screengrab
Figure 7: Note the significant number of hard-wired, baked-in dependencies that would be required for our application with even a small model using the traditional MVC pattern.

Instead, we get a much simpler, more data-driven design, and, consequently, more reusable framework illustrated below.

flow chart of lines of black text and black arrows leading to filled in form screengrab
Figure 8: Note the direct relationship between the Database schema and the SQL Binding annotations (indicated in red) for each View in our Demo.
  • Simpler, Less Complex — Our ViewModel required very little code. Note that the actual application code is the only place where any domain knowledge is baked in. The only dependencies are the SQift and SQiftViewModel frameworks. Both are general purpose (and open source) but code metrics for them are included for comparison.
Module            Language     files      code
--------------------------------------------------------------------
Fred's Bank Swift 3 108
SQL 1 68
--------------------------------------------------------------------
ViewModel Swift 6 695
Framework
--------------------------------------------------------------------
SQift Swift 27 2,814

Figure 9: File Counts and Lines-of-Code for the modules or our Demo application.

Cognitive overload is reduced with an exceptionally small amount of code in the application. There was no need to subclass the TableViewController nor the MapViewController provided in the ViewModel framework. And with the ability to use SQL to remap and compute columns no intervening Domain Classes were required. All concise without being confusing.

Changes to individual bindings are completely independent from one another and do not require any changes to mediating controllers.

  • More Efficient — There is a clear separation of concerns for developer; schema, bindings, and layout are independently specified or implemented. All schema operations (mapping, extracting, filtering, and such) are handled with SQL. The use of SQL based bindings at the leafs of the View hierarchy make it trivial to discern the provenance of the data displayed. Mediating objects only handle identifiers, not references to objects; only the rendering of the view carries the responsibility of validating the type and value. Developers can more easily locate relevant files and code (SQL, Swift, and Annotations) and discern its intent.
  • Low Change Amplification — A good design reduces the amount of code that is affected by each design decision. Changes in the inbound JSON from the server which would have a significant impact (as illustrated in Figure 8) can almost always be addressed directly with SQL in the database (as highlighted in Figure 9). Changes to individual bindings are completely independent from one another and do not require any changes to mediating controllers.
  • Error Reduction — The declarative form of the SQL Bindings make the intent clear and easy to validate. Looking at a UILabel in a table cell, for example, the binding, “location/title” is syntactic sugar for “SELECT title FROM location WHERE id = <id>”. With the “id” one can retrieve the exact value. Normally, the developer would have to research (aka “read and understand code”) to identify the “chain-of-custody” of an object with the value as it was passed from component to component. The more intermediate components handling the object, the more complex and error prone this process becomes. This collocation and concise declarative form of the SQL table and column make it easier to understand and validate the relationships between the domain data structures.

In addition to the improvements in the User and Developer experience, I would like to call out some additional advantages that are realized with this implementation.

  • Relying on the database as the Source of Truth for both domain and application state makes analysis and debugging significantly easier.
  • SQLite + JSON easily keeps up with the demands of the UI, handling thousands of records with regular expression searches and reading of JSON structures.
  • Database Views are a very simple and effective way to combine different data types for presentation. They are also very useful in transforming data structures to fulfill the “contracts” of other modules.

I found this experiment most illuminating. While more detailed and comprehensive benchmarking would certainly be worthwhile, I am reasonably confident that SQLite is a great choice for many use cases. I encourage readers to think out-of-the-box and ideate on the many possibilities when SQLite is a given in your environment.

If you have found this interesting I would be most interested in hearing of your experiences.

DISCLOSURE STATEMENT: © 2020 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.

--

--

Jason Jobe
Capital One Tech

Tinker, Maker, Smith. Professional Software Architect. Amateur economic and behavioral scientist. Papa Bear.