Offline Full-Text Search in Android & iOS

Pablo Pallocchi
AndroidPub
Published in
6 min readSep 22, 2018

--

Sometimes your app uses data that doesn’t change very often, so you could keep a copy in the local storage, turning your app in an offline-first solution, faster and always available for your beloved users. Alright, that sounds good, but obviously you want to do something with that data. Suppose you want to let the user search in thousands of documents, somebody could say “ok, forget about the offline-first approach and move the data to an Elasticsearch” but it isn’t me, because the great news are that SQLite supports a native full-text search (FTS) and it’s available both for Android and iOS. Let’s do it.

Why full-text search?

Every time you run a query like this:

SELECT * FROM movies WHERE title LIKE '%pulp%'

the database engine has to scan the whole table, because there is no way to know which movie contains that string in its title without check every single row, and it’s worse if we need to find in multiple fields at the same time.

A full-text search approach has two steps: indexing and searching. During the indexing all the documents are scanned, generating a list of search terms (index). Then, when we run the query, that index is used to perform the search.

Implementing FTS in SQLite using FTS5 module

Fortunately we can implement a full-text search in a mobile application without setting up and maintaining an expensive search engine in server side, just using the FTS5 module from SQLite, which could be good enough for our needs.

Let’s create an app to find movies in a local database. The first thing is creating the table for our movies, but it won’t be a common table, it has to be a FTS table:

CREATE VIRTUAL TABLE movies
USING FTS5(title, overview, year UNINDEXED);

We’ve just created our FTS table! now we can perform a fast search over title and overview on thousands of movies, and using just a single query. Notice the unindexed option: this means year is not part of the index, so we won’t be able to search on that field (and makes sense).

Inserting data is as simple as in any other table. For this post I’ve created a SQL script with 3.000 movies. But, to give you an example, this is how a single insert looks like:

INSERT INTO movies ('title', 'overview', 'poster', 'year') 
VALUES ('Pulp Fiction', 'A burger-loving hit man...', '1994');

Now let’s run a query to find all movies matching the term “pulp”:

SELECT * FROM movies WHERE movies MATCH 'pulp';

Notice you don’t have to specify each field in the where clause, since all the indexed fields (title and overview) are evaluated with the match operation.

Now in our case we want to create an autocomplete, so we need to show all the movies matching with the user input as soon as it changes. In this scenario, if the user types “pul”, we should find all the movies containing a term starting like that. So our query changes a bit:

SELECT * FROM movies WHERE movies MATCH 'pul*';

At this point, our application is able to perform a full-text search, finding all movies matching with an user input. But if you run the last query, you’ll realize the movies are not ordered in terms of best-matching result. Here comes the last and very important step: ranking.

Rank the results

Now we need to rank the results found in our SQLite database. To do that, we need a ranking function, and one of the most used functions in our context is Okapi BM25. In previous versions of the FTS module (FTS3/FTS4), you have to implement this function by your own, and do the ranking in the application side. But now with FTS5 this is part of the module, and is as simple as this:

SELECT * FROM movies WHERE movies MATCH 'pul*' ORDER BY rank;

so under the covers SQLite creates this hidden column called rank, containing the score, for each movie, which was calculated using Okapi BM25.

Weights for fields

We assumed that all the fields had the same weights, so it was the same if the term searched by the user was present in the title or the overview. But in most cases, and this is not the exception, we want to prioritize certain fields. So let’s improve our autocomplete, and give the title field a weight of 10:1 over overview.

SELECT * FROM movies WHERE movies MATCH 'pul*'
AND rank MATCH 'bm25(10.0, 1.0)' ORDER BY rank

What happened here? if you don’t specify a ranking function when you order by rank, SQLite uses the bm25() function under the covers, without any parameters (all fields have the same weight), but as we want to change it, we have to explicitly use the function, defining the weights we want in the same order we declared the fields during table creation.

FTS5 in action

Finally, this is how FTS5 works. Notice when I typed “Jedi”, all the Star Wars movies were returned (because that term is present in the overview), but the first result was “Return of the Jedi” since the title has a higher weight.

FTS5 in action (iOS)

FTS4 alternative for older devices

As I mentioned before, the previous approach was implemented using the FTS5 module, but depending on the target that your Android / iOS application has, you might not be able to use it, but with a bit more work you can implement full-text search using FTS4 instead.

Which module should I use?

Based on the official documentation, FTS5 was released with SQLite 3.9.0, so it should be included in all devices running:

  • iOS ≥ 10.0
  • Android ≥ API 24

Looking at the iOS adoption chart, at this moment, the 95% of the users are running iOS 10 or iOS 11, so maybe doesn’t make sense to implement the FTS4 approach just for a 5%. It’s up to you.

But since in the Android world just a 30% have an API ≥ 24, you should consider using FTS4 instead.

Working with FTS4

First, the syntax to create a FTS table changes a bit:

CREATE VIRTUAL TABLE movies
USING FTS4(title, overview, year, year=NONINDEXED);

as you can see, you have to use FTS4 keyword instead, and the unindexed fields are declared as nonindexed.

It was easy, right? now the big change comes when we want to perform the search. As I mentioned before, FTS4 doesn’t support ordering by a ranking function, so you have to implement by your own. The good thing is FTS4 has a function which you can include as a part of your select, and it returns a set of values that can be used by our Okapi BM25 implementation. This function is called matchinfo and is used as follows:

SELECT *, MATCHINFO(movies, 'pcnalx') FROM movies 
WHERE movies MATCH 'pul*';

This function receives the table name as the first parameter, and a set of characters, each one representing a calculated field, as the second one. The values we need in order to use the Okapi BM25 function is represented by pcnalx. Now, let’s see how to rank the results in the application using those values.

Rank the results in Swift

OkapiBM25 usage in Swift

Rank the results in Kotlin

OkapiBM25 usage in Kotlin

Conclusion

The FTS module of SQLite could be a great way to implement a full-text search both in iOS or Android, without any external dependency nor server running backwards. I’ve already implemented this approach in one of my applications that I have in the stores, and since it’s working really good I wrote this post to share with all of you. You can find the working example which I made for this post in my Github repo. Happy coding!

Source code

branch for FTS5 implementation (iOS)

branch for FTS4 implementation (iOS & Android)

--

--