Full Text Search on Ubuntu with Synapse and Recoll

(originally published Feb 2013)

I’ve recently switched from using Ubuntu’s Unity to XFCE as my main desktop environment at home. I generally like Unity, but hate its performance. I found Synapse to be an adequate replacement for the Dash as an app-launching and file-locating tool.

One thing that’s missing from Synapse (as well as Unity) is the ability to find documents based on their contents. That is, when I’m trying to find that one lecture that covers ”Hoeffding bounds” in a directory of PDFs called 01Lec.pdf, 02Lec.pdf, etc., I want to be able to do it quickly without having to resort to opening each one up individually and searching using the PDF reader. Out of the box, Synapse is unhelpful for this use case and offers to run locate using my search term:

Synapse has a plugin architecture and so I decided to hack together one to integrate it with Recoll , a Unix/Linux desktop search engine. The process of writing a new plugin was fairly simple. It’s pretty well documented on their wiki and supports the use of Vala as the programming language. The majority of the work involved implementing the search method, which has this interface:

Since Recoll only supports a Python API, I ended having to parse the output of the command line tool which looks something like this:

The following code runs the recoll process and captures its output:

This block parses the output and builds a list of result objects:

The full code is available on Launchpad. I’m still working on getting this merged upstream but it has served me well in my personal use. Now when I search for “Hoeffding bounds” I can find that PDF from class where it was discussed: