Deleting specific type of terms from a Xapian document

Gaurav Arora
Gaurav Arora
Published in
2 min readJun 19, 2017

Recently i was working on a problem. Where i needed to delete all the terms with specific prefix let’s say NA from the xapian document.

So, the Xapian document model provide us with amazing API’s to play with index and the individual indexed document.

  1. It provide’s us with iterator of terms in a particular document. TermIterator , termlist_begin(did) , termlist_end(did).
  2. It provides us with a API to delete the specific term in a document. remove_term(term).

I am like then why are we even blogging about it, It’s simple

  1. Iterate through the termlist of the document.
  2. Check if the terms starts with NA.
  3. call remove_term(term) if term starts with NA.

yes, this looks like the perfect thing. It worked for initial simple example also But then when things or data started getting complex the strategy started going hay-way.

It was not deleting lot of the prefix terms. I tried a lot of things which could be causing that. But the main issue was deleting the terms was causing the underlying dataset or termlist to change which was causing the termlist to also change. Hence we were missing deleting lot of terms. Which was detected after a lot of debugging.

Correct algorithm to do it:

  1. create a empty list of string.
  2. iterate through the termlist of the document.
  3. check if the term starts with NA.
  4. Add the term to list<string> if the term do start with NA.
  5. After the iterator of document finishes. Iteratre thorough the string list which contains terms to be deleted.
  6. Delete all the terms from document in this list via remove_term(term).

Just something to share. In this also We are also iterating through all the block trying to find the loose block to remove which will not make the building fall. Until one point when the building becomes unstable and it falls off.

Though the process is similar we want to delete based on condition while iterating through the blocks.

I recently discovered this game and i am liking it :) .

--

--

Gaurav Arora
Gaurav Arora

Currently helping build Zyft.com. Passionate about running, building stuff.