Image for post
Image for post

Dev Notes: Fuzzy Search on PostgreSQL

Ben Orozco
May 6, 2016 · 3 min read

My previous experience with search in Rails was 5+ years ago, then I used Sphinx as a Full-Text external search engine, which seems to have gone out of fashion.

At Nimbo X we use pg_search rails gem for searching purposes, with Full-Text already enabled in PostgreSQL itself, which has worked for us so far.

Nevertheless, with the increase of usage among practicians, along with our recent introduction of Medispan, support tickets related to mismatched search queries have become a cause of pain for the team. We knew our users deserve better.

Image for post
Image for post
#MayThe4thBeWithYou

First we needed to understand the current landscape on psql search. Luckily PgSearch documentation pointed us in the right direction: along with full-text search, other algorithms are supported through PostgreSQL contrib packages: fuzzystrmatch (Levenshtein, Double Metaphone) & pg_trgm (Trigrams).

Algorithms

Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other.

https://en.wikipedia.org/wiki/Levenshtein_distance

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.

https://en.wikipedia.org/wiki/Soundex

Metaphone is a phonetic algorithm… It fundamentally improves on the Soundex algorithm… which does a better job of matching words and names which sound similar.

https://en.wikipedia.org/wiki/Metaphone

Trigrams are a special case of the n-gram, where n is 3. They are often used in natural language processing for doing statistical analysis of texts.

https://en.wikipedia.org/wiki/Trigram

Implementation

PgSearch makes fuzzystrmatch straightforward to implement:

$ rails g pg_search:migration:dmetaphone
$ rake db:migrate
class Word < ActiveRecord::Base
include PgSearch
pg_search_scope :that_sounds_like,
:against => :spelling,
:using => :dmetaphone
end


four = Word.create! :spelling => 'four'
far = Word.create! :spelling => 'far'
fur = Word.create! :spelling => 'fur'
five = Word.create! :spelling => 'five'

Word.that_sounds_like("fir") # => [four, far, fur]

On the other side, pg_trgm queries are more expensive computationally. Creating a separate GIST (Generalized Index Search Tree) can be a palliative by increasing performance, although creating the index itself is expensive as well, and might involve some downtime. PgSearch makes pg_trgm a breeze to implement too:

class Website < ActiveRecord::Base
include PgSearch
pg_search_scope :kinda_spelled_like,
:against => :name,
:using => :trigram
end

yahooo = Website.create! :name => "Yahooo!"
yohoo = Website.create! :name => "Yohoo!"
gogle = Website.create! :name => "Gogle"
facebook = Website.create! :name => "Facebook"

Website.kinda_spelled_like("Yahoo!") # => [yahooo, yohoo]

Rails Migrations

It is nice to have rails migrations that activate the contrib packages in Postgres:

Resources

https://gist.github.com/benoror/e78f6bdff1b02de7ceed28c9b08fe435

The Backlog by Ecaresoft

Sharing our journey: from software development to company…

Ben Orozco

Written by

Healthtech Hacker — Full Stack Dev — Open Source & Crypto Enthusiast — CTO 🌳 @HealthTreeNet — Previously @ecaresoft & @Nimbo_X

The Backlog by Ecaresoft

Sharing our journey: from software development to company culture and productivity hacks.

Ben Orozco

Written by

Healthtech Hacker — Full Stack Dev — Open Source & Crypto Enthusiast — CTO 🌳 @HealthTreeNet — Previously @ecaresoft & @Nimbo_X

The Backlog by Ecaresoft

Sharing our journey: from software development to company culture and productivity hacks.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store