In Search of a Better Search (v.2.1.0)
When CenterlineScores v.2.0 launched late last year, one of the many things that changed within the system was the underlying technology/framework used to search our database of Horses and Riders. The old system was based on a “hacked-together” brute force approach to searching for character sequences in the text of the database record. While it worked reasonably well, there were a few unintended “side-effects” like displaying both old and new names (such as the case where a Rider changes her last name, etc) as well as the display of “duplicates” in the search result lists, etc.
A Better (but Misconfigured) Search Engine
The new Search Engine under-the-covers in CenterlineScores.com is an industry-standard search server software used all over the world by companies as large as eBay and others. It enables the capability to search on multiple criteria and calculate “relevance” scores based on the relative uniqueness of the items being searched (for example, super common words like “a”, “and” and “the” are weighted less since they are such common words, etc.). Unfortunately, because CLS 2.0 represented a significant rewrite of almost every part of the system, we didn’t put as much effort into configuring the underlying search engine to work the way our users had come to expect. As a result, the default CLS Search yielded search results that were WIDELY varying in quality and relevance to most users.
Tweaked and Tuned
When a long-time supporter and friend of the site got in touch to share her growing frustration with the new version of CLS, we got on the phone to try to sort things out. After only a few minutes, we quickly identified the Search Engine as the primary culprit of her frustration. She couldn’t find any of the horses/riders she was looking for.
We’ve now updated the configuration and tuning of the Search Algorithm in these areas:
- Rider results are now weighted slightly higher than Horse results: The high prevalence of “human” names given to horses as well as the fact that there are more horses than riders in the database was causing the search results to be flooded with horses named “Jessica” when you’re actually searching for a rider.
- Automatically add multiple combinations of the search terms into phrases: (e.g. with the search term of “Ashley Minea”, we now automatically search for “Ashley” and “Minea” separately, but also independently search for the phrase “Ashley Minea”).
- Recent show activity is weighted slightly higher: Because the CLS database covers 25+ years and millions of records, older (long inactive) records were getting ranked as high (or higher) than records with more recent activity. This tweak adds a slight bump to more recent show activity and slightly lowers the relevance for Riders/Horses who have never shown.
- “Fuzzy” search and misspellings: In addition, we have improved the search engine’s ability to locate results when the search term contains inadvertent misspellings or letter transpositions, etc. In the example below, the search term misspells the name “Ashley” as “Asley”. However, based on the context, the system “knows” that “Asley” is much less likely to be the name that was intended and more correctly scores the results.
All of these components of the search work to calculate a “relevance” score which estimates how closely the record matches search criteria; each of these score components are combined to the final score which is used to rank the results that are presented.
Updated Results Format
In response to feedback on the actual format of the Search Results, we have adjusted and will continue to fine tune the data that is returned in the search results list. One group of users actually told us that they used our search as a quick-lookup to view/verify USDF/USEF membership #’s. We broke this usecase when we removed those membership numbers from the results, so we’ve added them back. Horses now include level, breed name and (where available) coat/color.
All of the improvements to search outlined above have been implemented and deployed with our latest version v.2.1.0, which was deployed last week on Feb 27, 2019.
Moving Forward
At CenterlineScores.com, we are committed to building out a platform that provides useful and usable ways to Find and Interact with Dressage Horses, Riders and Shows data, so we want to know what you want to see next with regard to Search.
We can make the search do whatever you need it to do. Just tell us what works for you in the way you use CLS. Want to narrow the results down to just horses by including the word “horse”? Want to be able to search by owner’s name or breeder’s name? Want to search by Sire name? We can make it do anything, but we want it to be useful. So let us know!
Go to https://ideas.centerlinescores.com/ to submit an idea and/or vote on existing ideas. As always, thanks for continuing to use CenterlineScores.com!