Automated Solution for Query Elevation using Solr

Kais Hassan
3 min readJul 30, 2016

--

Introduction

While creating search solutions you might encounter situations where you want to bypass the relevancy ranking for a particular query and promote some documents to the top of the search results. This is usually referred to as sponsored search. It is currently popular in search advertising, and it is similar to the way Google AdWords works, in which paid Ads are elevated to the top, as shown in Figure [1].

Figure 1: Google Paid Ads

The Query Elevation Component

Apache Solr provides the QueryElevationComponent that allows you to configure the top results for a given query regardless of the scoring. The Elevation Component contains several useful features including:

  • Configurable FieldType for query: you can create any custom pipeline for analyzing the query text, so you can lowercase the text, use a stemmer and etc.
  • Force elevation even on sorted results: you can either choose to respect or ignore the sort parameter.
  • Enable/Disable elevation parameter: you can choose to disable elevation via a query parameter, very useful for testing.
  • Marking elevated documents in the results: this is very useful if you want to apply a different UI style to the elevated documents.

As many Solr components, you configure the QueryElevationComponent by providing a file containing the queries along with the promoted documents ids, as shown in the following code listing. However, editing XML files by hand is not appropriate for query elevation, since most of the elevation data is volatile and is only valid for a short term. For example, running a timed campaign to promote some products in an ECommerce application.

<?xml version=”1.0" encoding=”UTF-8" ?>
<elevate>
<query text=”foo bar”>
<doc id=”1" />
<doc id=”2" />
<doc id=”3" />
</query>

<query text=”ipod”>
<doc id=”MA147LL/A” /> <! — put the actual ipod at the top →
<doc id=”IW-02" exclude=”true” /> <! — exclude this cable →
</query>

</elevate>

Configuration Automation

In one of the search systems I am currently working on, we needed to benefit from the functionality of the QueryElevationComponent without having to edit elevate.xml configuration file by hand. The situation is as follows:

  • Elevation data should be stored in a DB managed by the CRM.
  • We are using Solr along with the DataImportHandler (DIH) for indexing.
  • The solution must be automated and to integrate with the DIH scheduler we have in place.

Solution

The best solution was to create a new Solr plugin that can be invoked from DIH Event Listeners, onImportStart or onImportEnd. Solr Elevate Creator Plugin, begins by reading a configuration file containing the SQL statement required to retrieve promoted document. Afterwards, it generates the elevate.xml and stores it in the Solr Core data directory, as DIH commits the changes for the index the elevate.xml gets refreshed. Figure [2] illustrated the process flow.

Figure 2: Elevate Plugin Process Flow

Plugin Options

Besides the JDBC connection configuration, the following are some of the options for the plugin:

  • elevateOutputFile (string, default: elevate.xml): Name of the output file, should match <config-file> in QueryElevationComponent
  • dataDir (boolean, default: true): If true the elevateOutputFile will be stored in data folder, false will store it in the Core conf folder. If you are using SolrCloud set it to false, if running Solr in standalone mode it is better to keep it true since the generated file will get refreshed after each commit.
  • sql (string): The SQL statement that will retrieve both the document id and elevation_query_text fields, both columns must exists in the query, you can use ‘select myID as id’ if you have different names in your DB.
  • splitRegex (string): You might want to elevate a single document by several queries, to accomplish this store queries inside elevation_query_text column comma separated or using other splitters, and set this option to the split value.

Check the Solr Elevate Creator Plugin github repository for complete usage information. Also, check the Solr Elevate Example, which is a minimal ready to use Solr distribution (based on 6.1) that contains the Solr Elevate Plugin along with a sample data and configuration.

--

--