Understanding and controlling the order of search results in Oracle Commerce Cloud

Published in

Oracle Developers

11 min readJan 8, 2019

In Oracle Commerce Cloud, we give merchandisers and our partners a large degree of control and influence over how search results are processed and returned. There are, however, a number of interrelated concepts that can be confusing.

This article strives to help you understand how search results are ordered, along with tips and tricks for further controlling results.

Matching versus Relevancy Ranking

When a shopper submits a keyword search, the first thing that happens is it attempts to match a set of records (in this case, Product records). A number of configurations determine which set of products match search terms.

Search Interfaces

A search interface is the collection of fields that are searched. Out of the box, only a few fields have been added to the search interface named “All”. These typically include the product display name and category.

Search interfaces are configured in OCC Admin under Search -> Searchable Field Ranking. The “Add Field” drop-down box has a list of fields that are enabled for keyword search, but not part of the search interface.

While this screen is entitled “Searchable Field Ranking”, it’s important to understand that all of the fields in a given search interface are searched. For instance, it doesn’t try to match product.description, and if no matches are found, then proceeds to product.category. It will search for matches in all of the fields.

Care should be taken as to which fields are added to a search interface. Long description might be tempting, but if you have very long descriptions, this might yield garbage results. Adding numeric fields (such as prices or ratings) is mostly a waste of time since no shopper is searching for literal strings like “$19.98”. (Instead, they would navigate or filter on price).

(Further down, we’ll discuss the “Ranking” and “Cross-field matching” aspects of this screen)

Thesaurus

Oracle Commerce Cloud ships with a built-in thesaurus containing many common words for different languages. For instance, it will handle “shirt” == “shirts”. However, it only handles basic common-language synonyms. It would not contain specialized values for “iphone” == “iphones”, or “tv == hdtv == hdtvs == lcd tvs”, etc.

Merchandisers can configure additional thesaurus entries. When the shopper performs a keyword search, the synonyms are considered and will increase the number of matching products returned.

Matching Conclusion

So the search interfaces and thesaurus are the two primary ways for merchandisers to control which products in their catalog are returned. But that configuration doesn’t determine the ordering of the results. That is where Relevancy Ranking comes in.

Relevancy Ranking

Relevancy ranking is the process by which the set of matching products are placed in some sort of order. Let’s do a quick example. Say our catalog has 10,000 products. A shopper searches for “hdtv”. Keyword search processing (using search interfaces and thesaurus) matches 400 out of that 10,000. Now it is relevancy ranking’s turn to order those 400 “hdtv” products.

As of the 18D (18.6) release, the ordering considers the following:

Boost and bury: This allows the merchandiser to select specific products or navigation states to push to the very top or bottom of the list of results.
Relevancy ranking modules: This examines the set of search results and tries to figure out which are the most relevant to the shopper’s search terms
Dynamic Curation: Discussed at length in this article (https://medium.com/oracledevs/dynamic-curation-of-product-listings-with-oracle-commerce-cloud-3a6de6f01450 ), Dynamic Curation allows you to further influence relevant results based on product attributes such as view count, whether it’s on-sale, in-stock, etc.
Static ranking: This is a final sort of the results when the above returns ties. For instance, if two different products exactly match the shopper’s search terms, a static ranking would sort them (perhaps based on price descending, or alphabetically or something else).

Boost and Bury

The boost and bury feature gives merchandiser fine-tuned control over results. For instance, perhaps there is an exciting new HDTV being sold, and merchandisers want to place this result first. The boost and bury feature allows them to do just that.

It also allows the burying of products to the end of the list. Perhaps the catalog contains a t-shirt with a picture of an HDTV, and “hdtv” appears in the product description. Merchandisers want actual HDTV’s to show up first, not this t-shirt. You can bury this t-shirt to the end of the list.

It is our hope that merchandisers don’t feel the need to use boost/bury for every search term. In the next two sections, we’ll discuss Dynamic Curation and Relevancy ranking, and between those two configuration options, hopefully you can naturally have good search results and resort to using boost/bury only for exceptional circumstances.

Relevancy Ranking Modules

Relevancy ranking modules will order the bulk of the results. These modules use various rankers to compute a numeric score for each matching product. It can take a look at criteria like:

“Did the search terms occur in the same field, or across fields?”
“Did the search terms exactly match a field?”
“Did the terms match in a high-ranked field like product name, or a low-ranked field like long description?”
“Did this record match all of the shopper’s terms, or only a subset of them?”

Oracle Commerce Search ships with a number of relevancy ranking modules that can be configured via REST. A complete set of documentation on this subject can be viewed at https://docs.oracle.com/cd/E97801_02/Cloud.18D/ExtendingCC/html/s4324understandhowrelevancerankingmod01.html

Understanding Searchable Field Ranking

Remember how the UI in OCC Admin is entitled “Searchable Field Ranking”? There is a reason for that specific name: Two of the relevancy ranking modules (and a few others if you use advanced configuration called “considerFieldRanks”) called Maxfield and Field will rank records based on which fields the terms are found in. The order of fields in Searchable Field Ranking is the order that Maxfield and Field will use. So if your “All” search interface had product name, short description, category and long description (in that order), Maxfield and Field would rank matches in product name ahead of matches in long descriptions.

Remember also there is a selector in “Searchable Field Ranking” for cross-field matches? This is used in conjunction with the Field relrank module. What this allows is to rank a cross-field match ahead of matches within a single field.

Consider a shopper searched for “lcd sony TV”, and our search interface had the four fields (product name, short description, category and long description). Even if those 3 terms are found spread out across name, description and category, we might still consider that more relevant than if all 3 were found in long description. In that case, we would move the “Cross-field matching” selector just above long description.

Dynamic Curation

Dynamic curation allows merchandisers to influence the set of search results based on fields unrelated to the search terms entered by the shopper. For instance, merchandisers might wish to nudge products that are not on-sale towards the back of list, or wish to highlight new products.

It’s important to note, however, the as configured out-of-the-box, Dynamic Curation is included as part of the relevancy ranking modules. What this means is that Dynamic Curation only takes effect AFTER relevancy ranking has started ordering results. It’s essentially just another relevancy ranking module, and as such, it can’t override relevancy the way boost/bury does. The influence of Dynamic Curation can thus be quite subtle.

Let’s use an example: Let’s say we’ve searched for “Sony 1080p lcd hdtv”, and there are 10 search results. The first result might be a Brand=Sony HDTV, with all 4 of the search terms appearing in the product name. Maybe the 10th result is a Vizio 1080p LCD HDTV. It matched most of the search terms (but not all). No matter what, using Dynamic Curation there is no way to push that Vizio TV ahead of the Sony TV (since the Sony TV is the more relevant result).

But let’s say results 1 through 8 are all Sony TVs, and all of them matched the terms with equal scores by the relevancy ranking modules. That’s where Dynamic Curation can influence the results: Merchandisers might push New, not-on-sale TVs to the front. The Vizio would still be 10th, but the first search result would be the newest Sony HDTV that is not-on-sale.

If a merchandiser wanted that Vizio to be the first result, they’d have to use Boost/bury to do so.

Static ranking

So at this point, the list of matching products has gone through boost/bury, relevancy ranking modules and Dynamic Curation. If there are products that have gone through all of that and still have the same relevancy scores, static ranking will determine the very final ordering.

Oracle Commerce Cloud ships with a default static ranking rule that will sort products by their display name in ascending order.

Controlling Relevancy Ranking via REST

Search interfaces, thesaurus, boost/bury and Dynamic Curation can all be controlled in OCC Admin via nice UI tools.

However, configuring the relevancy ranking modules and static ranking can only be configured via REST. Also, starting in 18C or so, the configuration location have changed (and become a bit more complex).

Since boost/bury, Dynamic Curation, relevancy ranking and static ranking can each be configured independently of each other, each one of those has a separate series of REST endpoints.

Changing the default Relevancy ranking via REST

Relevancy ranking is configured at /gsadmin/v1/cloud/content/system/rankingRules/defaultStandardSearch

First, do a GET against that endpoint. Out of the box, it is configured (as of 18D) with the nterms, maxfield and exact modules configured.

Make your modifications to that JSON, and then PUT to the same endpoint.

Changing the default static ranking via REST

Static ranking is configured at /gsadmin/v1/cloud/content/system/rankingRules/defaultStandardStatic

Again, first do a GET to retrieve the current JSON. Make your modifications and PUT it back to the same endpoint.

The out of the box configuration has a static sort on product.displayName and sku.listingOptionIndex. (In reality, since no two products have exactly the same display name, that static on sku.listingOptionIndex probably does nothing). You would want to replace those existing strings with your custom static ordering.

Understanding the folder structure of rules

If you perform a GET against /gsadmin/v1/cloud/content/system/rankingRules/ you would see a series of folders, such as:

defaultMerchBlend
defaultMerchBoostBury
defaultRecsBoost
defaultStandardSearch: default relrank (covered above)
defaultStandardStatic: default static sort (covered above)

For the most part, you should not touch /content/system except for configuring relrank or static. The exception might be “defaultMerchBlend”. You might wish to create a default Dynamic Curation rule that applies to search results, but do NOT want it to appear under the Dynamic Curation tool in OCC Admin (so that merchandisers can’t fiddle with it and break things).

As merchandisers create Dynamic Curation or Boost/Bury rules, those will get created in a slight different location:

/gsadmin/v1/cloud/content/rankingRules

A rule named “Default” is created for you out-of-the-box. That rule is useful for configuring a default Dynamic Curation rule that applies to all searches (unless you create a location-specific rule).

Determining Why Results are ordered the way they are

You perform a keyword search, and the results are ordered by relevancy ranking. But you’re curious: How exactly did the set of products you’re seeing get ordered that way?

The search index provides meta-data about this. The two options are called WhyRank and WhyMatch. This information is only available in your Preview storefront (since it can be potentially expensive to compute, and we don’t want to slow down live Storefronts which shoppers are using).

WhyRank and WhyMatch have been turned on by default in all preview storefronts starting in 18C or so. The information is returned in the JSON response from the /ccstoreui/v1/search endpoint, and it is returned for each product.

In my local application, I searched for “movie” by performing a GET /ccstoreui/v1/search?Ntt=movie

On the first product record, I have this for WhyMatch:

“DGraph.WhyMatch”: [
 “[{\”fields\”:[\”product.category@locale:en\”], \”terms\”:[ {\”term\”:\”movie\”, \”expansions\”:[ {\”type\”: \”Substring/Phrase\”}]}]},{\”fields\”:[\”product.displayName@locale:en\”], \”terms\”:[ {\”term\”:\”movie\”, \”expansions\”:[ {\”type\”: \”Substring/Phrase\”}]}]}]”

… and this for WhyRank

"DGraph.WhyRank": [
 "[ { \"nterms\" : { \"evaluationTime\" : \"0.0009765625\",   \"stratumRank\" : \"100\", \"stratumDesc\" : \"Matched 1 of 1 terms\" }}, 
       { \"maxfield\" : { \"evaluationTime\" : \"0.0009765625\", \"stratumRank\" : \"3\", \"stratumDesc\" : \"field match\", \"rankedField\" : \"product.displayName@locale:en\" }}, 
       { \"exact\" : { \"evaluationTime\" : \"0.011962890625\", \"stratumRank\" : \"2\", \"stratumDesc\" : \"subphrase match\", \"rankedField\" : \"product.category@locale:en\" }}, 
       { \"static\" : { \"sortedBy\" : [{\"fieldName\":\"product.displayName@locale:en\",\"fieldType\":\"String\",\"directionCompared\":\"ascending\"}] }}, 
       { \"static\" : { \"sortedBy\" : [{\"fieldName\":\"sku.listingOptionIndex\",\"fieldType\":\"Int\",\"directionCompared\":\"ascending\"}] }} ]"
              ],

Okay, so that’s not terribly user friendly. But looking in that JSON-within-JSON, for WhyMatch we see the fields it matched (product.category and product.displayName). We see @locale:en which matches the language we searched in.

For the WhyRank information, it’s even less user-friendly. There is still some useful information though. We see it matched “1 of 1 terms” for Nterms. The field used for ranking was “product.displayName”. For Exact, it only found a subphrase match (which means that neither product.displayName nor product.category had a value that was 100% exactly “movie”, only that our search for “movie” matched a subphrase. For example, maybe the Category name was “Horror movies”).

If you collect that information for a few products, you’ll see differences. Maybe the second product matched on a lower-ranked field.

Search Tuning and the Carpet-bubble effect

When configuring relevancy ranking or dynamic curation, it is best to start with a list of important search terms, and that those search terms represent a variety of searching styles.

For instance, your list might include searches that shoppers perform against category names, or product or brand names. Some searches should have multiple search terms (like our “Sony 1080p LCD HDTV” example), while others might be a single term (“hdtv”).

Before starting search tuning, perform searches with that list of terms and give them a personal score. After changing relevancy ranking or the default Dynamic curation rules, re-test each search term.

What sometimes happens is you improve one set of searches, but mistakenly ruin others. This is what I refer to as the carpet-bubble effect. If you have a carpet in your living room and a bubble forms, you might fix it right HERE, but then you just moved it over THERE, and you’re no better off.

So don’t test search tuning changes in isolation. And also (between you and me, so shhhhh), beware the CEO Effect. This happens when the CEO or CTO or CIO calls you and says “Hey, I just searched for X and it looks like garbage! Fix this ASAP!” You might make changes to fix that exact search, but inadvertently ruin other searches that actual shoppers are performing. Instead, add those terms to the list of search terms above. When making changes to for your CEO, also re-assess the other terms as well.

Finally, beware of over-using Boost and Bury. If you find yourself configuring hundreds of boost/bury rules, that probably signifies that your default relevancy ranking modules need improvement. Or it might mean adding (or removing!) fields from your search interface. Your goal should be to develop a good relevancy ranking configuration that covers most searches. Boost/bury should be used situationally, not something that should be used for every keyword a shopper enters.

Conclusion

Well, I threw a lot of information at you. If I had to summarize what you you should do, it would be:

Gather a list of important search terms that reflect search behavior from your customers
Modify the “All” search interface to include relevant fields. The out-of-the-box configuration is inadequate.
Modify the default relevancy ranking strategy and the static ordering. Your goal should be good search results for nearly all of your search terms. (But maybe not all)
Define a default dynamic curation rule
Add business-specific thesaurus entries
Test and re-test your search configuration. After any big changes (changes to your product types, search interfaces, etc) re-test your list of search terms.
Employ boost/bury situationally to improve search results or to make sure your top searches have great results to improve conversion