Enhancing AEM Lucene Search: Advanced Techniques for Improved Search Functionality : Final Part

Kiran Mayee N
Activate AEM
Published in
8 min readAug 13, 2024

Welcome back, AEM enthusiasts!

We’ve tackled the challenges of indexing management and crafted user-friendly search results with excerpts and highlighting. But what if users struggle to find the perfect search terms, or encounter those inevitable typos?

This article dives into the realm of advanced AEM search functionalities — synonyms, filters, and spell check. We’ll explore how these features empower users to refine their searches and navigate content with greater ease. Imagine a search experience that understands synonyms, helps filter through specific criteria, and even corrects typos on the fly — all leading to a more efficient and frustration-free journey for your users.

Building on a Strong Foundation

This article assumes you’ve read the previous installments in our AEM search optimization series. If you haven’t yet, we recommend starting with “Enhancing AEM Lucene Search: Advanced Techniques for Improved Search Functionality” Series to ensure your search engine has a solid foundation.

Previously Covered in this series:

  • Search Suggestion Servlets
  • Optimizing Search with AEM Indexing Management
  • Craft Compelling Search Results: Highlighting Keywords & Controlled Excerpts

Now, let’s unlock the full potential of AEM search with these powerful functionalities!

Empower User Searches: Synonyms, Filters & Spell Check

Imagine searching for “sneakers” and finding results not just for that exact term, but also for “trainers” and “athletic shoes.” This is the magic of synonyms in search.

Synonyms: Expanding the Search Net

When you enter a search query, the system doesn’t just look for that specific word. It also considers synonyms — words with similar meanings. This broadens the search scope, ensuring you find content even if you don’t use the exact terminology used in the content itself. These words could be defined in the “Synonyms.txt” file in the Index, as shown in the Index Management Article (Part 2 : Section 4.3.3 in this series).

Eg:

Here though, my Fulltext Search Query Term is “Pharmaceuticals”. It still includes results that contain the word “Pharmacy” too. Since the same is configured in my Synonyms.txt file.

Search systems don’t stop at synonyms. They can also dynamically extract filters based on your query. This means the search itself suggests potential ways to refine your results and pinpoint exactly what you’re looking for.

Dynamic Filters: Refining Your Search Journey

Here’s a simplified illustration:

Imagine searching for “hiking trails.” The system might automatically identify filters like “region,” “difficulty level,” or “length” based on the query and the content available. You can then use these filters to narrow down your search and find the perfect hiking trail for your needs.

Learn more about: Lucene Faceted Search

AEM allows you to extract filters dynamically based on search queries, enhancing the user experience by suggesting relevant ways to refine their search. Here’s a breakdown of how you might achieve this using Java code:

Identifying Filter Categories:

The first step involves identifying potential filter categories based on your search query and content structure. This can be achieved through:

  • Predefined Rules: Define rules that map specific terms or patterns within the query to filter categories. For example, terms like “location” or “city” might trigger the “region” filter category.
  • Taxonomy Analysis: If you utilize taxonomies within your content, analyze the query against the taxonomy structure to identify relevant categories.

Bucket-based Approach:

One common approach for filter extraction involves using buckets:

  • Bucket Creation: As the search results are processed, create buckets for each identified filter category. Each bucket will hold information about the specific filter options within that category (e.g., different regions or topics).
  • Populating Buckets: Iterate through the search results and analyze the content based on your predefined rules or taxonomy mapping. When a relevant term is encountered, add it to the corresponding bucket. For example, if the content mentions “California,” it would be added to the “region” bucket with the value “California.”

Processing Bucket Data:

Once you have populated the buckets with relevant filter options:

  • Create Filter Data Structures: Iterate through each bucket and construct the data structure representing the filter category and its available options. This might involve creating JSON objects or other data structures suitable for your chosen UI framework.
  • Handle Duplicates: You might want to remove duplicate entries within each filter category to provide a cleaner user experience. Utilize helper methods to identify and eliminate duplicates based on your specific needs.

Code Example 1:

// Assuming a Bucket class exists to hold filter options
public void extractFilters(SearchResult result, List<String> filterCategories, Map<String, Bucket> filterMap) {
// Analyze query and content to identify relevant filter categories
for (String category : filterCategories) {
Bucket bucket = new Bucket(category);
// (Logic to populate bucket based on predefined rules or taxonomy analysis)
filterMap.put(category, bucket);
}

// Process content within search results
List<ContentNode> contentNodes = result.getContentNodes();
for (ContentNode node : contentNodes) {
// (Logic to analyze content and add relevant information to buckets)
}

// Construct filter data structures and handle duplicates (implementation details omitted)
}

Code Example 2: where “1_group.1_property” is a predicate used in the Query

Map<String, Facet> facets = result.getFacets();

if (facets.containsKey("1_group.1_property")) {
facets.get("1_group.1_property").getBuckets().forEach(bucket -> getFacetData(bucket, topicData, difficultyLevel, filterData,
heading, filterData1, heading1));
}

private void getFacetData(Bucket bucket, List<Tag> topicData, List<Tag> difficultyLevel,
JSONArray filterData, JSONObject heading, JSONArray filterData1, JSONObject heading1) {
try {
Tag tagName = tagManager.resolve(bucket.getValue());
// (Logic to analyze content and add relevant information to buckets)
} catch (JSONException jsonException) {
LOG.error("JSON exception when adding to Json Object, {}",
jsonException.getMessage());
}
}

Important Note:

This is a simplified illustration, and the actual implementation will depend on your specific AEM setup, content structure, and desired filter options. The key aspects involve identifying filter categories, populating buckets, and constructing the final filter data structure.

Benefits of Synonyms and Dynamic Filters:

  • More Comprehensive Results: Synonyms prevent you from missing relevant content simply because you didn’t use the exact keyword.
  • Enhanced Search Accuracy: Dynamic filters help you refine your search journey, leading to more precise and focused results.
  • Improved User Experience: By understanding your intent and suggesting relevant filters, the search system empowers you to navigate information efficiently.

Fun Fact!

We can actually see the facets that could be extracted for a particular set of results in the Query Builder Debugger Tool too. By Selecting the ‘Extract facets’ Checkbox as shown in the below Video. Reference : AEM QueryBuilder Demo Part 2 — Facets

Spell Check Activation:

Ever typed a search query and ended up with unexpectedly zero results? Typos happen, and they can be frustrating when navigating search systems. Thankfully, AEM’s spell check functionality can come to the rescue!

What is Spell Check Activation?

Spell Check activation is a feature that helps users overcome typos and misspellings in their search queries. When enabled, the search system leverages a spellchecker to analyze the query for potential errors. If a misspelling is detected, the system suggests corrected terms, allowing users to refine their search and find the information they need more efficiently.

Code Example 1:

import org.apache.lucene.search.spell.SpellChecker;

public class SearchService {

private final SpellChecker spellChecker;

public SearchService(SpellChecker spellChecker) {
this.spellChecker = spellChecker;
}

public SearchResult executeSearch(String query, boolean enableSpellcheck) throws IOException {
String correctedQuery = query;
if (enableSpellcheck) {
Set<String> suggestions = spellChecker.suggestSimilar(query, 5);
if (!suggestions.isEmpty()) {
correctedQuery = suggestions.iterator().next();
}
}
// Perform search using the corrected query
return performSearch(correctedQuery);
}

// Implementation for performSearch(...) to execute the search with the corrected query
}

Code Example 2:

import com.day.cq.search.suggest.Suggester;

@Inject
private Suggester suggester;

private Session session;

public String searchWithSpellcheck(ResourceResolver resolver, String fullTextQuery, int limit, int offset) throws IOException {
if (isSpellCheckEnabled()) {
String correctedQuery = spellCheckQuery(fullTextQuery);
return correctedQuery != null ? getResults(limit, offset, correctedQuery) : getResults(limit, offset, fullTextQuery);
} else {
return getResults(limit, offset, fullTextQuery);
}
}

private String spellCheckQuery(String fullTextQuery) {
StringJoiner correctedWords = new StringJoiner(" ");
for (String word : fullTextQuery.split(" ")) {
String suggestion = suggester.spellCheck(session, word);
correctedWords.add(suggestion != null ? suggestion : word);
}
return correctedWords.toString();
}

Important Note:

This is a simplified illustration, and the actual implementation will depend on your specific AEM setup, content structure and code etc.

Congratulations! You’ve reached the final chapter in our exploration of AEM Lucene Search optimization. Throughout this series, we’ve covered a range of techniques to empower users and elevate the overall search experience:

  • Search Suggestion Servlets: We briefly introduced the concept of search suggestion servlets, highlighting their ability to guide users towards relevant content through real-time suggestions.
  • Optimizing Search with AEM Indexing Management: We laid the groundwork by understanding the importance of efficient indexing for accurate and speedy search results.
  • Craft Compelling Search Results: Highlighting Keywords & Controlled Excerpts: We explored how keyword highlighting and excerpts can transform search results into user-friendly snippets, saving users valuable time.
  • Empower User Searches: Synonyms, Filters & Spell Check: We unveiled advanced functionalities like synonyms, filters, and spell check, allowing users to refine their searches with greater precision.

By leveraging these enhancements for AEM Lucene Search, organizations can significantly optimize search functionality and deliver superior user experiences. As AEM developers and administrators, we can tailor search experiences that perfectly align with the unique needs of our digital properties.

Remember, the journey towards an exceptional AEM search experience is a continuous one. AEM Lucene Search remains a powerful tool that can evolve alongside your website, fostering content discoverability, user engagement, and ultimately, a successful digital presence.

Happy searching, and keep exploring the possibilities!

--

--