Solr Anti-patterns — Part2

Published in

Walmart Global Tech Blog

5 min readJun 30, 2021

An anti-pattern is a common response to a recurring problem that is usually ineffective and risks being highly counterproductive.

This is a continuation of Solr Anti-patterns — part1.

The common anti-patterns are:

1. Excessive logging

Solr logs are a key way to know what’s happening in the system. There are several ways to adjust the default logging configuration.

Trying to log everything can have an impact on the performance of your cluster as it will need more resources and threads to log all debugging/info logs.

Solution: Set the log level to warn/error and enable slow query logging to capture queries that run for more than the specified time.

While debugging any issue you can change the log level temporarily and revert it back once you capture the logs.

2. Excessive caching

To obtain maximum query performance, Solr stores several different pieces of information using in-memory caches. Result sets, filters, and document fields are all cached so that subsequent, similar searches can be handled quickly.

“Large cache” is just a fancy word for “garbage”. If you let objects accumulate in the caches, the Java Virtual Machine’s garbage collector is eventually going to have to clean it all up. Having lots of garbage increases the duration of garbage collections and hurts your application’s responsiveness.

Solution:

Do not keep clauses/crietrias fetching huge record sets in filter query, Use clauses with huge record sets in the main query to restrict the overall result count.
Do not keep dynamic clauses in filter criteria it results in an invalid cache.
Use static clauses in filter queries with matched subsets of docs for better performance of queries.

3. Frequent commits

There are 2 types of commits in solr:

Hard commit: This is governed by the <autoCommit> option in solrconfig.xml or explicit calls from a client (SolrJ or HTTP via the browser, cURL, or similar). Hard commits truncate the current segment and open a new segment in your index.
Soft commit: A less expensive operation than hard-commit (openSearcher=true) that also makes documents visible to search. Soft commits do not truncate the transaction log.

Soft commits are “less expensive”, but they still aren’t free. You should make the soft commit interval as long as is reasonable for best performance!

Solution: Avoid Explicit commits and define the commit settings(autoCommit and autoSoftCommit) in solrconfig.xml

4. Heavy warming queries

A cache-warming query is a pre-configured query (in solrconfig.xml) that gets executed against a new searcher in order to populate the new searcher’s caches.

Having complex or many warming queries can lead to long delays in opening new searchers. Read more about warming searchers here.

Solution:

Use this feature very carefully after thorough testing, Solr ships with the warming queries section commented as shown in the above example. Its best to keep it that way.

Warming queries should be the ones that are used frequently by your application and fast enough to not cause any delay for new searchers.

5. Boolean queries

Solr has a restriction on number of terms for boolean queries. You can increase the limit if you want but it can slow down the performance due to large terms and scoring overheads. Please read here for more details.

Solution: Use Terms Query Parser instead if your terms for boolean query crosses 1000.

6. DBQ with nested docs

DBQ (Delete By Queries) should be avoided if you have nested docs in your applications.

Nested documents can simply be replaced by adding a new document with more or fewer documents as an application desires. This aspect isn’t different than updating any normal document except that Solr takes care to ensure that all related child documents of the existing version get deleted.

Do not add a root document that has the same ID of a child document. This will violate integrity assumptions that Solr expects. If you try to use an ID of a child document, nothing will happen since only root document IDs are considered.

Solution: To delete a nested document, you can delete it by the ID of the root document. If you use Solr’s delete-by-query APIs, you have to be very careful to ensure that no children remain of any documents that are being deleted. Doing otherwise will violate integrity assumptions that Solr expects.

7. Multi-level pivoting

Solr pivot faceting or decision tree faceting or sub faceting is used to provide a more detailed view of index data. You can calculate sub facet of parent facet or generate the tree-like structure of data and display it in the application which helps to make better decisions.

In some of the analytics applications where users want deep insights of data, to meet this type of requirements you can use pivot faceting for more than two or three fields, but this does not come free, multi-level pivoting needs more CPU and memory for the calculations.

Solution: Try to restrict the number of result set for pivots, facet/pivot on docValue enabled fields, and use json facet API instead of old-style faceting.

8. Schemaless mode

Schemaless mode removes the need to design a schema before using Search. This helps you use Search more quickly, but the schemaless mode is less efficient and less effective than using a deliberately designed schema.

This has the potential to index every field (even non-searchable ones) which can lead to bad performance.

Schemaless mode is actually a guess-field-and-add-to-schema mode. So when Solr sees a new field, it tries to guess the field type and adds it to its schema. This guess is often not optimized for each field.

Solution:

Stick with the default non-schemaless mode.

Use dynamic fields/mapping as an alternative.

Dynamic fields allow Solr to index fields not explicitly defined in your schema.
Use Dynamic fields when you forget to define one or more fields.
Add Dynamic fields so applications are less brittle by providing flexibility to the documents you can add to Solr.

9. No throttling for write-heavy systems

Solr takes on things other databases do not, like secondary indexes, unstructured text searches, facets, pivots, aggregations, streaming queries, lemmatizations, phrase searches, etc. This doesn’t come free.

The indexes providing those capabilities must be maintained and changed as data comes into your cluster. Updates in particular can be an expensive operation, putting additional load on the index maintenance operations. Because of this, Solr can be choked easily if bursty update traffic is not throttled before ingestion.

Solution: Throttle your ingestion by adding a messaging queue before ingesting to Solr and try to avoid large batch sizes.

10. Config reload/upload during high ingestion

Doing config upload or reload on a live prod cluster should be avoided, as replicas can go into recoveries (and it can take quite some time to recover data based on index size).

Solution: Stop the writes for the collection where the config upload/reload is being done or do this activity off business hours when the write traffic is low.

Conclusion

Avoiding these anti-patterns has helped us manage many critical clusters of different sizes and use cases, efficiently by reducing resource utilization like CPU, memory, I/O operations. By implementing these solutions, you will not only save on operational costs but also increase the cluster performance and stability.

Solr Anti-patterns — Part2

1. Excessive logging

2. Excessive caching

3. Frequent commits

4. Heavy warming queries

5. Boolean queries

6. DBQ with nested docs

7. Multi-level pivoting

8. Schemaless mode

9. No throttling for write-heavy systems

10. Config reload/upload during high ingestion

Conclusion

Written by Dinesh Naik