Tips for improving performance with Azure cognitive search

Madushi Sarathchandra
5 min readApr 24, 2023

--

What is Azure cognitive search?

Azure cognitive search is a Microsoft cloud based search service that facilitates developers to add search capabilities to their applications. Search is really important for any application, with Azure cognitive search it provides a fully managed, scalable and secure search experience.

For cognitive search, Azure provides various integration options with other Azure services to enhance its functionality. Indexers and skillsets are the two integration options for cognitive search on Azure. Indexers are used to automate the process of ingesting and retrieving data from Azure data sources such as Azure SQL database, blob storage, CosmosDB, etc. A skillset is another interaction option for Azure cognitive search that allows to incorporate with AI capabilities.

In this article, I’m going to explain some of the tips and best practices that can be recommended for boosting performance of your application with Azure cognitive search.

When improving the performance, it is important to know what factors are most likely to impact the search performance. Index schema design, service capacities and query designs are some key factors and optimizing them will help to avoid such inefficiencies.

Let’s see some tips and best practices for optimizations.

  1. Use search functions to execute filters that contain a large number of values.

When you are working with large data set or complex filtering requirements, it is often better to use search functions instead of overloading filter criteria. Otherwise, the performance of the search will degrade.

When you are using search.in functions, Azure search can optimize the query execution by using an efficient search algorithm that is designed to handle large sets of data. This can lead to faster query execution times and better overall performance.

In the following example, the filter expression is used to check whether a single field in each document is equal to one of many possible values using eq and or operators.

group_ids/any(g: g eq '123' or g eq '456' or g eq '789')

But the more efficient way to execute filters with a large number of values is using search.in functions.

group_ids/any(g: search.in(g, '123, 456, 789'))

The search.in function tests whether a given string field or range variable is equal to one of a given list of values. Equality between the variable and each value in the list is determined in a case-sensitive fashion, the same way as for the eq operator.

2. Limit the fields being searched at query time using the “searchFields” parameter.

Having a large number of searchable fields can result in increasing the workload of the search service. This is due to each additional field adding complexity to the search index and increasing the amount of data that needs to be processed during a search query. To mitigate this issue, it is recommended to limit the fields being searched at query time by specifying only the fields that are relevant to the query using the ‘searchFields’ parameter.

GET /search?q=apple&searchFields=name,description&sort=price

In the above example, we are searching for the term ‘apple’. However we are only searching the ‘name’ and ‘description’ fields by specifying them in the ‘searchFields’ parameter. This can help improve the performance and reduce the workload on the search service.

3. Select only the properties that are needed for field attribution.

This is a common mistake that developers make when creating a search index. It is recommended to select only the properties that are needed instead of selecting all available properties for the fields. As an example, if a field doesn’t need to be fully text searchable, skip the searchable attribute when creating that field. First identify the properties that are required and remove those that are not. Select the right data types and the right analyzers for each property. Analyzers can be used to tokenize and normalize text data. Designing a proper schema will reduce the size of the index and improve the search performance.

4. Consider alternatives to complex types.

When designing a search index, it’s important to consider alternatives for the complex types to improve performance. Complex data types are useful when data has a complicated nested structure, but it requires extra storage and additional resources. In some cases, you can use alternative data types for the complex types that will help to avoid such tradeoffs. This can simplify the index structure and reduce the number of nested queries required to perform the search.

5. Reduce the amount of data being returned.

Retrieving a large amount of content in a query can slow down the search process and reduce the overall performance of the system. It is recommended to structure the queries to return the fields that are necessary to render. Once a user selects a specific result, you can use a Lookup API to retrieve the remaining fields.

In Azure search, Lookup API is a RESTful API endpoint that allows you to retrieve a single amount of document or record from your search index.

6. Avoid partial term searches

It is generally not recommended to use partial terms searches in Azure cognitive search. This is because they are more computationally expensive than typical keyword searches. They require full index scans to produce results.

7. Add partitions for slow individual queries.

Adding partitions(more replicas) can be a useful technique to improve performance when the query execution is slowing down. When you are adding more replicas, you are splitting your index into smaller, more manageable partitions. Each partition can contain a subset of the index data. This can help to reduce the query execution time and can take the advantage of parallel query execution.

However, it is important to know that partitioning can lead to some drawbacks. It will add some complexity to your index configuration and may not be effective for all types of queries. So you should carefully analyze the tradeoff between improved query performance and the increased complexity.

By following these tips and best practices, you can optimize your Azure Cognitive Search performance and provide a better search experience for your users.

Ultimately, the best approach will depend on your specific use case, size and the complexity of your dataset. So it’s important to experiment with different optimization techniques to find the best performance for your particular application.

--

--