Configuring RestHighLevelClient for Elastic

Sunny Makhija
Dec 13, 2019 · 6 min read

Elasticsearch is the most popular search engine nowadays. We have been working with Elastic for a year now. We have used lots of search features of Elastic, like fuzzy search, substring search, aggregation, and sorting. But Elastic is not limited to search only — machine learning module has been added which can be used for analytical purposes extensively.

We have also integrated Elastic-Fluentd-Kibana stack in our application, so all logs are pushed to Elastic by Fluentd via Kafka and these logs are shown in aggregated graphical representation on Kibana which aids debugging, monitoring, and provides actionable business metrics. In the following sections will explain how we have used it in our use case.

In this blog, you will know:

  1. How to configure RestClient to integrate with elastic cluster.
  2. How to use RestClient to perform any operation on elastic.
  3. How to apply basic authentication and,
  4. How to use sniffer with RestClient

Let’s start with how we set up and use elastic in our microservices:

Our journey started with elastic one year back when we started to use elastic in one of search use cases, so we have set up our local environment with 5 nodes and out of which 3 nodes are master eligible nodes, 2 data nodes shared with master eligible nodes and 1 ingest node for indexing. Node configuration is totally based upon the application need. I will not dig into the setting up of elastic cluster here because the prime objective of this blog is to explain how do we integrate our elastic cluster with our microservices and how do we execute any search query on it. For installation and node specification, please refer below link:

https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html

The next step is to integrate elastic cluster with our microservices.

We need a client to do the same. The following clients are available :

  1. Spring data elastic
  2. RestHighLevelClient

We choose RestHighLevelClient for our microservices due to its loose coupling. This allows us to easily upgrade elastic any time. On the other hand, spring data elastic is tightly coupled with elastic and there is support till now only 5.5.0 version of elastic. Please refer below spring data elastic compatibility matrix:

In order to use rest-high level client, please use below dependency of rest-high-level-client:

compile(“org.elasticsearch.client:elasticsearch-rest-high-level-client:${project.elastic_version})

RestHighLevelClient basically maintains a pool of RestLowLevelClient instances, so we need to create only one single instance of RestHighLevel client which will take care of the connection pool of low-level client for you. It pick elastic node in round-robin fashion and perform an operation on that node by using RestLowLevel client. So it is a best practice to create your RestHighLevel client instance as singleton and should create once in the application lifetime.

In below code snippet, creating RestHighLevelClient

private RestHighLevelClient buildClient() {SniffOnFailureListener sniffOnFailureListener = new SniffOnFailureListener();RestClientBuilder restClientBuilder = restClientBuilder();restClientBuilder.setFailureListener(sniffOnFailureListener);if (!StringUtils.isEmpty(user) && !StringUtils.isEmpty(password)) {applyAuthentication(restClientBuilder, user, password);}restHighLevelClient = new RestHighLevelClient(restClientBuilder);Sniffer sniffer = Sniffer.builder(restHighLevelClient.getLowLevelClient()).setSniffIntervalMillis(30000).build();sniffOnFailureListener.setSniffer(sniffer);return restHighLevelClient;}

In the above snippet, we have used a sniffer to enable fault tolerance in our services, the sniffer will sniff your elastic cluster periodically and check if any nodes will down then it will remove that node from the active nodes. So after that RestLowLevel client will not pick that node for any operation and any request will not get failed. For further details please check the link below:

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/sniffer.html

Problem: We have faced one intermittent issue with Sniffer, Sniffer always sniff for active node and change it to active node list, in some unknown situation Sniffer set active nodes as -1 and due to which client cannot do any operation on elastic and we need to restart our service, so always set the Sniffer very carefully.

You may also require basic authentication. It can be added using the following snippet.

public void applyAuthentication(RestClientBuilder restClientBuilder, String user, String password) {final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();credentialsProvider.setCredentials(AuthScope.ANY,new UsernamePasswordCredentials(user, password));restClientBuilder.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {@Overridepublic HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
}});}

Use below snippet to create RestBuilder instance:

return RestClient.builder(hostsArr).setRequestConfigCallback(requestConfigBuilder -> requestConfigBuilder.setConnectTimeout(connectionTimeout).setSocketTimeout(socketTimeout)).setMaxRetryTimeoutMillis(retryTimeout);

Please reset timeouts according to your needs. Now RestHighLevel setup is done, and we are ready to perform any operation on cluster.

In the next section, we will learn how to operate on elastic using this client.

Before I explain how to perform a search operation on elastic, let me introduce first index and mapping in elastic.

Index: An index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data. By default, Elasticsearch indexes all data in every field and each indexed field has a dedicated, optimized data structure. For example, text fields are stored in inverted indices, and numeric and geo fields are stored in BKD trees.

Let’s understand it through a sample index definition:

Index Setting:

{“number_of_shards”:1,”number_of_replicas”:1,
“analysis”: {“normalizer”: {“lowercase_normalizer”: {“type”: “custom”,“filter”: [“lowercase”,”asciifolding”]}},“analyzer”: {“tokenized_lowercase_analyzer”: {“type”:”custom”,“tokenizer”:“whitespace”,“filter”: [“lowercase”,”asciifolding”]}}}}

In Index setting, below are few important properties to take care at the time of creation of index:

Number_of_shards: define no. of primary shards

Number_of_replicas: define no. of secondary shards which are used in case of failure of primary shard.

In our local dev environment, we use only 1 primary shard with 0 replica shard.

Normalizer & Analyzer: These are the field properties which has been used in mapping with the fields and use in different search query, for e.g “lowercase_normalizer” to perform case insensitive search and “tokenized_lowercase_analyzer” have been used for case insensitive and tokenized search. For e.g, if you have indexed “I am elastic” and you want to perform a search on each word like “am” and “elastic”.

Please note “normalizer” works with keyword and “analyzer” works with “text” datatype.

Mapping defines the schema of index, it is a collection of fields with their types and other properties:

For e.g

{
“stud_doc”: {“properties”: {
“aadhar_no”: {“type”: “keyword”},
“Name”: {“type”: “text”,“fields”: {“keyword”: {“type”: “keyword”,“ignore_above”: 256}}},
“registration_no”: {“type”: “long”,},“branch_code”: {“type”: “keyword”,“normalizer”:“lowercase_normalizer”},“branch_name”: {“type”: “keyword”,“normalizer”: “lowercase_normalizer”},
}

So in the above mapping, we have defined a few fields with their type and normalizer, so generally, we should take “keyword” where we would like exact match and “text” should use for substring match, fuzzy query and wildcard query and “lowercase_normalizer” has been used for case insensitive search. There are lots of inbuilt normalizer and analyzer to solve specific search needs.

After that we have indexed test data in this index, For now, I am not digging it further. You can use elastic bulk indexing to index data in the local environment but not for production.

Summarizing the work done till now, we have set up an Elastic cluster and a RestHighLevelClient. We have also indexed some sample data in index and now we are ready to perform any search operation on elastic:

  1. Create SearchSourceBuilder instance using QueryBuilder instance.
  2. After that by using this SearchSourceBuilder instance create SearchRequest instance and now you are ready to do any operation on elastic using your client.

Please refer the below code snippet to create SearchRequest:

BoolQueryBuilder qb = QueryBuilders.matchAllQuery(); //to fetch all documents from the index
SeachSourceBuilder sourceBuilder = new SourceBuilder(qb);SearchRequest searchRequest = new SearchRequest(index);
searchRequest.types(indexType);
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = null;
try {
searchResponse = restHighLevelClient.search(searchRequest, DEFAULT);
}

In the above code snippet e.g we have created SearchRequest and passed index and mapping name to create SearchRequest, we have already explained index and mapping in the above section.

Wrapping up

Xebia Engineering Blog

The Xebia Engineering Blog

Sunny Makhija

Written by

Xebia Engineering Blog

The Xebia Engineering Blog

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade