Hi Jose,
Anagha Joshi

Hey Anagha,

  1. In this particular scenario I’m not paginating the results, since I’m displaying a fixed list of 10 candidates, but ElasticSearch provides pagination support by using the from and size params. So if you do from: 30, size: 10 you would effectively jump to the 4th page. More on this can be found in the documentation of ElasticSearch here https://www.elastic.co/guide/en/elasticsearch/reference/5.1/search-request-from-size.html
  2. Again, in this scenario I am returning the whole search document, but it should be possible to only retrieve the fields you need by specifying the _source param in the top_hits aggregation. An example is shown here https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html#_example. You can observe that using the _source you can only retrieve the few fields you care about
  3. We haven’t experienced a performance hit by using this approach, but our results are pretty much 70% unique emails and the rest with duplicated emails. So at least for us it’s been working well, but for ~100% unique emails most of the time a benchmark should probably be done to measure the performance impact.


Show your support

Clapping shows how much you appreciated Jose Raymundo Cruz’s story.