Benchmarking Performance: Elasticsearch vs Competitors

Gigasearch Engineering
gigasearch
Published in
3 min readAug 16, 2021

There are some new exciting players in the open-source search engines field. We decided to look at some of them closely to find out how they stack against the Elasticsearch — both by feature set and performance.

Candidates:

Features

Known limitations

Elasticsearch

  • Becomes unstable above ~1000 indices (or 20k shards) per cluster

TypeSense

  • Storage size limited by available RAM

Source: https://typesense.org/typesense-vs-algolia-vs-elasticsearch-vs-meilisearch/

Meilisearch

  • The maximum number of terms taken into account for each search query is 10
  • Maximum database size is 100GiB (can be changed per instance)
  • Up to 200 indexes
  • Maximum of 1000 words per field

Source: https://docs.meilisearch.com/reference/features/known_limitations.html#design-limitations

Benchmark

Dataset

Name: enwiki-20210720-abstract.xml

Description and Source: Date: July 20, 2021

Docs: 6.3M

XML size: 6.0 GB

Query words are chosen randomly from the 1000 most popular English words dataset.

Environment

2x General Purpose / 32 GB / 8 vCPUs DigitalOcean droplets (one for load generation + one for storage).

Results

Indexing time

For indexing we only counted the time our indexer spent in requests to the search backend. Elasticsearch, PostgreSQL and Typesense show very similar performance here, while RediSearch is ~2x slower; this result strangely contradicts the RedisLabs benchmark results so the set up might be suboptimal here. On the other hand, Meilisearch really shines here being almost 7 times faster than the others.

Query latency

Again, RediSearch is a slower outlier here for all queries, and again RedisLabs got different results. Another surprising outlier is the “three-word” query on Typesense, taking enormous amount of time on average for some reason. Meilisearch displayed pretty solid performance, especially for prefix and typo queries.

We also used zeroes for unsupported types of queries but RediSearch got its timings into the under 1 ms (!) zone for “exact phrase” and “three word AND” queries.

Raw numbers

Takeaways

  • Elasticsearch is still the king, offering solid performance for indexing and all types of queries.
  • RediSearch has so-so indexing performance and RedisLabs try hard to upsell their cloud solution so documentation is subpar too but it can give sub-millisecond latency for some types of queries.
  • PostgreSQL has a weird spike for simple one-word query performance and interface is quite complex though it might be a decent solution if you already have a Postgres database.
  • TypeSense has a good feature set and performance generally but with a strange spike at multi-word queries.
  • MeiliSearch has absolutely awesome performance (both indexing and querying) though at a cost: feature set and data set size are limited, and it’s hard limits since there is no distributed option available.

Gigasearch is a team of Elasticsearch consultants and engineers with experience deploying and tuning petabyte-scale clusters. Contact us today!

--

--