Solr — The alternative to Elastic Search you should know about

Ayush Yadav
3 min readJan 6, 2024

--

Solr is an Opensource search platform built on Apache Lucene. It’s similar to Elastic Search. What is Lucene you asked?

Lucene is Java library which provides indexing & searching capabilities. So index data & then search on it.

The fundamental of IR system:

1. Give information — Indexing

2. Search Information — Querying

You may have studied the Information Retrieval subject. Information Retrieval is how all search engines work, so Lucene becomes a pretty important technology you should know as a software engineer. Lucene powers Elastic Search

How IR works?

You index your data, so based on terms in the data there’s something called Inverted Index which is generated. Now let’s say you want to search a string, we never search on all the docs, we search on the generated index. In Solr we call this index Core

Now as I said before Lucene is Java based. So solr is web platform running on jetty server which itself is running inside Java vm. To index & search data we do it via http requests.

Running Solr:

It is pretty simple, you can either set it up by downloading setup or just run Docker run solr:latest -p 8983:8983 Solr provides several client libraries to use http server. Eg: Pysolr for python

docker run solr:latest -p 8983:8983 Solr

Schema:

Solr introduce a new concept of schema, so you can define how your index should be. ( Lucene doesn’t have such thing) Like which fields it should have. Eg: Name , class etc. & Then later you can run queries for particular fields.

Searching in Solr:

There are various ways to search.

Fundamentally we can do Boolean Retrieval by using AND & OR operators between terms. Or we can also do wild card queries, fuzzy search etc Eg: Name: Cat OR Dog This will get us all docs with Cat or Dog.

Learning resources — Youtube & Docs:

1. Getting started

2. Solr vs Elastic Search

3. Full Playlist
4. Docs

Thanks for reading!!

--

--

Ayush Yadav

Generative AI Wizard | Building the future one pipeline at a time.