OpenSearch With Rails

Swapnil Kant
Javarevisited
Published in
6 min readJun 24, 2024

Introduction

Hello folks, while working with OpenSearch I learned about it’s architecture and how it works on the backend for managing so many nodes, indexes and indexing documents to provide the most fastest and accurate search results. While I was working on my project where I had to integrate OpenSearch using Rails I worked on some interesting code configurations to setup OpenSearch on my Rails application. In this article I would be sharing my experience while using OpenSearch with Rails.

OpenSearch

What is OpenSearch?

The first question which come to your mind after hearing OpenSearch is

“Is OpenSearch a database?”

well the answer to this question is that OpenSearch can be loosely defined as a database but to be more specific OpenSearch is a distributed search engine.

OpenSearch is a distributed search engine which enables you to search almost anything and everything and analyse your data once you ingest your data to OpenSearch. The term distributed means that it is capable of running into multiple nodes.

History of OpenSearch

History of OpenSearch

Well, talking about the history of OpenSearch.

In 2015, Amazon took an advantage of the open-source licence to launch Amazon ElasticSearch Service, a cloud based managed service which would allow the users to launch scalable ElasticSearch services, connect the data sources to cluster endpoints, and load, process, analyse or visualise data in cloud.

But the developers at NV would eventually object to Amazon’s use of their products and trademarks and filed a suit against the tech giant Amazon later Amazon resolved this suit on two terms

  1. In January 2021, Elastic N.V. announced that, ElasticSearch would be licensed under the Server Side Public license (SSPL) which prevents Amazon and other organisations from providing ElasticSearch as a service without directly collaborating with Elastic.
  2. In April 2021 Amazon announced that it will be forking the open-source version of ElasticSearch to launch a new open-source distributed search engine which later named as OpenSearch.
OpenSearch Components

OpenSearch Components

OpenSearch has various components which work together to make it speed an cost efficient, Let’s discuss some of it’s main components

  1. Documents in OpenSearch is defined as the units which are responsible for storing data information, talking about how it stores the data information is in the JSON format. Each document refers to a single data which comprises of fields and values. In easy words documents in OpenSearch can be easily referenced with the documents present in MongoDB database in terms of it’s structure.
  2. Indices in OpenSearch is defined as a collection of documents. In easy words indices can be understood as a database which comprises of a collection of documents (similar to rows in a table).
  3. OpenSearch is a distributed search engine and from the term distributed it means that it is capable of running in multiple nodes. The Clusters in OpenSearch is defined as a collection of these nodes which makes OpenSearch a distributed search engine which carries out search and analysis operations on your data.
  4. Clusters in OpenSearch comprises of a Cluster Manager Node which is responsible for managing indices in OpenSearch. It is responsible for deleting and creating indices which contains all your data information. The Cluster Manager Node inside a cluster is responsible for managing the documents within these indices which together combine to form shards present in a node and node is responsible for combining all the data information from multiple nodes and sending it as a combined response to fulfil your request.
  5. Shards in OpenSearch is defined as a combination of indices. OpenSearch splits indices into multiple shards.
Cluster In OpneSearch

How does Sharding helps?

Now, suppose you have 800gbs of index in your cluster in OpenSearch, now while making a request to OpenSearch it is not feasible to make a request to a single node as it decreases the system’s performance and decreases the scalability of the system and makes it very slow resulting in delay response, but dividing these indices into various shards say each of 80gbs will make the system efficient and faster, and hence each shards contribute to the scalability and maintainability of the system.

Now, when you have multiple shards which has multiple indices within a cluster.

OpenSearch

Why OpenSearch?

OpenSearch is famous for it’s two main features

  1. Scalability is defined as a system’s capability to increase or decrease the system’s efficiency or performance in response to change in application and system’s processing demands.
    Now, in the previous case where we came across the concept of sharding where we divided a 800gb chunk of data into multiple 80gbs smaller indices chunks into multiple shards which in return increased the system’s performance and hence, made the system scalable and also minimised the cost which would have been more if it was left to be computed from a single node.
  2. AWS allows it’s users to pay for only the amount of services they are using be it a specific amount of storage the number of EC2 instances running and the amount of data transfer taking place, it has a pay as you go plan which is much cost efficient than other similar service providers and enables a user to manage their cost and usage.
  3. AWS services comes up with IAM (Identity and Access Management) service which allows the admin to setup certain rules and permissions to handle the AWS services. It might have a group which consists of various users within it having certain rules and permissions assigned or might be a single user contributing to strong Security.

Setting Up OpenSearch

For setting up OpenSearch you can follow the link here

AWS and Rails

Configuration and Connection

For testing my OpenSearch with rails I have created a basic flight app which is a basic CRUD app on rails and I have tried to show how to perform basic CRUD operations with OpenSearch.

Use the gems to integrate OpenSearch service in your application

# Opensearch client for ruby
gem 'opensearch-ruby'
gem 'opensearch-aws-sigv4'

For connecting OpenSearch to your rails application you can refer to the code which I have used in my application (refer to method open_search_client)

openseatch_helper.rb

followed by your environment variables in .env file as

.env file
OpenSearch Vs ElasticSearch

OpenSearch Vs ElasticSearch

  • High Performance: Elasticsearch engine is 40–140% faster than OpenSearch while consuming fewer compute resources.
  • Pricing: Elastic Cloud pricing starts at $95+/month for a Standard subscription, while AWS customers can start using OpenSearch Service for free if they remain under the AWS Free Tier usage limits.
  • Community Support: GitHub reveals that the Elasticsearch codebase has had a greater number of commits over the past year compared to OpenSearch.

Keep learning and keep growing and also keep exploring more!

All the very best!

For more exciting and informative articles and tips follow me on Medium and Linkedin

--

--

Swapnil Kant
Javarevisited

Hi, I am Swapnil Kant, an avid programmer, and a full-time learner! One who is highly interested in Algorithm Optimization and Development