Hands on Elasticsearch

About Me: Application Architect at Oildex, a Services of Transzap Inc,.

As we all have seen companies struggle with Searching, Sorting, aggregation, and performance with both Relational and No-SQL databases. Elasticsearch comes to rescue to us in those scenarios. ELK (Elasticsearch, Logstash, and Kibana respectively) stack is what you might have heard around the industry for awhile. Elasticsearch is Heart of the Elastic Company’s product portfolio. One of the best thing for developers is Elastic Company has built amazing documentation anyone can easily digest.

Components of the Elastic Stack

  • Elasticsearch: Distributed, fast, Highly scalable Document data store
Source: https://www.elastic.co/products/elasticsearch
  • Kibana: Node.js based front-end application for visualization
  • Logstash: A tool to Collect, Parse and Store data from and to a variety of source. In another way Extract, Transform and Load (ETL) fundamentals.
  • Beats: Lightweight utilities to read data from different sources in its predefined formats
  • ES-Hadoop
  • Plugins (X-Pack)
Source: https://www.elastic.co/products/x-pack

You have ability to create custom plugins as well.


What is Elastisearch ? and Benefits:

  • Built using Apache Lucene
  • Open Source
  • RESTful (Simple Rest API)
  • Schema-Free
  • JSON
  • Scalability
  • High Availability
  • Near real-time distributed search and analytics engine
  • Multi-tenancy
  • Per operation persistence
  • Good Documentation
  • Every field is indexed
  • Distributed document store
  • Analytics API
  • Near real-time distributed indexing
  • Native Full-Text search/Google Like search
  • Aggregation
  • Basic stats: Mean, min, max, sum, avg, std dev, term counts
  • Significant terms, Percentiles, Cardinality estimations
  • Huge community
  • Connectors to other technologies
  • Sharding and Replication
  • Automatic discovery of nodes within cluster and electing master node
  • Extensibility: Plugins and scripts
  • Snapshot and Restore module

Elasticsearch Use Cases

  • Application Search
  • Business Analysis
  • Enterprise Search
  • Metric Analysis
  • Operational Log Analytics
  • Security Analytics
  • Much More …

Elasticsearch Terminology

  • Index (Database)
  • Type (Table)
  • Document (Row)
  • Fields (Column)
  • Mapping (Schema)
  • Analysis
  • Cluster
  • Node
  • Id
  • Primary shard
  • replica shard
  • Routing
  • Shard
  • source field
  • Term
  • text

Downloading and Running Elasticsearch

bin/elasticsearch
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:5.6.3

Using the REST API

*Note: Make sure to secure elasticsearch REST end point wihtin your network.

  • Kibana Dev Tools: Offers capability to explore all Elasticsearch REST API without any third party tools.
  • Postman & Others: Postman or any other Rest client tool can be used to explore REST API’s.

Creating Index

Use Kibana Dev Tools or tool of your choice to create Index and CRUD operation.

PUT /user
{
"mappings": {
"userprimaryinfo": {
"properties": {
"firstName": {
"type": "text"
},
"lastName": {
"type": "text"
},
"userId": {
"type": "text"
}
}
},
"usercontactinfo": {
"properties": {
"email": {
"type": "text"
},
"phone": {
"type" : "text"
},
"userId": {
"type": "text"
}
}
}
}
}

Create/Insert User information

POST /user/userprimaryinfo
{
"firstName" : "John",
"lastName" : "Smith",
"userId" : "jsmith"
}
POST /user/usercontactinfo
{
"email" : "jsmith@hello.com",
"phone" : "1112223333",
"userId" : "jsmith"
}
POST /user/usercontactinfo
{
"firstName" : "John",
"lastName" : "Doe",
"userId" : "jdoe"
}
POST /user/userprimaryinfo
{
"email" : "jdoe@hello.com",
"phone" : "2223334444",
"userId" : "jdoe"
}

Searching

  • Search request body
GET /user/userprimaryinfo/_search
{
"query" : {
"term" : { "userId" : "jdoe" }
}
}
  • ElasticSearch’s query DSL

Understanding Response

Source: ElasticSearch 101 – a getting started tutorial By Joel Abrahamsson

Searching Examples:

  • Basic free text search
GET /user/userprimaryinfo/_search?q=John
GET /user/usercontactinfo/_search?q=*333
GET /user/userprimaryinfo,usercontactinfo/_search?q=jdoe

OR

GET /user/userprimaryinfo,usercontactinfo/_search
{
"query": {
"match" : {

"userId" : "jdoe"
}
},
"sort": [
{
"_type": {
"order": "asc"
}
}
]
}

Elasticsearch Awesome Features

Elastisearch Analyzers:

  • Standard Analyzer: The standard analyzer divides text into terms on word boundaries, as defined by the Unicode Text Segmentation algorithm. It removes most punctuation, lowercases terms, and supports removing stop words.
  • Simple Analyzer: The simple analyzer divides text into terms whenever it encounters a character which is not a letter. It lowercases all terms.
  • Whitespace Analyzer: The whitespace analyzer divides text into terms whenever it encounters any whitespace character. It does not lowercase terms.
  • Stop Analyzer: The stop analyzer is like the simple analyzer, but also supports removal of stop words.
  • Keyword Analyzer: The keyword analyzer is a “noop” analyzer that accepts whatever text it is given and outputs the exact same text as a single term.
  • Pattern Analyzer: The pattern analyzer uses a regular expression to split the text into terms. It supports lower-casing and stop words.
  • Language Analyzers: Elasticsearch provides many language-specific analyzers like english or french.
  • Fingerprint Analyzer: The fingerprint analyzer is a specialist analyzer which creates a fingerprint which can be used for duplicate detection.

With an ability to create custom analyzer for your use cases.




I hope this post has helped you. If you enjoyed this article, please don’t forget to clap👏 ! I would love to know what you think and would appreciate your thoughts on this topic. You can also follow me on Medium, GitHub and Twitter for more updates.