Nowadays, I am learning hands on with ELK stack, so I thought of writing this article for my understanding and also for people who are trying to do the same.
Hope to see you with constructive thoughts in comments!
Let’s install it quickly (Mac-os):
$ brew tap caskroom/cask
$ brew install brew-cask
$ brew cask install java
Install Elasticsearch, Logstash and Kibana
$ brew install elasticsearch
$ brew install kibana
$ brew install logstash
ElasticSearch can be used both as a search engine as well as a data store.
Every record that must be stored in ElasticSearch must be a JSON object.
Our main data container is called index (plural indices) and it can be considered as a database in the traditional SQL world.
In an index, the data is grouped into data types called mappings in ElasticSearch. A mapping describes how the records are composed (fields).
Every instance of ElasticSearch is called a node and several nodes make a cluster.
what happens when a node starts?
The configuration is read from the environment variables and the elasticsearch.yml configuration file.
A node name is set by the configuration file or is chosen from a list of built-in random names.
Internally, the ElasticSearch engine initializes all the modules and plugins that are available in the current installation.
A lot of required services are automatically started, such as:
- Cluster services, Indexing Service, Mapping Service, Network Services, Plugin Service, River Service, Language Scripting Services.
There are two important behaviors in an ElasticSearch node:
The non-data node (or arbiter):
Process REST responses and all other operations of search. They are responsible for distributing the actions to the underlying shards (map) and collecting/aggregating the shard results (redux) to be able to send a final response.
The data container behavior:
Store data in them. They contain the indices shards that store the indexed documents as Lucene (internal ElasticSearch engine) indices.
The following schema compares ElasticSearch with SQL and MongoDB:
Solving the yellow status…
Mainly, yellow status is due to some shards that are not allocated.
If your cluster is in the recovery status (meaning that it’s starting up and checking the shards before they are online), you need to wait until the shards’ startup process ends
Solving the red status…
This means you are experiencing lost data, the cause of which is that one or more shards are missing. To fix this, you need to try to restore the node(s) that are missing.
Most of the visualization technology handles the analytical processing, whereas Kibana is just a web application that renders analytical processing done by Elasticsearch. It doesn’t load data from Elasticsearch and then process it, but leverages the power of Elasticsearch to do all the heavy lifting. This basically allows real-time visualization at scale.
This will work as the base of your visuals you are willing to generate. It’s a Json file for your visualizations (config file for creating Kibana dashboard)
As by name, this is for specifying the index patter.
Logstash is basically a data collector who holds the data and later starts ingesting data of interest into Elasticsearch when required.
The design of .conf file for logstash is crucial for getting visualization of desire.