Getting Started with Splunk

Sumit Srivastava
Globant
Published in
5 min readApr 27, 2022

Splunk is a software platform widely used for monitoring, searching, analyzing and visualizing the machine-generated data in real time. It performs capturing, indexing, and correlating the real time data in a searchable container and produces graphs, alerts, dashboards and visualizations. Splunk provides easy to access data over the whole organization for easy diagnostics and solutions to various business problems.

Why Splunk :

  • Splunk collects data in real-time from multiple systems
  • It accepts data in any form, example- log file, .csv, json, config etc.
  • Splunk can pull data from database, cloud and any other OS
  • It analyze and visualize the data for better performance
  • Splunk give alerts/ event notifications
  • Provides real-time visibility
  • It satisfies industry needs like horizontal scalability (using many systems in parallel)

Splunk Component :

  1. Processing Components
  2. Management Components

Processing Components :

Forwarder : collect the data from remote machines then forwards data to the Index .

Indexer : Indexer process the incoming data in real-time. It also stores & Indexes the data on disk.

Search Head: End users interact with Splunk through Search Head. It allows users to do search, analysis & Visualization.

Management Components :

  • Deployment Server
  • Indexer Cluster (Master Node)
  • Search Head Cluster (Deployer)
  • License Master
  • Monitoring Console

Deployment Server: acts as a centralized configuration manager for any number of other instances, called “deployment clients”. … Deployment clients can be forwarders, indexers, or search heads. Each deployment client belongs to one or more server classes.

Indexer Cluster (Master Node) : manages the cluster. It coordinates the replicating activities of the peer nodes and tells the search head where to find data. It also helps manage the configuration of peer nodes and orchestrates remedial activities if a peer goes down.

Search Head Cluster (Deployer) : Search head runs searches across the set of peer nodes. You must use a search head to manage searches across indexer clusters.

License Master : The license is based on volume & usage — for example, 50 GB per day. Splunk regular checks the licensing details.

Monitoring Console : The monitoring console is a set of dashboards, platform alerts, and health checks included in Splunk Enterprise.

Splunk Architecture

Splunk’s architecture comprises of various components and its functionalities. Refer to the below image which gives a consolidated view of the components involved in the process:

Forward Data :

The forwarder can track the data, make a copy of the data and can perform load balancing on that particular data before it sends it to the indexer.

Cloning can help in producing duplicated copies of any case at the data source whereas load balancing is performed so that even if one case collapses, that data can be carried to another case which is hosting the indexer.

Stores Data :

When the data is obtained from the forwarder, it is then dropped in an Indexer component. In the Indexer, the obtained data is then split into various logical datastores and at every datastore, you can set authorities which will then guide the user’s views and accesses.

When the data is inside the Indexer, you can explore that data and assign those explorations to different search companions and all the results that we will be getting after assigning will be merged and carried forward to the Search Head.

Searches Data :

You can also perform scheduling the search companions and creating the alerts, which will be then activated when some situations will match the saved searches.

You can also use the knowledge objects only to intensify the existing unstructured data (data which do not have any format).

The search heads and knowledge objects can be retrieved from a Splunk CLI or a Splunk Web Interface. This interaction happens over a REST API connection.

Search Processing Language (SPL) :

SPL is a language containing many commands, functions, arguments, etc., which are written to get the desired results from the datasets.

Components of SPL

  • Search Terms − These are the keywords or phrases you are looking for.
  • Commands − The action you want to take on the result set like format the result or count them.
  • Functions − What are the computations you are going to apply on the results. Like Sum, Average etc.
  • Clauses − How to group or rename the fields in the result set.

Search Terms : In the below example, we are searching for records which contain two highlighted terms.

Commands : In the below example we use the head command to filter out only the top 3 results from a search operation.

Functions : In the below example, we use the Stats avg() function which calculates the average value of the numeric field being taken as input.

Clauses : In the below example, we get the average size of bytes of each file present in the web_application log. As you can see, the result shows the name of each file as well as the average bytes for each file.

Conclusion : Hence Splunk is the perfect tool to monitor different infrastructure performances, troubleshoot issues, create dashboards, create reports and alerts easily. It is a complete tool for managing any system with all the logs being stored dynamically.

--

--