Understanding Splunk : Data Ingestion

Ashish
Analytics Vidhya
Published in
4 min readDec 26, 2019

Introduction:

Organisations are generating more data today than they have in their entire existence. Predictions are that zettabytes of data will be generated in the next 2 years. The reason affirming this prediction is the fact that any new cloud based application or any cloud connected IoT device is generating streams of data every microsecond. Also, there is an ignored section of this collected data which goes unused. There lies huge business value in this data and we need tools to tap into it and encash by deducing meaning information from this data. This data is popularly called as dark data amongst big data analysts. Dark data is mainly represented in web traffic, log files, streaming analysis data, unstructured data, etc

How can any organisation take advantage of this dark data and convert it into actionable insights. Splunk is the answer. Splunk allows you to investigate this data in its raw unstructured format, monitor it as it streams through in your business systems, analyse their trends and take action so that you can turn your dark data into actionable insights.

Understanding the Four Vs of Big Data:

Image source: https://twitter.com/BigDataBlock/status/1001523633733488641/photo/1

The massive data being generated by organisations is very diverse in its use and location. The focus of Data Fabric Search (DFS) is to address the first three Vs, i.e. Volume, Variability and Variety. Historically data platforms have been built to optimise one of these at the sacrifice of others.

What Splunk can Index:

Image source: https://answers.splunk.com/answers/671980/what-are-the-different-types-of-data-ingestion.html

Demo : Data Ingestion in Splunk (With screenshots) :

Below are the steps to ingest a data file in Splunk dashboard.

Step 1: Start the Splunk server using Splunk CLI

Step 2: Login with Splunk credentials

Step 3: Dashboard after logging in successfully

Step 4: Select “Add Data” from Settings tab.

Step 5: Choose “Upload” from the dashboard

Step 6: Select Source of Data

Select the file to be uploaded

Here is a sample data file available for download : https://docs.splunk.com/Documentation/Splunk/8.0.1/SearchTutorial/Systemrequirements#Download_the_tutorial_data_files

Step 7: This page allows you to configure the data input settings so that data can be indexed as per settings specified.

Step 8: The Review Page

Step 9 : Data Upload

Step 10: After file is uploaded successfully.

Step 11 : Search Results

Conclusion:

Splunk is now an industry standard for analysing real time data and trigger follow up actions. Splunk is being used all over the world by government agencies, commercial service providers, universities to analyse and understand business and customer behaviour in real time It can trigger alerts in case of any cyber security fraud, and improving the performance of the service being provided, while reducing the cost for the day to day operations in any organisation.

My Name is Ashish @ashish_fagna. I am a software consultant. LinkedIn profile. If you enjoyed this article, please recommend and share it! Thanks for your time.

You can also contact me on my email ashish [dot] fagna [at] gmail.com

--

--

Ashish
Analytics Vidhya

Software Developer. Machine Learning, Artificial Intelligence Learner. Youtube Channel : https://www.youtube.com/channel/UCP1S6HNw4_zze3jAH-Dt08g/videos