Next Steps With Splunk

Part 2. Importing And Working With Data

Published in

Splunk User Developer Administrator

8 min readFeb 21, 2019

This is part 2 of our “Getting Back to Splunk Basics” series. In part 1 of the series, we covered a lot of the basics of using a docker instance of Splunk and setting it up on our desktop or laptop to have a practice environment where we can start to get familiar with the application. We logged into our environment and got familiar with the administering the application and even created a new user. In this post, we are going to extend our knowledge further and start using Splunk for what it was created for. We are going to start importing data into our environment, we are going to start searching over our data, and finally we will make our data be a little more persistent. If you’ve noticed, every time you stop your docker instance, all of your chances will be removed as well. So why wait any longer, let’s get stuck in.

If your looking for ways to install Splunk using Ansible, checkout our latest book on the subject:

I know this sometimes seems like we are doing a docker tutorial, bare with us as all of the information will come together nicely and give you a good, hands on understanding on Splunk and administering the application.

Keeping Our Changes With A Volume

If you have moved directly on from our first post, you may not have realised when you turn off your system the Splunk docker instance, any changes you have made will also be lost. Not to worry there are some easy changes we can make to keep the important parts of our Splunk installation so we can use then for future reference.

When you continue to work with Splunk, and if you continue to test and run your installation on docker, there may be a number of different mounted volumes you may want to use, but today we will simply copy the etc directory, so we can keep all our configuration changes.

If you have your docker Splunk instance running, stop it by killing the running container and we will start it up shortly.

Create a directory on our system to keep our data safe. This will usually be in our home directory, I am running on Mac so I will create my directory as the /Users/<homedir>/testsplunk, but if you are using Linux, you could do something similar in the home directory.
We then need to run our docker command and point to the new directory location and correlate it to a corresponding directory on our Splunk system. In our example we will use the /opt/splunk/etc/ directory, as this will keep all our user and configuration data. If you see below, you will notice our docker command now uses the -v option to mount the volume /Users/vincesesto/testsplunk to the /opt/splunk/etc/ directory on our docker container.

We can achieve the above with the following commands

# 1. Create our new directory
mkdir -p /home/<username>/testSplunk# 2. Run the new docker command
docker run -d -p 8000:8000 -e 'SPLUNK_START_ARGS=--accept-license' -e 'SPLUNK_PASSWORD=changeme' -v /Users/vincesesto/testsplunk:/opt/splunk/etc/ splunk/splunk

Directory Structure In Splunk

This is probably the best time to introduce the Splunk directory structure as it will help you to make decisions like the one above, as well as help you to troubleshoot issues and make configuration changes when needed. Most installation will run in the /opt/splunk directory, with the following being the main structure of the application.

var: If you’re familiar with the Linux file system, this is usually where an application would usually store all data that changes and within Splunk, this can be huge. You need to remember that if you are indexing data, the actual index files will be located in the var directory. So if you are setting up storage for an indexer, or mounting the file system on your laptop, you need to make sure you have enough room. This also has all of the Splunk application logs as well, so if you are troubleshooting, the var directory is the best place to start.
bin: Yep, this is where have all our applications and binaries that run within the Splunk application. For example, if you need to restart the server for some reason, you would run /opt/splunk/bin/splunk restart.
etc: This is where we have all our configuration data for our system. This is broken up into three sub directories within the etc directory. The users directory holds all relevant details for the users we have set up in our system. The apps directory holds all the code for our Splunk apps, and the system directory holds all the relevant information for the system configurations.

In the example above, we have mounted the /opt/splunk/etc directory to our own laptop or desktop, so every time we start up our environment this way, we should be able to see out user data, app data and system data should still be available and unchanged. To allow us to keep our changes, we will:

We can test this out pretty easily, simply create a new user in your Splunk web interface like we did in part 1 of this series, kill your docker container and then see if it is still there when you start it up again.

Forwarding Data To Your Splunk Installation

In a real work environment we would use Splunk with an Indexer and Search Head, and have our data from our environment forward to a Universal Forwarder, which then will be indexed by our Indexer. We only have a test environment at the moment, so we don’t need to worry too much about Indexers and Search Heads. We could import sample data into our Splunk search environment, but we would be limited to the data already on the docker instance. Importing data onto the contain would not be very efficient.

In the next section of this post, we will set up a forwarder on our laptop and start forwarding data onto our docker instance. But first we need to make sure we can forward data to our Splunk host.

Splunk dedicates specific ports for specific services. As we already know, port 8000 is where Splunk serves its web interface. Our docker command uses the -p option to map port 8000 on our docker instance to 8000 on our host machine to make sure it is available in our web browser. We can do the same thing with forwarding. In this case Splunk uses port 9997 for data forwarding, so we can run our docker instance with an extra -p option to allow forwarding.

If you still have your docker instance running, stop it as you will need to run it with the new option listed below:

# Run the new docker command
docker run -d -p 8000:8000 -p 9997:9997 -e 'SPLUNK_START_ARGS=--accept-license' -e 'SPLUNK_PASSWORD=changeme' -v /Users/vincesesto/testsplunk:/opt/splunk/etc/ splunk/splunk

Forward Data With A SplunkForwarder

As an example, we are going to install a forwarder on a laptop, which is a Mac. This will be similar to installing on a Linux or Windows environment, you just need to remember the directory structure will be a little different. To download and install the application, you can get these details directly from the Splunk website if you click here. When you install the Splunk forwarder for the first time, it should start automatically, but in the following we are going to perform the following:

Move into our SplunkForwarder directory, for a Mac, it will be in the Applications directory, but for a Linux environment it will be in the opt directory. The forwarder will have the same directory structure as a regular Splunk installation, so to start up the forwarder and perform our configuration changes, we will be working in the bin directory.
Start the SplunkForwarder from the command line. This will use the splunk script in the bin directory.
Tell the forwarder where to forward the data. In our case we will be forwarding to our docker container. We have set up port 9997 to map to our local host on IP address 0.0.0.0. We will once again use the splunk script in the bin directory with the “add forwarder” option.
Lastly we have to tell the forward what directories or files we need to forward to our indexer. In our case, we will simply forward the contents of our /var/log directory. Once again we use the splunk script in the bin directory, with the “add monitor” options.

Below is all the command line arguments you need to perform these changes.

# 1. Move into the bin directory
cd /Applications/SplunkForwarder/bin/# 2. Check if the SplunkForwarder is running
./splunk statussplunkd is not running.# 3. Start up the forwarder
./splunk startSplunk> Be an IT superhero. Go home early.Checking prerequisites...Checking mgmt port [8089]: openChecking conf files for problems...DoneChecking default conf files for edits...Validating installed files against hashes from '/Applications/SplunkForwarder/splunkforwarder-7.2.4-8a94541dcfac-darwin-64-manifest'All installed files intact.DoneAll preliminary checks passed.Starting splunk server daemon (splunkd)...Done

We have set the commands above to bold to make them stand out from the output received on our system. To finish off forwarding, you can now perform the following from the command line.

# 4. Tell the forwarder where to send the data to be indexed by Splunk./splunk add forward-server 0.0.0.0:9997 -auth admin:changemeAdded forwarding to: 0.0.0.0:9997.# 5. Tell the forwarder what to forward
./splunk add monitor /var/log/

If all goes well, you should be able to start seeing data coming through to your Splunk web interface. If you haven’t used the Splunk search interface before, we have simply opened up a search query and looked for all data in index=”main” and as you can see below, we are starting to see data coming from my MacBook. If you’re interested in getting some further details on searching in Splunk, check out our post on the subject here.

Splunk Port Numbers

I’m sure by now you’ve noticed Splunk uses different ports for different services. We’ve been using port 8000 to access our web interface, but we thought we would quickly clarify all the important ports assigned to Splunk.

8000: You’ve been using this for the web application, and this is the dedicated default web port used to access Splunk in your browser.
9997: We’ve also used this one as well in the above post as we have forwarded our data into our index using this port.
8089: This is the management interface where you can connect to the Splunk API.
8080: This is the port Splunk uses for index replication and allows communication between index servers.

You’ve done a lot of work if you’ve been following along. We’ve set up our Splunk docker container to accept information from a forwarder by opening ports as well as mounting the etc filesystem on our laptop to make some of our configuration data more persistent. We’ve set up a forwarder on our local host and started forwarding data through to our Splunk instance so we can start to search over the data.

Found this post useful? Kindly tap the clap button below! :)

About The Author

DevOps Engineer, Endurance Athlete and Author. As a DevOps Engineer I specialize in Linux and Open Source Applications. Particularly interested in Search Marketing and Analytic’s, and is currently developing my skills in devops, continuous integration, security, Splunk(UI and Reporting) and development(Java).