osquery For Security

Introduction to osquery — Part 1

7 min readJan 19, 2016

Osquery is a tool that was developed at Facebook that allows you to query security, reliability, and compliance based information about the Linux and OSX based systems in your environment. When it comes to securing a Linux and/or OSX network environment, it’s hard to beat a tool that’s easy to install, open source, and completely free.

If you’re an IT or security engineer working with a fleet of Linux and/or OSX systems, osquery should be your first choice of software to install on those hosts. In my experience, it’s rare to find a monitoring or security tool that both engineers and security engineers can agree upon. Osquery fits the bill because it provides resources for both engineering and security teams.

Osquery isn’t like other vendor security tools. It doesn’t try to hide itself deep in the operating system or prevent itself from being uninstalled. It also doesn’t place any new restrictions on the users or system. In fact, it does the complete opposite by enabling users to easily gain more information about the system.

Osquery includes an interactive query console/shell (osqueryi) and a daemon (osqueryd) that allows osquery to run in the background, schedule queries, and aggregate logs.

If you’re fairly new to osquery or wondering what an enterprise deployment might look like, keep on reading. This post contains an overview of how to create an osquery config, centralize the log output, and start creating effective searches and alerts.

Prerequisites

This post also assumes you already have:

osquery installed
Centralized logging infrastructure (Splunk, ELK, etc)
Log forwarders installed on your hosts (UniversalForwader, Logstash, Fluentd)

Building a Config

Osquery’s configuration file (often named osquery.conf) contains the configuration options and queries that osqueryd uses when it runs.

We’ll use the code sample below as our basic config. By naming it osquery.conf and placing it in /var/osquery, we ensure that it will be picked up by the default config_path settings.

As you can see in lines 3–7, we’ve only scheduled a single query under the “schedule” heading. That query is scheduled to run every 60 seconds. That’s quite often, but it will help us populate our log file. If you’d like to see what information that query will log when it runs, open up the interactive osqueryi application in a terminal:

Although we added one query to our configuration file, we also included 3 query packs (lines 9–13). Query packs are just JSON config files that contain additional queries. Think of it like importing software libraries. Query packs make it easy to categorize your queries and also allow you to keep a short and tidy config. If you want to see all of the queries that are scheduled to run from our config (including the packs), you can use: “SELECT name FROM osquery_schedule;”.

If you want to view or change the queries that will be running from the packs, you can view them in /var/osquery/packs. Queries listed inside the packs allow you to change the interval for how often each query will run and disable certain queries if they’re unwanted. For the sake of this post, we’re going to leave them all enabled.

Debugging Your Config

Osqueryd won’t run correctly if there are problems with your configuration file. Here’s a few ways to debug it:

Launch osqueryi in verbose mode and point it to your config using the config_path argument. If you see any initialization lines containing “Error reading config”, you’ve got a problem.

osqueryi --config_path=/var/osquery/osquery.conf --verbose

2. Check the osquery_info table. The “config_valid” column should be set to “1” if it’s valid.

osquery> select config_hash, config_valid from osquery_info;
+ — — — — — — — — — — — — — — — — — + — — — — — — — +
| config_hash                       | config_valid  |
+ — — — — — — — — — — — — — — — — — + — — — — — — — +
| 984b6e1c688c1b2bf126a7e812adcac2  | 1             |
+ — — — — — — — — — — — — — — — — — + — — — — — — — +

3. Use osqueryctl to check the config:

$ sudo osqueryctl config-check
Error reading config: Error parsing the config JSON

4. If you’re encountering JSON parsing errors, use a JSON linting tool to debug it.

Starting osqueryd

Now that we have a valid config, it’s time to start the osquery daemon. osqueryctl is a helper script included with osquery that allows you to easily start/stop/restart the osqueryd service.

$ sudo osqueryctl
Usage: /usr/local/bin/osqueryctl {clean|config-check|start|stop|status|restart}

Start osqueryd by running $ sudo osqueryctl start.

Logging to Splunk

After the Splunk UniversalForwarder is installed, we have to configure the inputs and outputs.

Configuring SplunkForwarder

1. Make a directory for the new configs. You should never edit the configs in the “default” directory.

$ mkdir /Applications/SplunkForwarder/etc/apps/SplunkUniversalForwarder/local

2. Copy inputs.conf and outputs.conf from the “default” folder into that directory:

$ cp /Applications/SplunkForwarder/etc/apps/SplunkUniversalForwarder/default/inputs.conf /Applications/SplunkForwarder/etc/apps/SplunkUniversalForwarder/default/outputs.conf
/Applications/SplunkForwarder/etc/apps/SplunkUniversalForwarder/local

Configuring Outputs

Edit the newly copied outputs.conf and add the “defaultGroup” and “server” directives so that Splunk knows where to forward the logs.

# Version 6.3.2
[tcpout] defaultGroup = splunk[tcpout:splunk]
server = 192.168.x.x:9997
forwardedindex.0.whitelist = .*
forwardedindex.1.blacklist = _.*
forwardedindex.2.whitelist = (_audit|_introspection)
forwardedindex.filter.disable = false

Configuring Inputs

We want to collect the file that contains the query results as an input, and that file is located at /var/log/osquery/osqueryd.results.log. Optionally, if you wanted to include any error related information, you could also include the osqueryd.WARNING and osqueryd.ERROR logs as well. For now, we’ll stick with results.

In /Applications/SplunkForwarder/etc/apps/SplunkUniversalForwarder/local, append the following lines to inputs.conf:

[monitor:///var/log/osquery/osqueryd.results.log]
index = osquery
sourcetype = osquery:results

Note: This assumes you’ve created a separate index for osquery logs in Splunk already.

Restart

Now that we’ve updated the config, we need to restart the Splunk forwarder for these changes to take effect:

$ /Applications/SplunkForwarder/bin/splunk restart

Server-side

Don’t forget to configure a listening TCP port (9997) on your Splunk server!

Grokking osquery Logs

The #1 thing to understand about the way osquery writes logs is that it writes differential logs.

To understand this more clearly, let’s use the usb_devices table as an example. If we have a query that logs all usb_devices like so:

SELECT * from usb_devices;

you might expect that query to log all present usb devices each time it runs. That will not be the case. After osquery has recorded all of the initial usb devices from the first time the query executes, it will only log devices that have been added or removed from the table since the last execution.

However, if you encounter a situation where you want to see the full set of results each time a query is run, take a look at snapshot logs.

osquery + Splunk = ❤

Let’s look at some of the most efficient ways to dig through this data. Since the data is already in JSON, we don’t have to worry about field extractions.

Let’s start by examining all of the queries that we have results for, using the “name” field.

We can recreate a nice table view by using Splunk’s built in table command:

Let’s get down to what this post is mainly supposed to be about: security. Our config includes a query pack called “osx-attacks”, so let’s see how it works in practice by triggering it. I went ahead and created a launchd item called “com.genieo.completer.download.plist”, which is an artifact of the Genieo adware. If you take a look at the osx-attacks query pack, you’ll notice a query exists to detect this specific adware artifact. Take a look at that query — it looks in the launchd table for items with specific labels.

Now that we’ve simulated the presence of malware on our system, we can run the following query to ensure that osquery detected it:

At this point, you might be able to think of some straightforward queries/alerts /reports that can be set up:

1. Identify malware on OSX systems:index=osquery name=pack_osx_attacks* | table unixTime, action, hostIdentifier, columns.label, columns.program_arguments2. Keep a whitelist of legitimate launchd items in a Splunk lookup table titled "whitelisted_launchd.csv" and generate an alert whenever a new launchd item is added:index=osquery action=added name=pack_it_compliance_launchd NOT [|inputlookup whitelisted_launchd.csv] | table unixTime, action, hostIdentifier, columns.label, columns.program_arguments3. Generate a report of systems with unencrypted hard drives:index=osquery name=pack_incident_response_disk_encryption columns.encrypted=0 columns.name="/dev/disk1" | table unixTime, hostIdentifier, columns.name, columns.encrypted

Take a look at the tables provided by osquery and the existing queries in query packs. They don’t all directly provide value, but they often provide data that can be used to aid detection in some way.

The biggest limit here is your creativity and the skills you have to transform the data available to you into something meaningful.

I’ll follow up in the near future with a part two, geared entirely around creating custom queries and detecting malicious activity.

In the meantime, if you’re hungry for more osquery knowledge, check out “Responding @ Scale — osquery for Mass Incident Detection & Response” by @sroberts and @bfist