Bootstrapping StreamingPhish to Bro DNS Logs in RockNSM

Late last week, I encountered a scenario where I needed to collect Bro IDS logs from a small lab environment. Installing Bro from source isn’t the most trivial exercise so I opted for RockNSM, a network security monitoring and collection platform generously open-sourced by members of the Missouri National Guard Cyber Team (MOCYBER) that includes Bro IDS. It also gave me a good excuse to bootstrap it with StreamingPhish, a Python-based utility I open-sourced a few months ago that uses machine learning to detect fully-qualified domain names potentially used in phishing attacks. In this post, I’ll detail the steps I took to enrich every single DNS query with a “PhishScore”, map source IP addresses to client device names, build out a few Kibana dashboards, and configure real-time alerting to Slack!

Analytics dashboard in RockNSM showing the highest scoring FQDNs, along with client IP addresses and device names.

I did an offline ISO install of RockNSM v2.1, which was announced almost a month ago on August 24th, 2018. I had previously configured a SPAN session from my switch to copy all north-south traffic into the monitoring port on my server, so I had the whole system up and running in about 20 minutes. The plumbing begins by essentially hijacking/man-in-the-middling the Kafka topic that Bro uses to send all of the logs it generates downstream to logstash (see the obnoxious red arrow below):

Intercepting Bro logs in Kafka and enriching DNS logs with a “phish_score” value. Source: http://rocknsm.io/

Before Bro pushes logs to the default “bro-raw” topic that logstash consumes from, I’ll send it down a different topic: “bro-raw-orig”. I’ll configure the StreamingPhish application to consume from it, filter for DNS logs, evaluate each query, and append the score to each DNS message. Step one is to modify /usr/share/bro/site/scripts/rock/plugins/kafka.bro to instead write logs to a topic named “bro-raw-orig”:

Taking a detour to the “bro-raw-orig” topic.

Run sudo broctl deploy so that Bro picks up the configuration change.

Next, clone the StreamingPhish repository if you haven’t already. I’m going to make a handful of changes here:

1. Add redis as an additional container in the StreamingPhish project.

Key lookups in a redis cache run ~40x faster than how fast StreamingPhish can evaluate a DNS query, and redis lookups are computationally less expensive too. Double win. There’s no need to evaluate the same DNS query more than once with the same classifier, so we’ll only evaluate DNS queries that don’t exist in the cache (then add them to the cache immediately after producing a score). If we train a new classifier, we’ll manually flush the cache so everything gets re-analyzed. Let’s add a whopping 5 lines to streamingphish/docker-compose.yml:

This container gets downloaded and run when we start the StreamingPhish application.

2. Modify the “db” and “cli” containers to run in the “host” network mode.

The Kafka broker is running locally on the server on port 9092, so the “cli” container where our application code lives can’t connect to it if it’s in the default docker network. Add a field under the “db” and “cli” services named “network_mode” and set it to “host”:

Configure the StreamingPhish containers to run on the host network.

3. Update the location of the database in the application to point to localhost instead of “db”.

Again, since we’re not using the docker network, the “cli” container can’t communicate with the “db” container via it’s hostname. It’s running locally on the host server, so update the DB_HOST variable in streamingphish/cli/streamingphish/streamingphish/database.py to point to the localhost:

Point to localhost instead of “db” — it’s still accessible on port 27017.

4. Add kafka-python as a requirement for the “cli” container.

Our Python-happy application will use a library called “kafka-python” to consume from the “bro-raw-orig” topic in the Kafka broker that’s running locally on port 9092. Add a one-liner, kafka-python==1.4.3, to the file at streamingphish/cli/requirements.txt. Docker-compose will install this new requirement when we run the application.

5. Rewrite the entry-point of the application.

Instead of driving the application via the command-line interface offered by the PhishCLI() class, we’re going to admittedly hand-jam this a little bit. The entry point in our application will:

  • Train a classifier if one doesn’t exist, and load said classifier.
  • Connect to the local Kafka broker, create a consumer for the “bro-raw-orig” topic, and create a producer for the “bro-raw” topic.
  • Consume each Bro log and filter for DNS messages.
  • Check DNS queries to see if they exist in the cache.
  • Evaluate queries, append these messages with scores, and add them to the cache.
  • (Bonus, I’m also using the redis cache to enrich Bro messages with hostnames — more details below).
  • Send the messages from Bro on their way downstream to logstash!

Per the bonus exercise, you’ll notice I wrote some logic to keep track of IP address to MAC address mappings (dhcp.log), and MAC address mappings to hostnames (known_devices.log). The benefit is that now I can add the hostname to any log that has a source IP address (“id_orig_h” in RockNSM). And because the DHCP lease time is included in the Bro logs too, I can set my keys in redis to expire in lockstep with the DHCP leases. This is strictly optional, but triaging alerts is much more burdensome if they can’t be associated with workstations or users (easier said than done). Fortunately, I have all the data I need right in front of me.

SIDENOTE: The standard CLI tool is still available in the event you want to do retrains. Simply run sudo docker-compose exec cli python3, run from streamingphish.cli import PhishCLI, and instantiate the class with PhishCLI(). The new __main__.py entrypoint will pickup whichever classifier is activated in the configuration.

6. Load the known-devices-and-hostnames script and reload Bro.

Update /usr/share/bro/site/local.bro with a one-liner, “@load protocols/dhcp/known-devices-and-hostnames”, and run sudo broctl deploy afterwards so that Bro picks up the change. This script comes packaged with Bro 2.5 by default and activates the known_devices.log that we need for identifying hostnames.

7. Make a minor change to logstash to support the new workstation field.

Edit /etc/logstash/conf.d/logstash-500-filter-bro.conf and rename “workstation” to “[@meta][workstation]”:

Map “workstation” to “[@meta][workstation]”.

8. Cross your fingers you didn’t fat finger one of these steps, restart all RockNSM services, and run the application!

With all the changes we’ve made, it’s a good idea to restart services. This can be done with sudo rock_stop followed by sudo rock_start.

To run the application, navigate to where you checked out the streamingphish repository, invoke sudo docker-compose up -d --build and finally sudo docker-compose exec cli streamingphish. If you copy/pasted my code above, hopefully your output will look something like this:

At this point, visit the Kibana UI and refresh your indexes for Bro. You should see at least two new fields: dns.phish_score and @meta.workstation. Navigate to the Discover tab and you should see a “@meta.workstation” field with each log, and a “phish_score” field with each DNS log.

Building visualizations and adding them to a dashboard in Kibana should be pretty straightforward, so I’ll omit those procedures from this post. I’m happy to share the basic one I built if you’d like it.

The last step, if applicable, is configuring alerts to be sent to Slack/HipChat/Email/PagerDuty/whatever if, for example, a high “phish_score” is observed. This capability is provided courtesy of the Elastic X-Pack features, which are available as a 30-day trial if you don’t have a paid license. I’ve got my alerting configured to run every 30 seconds and send messages like the one below if a “phish_score” above 90% is seen. In this example, I pulled my phone out of my pocket and visited a fully-qualified domain that didn’t exist but I thought would get flagged by my classifier. Even though the query didn’t resolve, I still got this alert in Slack 7 seconds later:

I even included a facepalm emoji on the alert :)

Having the ability to correlate the connection ID associated with my alert across multiple protocol analyzer logs with a single click is pretty powerful in my opinion, so I dynamically built the URL in the alert to do just that. Again, the extent of this activity was just an NXDOMAIN response in the DNS logs, but if it were a successful HTTP or SSL connection, I might be able to triage the activity faster than you could swipe left on Tinder:

I’ll omit specific instructions for configuring Slack alerts from this post too. The Elastic documentation below has some good pointers. If you get stuck, feel free to reach out to me and I’ll try to help.

Thanks for reading!