Linux Threat Hunting Primer — Part I

VerintCyberSec
Verint Cyber Engineering
5 min readDec 8, 2019

By Shachar Roitman

Introduction

This post will discuss the main dilemmas regarding Linux threat hunting, the methodology of performing threat hunting for Linux systems and how to decide on the hunting vectors.

When we discuss threat hunting, we assume there is an attacker in the network and we’d like to actively search for him. To be efficient, this search should revolve around the system objects that are most likely to be abused. The main challenge we face in the threat hunting process is sifting through the huge amounts of data that we collect. Usually, most Linux computers in the network are servers, which is not going make it easier for us — a server usually generates a lot of data.

The amount of data makes the process of finding anomalies hard. This is why in-depth knowledge about system’s normal activity and OS internals is essential.

At every stage of the hunting process, you need to stop and think how to filter out the “known good” behaviors of the OS. For example, using knowledge about standard paths for a binary or what init runs by default. Another way would be to baseline the normal behavior of a specific server in the network based on its role, common users and other distinctive features.

Choosing a threat hunting vector

You have decided that you want to start hunting. You know that you want to focus on the (often neglected, yet targeted) Linux servers in your network.

How do you decide what to hunt for?

A good starting point would be to read blogs and articles regarding Linux based attacks and of course, the MITRE ATT&CK matrix, which dedicates a special section for Linux threats, mapped to the cyber-attack kill chain.

During our research, we combined data from MITRE ATT&CK matrix with other statistical reports about Linux malware published in recent years. Combining all this data allowed us to visualize the statistical distribution of Linux-based attacks.

First, we’ll look at the connection between MITRE Linux hacking techniques at different parts of the attack kill chain.

MITRE ATT&CK techniques by tactics graph
MITRE ATT&CK techniques by tactics graph

This graph presents the “world” of MITRE ATT&CK matrix, including all of the techniques and tactics and how they are connected. We’d like to find the minimum amount of tactics (red circles) that covers the maximum amount of techniques (blue circles). The connections between the different entities in the graph help us understand that it’s enough to hunt for a small group of tactics to achieve good coverage.

The next graphs depict techniques that were used in

cyberattacks in 2018 against Linux-based systems, divided into MITRE ATT&CK tactics.

Techniques used in cyberattacks in 2018 against Linux-based systems divided into MITRE ATT&CK tactics

From the statistical distribution of both graphs (Figure 1, Figure 2) considering the focus on servers, we can conclude that by hunting for three techniques: (1) Persistence, (2) Defense evasion and (3) Privilege escalation, we will get good coverage of commonly used techniques.

Our hunt will focus on tactics with the highest ROI. Hence, tactics that are most likely to be used by malicious tools and malware and will require the least investment.

Most common attacks and techniques

The following lists describe the most common attacks, techniques and tactics. The techniques are prioritized by the research and testing effort that is required to hunt for them versus the value they give. To those, we also added: common auto start locations, commonly targeted directories and living off the land binaries (using data from https://gtfobins.github.io). Please note that all of the technique and tactic names below are taken from the MITRE ATT&CK framework.

Most common techniques (Ordered by priority):

  1. Scripting
  2. Credential Dumping
  3. User Execution
  4. Obfuscated Files or Information
  5. Spearfishing Attachment
  6. Command-Line Interface
  7. Standard Application Layer Protocol
  8. Remote File Copy
  9. Valid Accounts
  10. File Deletion

Linux persistence techniques:

  1. Hidden Files and Directories
  2. Create Account
  3. Valid Accounts
  4. Local Job Scheduling
  5. Web Shell
  6. Bootkit
  7. Port Knocking
  8. .bash_profile and .bashrc
  9. Systemd Service
  10. setuid and setgid

Linux privilege escalation techniques:

  1. Process Injection
  2. Valid Accounts
  3. Exploitation for Privilege Escalation
  4. Web Shell
  5. sudo caching

Linux execution techniques:

  1. Command-Line Interface
  2. Scripting
  3. User Execution
  4. Exploitation for Client Execution
  5. Local Job Scheduling

Linux exfiltration techniques:

  1. Data Encryption
  2. Data Compression
  3. Exfiltration Over Command and Control Channels
  4. Exfiltration Over Alternative Protocols
  5. Scheduled Transfers

Linux defense evasion techniques:

  1. File Deletion
  2. Process Injection
  3. Masquerading
  4. Disabling Security Tools
  5. Timestamp
  6. Indicator Removal on Host
  7. Binary Padding

A Healthy Threat Hunting Process

After choosing the hunt vector and a specific tactic, we will perform the following steps for each technique:

  1. Familiarize ourselves with the organization’s network
  2. Research about Linux internals standard behavior
  3. Learn how to filter out the standard activity specific to the organization (Generated by known software, scripts etc.)
  4. Research about anomalous behavior that you’d like to hunt for
  5. Search for the anomalous behavior in the network
  6. Create fine-tuned threat-hunting queries and investigate the query output

It’s important to note that this is an iterative process, so we might need to revisit the research steps several times until the query output gives the precise amount of data needed (specific enough to avoid very large result sets, but wide enough not to miss any suspicious activity).

In the next post (Linux Threat Hunting Part II ), I’m going to illustrate and explain the process of threat hunting using an example of a specific technique. Threat hunting queries should be implemented and used on top of your existing Data Lake or EDR solution for monitoring the network. Be aware that without being able to query all of the data, it will be hard or even impossible to perform threat hunting on a full network.

I hope that this post helped you understand how to begin the process of threat hunting and how to decide which initial investment will have the best returns.

Thanks to Oren Biderman and Michael Gendelman for reviewing this post and providing useful suggestions.

--

--