Expiring IOCs in Entity Graph

14 min readMar 20, 2024

In this post, I explore how to expire Indicators of Compromise (IOCs) within Chronicle SIEM’s Entity Graph, focusing on using YARA-L detection rules with your Cyber Threat Intelligence (CTI) feeds, using techniques like time-based expiration and score-based thresholds to boost your detection accuracy.

⏩ If you’re familiar with Entity Graph and IOCs you can skip ahead to the test harness section.

Entity Graph

Fundamentals

Within Chronicle SIEM’s Unified Data Model (UDM) there are two sub-models that represent:

i) Event data
- data parsed from logs
ii) Entity data
- context parsed from CMDBs or LDAPs

Examples of Event versus Entity data represented in UDM

Context sources within Entity Graph are defined as either:

ENTITY_CONTEXT
- User provided context sources, e.g., Azure AD, OKTA
DERIVED_CONTEXT
- Generated by Chronicle based upon your data, e.g., Prevalence, First Seen
GLOBAL_CONTEXT
- added by Google, e.g., SafeBrowsing, WHOIS, ToR, etc…|

📝 If programmatically creating Entities via the Ingestion API do not set the context_type as while this will pass validation in the API, and be indexed, it will not be loaded into Entity Graph .

Entities in the Entity Graph must be of a valid Entity Type:

|------------- |----------- |
| Entity Type  | Entity ID  |
|------------- |----------- |
| ASSET        | 1          |
| DOMAIN_NAME  | 5          |
| FILE         | 4          |
| GROUP        | 10001      |
| IP_ADDRESS   | 3          |
| METRIC       | 8          |
| MUTEX        | 7          |
| RESOURCE     | 2          |
| URL          | 6          |
| USER         | 10000      |
|------------- |----------- |

Entity Graph supports the following Entity Types for IOCs:

|-------------- |------------------- |
| Entity Types  | Sub Types          |
|-------------- |------------------- |
| DOMAIN_NAME   |                    |
|-------------- |------------------- |
| FILE          | - MD5              |
|               | - SHA1             |
|               | - SHA256           |
|               | - File Name        |
|-------------- |------------------- |
| IP_ADDRESS    |                    |
|-------------- |------------------- |
| RESOURCE      | - Registry Keys    |
|               | - User Agents      |
|-------------- |------------------- |
| URL           |                    |
|-------------- |------------------- |
| USER          | - Email Addresses  |
|-------------- |------------------- |

IOCs stored in Entity Graph will be of either type:

ENTITY_CONTEXT
- user provided CTI sources, such as MISP, Anomali, etc…, or
GLOBAL_CONTEXT
- Google SecOps provided CTI sources, such as Mandiant Fusion

Global context types cannot be searched via Raw Log Search (legacy) or the new native Dashboards (preview), nor are they exported to the Chronicle Datalake (aka BigQuery). This is by design as otherwise the entirety of these feeds would be exportable, and Google would be giving them away for free. You can however see the matching Entity Graph IOC in Curated Detections, e.g., Applied Intel rule packs.

Expiration of Entity Graph entities

Within Entity Graph there are in effect two sub-types of Entities, but these are not documented (as far as I know of), :

User, Asset or Resource Entities, aka non IOC Entities
- expire after +/- 5 days from date of Ingestion
IOC Entities
- require the population of at least one metadata.threat key
- expire based on the metadata.interval.end_date key

User, Asset, or Resource Entities (ENTITY_CONTEXT or DERIVED_CONTEXT) are created from Context sources, such as Azure AD, OKTA, Cloud Identity, etc… and will automatically expire after +/- 5 days from the data of ingestion (metadata.ingestion_timestamp).

However, if your Entity Graph context record includes a metadata.threat key, any metadata.threat field, then that context record will not age out after +/- 5 days, but rather will use the metadata.interval.start_time value. If you do not specify a metadata.interval.end_date then the Entity will be active until the data ages out, i.e., your online retention duration.

📝 I’ve written on the topic of Entity Graph in prior posts, such as IOC matching in Chronicle SIEM, Aliasing in Chronicle SIEM, and Enrichment in Chronicle SIEM. The Entity Graph is also touched upon in the official Google SecOps documentation, Create context-aware analytics, and How Chronicle enriches event and entity data.

A Test!

In this section I run a series of tests, document the results, and provide my findings on how IOC entities in Entity Graph are observed to work.

Results & Findings

Before presenting the tests and results, the summary findings are as follows:

A metadata.threat field is required in order to create an IOC entity in Entity Graph
A multi-event YARA-L Detection Rule will only match against an IOC entity where the UDM Event data is within the metadata.interval.start_time and metadata.interval.end_time
Neither metadata.product_object_id nor metadata.threat.threat_idare mandatory fields for creating an IOC entity
- this is important as omitting both of these can lead to creating non-expiring IOC entities that cannot be invalidated
metadata.interval.start_time and metadata.collected_timestamp are mandatory fields, and omitting these will result in an error submitting an IOC entity via API
If an IOC Entity does not include a metadata.interval.end_time then the IOC will not expire
If an IOC Entity does not include metadata.product_object_id nor metadata.threat.threat_id then it is not possible to invalidate the IOC
- however, once the IOC Entity Graph goes beyond the tenant’s data retention period it will be deleted, e.g., usually 12 months
Updates to IOC entities in Entity Graph can take in excess of 12 hours to become active
- if you need to immediately invalidate a matching IOC in a YARA-L Detection Alert you need to use an exclusion Reference List
- the Chronicle SIEM product road-map does include work to reduce this latency

The Test Harness

A variety of IOCs will be ingested into Entity Graph with each IOC having unique identifiers to test the requirement of specific fields for creating IOC entities.

Matching YARA-L Detection rules will be created, and UDM events replayed for each IOC via using the Chronicle Ingestion API.

The results of this process will be recorded to understand the specific conditions that trigger a Detection Alert based on the presence of an IOC, and when they don’t.

Creating the IOC Entities

metadata.product_entity_id and metadata.threat.threat_id

{
    "metadata": {
        "product_entity_id": "88bfa102-aba2-47a5-96c6-0aa9ab2de593",
        "vendor_name": "ACME",
        "product_name": "CTI",
        "entity_type": "IP_ADDRESS",
        "description": "test_run_3a_ip",
        "collected_timestamp": "2024-03-18T10:05:43.599027Z",
        "interval": {
            "start_time": "2024-03-17T10:05:43.599064Z",
            "end_time": "2024-03-28T10:05:43.600188Z"
        },
        "threat": {
            "threat_id": "1ff6e018-ae7a-4991-bb57-67dc5909e581"
        }
    },
    "entity": {
        "ip": [
            "203.0.113.130"
        ]
    }
}

2. metadata.product_entity_id only


{
    "metadata": {
        "product_entity_id": "7721e148-e2c6-4931-8583-afed0bdf99ab",
        "vendor_name": "ACME",
        "product_name": "CTI",
        "entity_type": "IP_ADDRESS",
        "description": "test_run_3b_ip",
        "collected_timestamp": "2024-03-18T10:08:19.123167Z",
        "interval": {
            "start_time": "2024-03-17T10:08:19.123210Z",
            "end_time": "2024-03-28T10:08:19.123318Z"
        },
        "threat": {
            "threat_name": "ACME Test IOC"
        }
    },
    "entity": {
        "ip": [
            "203.0.113.131"
        ]
    }
}

3. metadata.threat.threat_id only

{
    "metadata": {
        "vendor_name": "ACME",
        "product_name": "CTI",
        "entity_type": "IP_ADDRESS",
        "description": "test_run_3b_ip",
        "collected_timestamp": "2024-03-18T10:08:38.352372Z",
        "interval": {
            "start_time": "2024-03-17T10:08:38.352397Z",
            "end_time": "2024-03-28T10:08:38.352471Z"
        },
        "threat": {
            "threat_id": "64192c24-d8bc-4b85-bff6-f97889ced840"
        }
    },
    "entity": {
        "ip": [
            "203.0.113.132"
        ]
    }
}

4. No metadata.interval.end_time, metadata.product_object_id, or metadata.threat.threat_id

{
    "metadata": {
        "vendor_name": "ACME",
        "product_name": "CTI",
        "entity_type": "IP_ADDRESS",
        "description": "test_run_3e_ip",
        "collected_timestamp": "2024-03-19T09:16:37.927301Z",
        "interval": {
            "start_time": "2024-03-18T09:16:37.927332Z"
        },
        "threat": {
            "threat_name": "ACME Test IOC"
        }
    },
    "entity": {
        "ip": [
            "203.0.113.170"
        ]
    }
}

Creating the YARA-L Detection Rules

For each IOC Entity a matching YARA-L Detection Engine rule will be created. For testing purposes the $e event section of the YARA-L rule is hard coded to match specific values.

rule ioc_invalidation_test_3a {
  meta:
    author = "cmmartin@"
    description = "Test harness for invalidating IOCs via Entity Graph"
    severity = "INFORMATIONAL"

  events:
    //$e.metadata.description = "test_run_3a_ip"
    $e.metadata.event_type = "NETWORK_CONNECTION"
    $e.metadata.product_name = "EDR"
    $e.metadata.vendor_name = "ACME"
    $e.principal.ip = "203.0.113.10"
    $e.principal.ip = $asset    
    $e.target.ip = "203.0.113.130"
    $e.target.ip = $ip

    $ioc.graph.metadata.vendor_name = "ACME"
    $ioc.graph.metadata.product_name = "CTI"
    $ioc.graph.metadata.entity_type = "IP_ADDRESS"
    $ioc.graph.entity.ip = $ip

  match:
    $asset over 1m

  outcome:
    $risk_score = 0

  condition:
    $e and $ioc
}

Note, the invalidation part will be covered in more detail later on in the article.

Creating the trigger UDM Events

Corresponding UDM Events will be replayed via the Ingestion API, specifically with a matching metadata.event_timestamp that falls within, or outside, the Entities metadata.interval.start_time and metadata.interval.end_time. This is so as to test the functionality of Intervals within Entity Graph with respect to generating a YARA-L Detection Alert.

Example UDM Event replayed to match the IOC Entities, and generated a Detection Alert

Running the tests… what happened?

For the initial test, a YARA-L Detection Alert was generated where the UDM event timestamp is within the IOC entity metadata.interval.start_time and metadata.interval.end_time. This is the expected result.

YARA-L Detection Alerts matching the IOC Entities during the metadata.interval time range

Replaying UDM events outside of the IOC Entity metadata.interval.end_time does not result in a YARA-L Detection Alert being generated. This is the expected result.

No YARA-L Detection Alerts generated as the UDM Events fall outside the valid metadata.interval time range

For the next stage of the test the same IOC entities are replayed but with a greater interval range, now() -1 day to now() + 10 days. These IOC entities will be observed to see if they supersede the prior versions, specifically how the metadata.product_object_id, and metadata.threat.threat_id behave.

{
    "metadata": {
        "product_entity_id": "88bfa102-aba2-47a5-96c6-0aa9ab2de593",
        "vendor_name": "ACME",
        "product_name": "CTI",
        "entity_type": "IP_ADDRESS",
        "description": "test_run_3a_ip",
        "collected_timestamp": "2024-03-18T10:05:43.599027Z",
        "interval": {
            "start_time": "2024-03-17T10:05:43.599064Z",
            "end_time": "2024-03-28T10:05:43.600188Z"
        },
        "threat": {
            "threat_id": "1ff6e018-ae7a-4991-bb57-67dc5909e581"
        }
    },
    "entity": {
        "ip": [
            "203.0.113.130"
        ]
    }
}

Matching UDM events are replayed, and YARA-L Detection Alert generation is verified using the newer IOC entity record. This is the expected result.

YARA-L Detection Alerts matching the updated IOC Entities with a 10 day interval time range

This then leads onto the topic of what happens if an IOC doesn’t have an end date?

Non-expiring IOCs

The metadata.interval.end_time appears to not be a primary key, which means you can create an IOC entity without an end date.

{
    "metadata": {
        "vendor_name": "ACME",
        "product_name": "CTI",
        "entity_type": "IP_ADDRESS",
        "description": "test_run_3d_ip",
        "collected_timestamp": "2024-03-18T21:40:47.822121Z",
        "interval": {
            "start_time": "2024-03-17T21:40:47.822167Z"
        },
        "threat": {
            "threat_id": "cbaf34b3-24fb-4eb7-8210-5e467280c3da"
        }
    },
    "entity": {
        "ip": [
            "203.0.113.160"
        ]
    }
}

If metadata.interval.end_time is omitted then Chronicle SIEM will automatically set the end date to December 31st 9999.

There are scenarios where having non-expiring IOCs makes sense, such as File Hashes or Registry Keys which once known malicious will always be malicious, where as IP addresses or Domains are often used temporarily as part of a campaign, i.e., they get cleaned up and are no longer malicious or active after a short interval of time after discovery.

So what happens if you have a non-expiring IOC, without a metadata.interval.end_time, generating Detection Alerts that you wish to wish to prevent?

Invalidating IOCs

To invalidate an IOC in Entity Graph is straightforward in concept: ingest a newer version of that IOC with an end date of now.

Example of updating an IOC entity to invalidate the prior version

From observation it took upwards of 12 hours before an updated version of an IOC entity record is ‘loaded’ into Entity Graph, and evaluated by a YARA-L Detection rule.

After then replaying matching UDM events Detection Alerts were not observed for the expected scenarios, a b and c, and Detection Alerts were observed for the expected scenarios, d and e.

|------|---------------------------------|------------------|----------|
| Test | product_entity_id and threat_id | threat.threat_id | end_date |
|------|---------------------------------|------------------|----------|
| a    | x                               | x                | x        |
|------|---------------------------------|------------------|----------|
| b    | x                               |                  | x        |
|------|---------------------------------|------------------|----------|
| c    |                                 | x                | x        |
|------|---------------------------------|------------------|----------|
| d    |                                 | x                |          |
|------|---------------------------------|------------------|----------|
| e    |                                 |                  |          |
|------|---------------------------------|------------------|----------|

This leads to the conclusion that metadata.product_entity_id and metadata.threat.threat_id act as primary keys, and can be used to create versions of an entity (a, b and c).

For scenario (d) and (e) no end date was specified in the IOC entity, i.e., a non-expiring IOC entity, and so Detection Alerts were expected.

How do you then expire an IOC entity without a primary key field?

As far as I can tell, you cannot expire an IOC entity in Entity Graph that does not have a primary key; however, this is not an expected scenario as TIP providers provide a unique identifier per IOC, and the Chronicle SIEM default integrations apply that as either metadata.product_object_id or metadata.threat.threat_id.

📝 If you have or find a parser instance that does not populate these values then this should be raised back to Chronicle via Support.

If however you find yourself in this scenario then you can look to use Reference Lists as an exclusion list within a YARA-L Detection rule, e.g.,


rule misp_ioc_target_ip_observed_connection {

  meta:
    author = "thatsiemguy@"
    owner = "secops-t3@" 
    description = "Matches Network Target IP event data against MISP IP IOCs, specifically looking for observed connections.  If you wish to evaluate IOAs then consider duplicating and using the UDM SRC field."
    response = "Evaluate the IOC match in MISP to determine the Feed source, MISP Tags, and Severity "
    severity = "LOW"
    priority = "LOW"
    // Optional, but desired meta fields
    mitre_mitigation = "M1019"
    misp_types = "ip-src, ip-dst"

  events:
    (
        $event.metadata.event_type = "NETWORK_CONNECTION" or
        $event.metadata.event_type = "NETWORK_HTTP" or
        $event.metadata.event_type = "NETWORK_FTP" or
        $event.metadata.event_type = "NETWORK_SMTP" or
        $event.metadata.event_type = "NETWORK_FLOW" or        
        $event.metadata.event_type = "NETWORK_UNCATEGORIZED"
    )
    $event.target.ip != "" and $event.target.ip = $ip
    $asset = strings.coalesce($event.principal.hostname, $event.principal.ip)

    $misp.graph.metadata.vendor_name = "misp-project.org"
    $misp.graph.metadata.product_name = "MISP Threat Sharing"
    $misp.graph.metadata.entity_type = "IP_ADDRESS"
    $misp.graph.entity.ip = $ip
    // used to not alert on tagged false positives in MISP TIP
    not any $misp.graph.metadata.threat.category_details = "false_positive"
    // exclusion filtering, aka short term fix
    not $misp.graph.entity.ip in %misp_ioc_target_ip_observed_connection_exclusions

  match:
    $asset over 1m

...

Alternatively, as IOC entities honor your data retention period, they’ll become inactive in one year…

What about the automated IOC matching in Chronicle SIEM?

Chronicle SIEM includes automated IOC matching that can be used for artifacts of type Hash, IP Address, or Domain Name. The automated IOC matching does honor the start and end times, if available, of an IOC.

However, just as above, this is also an important consideration when it comes to non-expiring IOCs, e.g., if you ingest an errant IOC without an expiration date then the automated matching is going to keep matching it over and over.

Get an alert every time you visit Google.com…

Approach to expiring IOCs

Not all IOCs are of equal Detection value, and certain Indicators types have a short Time To Live (TTL), and should expire (decay) at a quicker rate.

Common approaches for aging out IOCs include:

using a numerical score threshold
using an end date
using a combination of both

The approach for decaying IOCs (and terminology) varies between TIPs, e.g., Mandiant uses a confidence and threat score approach where as the TIP provider they reduce the threat score over time based on observations, in MISP there is the base_score with decay modelling.

IOCs can include a first seen and last seen value, and within UDM an IOC entity should populate the metadata.threat.first_discovered_time or metadata.threat.last_discovered_time from these values. While using the first_seen as the metadata.inteveral.start_time can be a suitable apporoach, it is not recommended to use the last_discovered for the metadata.interval.end_time as this will be a date in the past, and potentially prevent generating a Detection or Detection Alert.

If using a Chronicle parser for normalizing CTI into an IOC entity there is no date function, so you can’t perform any calculated operation to say now() + x days. This creates a challenge that you either then have to either a) use the last seen date, but factor in this may prevent raising a Detection if the artifact is seen after this time, or b) leave the IOC entity as a non-expiring entity.

If using a custom integration you can use a programming language to performance calculations and set a programmatic end_time, e.g., I wrote a custom MATI that provides an example of this here:

IP_ADDRESS_VALID_FROM=SIEM_DATA_RETENTION
IP_ADDRESS_EXPIRES_AFTER=90

DOMAIN_NAME_VALID_FROM=SIEM_DATA_RETENTION
DOMAIN_NAME_EXPIRES_AFTER=30

#TODO(): URLs could be made more fine grained depending on the category, e.g, Phishing is 14 but Malware is 90
URL_VALID_FROM=SIEM_DATA_RETENTION
URL_EXPIRES_AFTER=30

# skip ahead a bit...

  match indicator['type']:
    case "fqdn":
      entity['hostname'] = indicator['value']
      metadata['entity_type'] = 'DOMAIN_NAME'
      interval['start_time'] = subtract_offset(now(),DOMAIN_NAME_VALID_FROM)
      interval['end_time'] = subtract_offset(now(),DOMAIN_NAME_EXPIRES_AFTER)

    case "ipv4":
      entity['ip'] = indicator['value']
      metadata['entity_type'] = 'IP_ADDRESS'
      interval['start_time'] = subtract_offset(now(),IP_ADDRESS_VALID_FROM)
      interval['end_time'] = subtract_offset(now(),IP_ADDRESS_EXPIRES_AFTER)

This then leads onto the use of a score, e.g., a threat_score. Within Chronicle UDM for IOC entities the numeric fields of metadata.threat.risk_score and metadata.threat.confidence_score should be used, which then enables you to have non-expiring IOC entities, but as the risk score decays over time apply a threshold for alerting in your YARA-L Detection Alert, e.g.,

rule anomali_ioc_ip_mandiant {

  meta:
    author = "Google SecOps"
    description = "Match Anomali IOC IP indicators from Mandiant only, against UDM event data."
    severity = "MEDIUM"
    priority = "MEDIUM"
    notes = "Only evaluates against log sources that populate an external IP address into target.ip"

  events:
    // filter the UDM event types, i.e., only log sources that will populate an external IP address into target.ip
    (
      $e.metadata.event_type = "NETWORK_CONNECTION" or
      $e.metadata.event_type = "NETWORK_HTTP" or
      $e.metadata.event_type = "NETWORK_FLOW" or
      $e.metadata.event_type = "NETWORK_SMTP" or
      $e.metadata.event_type = "NETWORK_FTP" or
      $e.metadata.event_type = "NETWORK_DNS"      
    )
    $e.target.ip = $ip
    $e.principal.ip = $host

    $g.graph.metadata.vendor_name = "ANOMALI_IOC"
    (
      $g.graph.metadata.threat.threat_name = "Mandiant - Indicators" or
      $g.graph.metadata.threat.threat_name = "Mandiant - Fusion Intelligence"
    )
    $g.graph.metadata.entity_type = "IP_ADDRESS"
    $g.graph.entity.ip = $ip

    // only match against active IOCs, i.e., not CLEARED
    $g.graph.metadata.threat.threat_status = "ACTIVE"

    // only consider IOCs with a risk_score of X or above
    $g.graph.metadata.threat.risk_score >= 60

    // only consider IOCs with a confidence score of Y or above
    // - note, this value can be a negative integer, so verify before using
    //$g.graph.metadata.threat.confidence_score >= 60

  match:
    $host over 1m

If using this approach you can also factor in having a filter on API ingestion of CTI into Chronicle, e.g., only ingest IOCs with a threatscore above 40, generate a Detection above 60, and a Detection Alert above 80 (this requires two YARA-L rules), but factor in that you need verify the TIP doesn’t drop an individual IOC score too much in one go, as otherwise you will filter out the updated IOC record on ingestion, e.g., IOC entity was threatscore of 50 and now is 39.

While not technically expiring an IOC, an alternate approach for IOCs where you can’t use an end date, and a score isn’t available, is to use tagging in the source event, e.g., here’s an example within MISP of using a custom label which is then applied as an exclude filter in the YARA-L rule.

Using custom tagging to suppress an IOC from generating YARA-L Detection Alert

IOC expiration isn’t complex, but it’s not easy or straight forward either.

Conclusions

Within your TIP it is critical to pre-curate and filter your IOCs before deploying to production. Ideally your TIP includes a Warning Lists, or similar functionality, to notify you of low quality IOC matches so as you can vet before deployment into production.

Another key consideration when it comes to Chronicle SecOps platform is that Chronicle SIEM’s Detection Engine can generate a very large number of Detection Alerts, very quickly. While Chronicle SOAR includes Case Overflow functionality you can generate thousands of cases quickly which, apart from annoying your SOC, can cause backlogs and prevent playbooks from executing in a timely manner.

For all these reasons and above, this is where one of the Enterprise+ features of Chronicle SecOps, Applied Threat Intel, is of great value as the curation, vetting, and decay of IOCs, from sources such as Google and Mandiant, is applied automatically on your data in Chronicle SIEM.

However, when using your own CTI, hopefully the above helps to demystify the important yet mysterious Entity Graph, and how IOC matching in Chronicle SIEM works.