Indexed fields in UDM Search

Chris Martin (@thatsiemguy)
8 min readApr 13, 2024

--

UDM Search is the core investigation point within Chronicle, Google SecOps’ SIEM component. Understanding how to use indexed fields is critical for returning results quickly. This post explores indexed fields in Google SecOps UDM Search — what they are and how they can dramatically accelerate your incident response. These specific fields can also help you access the most recent UDM event data in near real-time.

By the end of this post you’ll be able to run blistering fast UDM searches

Overview

If you’re an existing SecOps user you may have noticed improvements in UDM Search response speeds and data availability. These enhancements allow normalized and enriched event data to be searchable within minutes. Additionally, changes have been made to optimize the indexing of late-arriving data, ensuring it’s also quickly accessible in UDM Search.

In the following parts of this post, I will list the indexed fields I’ve identified and run a couple of tests to demonstrate the impact of indexed fields.

Note: Since indexed fields are not documented, it’s important to remember that my understanding might have some inaccuracies. It’s advisable to test these findings yourself within your own environment before relying on them heavily.

Indexed UDM Fields

The known UDM fields that are indexed, i.e., can be used to retrieve data in near real-time, and can return data from a query quicker, are as follows:

#ABOUT
about.file.md5
about.file.sha1
about.file.sha256

#INTERMEDIARY
intermediary.hostname
intermediary.ip

#NETWORK
network.dns.questions.name
network.email.from
network.email.to

#OBSERVER
observer.hostname
observer.ip

#PRINCIPAL
principal.asset.hostname
principal.asset.ip
principal.asset.mac
principal.file.md5
principal.file.sha1
principal.file.sha256
principal.hostname
principal.ip
principal.mac
principal.process.file.md5
principal.process.file.sha1
principal.process.file.sha256
principal.user.email_addresses
principal.user.product_object_id
principal.user.userid
principal.user.windows_sid

#SOURCE
source.user.userid
src.asset.hostname
src.hostname
src.ip

#TARGET
target.asset.hostname
target.file.md5
target.file.sha1
target.file.sha256
target.hostname
target.ip
target.process.file.md5
target.process.file.sha1
target.process.file.sha256
target.user.email_addresses
target.user.product_object_id
target.user.userid
target.user.windows_sid

Important: Metadata fields are not Indexed fields, and I will explore that in more detail below.

Note: Please note this is the case as of the time of writing, April 2024, and may change.

Using UDM Search for Near Real-Time Event Monitoring

Let’s test how quickly UDM Search can surface events from a low-latency log source:

  • Log Source: Choose a log source with known fast ingestion (e.g., the native Google Workspace to Chronicle SIEM integration).
  • Event Generation: Rename a Google Doc to the current time (in CET) to create a unique, easily identifiable event.
  • Search: Use UDM Search to query for the recently renamed Google Doc.
    Evaluation: Measure the time between the event creation and its appearance in search results.
A simple test harness by renaming a document

Keep in mind that the manual tests via the GUI may introduce slight delays due to user or computer error, so the results are approximate rather than millisecond-precise.

Raw Log Search Results

Chronicle SIEM indexes data into Raw Log Search in near real-time under normal circumstances. This enables you to quickly find even very recent events.

Note: While I used the preview Raw Log Search within UDM Search, similar if not lower latency results are expected in the legacy interface.

Test Search: To verify the near real-time capabilities, I performed the following Raw Log search, matching the unique filename of the Google Doc I created earlier:

raw = /Test Doc at 21:56 CET/ log_source in ["Workspace Activities"]

The event was immediately available in Chronicle SIEM’s Raw Log Search results.

Test event at 21:56 and results in Raw Log search at 21:57

While there’s no dedicated “Now” button in the DateTime selector, you can approximate a real-time query using the relative ranges under the “Range” tab (e.g., “Last 5 Minutes”).

The DateTime selector in UDM Search has a quirk: the “Run Search” button won’t re-enable if you select a relative range ending in the current minute, and workarounds include:

  • Manual Adjustment: Slightly modify the start time of your range.
  • Wait a minute: Literally wait until the next minute begins, then select the desired relative range.
The intricacies of the DateTime dialogue

UDM Search Results

UDM Search typically indexes all enriched data within minutes. In this test, I explore how you can leverage indexed fields to access data in near real-time. Before we dive in, let’s clarify some terminology that I’ll be using throughout this post

  • near real-time: In the context of Google SecOps, data is available for search within a few minutes due to collection, processing, and indexing delays. This differs from true real-time systems where data is available instantly.
  • indexed fields: Specific UDM fields optimized for fast lookup, enhancing search efficiency.
  • non-indexed fields: Encompasses both indexed and non-indexed UDM fields.

To test how indexed fields can improve search speed, we’ll conduct a test using two methods:

  • UDM Search via the GUI: This is the graphical user interface for searching.
  • UDM Search API endpoint with a Python script: This allows automated queries. Our script will repeat the same query every 30 seconds until it gets a result from the expected test event.

Now that we have our terminology established, let’s start exploring.

Test 1 — Searching non-indexed fields via the API

This test examines the UDM Search API’s behavior when querying non-indexed fields.

UDM Query:

target.resource.name = /Test Doc at 16:34 CET/

Test Results

  • The UDM Search API endpoint’s response time varied between approximately 10 and 15 minutes during testing.
2024-04-13T14:35:05.361044Z: attempt: 1.  No results found.
2024-04-13T14:35:37.287846Z: attempt: 2. No results found.
2024-04-13T14:36:09.278386Z: attempt: 3. No results found.
2024-04-13T14:36:41.425666Z: attempt: 4. No results found.
2024-04-13T14:37:13.529023Z: attempt: 5. No results found.
2024-04-13T14:37:45.558487Z: attempt: 6. No results found.
2024-04-13T14:38:17.285079Z: attempt: 7. No results found.
2024-04-13T14:38:49.986647Z: attempt: 8. No results found.
2024-04-13T14:39:21.336291Z: attempt: 9. No results found.
2024-04-13T14:39:53.053474Z: attempt: 10. No results found.
2024-04-13T14:40:24.726820Z: attempt: 11. No results found.
2024-04-13T14:40:57.012009Z: attempt: 12. No results found.
2024-04-13T14:41:28.847092Z: attempt: 13. No results found.
2024-04-13T14:42:00.771291Z: attempt: 14. No results found.
2024-04-13T14:42:32.834385Z: attempt: 15. No results found.
2024-04-13T14:43:04.672392Z: attempt: 16. No results found.
2024-04-13T14:43:36.686217Z: attempt: 17. No results found.
2024-04-13T14:44:08.953811Z: attempt: 18. No results found.
2024-04-13T14:44:41.547433Z: attempt: 19
2024-04-13T14:44:41.548454Z: number of results: 2
event created at: 2024-04-13T14:34:36.563Z, ingested at: 2024-04-13T14:34:40.200455Z, and with an ingestion latency of 0:00:03.637455.
event created at: 2024-04-13T14:34:36.563Z, ingested at: 2024-04-13T14:34:40.200455Z, and with an ingestion latency of 0:00:03.637455.

Conclusion

Using only non-indexed fields in your UDM query means you can’t access UDM event data in near-real-time via the API endpoint.

Test 2— Searching indexed fields via the API

This test examines the UDM Search API’s behavior when querying indexed fields, in this example the principal.user.email_addresses field indexed field specifically.

UDM Query:

principal.user.email_addresses = "foo@bar.altostrat.com" 
and target.resource.name = "Test Doc at 18:31 CET"

Test Results

  • The UDM events were returned via the API response immediately.
2024-04-13T16:32:13.553792Z: attempt: 1
2024-04-13T16:32:13.556446Z: number of results: 2
event created at: 2024-04-13T16:31:42.498Z, ingested at: 2024-04-13T16:31:46.116769Z, and with an ingestion latency of 0:00:03.618769.
event created at: 2024-04-13T16:31:42.498Z, ingested at: 2024-04-13T16:31:46.116769Z, and with an ingestion latency of 0:00:03.618769.

Conclusion

  • Indexing is crucial for near-real-time results in UDM Search. Use indexed fields in your UDM queries whenever possible.
  • To optimize searches further, you can strategically join non-indexed fields with indexed ones.

This image illustrates typical UDM Search behavior for indexed vs. non-indexed fields. Please note: Timings are approximate and not guaranteed. Indexed data may not always be available in one minute, and non-indexed data might sometimes take longer than ten minutes.

Test 3: Searching using indexed fields via the GUI

This test examines UDM Search behavior via the Google SecOps UI when querying indexed fields.

UDM Query:

metadata.event_type = "USER_RESOURCE_UPDATE_CONTENT" 
and principal.user.email_addresses = "foo@bar.altostrat.com"

Test Results

  • The UDM events were returned via the UDM Search GUI immediately.
Test event at 21:56 and search results at 21:58

Conclusion

This re-enforces our learning observed in the prior test that indexed fields can be used to return UDM events in near real-time.

Test 4: Searching using non-indexed fields via the GUI

This test examines UDM Search behavior via the Google SecOps UI when querying non-indexed fields.

UDM Query:

target.resource.name = /Test Doc at 20:17 CET/

Test Results

  • The UDM events were not returned via the UDM Search GUI immediately.
The event via a non-indexed field was not immediately available
  • The event was visible within several minutes however, and with lower latency than via the UDM Search API endpoint
The event via a non-indexed field was not immediately available, but was available within 6~ minutes

Conclusion

  • This test confirms that non-indexed fields lead to slower response times in UDM Search API queries (similar to Test 1).
  • Interestingly, the Google SecOps GUI appears to handle UDM Search queries on non-indexed fields with lower latency than the API.

Q&A

Q: What if my data only exists in non-indexed fields, and there are no other searchable fields (like user, IP address, etc.)?

A: This is an uncommon scenario, and it presents some challenges. Here are some options though:

  • Refactor the Parser (if possible): Ideally, modify how your data is processed so at least one indexed field is populated.
  • Accept Delays: If refactoring isn’t possible, accept that searching this data in UDM Search won’t be in near real-time.
  • Consider Raw Log Search: While you can access data quickly in Raw Log Search, it may be less convenient for complex analysis.

Q. Are grouped fields counted as indexed fields?

A. No.

Q. Does UDM Search require enriched data to work?

A. No, UDM Search can return results even if data enrichment hasn’t happened yet. While most searches will show enriched events, queries on very recent data might initially return un-enriched events marked with a ‘U’. These events will typically become enriched within a few minutes. Re-running your search later will likely show the updated version.

Example of an indexed search returning un-enriched UDM events

Q. Can I add my own custom indexed fields?

A. No.

Summary

  • Google SecOps UDM Search provides a powerful tool for investigating UDM event data, both enriched and un-enriched, with results typically available within minutes.
  • Understanding and utilizing indexed fields allows you to speed up searches when working with near real-time data, which can be especially valuable during active security incidents.
  • When using the UDM Search API endpoint via a SOAR or custom integration it is important to be aware of indexed and non-indexed fields in order to make sure you return the expected data.

--

--