Threat Knowledge Graphs using Generative AI

Venkat Pothamsetty
securitygpt
Published in
3 min readSep 23, 2023

--

Threat graphs are traditionally used for tracing and visualizing attacks in a sequential way, a lot of security vendors offer threat graphs in some form or other, XDR vendors such as Crowdstrike to generic security vendors such as Sophos , to cloud vendors such as AWS with behavior graph features in Detective

Knowledge graphs are similar, but a bit different. They are very useful tool for understanding a topic from large amount of text . They are similar to mind maps but for a specific purpose understanding of a topic with a specific objective. A few companies have used them prior such as Linkedin , with the rise of graph databases , graphs are becoming default for accessing APIs , we can expect their usage to to grow.

As security operators, we get hit with a lot of information on threats and vulnerabilities, be it threat intelligence feeds, be it vulnerabilities or stories from various vendors around how they detected and thwarted a threat. Much of that information is in the form of text and hard to read.

What if -- we make Threat Knowledge Graphs that can parse the vast information we get hit re: threat feeds and vulnerabilities and make knowledge graphs so we can easily and visually understand how a threat or a vulnerability works ?

Architecture and Components

LLMs can parse large amount of unstructured text and and give us structured text back, and graphs are a structure. The architecture is simple, parse the relevant text in the threat feed or a vulnerability URL, define the graph structure we want, pass the text and the structure to the LLM and direct it to give us the graph back based on the knowledge objective we have.

Below are the main components -

  • Lets define the structure that we want , essentially define nodes and edges in a graph - source gist here
  • Lets make a detailed prompt to LLM with all the components , from role, task, objectives, examples etc - source gist here
  • LLM gives the graph back, lets make a function to visualize the graph - source gist here

Lets see how it worked - fortunately we have had a lot of high profile threats, attacks and vulnerabilities - storm8 attack on Microsoft , MoveIt transfer vulnerability , product process hiccup from Microsoft accidentally exposing their private data.

Threat Knowledge Graph of Vulnerability - MoveIt

Graph clearly shows how the vulnerability works, which versions are vulnerable, what are the fixes and workarounds

Threat Knowledge Graph of Attack - Storm8

Shows how the attack propagated from getting keys from the dump, accessing the production infrastructure

Threat Knowledge Graph of Product and Process Security - Microsoft's Git Exposure of Tokens

The graph shows how the product security process broke from from exposure of keys in Git to misconfigured storage tokens to SAS tokens with overly excessive scope

Future Work

  • Extending the prompts and structure to parse and highlight IOCs
  • Extending the prompts and structure to expand specific parts such as remediation into sub graphs - remediation knowledge graphs

How to Use

Threat knowledge graphs are included in latest version of SecurtyGPT, demo hosted at https://threat-knowledge-graph.streamlit.app/ (uses OpenAI API in the backend) .

import securitygpt
from securitygpt.vulngpt.graphgpt import draw_threat_graph
url = "https://thehackernews.com/2023/09/financially-motivated-unc3944-threat.html"
objective = "understand the attack details and remediations"

dot = draw_threat_graph(url,objective)

--

--