Graphing Cyber: Leveraging Math Theory to Strengthen Security

Uri Itai
Coinmonks
6 min readSep 2, 2024

--

A year ago, while developing a course on data science and cybersecurity, I consulted with several experts in the field. During one such conversation, I met a Chief Information Security Officer (CISO) who shared that his strong foundation in mathematics had significantly benefited him in his role. This insight further piqued my interest.

The Ciso in action

As we delved deeper, I was captivated by the methodology he described. Curious, I asked, ‘So, what’s the secret tool?’

‘There are no secrets,’ he replied with a smile. ‘I’m using old fashion graphs theory.’

‘Graph theory is incredibly effective for analyzing computer systems. It’s a well-established approach, yet you’d be surprised how many companies overlook it,’ he continued, highlighting the method’s simplicity and power.

Intrigued by his approach, I pressed further, ‘So, how do you construct the company graph?’

‘It’s not just one graph, but multiple,’ he explained. ‘Each graph serves a distinct purpose, capturing different dimensions of the organization’s operations.’”

Communication graph

To fully understand this concept, let’s revisit the basics of a graph. At its core, a graph is composed of nodes (or vertices) connected by edges (or links). These edges can be weighted or unweighted, directed or undirected, depending on the context. In cybersecurity, the nodes might represent employees, computers, servers, or even connected devices like printers and IoT equipment. The edges, on the other hand, signify various interactions or relationships, such as communication links, data flows, or access permissions.

The company’s graph in a computer model is much simpler than in real life, where it is far more complex.

For example, graphs can model communication traffic within a company — whether it’s Slack messages, emails, or data transfers between devices. This traffic can be visualized as a single graph or as a collection of interconnected graphs, where the edges carry temporal data that varies by time of day, day of the week, and whether employees are working in-office or remotely. Such graphs are dynamic, reflecting the ebb and flow of data throughout the organization. Identifying anomalies in this traffic, while accounting for these temporal and contextual variables, plays a critical role in cybersecurity defense. An unexpected spike in communication at an odd hour, or unusual data transfer patterns on a holiday, could be early indicators of a cyberattack.

But communication traffic is just one aspect. Consider the company hierarchy graph, which might depict the organizational structure — who reports to whom, and how different departments are connected. Interestingly, this graph can be quite different from the communication graph. For instance, the IT department may have more frequent interactions with various teams than the formal hierarchy would suggest. By analyzing these differences, one can gain a deeper understanding of how the organization truly operates on a day-to-day basis, potentially identifying bottlenecks, inefficiencies, or even shadow networks that bypass official channels.

Another crucial graph emerges from the version control software used by the company. This graph doesn’t just map the codebase, showing how different branches and commits are related, but also reveals collaborations between developers working on similar issues. It can expose patterns of cooperation and isolation, highlighting key contributors or potential risks. Interestingly, version control systems are often assumed to be inherently secure — after all, they’re critical tools in software development. However, the CISO hinted at vulnerabilities that could be exploited, a topic worthy of its own exploration.

When you think of these graphs collectively, they resemble a complex social network within the company. Just like any social network, they exhibit certain properties, such as a power-law distribution in the number of interactions. Often, a few nodes (or individuals) handle most of the traffic, acting as communication hubs. Identifying these hubs provides valuable insights, as they often represent informal leaders or key influencers within the organization. These individuals may not hold formal managerial roles, yet they wield significant influence. However, this dynamic can shift dramatically during a cyberattack. In many cases, the compromised node in an attack is a communication hub. If you haven’t anticipated this possibility, you could easily mistake a surge in traffic as a sign of popularity rather than an indicator of a breach, leading to costly false positives.

Graphs also prove invaluable for segmenting the company into clusters. For instance, using graph theory, you can identify natural divisions within the company — different departments, teams, or even geographical locations. This segmentation allows for the strategic placement of firewalls and other security measures, effectively creating digital borders that can contain an attack and prevent it from spreading across the entire organization. In the event of a breach, this ability to quarantine affected sections can be the difference between a minor incident and a catastrophic failure.

Cluster graph

One powerful approach involves leveraging spectral graph theory. By analyzing the Laplacian matrix of a graph, we can uncover crucial structural properties of the network. Notably, the Fiedler vector, associated with the second-smallest eigenvalue of the Laplacian, is instrumental in partitioning the network into two distinct components. This method, known as spectral clustering, can be further extended to divide the network into multiple clusters, allowing for more nuanced and effective segmentation. For more details, see this article.

Graph cliques

Another approach is to consider communication as a flow network. Taking this into account one can use the maximal flow minimal cut theorem. The cut is a strategic location for installing firewalls. Moreover, knowing the flow provides information on the speed of the spread of the attack.

Graph cut

And of course, neural networks are indispensable in data science. Graph Neural Networks (GNNs), in particular, yield fascinating results in message-passing analysis. GNNs have been effectively used to detect anomalies at both the node and path levels, identifying malicious processes and detecting lateral movement at the edge level. By analyzing the relationships between nodes within a graph, GNNs can reveal suspicious network behaviors that might escape detection by conventional security tools. For example, GNNs are particularly adept at identifying lateral movement, where an attacker transitions from one compromised node to another to evade detection.

Moreover, GNNs are not just limited to anomaly detection; they are also powerful tools for threat intelligence and vulnerability assessment. By scrutinizing the structure of the network graph, GNNs can pinpoint potential vulnerabilities and weaknesses, enabling security teams to address these risks proactively before attackers can exploit them.

In summary, by leveraging graph theory, we can gain a comprehensive and multi-dimensional view of a company’s operations, allowing us to identify vulnerabilities, optimize defenses, and ultimately strengthen cybersecurity. This approach, though rooted in mathematical theory, offers practical solutions that can be adapted to the unique challenges of the healthcare sector and beyond.

I left the meeting with a firm understanding of why he emphasized that his mathematical background is essential. During the course, I showed my students the importance of using theory to solve practical issues — a connection often missing in many academic studies. I hope this blog post will inspire readers to recognize and adopt this approach.

--

--

Uri Itai
Coinmonks

Mathematician in exile, researching algorithms and machine learning, applying data science, and expanding my ideas.