Neo4j + Akamai — Mapping your External attack surface
TL;DR: This describes an approach to map out and visualize software and infrastructure. Specifically Akamai Endpoints (aka FQDNs) and how they relate to vulnerabilities (eg. the OpenSSL vulnerability announced last week). Click here for code snippets on how to do this: https://github.com/ccloes/neo4j-akamai
We have been using Akamai for many many years, but some of the challenges have been around our inability to “map” our infrastructure and how it is routed through Akamai. It is not uncommon for us to have hundreds of Akamai property configs each with thousands of lines of config and thousands of endpoints. This can make finding and mapping the route our traffic can take through Akamai very challenging, manual, and time consuming. We can now simply visualize that using Neo4j and Bloom.
Over the past year, we have developed a system which allows visualization of our infrastructure. This is made possible by the power of Neo4j (a graph database vendor). These visualizations have uncovered previously hidden insights that were difficult to see before. The use of graph databases has also allowed us to connect related data sets, and create what Neo4j calls “Knowledge Graphs”. These graphs allow us to create powerful inferences through relationships between the data sets. From a security perspective, this helps us in many ways:
- Team Attribution (who is responsible for this infrastructure)
- Prioritization (which vulnerabilities are public facing)
- Hygiene (which infrastructure is no longer being used)
Some of the data sets we have are:
- Source code repos and committers
- Akamai property configs and redirect rules
- Cloud resources (ie. AWS, GCP)
- DNS zones and entries
- Vulnerabilities from security scans
- Org chart data (employee data)
- Compliance Frameworks (eg. NIST, ISO, INDOR, PCI)
- Generic datasets from various datastores (eg. traditional databases, online storage, flat files)
NOTE: Some of these data sets use ingest methods that we hope to showcase in future Open Source projects in the coming months. Stay tuned.
These data sets tied together using graph relationships allow us powerful inferences that unlock the insights which were once hidden in the siloed data. Specifically, how our traffic is being routed through Akamai and where unused infrastructure might be hiding. Beyond what we have done with Akamai, we can now tie CVE’s to source code and source code to frontend endpoints for potential exposures. This is all in service of prioritizing the most critical vulnerabilities first and who should be working on them. We can now do in seconds, what previously took engineers hours and days to manually figure out.
Part of the reason for sharing this post was to share a small sample of how to do this ingestion and visualization in case others are in a similar situation. It is also a small taste of some of the future open source contributions we will be sharing using Neo4j. With issues like Heartbleed and the more recent OpenSSL vulnerability (CVE-2022–3786, CVE-2022–3602) published this month, this seems even more critical to share with others in hopes that it will help make their jobs easier in finding where these issues may lie inside their infrastructure.
One of the powerful insights we can now infer is what infrastructure or code may be vulnerable to a CVE. For instance we can see in the following visualization all the dependencies of the a software library which may have a CVE (eg. example CVE-20) according to our security scan tools.
You can find the Neo4j and Akamai code snippets here: https://github.com/ccloes/neo4j-akamai (Pull Requests welcome)
Come to the San Diego Graph DB User Group session Thursday Nov 10 (Sign up here).
Authors:
Chad Cloes, Zach Probst, Bryan Norman, Gabe Gallagher, Erick Lee
Special thanks to:
Daniel Kang, Nima Imani, Arshad Husain, PMP, CSM, Cybersecurity, Ethical Hacking, Michael Astua Mora