Climbing the DIKW Pyramid — Securing Knowledge in the AI Age

DNX Ventures
DNX Ventures Blog
Published in
5 min readJul 15, 2024

The rapid advancement of artificial intelligence, especially large language models (LLMs), ushered in a new era, redefining how we think about cybersecurity and data privacy. At DNX Ventures’ Japan-US Cybersecurity Summit, Kris Harms, our cybersecurity-focused Principal, hosted Sounil Yu, Chief AI Security Officer and co-founder of Knostic. Sounil provided invaluable insights into this ongoing evolution, introduced the concepts of a knowledge-centric vs data centric mental model, and provided guideposts and how to navigate this paradigm shift to harness the power of AI in the enterprise.

Sounil Yu’s cybersecurity credentials are nothing short of impressive. The Cyber Defense Matrix and the DIE Triad, which Sounil created, are reshaping approaches to cybersecurity. He’s a Board Member of the FAIR Institute; fellow at GMU Scalia Law School’s National Security Institute; guest lecturer at Carnegie Mellon; and advisor to many startups. Sounil previously served as the CISO at JupiterOne, CISO-in-Residence at YL Ventures, and Chief Security Scientist at Bank of America.

Sounil has a knack for creating and interpreting models that help us understand and mitigate risks inherent in new technology. In this session, Sounil discussed emerging security risks within generative AI and LLMs through the lens of the DIKW (Data, Information, Knowledge, Wisdom) pyramid.

The DIKW Pyramid

The DIKW pyramid represents a hierarchy of understanding, with each level adding increasing value and context. Data, the foundational layer, represents raw facts and figures. Information describes that data within a particular context. Knowledge emerges when information is synthesized, patterns are recognized, and principles are applied. Finally, Wisdom represents the ability to understand fundamental truths and make sound judgments.

AI Safety VS AI Security

Sounil shared insight that the future AI safety and AI security challenges may vastly overshadow our current cybersecurity issues. While the current discourse is mostly around AI security, the real priority must be building safe AI systems. He argued that it’s better to have “AI that is safe but insecure”, rather than “AI that is secure but unsafe”. A secure system means little if the AI itself is unsafe, highlighting the critical need for a foundational focus on AI safety as we advance these technologies.

In the context of Safe AI, Sounil emphasized that critical aspects of this tectonic technology transformation revolve around newer concepts such as Knowledge Quality, Knowledge Security and Knowledge Privacy. If we replaced “Knowledge” with “Data”, we’d know those terms well, but Sounil helped us take this one step further for the age of LLMs.

Data Quality vs Knowledge Quality

We, as industry experts, are pretty familiar with the concept of Data Quality and Data Quality Issues, but Knowledge Quality is a new term for a new step in the DIKW pyramid.

Knowledge Quality refers to the accuracy, reliability, and integrity of the insights generated by AI systems. As AI models process and analyze vast datasets, ensuring high-quality knowledge is crucial for making informed and trustworthy decisions. Poor knowledge quality can lead to misinformation and flawed conclusions (called Hallucinations), which can have serious implications. Maintaining Knowledge Quality involves implementing rigorous validation processes, sufficient data quality, and appropriate governance. But high knowledge quality is only part of the challenge.

Knowledge Privacy, an Introduction

Drawing from his extensive experience at Bank of America and in the VC world, Sounil advocated for another mindset shift. Rather than myopically focusing on data privacy, we need to prioritize something he called “Knowledge Privacy”.

Knowledge privacy refers to the protection of aggregated insights and inferences drawn from data, rather than just the individual data points themselves. This concept is particularly important in the context of AI, where systems can analyze and predict complex behaviors and preferences. The introduction of Facebook’s social graph years ago underscored this shift, as it enabled the discovery of deeply personal insights — voting preferences, sexual orientation, and buying habits — highlighting the concept of knowledge privacy. Such information far extends beyond traditional data privacy concerns like national IDs, email addresses, or birthdays underscoring the importance of knowledge privacy.

A breach of knowledge privacy can be far more intrusive than mere data privacy, touching on the core of our personal experiences and relationships. Our current laws and focus are misaligned, targeting data privacy when the more critical issue lies at the knowledge level. This realization calls for a reevaluation of privacy regulations to address the deeper and more sensitive aspects of knowledge privacy in the AI landscape.

Climbing the DIKW Pyramid for Knowledge Security

So how do we climb the DIKW pyramid securely? Sounil stressed the importance of contextual understanding — knowing what information a particular person or entity truly needs access to, rather than relying solely on rigid access controls. Enterprises must develop “same-layer controls” that can intelligently filter and fragment knowledge, not information, at a granular level for different audiences.

Sounil explored the concept of “knowledge security” and the necessity for control at the knowledge level. Enterprise search capabilities, for instance, allow users to efficiently find information about ongoing projects or initiatives. However, they also have the potential to expose sensitive insights like potential layoffs, mergers, or personal details about colleagues, through the discovery of accessible information (like discovering a spreadsheet filled with sensitive data you didn’t know you had access to), or inference based on a collection of data. This highlights a knowledge security problem where not everyone has a need to know certain information.

“The prevailing approach of restricting the data provided to LLMs fails to address the deeper issue of securing knowledge itself.”

The prevailing approach of restricting the data provided to LLMs fails to address the deeper issue of securing knowledge itself. Instead, we need robust knowledge-level controls to manage who can access specific types of information, ensuring that valuable insights can still be leveraged without compromising sensitive knowledge. This nuanced control is essential for maintaining both the utility and security of AI systems in enterprise environments.

In essence, the DIKW model can be an effective template to help us understand the new dimensions of risk associated with AI. By prioritizing AI safety and knowledge security, innovative enterprises and startups can stay ahead of the curve and harness AI safely and responsibly.

As we rapidly scale the DIKW pyramid powered by AI, Sounil’s insights serve as a critical guidepost. The ascent will be challenging, but the view from the top — a world with advanced, secure, and trustworthy AI capabilities — makes it all worthwhile.

--

--