Exploiting LLMs: Unpacking Excessive Agency in a 6-Step Guide

Zeev Kalyuzhner
Wix Engineering
Published in
4 min readMay 16, 2024

Welcome to the first article in our series about security vulnerabilities in Large Language Models (LLMs).

In this series, we explore different ways attackers can exploit LLMs. Our first article focuses on how attackers can abuse LLM APIs to gain too much control over systems, which can cause problems.

As organizations use LLMs more, the risk of attackers taking advantage of their access increases. In this article, we uncover how attackers exploit LLM APIs, learn their tactics, and understand how to protect against them.

We’ll use insights from PortSwigger Labs and other sources to help you stay informed about LLM security.

Exploiting LLM APIs with Excessive Agency

LLMs offer powerful linguistic capabilities and access to various APIs, functions, and plugins, extending their reach into organizational ecosystems and posing both a risk and an opportunity for innovation.

Photo by Unsplash

Understanding LLM API Dynamics

The integration of LLMs with APIs hinges on the architecture and functionality of the API itself. Typically, organizations grant LLMs access to specific functionalities by delineating local APIs tailored to the model’s requirements. For instance, a customer support LLM might be endowed with APIs for managing user profiles, orders, and inventory.

The workflow for LLM-API integration follows a structured pattern:

1. Client Initiation: The client initiates communication with the LLM by submitting a prompt.

2. LLM Processing: Recognizing the need for API interaction, the LLM responds with a JSON object containing arguments conforming to the API’s schema.

3. Function Invocation: The client invokes the specified function, utilizing the provided arguments.

4. Response Processing: Upon receiving the function’s response, the client processes the data accordingly.

5. Iterative Interaction: Subsequently, the client resumes interaction with the LLM, appending the function response as a new message.

6. External API Invocation: Acting as an intermediary, the LLM calls the external API with the processed data.

7. Result Summarization: Finally, the LLM consolidates the API call results, presenting a summary to the user.

Unveiling the Attack Surface

The term “excessive agency” refers to scenarios in which LLMs have access to APIs that may access private data that malicious actors might use. Through the use of this access, attackers may force LLMs to do tasks outside their intended scope, opening up opportunities for manipulative actions through the model’s APIs.

To illustrate the potency of exploiting LLM APIs with excessive agency, consider the following scenario based on a lab environment:

1. Identify accessible APIs: Begin by querying the LLM to ascertain the APIs it can access. Employ tactics such as providing misleading context if necessary to elicit a comprehensive list of accessible APIs.

2. Probe for vulnerabilities: Dive deeper to discern specific arguments and functionalities of accessible APIs. Query the LLM about API arguments and capabilities to uncover potential avenues for exploitation.

3. Engage the LLM: Initiate a conversation with the LLM, querying it about the APIs at its disposal.

4. Investigate API capabilities: Focus on specific APIs, such as the Debug SQL API, to understand their potential impact. Note functionalities like executing raw SQL commands on the underlying database.

5. Execute malicious commands: Exploit vulnerable APIs by commanding the LLM to perform actions like selecting sensitive data or executing destructive commands.

6. Achieve objective: Through manipulative interactions with the LLM, accomplish the desired outcome, whether it involves data extraction or system compromise.

Exploiting LLM APIs with too much freedom shows how linguistic skill and digital manipulation can be used together. This shows how important strong defenses are in the face of changing threats.

Photo by Unsplash

Acknowledging the Educational Nature of Attack Examples

It is imperative to recognize that all attack examples presented herein are solely for educational purposes. While these scenarios provide valuable insights into potential vulnerabilities and attack vectors, they are not intended for malicious exploitation. Organizations and individuals should make responsible use of the knowledge gained from such examples, adhering to ethical guidelines and principles.

In conclusion, our examination of exploiting LLM APIs with excessive agency illuminates the inherent vulnerabilities within organizational ecosystems. By probing for weaknesses and executing manipulative commands, attackers can exploit LLMs to compromise system integrity and extract sensitive data.

To mitigate these risks, organizations must prioritize robust access controls, comprehensive risk assessments, and ongoing security awareness initiatives. By fortifying defenses and fostering a culture of cybersecurity resilience, organizations can defend against emerging threats and safeguard their digital ecosystems.

Stay tuned for our next installment as we dive deeper into the dynamic realm of LLM security, equipping organizations to adeptly navigate the intricate landscape of modern cybersecurity challenges.

--

--

Zeev Kalyuzhner
Wix Engineering

Ph.D. candidate bridging AI, nanophotonics & cybersecurity. Lecturer @OpenU, Data Scientist @Wix.com. Passionate about practical learning & AI-driven security.