June 13, 2026
AI / LLM Software Security: Part 5
This is the fifth (and final) post in my “AI / LLM Software Security Series”.
Robert Broeckelmann
18 min read
LLM09:2025 Misinformation
LLM09:2025 "Misinformation" in the OWASP Top 10 for LLM Applications 2025 focuses on a core (if not primary) weakness of generative AI systems. LLMs can produce false, misleading, or fabricated information that sounds authoritative and believable, but is utter horse sh*t.
OWASP treats this as both:
- Security problem; and
- A business reliability problem
The danger is not just that the model is wrong, it is that users, systems, and organizations may trust and act on the incorrect output.
LLMs generate text probabilistically.
They do not:
- Verify truth
- Understand facts
- Reason like humans
- Inherently distinguish reality from fiction
Instead, they predict statistically likely sequences of words.
That means that while sounding extremely confident, they can generate:
- Fabricated citations
- Fake legal cases
- Incorrect medical advice
- Invented APIs,
- Nonexistent code libraries
- Misleading summaries
I'll continue to refer to this as "utter horse sh*t".
OWASP refers to this broadly as misinformation. Hallucinations are one major cause of this misinformation. OWASP distinguishes between:
- Hallucinations: fabricated or incorrect content
- Misinformation: The broader operational risk created when false outputs are consumed as truth.
This is an important distinction. Misinformation does not require malicious intent and it does not necessarily require an attacker. Sometimes the system simply fails because:
- The model guessed wrong
- The user trusted it; and
- The application lacked safeguards
This becomes a security problem because incorrect AI output can lead to:
- Legal liability
- Compliance failures
- Financial loss
- Unsafe medical guidance
- Reputational damage
- Insecure code deployment
- Operational failures
- Dangerous business decisions
OWASP emphasizes that the vulnerability often exists in the surrounding application architecture, not just the model itself. The real problem arises when systems automatically trust, store, display, or execute LLM-generated content without verification.
Common Examples
There are some absolutely hilarious examples from the headlines. We've already seen a couple of those in this series.
Fabricated Facts
Fabricated facts are statements that are presented as factual but are actually invented by the model. This phenomenon is commonly called a hallucination.
A fabricated fact can range from a minor error to a completely fictional piece of information, such as:
- Inventing a research paper that does not exist.
- Claiming a person held a job they never held.
- Generating fake legal citations or court cases.
- Creating nonexistent APIs, software functions, or configuration options.
- Providing incorrect historical dates, statistics, or references with high confidence.
The model confidently invents
- Statistics
- Events
- Legal precedents
- Technical details
- References
The key point is that an LLM is not inherently a database of verified knowledge. It is a probabilistic text-generation system that predicts what text is likely to come next. When generating a response, the model may:
- Recognize a pattern in its training data.
- Attempt to complete the pattern.
- Produce something that looks plausible.
- Fail to distinguish between "plausible" and "true."
A famous example involved lawyers submitting fake AI-generated legal citations to court.
No matter how many times people are left looking ridiculous in these situations, there's always a next, lazy person who couldn't bother to check the information. Don't put yourself in that situation.
Incorrect Customer Guidance
Incorrect Customer Guidance refers to situations where an LLM provides advice, instructions, recommendations, or decisions that are wrong, misleading, incomplete, unsafe, or inappropriate for the user's actual circumstances. Unlike a simple factual hallucination, incorrect customer guidance focuses on the action the user takes as a result of the model's output.
This is different from fabricated facts. A fabricated fact is stating "The filing deadline is April 30." when the actual deadline is different. Incorrect customer guidance is "You don't need to file this form at all." even if all of the underlying facts were technically correct. The key issue is that the model's recommendation leads the user toward a bad outcome.
This happens because LLMs are optimized to generate plausible responses, not to:
- Understand legal liability
- Verify current policies
- Assess risk
- Consider organizational procedures
- Account for all relevant context
As a result, they may provide guidance that sounds reasonable but is operationally incorrect.
AI customer-service systems may:
- Invent refund policies
- Provide false travel rules
- Misstate account information
Recall, the well-known case where an airline chatbot provided incorrect refund information, contributing to legal consequences.
Unsafe Code Generation
Unsafe code generation occurs when an LLM produces software code that contains security vulnerabilities, insecure design patterns, weak configurations, or dangerous operational practices.
The code may compile, pass tests, and appear correct, yet still introduce exploitable weaknesses into an application or infrastructure environment.
This is one of the most significant security risks associated with AI-assisted software development because developers often trust code that looks professional and functional.
Coding assistants may:
- Invent nonexistent packages
- Suggest insecure libraries
- Generate vulnerable code
- Recommend dangerous configurations
That vulnerable code may come in the form of SQL injection, command injection, hard-coded secrets, weak authentication, insecure cryptography, missing authorization checks, deserialization vulnerabilities, cross-site scripting (XSS), etc, etc. We looked at many of these, briefly, earlier in this series, some I've covered in other blog posts, maybe I'll get to the rest in the future, but the point of this post is not to detail every possible way an LLM can produce insecure / unsafe code.
The LLM can produce vulnerable code because:
- Training data contains vulnerable code: The internet contains old tutorials, vulnerable examples, stack overllfow snippets, legacy codebases. The model learns from both good and bad patterns. This just reenforces the notion that the internet has a lot of crap.
- Functional Correctness vs Security: LLMs are optimized to generate code that appears to solve a problem. Security is often secondary.
- Missing Threat Modeling: The model does not naturally think like an attacker. A human security engineer might ask what if the input is malicious? What if authorization fails? What is this API is abused? The model often does not.
- Outdated Knowledge: Security best practices evolve rapidly. Cryptographic recommendations change. Authentication standards evolve. Framework protections improve. The model may reproduce practices that were acceptable years ago but are now discouraged.
Researchers have repeatedly found LLM-generated code containing:
- SQL injection vulnerabilities
- Cross-site scripting flaws
- Weak cryptography
- Authentication bypasses
- Insecure cloud configurations
Several studies have shown that developers frequently accept insecure AI-generated code because:
- It looks polished.
- It compiles successfully.
- It solves the immediate problem.
This can create a false sense of confidence.
Unsafe code generation can lead to:
- Data breaches
- Remote code execution
- Account compromise
- Privilege escalation
- Cloud compromise
- Compliance violations
At scale, an organization using AI coding assistants may unintentionally propagate the same insecure pattern across dozens or hundreds of applications.
Medical and Healthcare Risks
Medical and healthcare applications are among the highest-risk uses of Large Language Models because errors can directly affect patient health, treatment decisions, and clinical outcomes. Unlike many business applications where a mistake might be inconvenient or costly, mistakes in healthcare can lead to injury or death.
Healthcare information is different. Healthcare combines several challenging characteristics:
- High consequences for errors
- Complex and evolving medical knowledge
- Significant regulatory requirements
- Patient-specific decision making
- Large amounts of sensitive information
An LLM may sound knowledgeable and authoritative while lacking the ability to verify whether its recommendations are medically appropriate for a particular patient. The LLM may:
- Overstate certainty
- Provide unsupported treatments
- Misrepresent medical consensus
- Fabricate clinical information
In high-risk domains, misinformation can directly impact safety.
One of the biggest dangers is that the LLM's confident tone creates perceived authority. Users often assume that fluent answers means accurate answers. OWASP calls this over-reliance.
OWASP repeatedly emphasizes the human factor: people trust confident AI output too easily. This creates a "trust trap":
- The model sounds competent
- Users stop verifying
- Organizations automate decisions
- Errors propagate into real systems
Misinformation therefore becomes:
- Partly a technical issue
- Partly a UX/governance issue
Agentic AI Systems Increase the Risk
As AI systems become more autonomous, misinformation can feed directly into:
- Workflows
- Business logic
- APIs
- Code pipelines
- Infrastructure systems
- Operational decisions
A hallucinated answer is bad, but a hallucinated answer automatically acted upon by software is much worse.
And, bam, we have all the necessary preconditions for Sky Net. Joke. Or, is it?
Recommended Mitigations
As always, layered defenses are recommended.
Retrieval-Augmented Generation (RAG)
We've already discussed RAG in previous posts in this series.
Retrieval-Augmented Generation (RAG) is often presented as a solution to hallucinations, but in practice it introduces an entirely new attack surface. A poorly secured RAG system can become more dangerous than a standalone LLM because attackers can influence what information the model retrieves and trusts.
A useful mindset is to treat your retrieval layer as an untrusted input source, even when the data originated inside your organization. Always:
Curate and Trust Your Data Sources: The most important security control is deciding what gets into the knowledge base. Otherwise, you are creating opportunities for data poisoning.
Defend Against Prompt Injection in Retrieved Content: One of the biggest RAG risks is indirect prompt injection. The model should be instructed to use retrieved content asreference material, not executable instructions. Many modern agent frameworks now implement this separation because indirect prompt injection has become one of the most common LLM attack vectors.
Enforce Authorization Before Retrieval: A common mistake is to retrieve documents and filter results afterward. Instead, determine user permissions, apply permissions during retrieval, and only retrieve documents the user may access. Otherwise you risk data leakage, cross-tenant exposure, and confidential document disclosure.
Use Metadata Filtering: Strong metadata filtering significantly reduces retrieval risk. For example, every document should be tagged with department, classification level, data owner, project, customer account, and region. Then, search queries should include specific values for each of those tags rather than generic document searches.
Minimize Retrieved Context: More context is not always better. Large context windows increase attack surface, prompt injection opportunities, information disclosure risk, and cost. Retrieve only what is necessary.
Validate Retrieved Content: Treat retrieved documents like user input. Perform content scanning, classification, Data Loss Prevention (DLP) checks, malware scanning, and pattern detection. Look for prompt injection attempts, embedded instructions, secrets, credentials, tokens, PII, among other things, before content reaches the model.
Protect Sensitive Information: Many organizations accidentally build a "Ask me anything about our company" systems that expose confidential information. Classify content before indexing; you know, that whole data classification standard most organizations have, but never seem to get around to implementing. For example, Public, Confidential, Restricted data classifications.
Log Retrieval Events: Most organizations log prompts but forget retrieval activity. Your system should log user, query, retrieve documents, similarity scores, and the generated response. This allows incident investigation, compliance review, and abuse detection. Without retrieval logs, debugging a bad answer can be nearly impossible.
Monitor for Data Poisoning: Data poisoning occurs when malicious content enters the corpus (love that word). This could come in the form of compromised wiki pages, malicious pull requests, altered documentation, or insider threats. An organization should monitor for sudden content changes, unusual embedding updates, new instruction-like language, and unauthorized document additions.
Use Source Attribution: Every answer should ideally include source document, document version, timestamp, and confidence indicators. Users should know where the document came from. This improves trust and simplifies verification.
Limit Agent Capabilities: A dangerous pattern is RAG + Tool Access + Autonomous Actions. If retrieved content can influence database updates, cloud administration, email sending, or financial transactions, prompt injection becomes much more severe. Apply least privilege to all connected tools
Separate Public and Private Knowledge Bases: Avoid combining public internet content, internal documentation, and customer data into one vector database. Instead use separate retrieval domains (a logical boundary that defines which collection of knowledge can be searched for a particular query). For example, have a separate retrieval domain for "Public Knowledge Base", "Internal Knowledge Base", and "Customer-Specific Knowledge Base"— each with independent access controls.
Continuously Test the System: Perform security testing such as prompt injection testing, data poisoning exercises, access control validation, cross-tenant leakage testing, and retrieval manipulation testing. Many organizations red-team the model but never test the retrieval layer, even though the retrieval layer often presents the larger risk.
To do RAG safely, ground outputs in:
- Verified documents
- Trusted databases
- Authoritative sources
This reduces hallucination risk.
Not everything belongs in a vector database.
Citation and Source Verification
Citation and source verification are among the most important controls for reducing misinformation and hallucinations in LLM applications. While modern models can generate highly convincing responses, they often cannot inherently distinguish between information that is verified and information that is merely plausible. Citation and source verification mechanisms help bridge this gap by grounding outputs in identifiable evidence.
Citation and source verification refers to the process of:
- Identifying the sources used to generate an answer.
- Presenting those sources to users.
- Validating that cited sources actually exist.
- Confirming that the cited sources support the claims being made.
- Detecting fabricated or misleading citations.
The goal is to answer not only what-did-the-model-say?, but also how-do-we-know-this-is-true?
Traditional search engines provide links that allow users to inspect the underlying information.
By contrast, without showing the supporting evidence, LLMs often produce:
- Synthesized answers
- Summaries
- Recommendations
- Explanations
This creates several risks:
- Hallucinated facts
- Fabricated references
- Outdated information
- Unsupported conclusions
- User overreliance on model outputs
Source verification introduces accountability into the generation process. Require:
- Citations
- Provenance
- Source attribution
- Factual grounding
Users should be able to verify claims.
The Machine Learning — Bill of Materials (ML-BOM) and the Artificial Intelligence — Bill of Materials (AI-BOM) specs should be able to help with this. Though, that requires a lot of work that many organizations just aren't going to put in.
Human Oversight
Keep humans involved for anything important, including, but not limited to:
- Medical
- Legal
- Financial
- Safety-critical decisions
OWASP strongly discourages fully autonomous trust in high-impact contexts. Which is a rather fancy way of saying keep humans in the loop as a sanity check for important things. Or, anything else where being wrong leaves you looking like an idiot.
Confidence and Validation Layers
A confidence layer is a system that sits between the LLM and the end user, evaluating whether a response is trustworthy enough to return, whether it needs qualification, or whether it should be escalated to another source.
A model can be extremely confident and completely wrong.
Therefore, a confidence layer should measure evidence quality, retrieval quality, and verification success, not merely the model's token probabilities.
The validation layer is one of the most important — and often overlooked — components of a production LLM application. It acts as a security and quality control boundary between the model and the outside world. Without validation, an LLM application is effectively trusting probabilistic text generation to drive business processes, make decisions, access data, and invoke tools. The validation layer is responsible for ensuring that:
- Inputs are safe
- Outputs meet expectations
- Retrieved content is trustworthy
- Tool calls are appropriate
- Business rules are enforced
- Responses satisfy security requirements
Applications should:
- Validate outputs,
- Apply business rules,
- Cross-check facts,
- Detect contradictions,
- Reject low-confidence responses.
Clear Risk Communication
UI design should:
- Indicate uncertainty
- Label AI-generated content
- Discourage blind trust
- Encourage verification
Give the user the opportunity to understand that only idiot would trust this information.
Monitoring and Evaluation
Continuously measure:
- Hallucination rates
- Factual accuracy
- Citation correctness
- Unsafe outputs
- Regression over time
LLM09 Summary
LLM09 highlights a fundamental reality of generative AI: fluency is not the same as truth. Traditional software usually failed visibly:
- Crashes
- Errors
- Exceptions
LLMs fail differently:
- Smoothly
- Persuasively
- Confidently
That makes misinformation uniquely dangerous because users often cannot easily distinguish accurate output from fabricated output.
LLM10:2025 Unbounded Consumption
LLM10:2025 "Unbounded Consumption" in the OWASP Top 10 for LLM Applications 2025 focuses on a rapidly growing class of AI security and operational risks: Attackers — or poorly designed systems — can force AI systems to consume excessive compute, tokens, memory, API calls, time, or money.
This is essentially the LLM-era evolution of:
- Denial of Service (DoS)
- Resource exhaustion
- Abuse of metered cloud infrastructure.
But AI systems introduce new dimensions:
- Token costs
- Inference costs
- Recursive agents
- Tool chains
- Vector retrieval
- Model extraction attacks
LLM inference is computationally expensive. Every request may consume:
- GPU time
- API quotas
- Cloud compute
- Vector searches
- Tool invocations
- Memory
- Tokens
If applications fail to enforce limits, attackers can trigger:
- Runaway costs
- Degraded performance
- Service outages
- Infrastructure instability
OWASP defines this as uncontrolled and excessive inference consumption. This Is Different from Traditional Deniel of Service (DoS).
Traditional DoS attacks mostly targeted:
- Bandwidth
- CPU
- Memory
- Network sockets
LLM systems introduce entirely new resource dimensions:
- Token generation
- Context window expansion
- Retrieval depth
- Agent recursion
- Tool-call chains
- Reasoning loops
- Embedding operations
- Pay-per-token billing
An attacker may not even need huge traffic volumes.
A few carefully crafted prompts can become extremely exp1ensive.
Common Attack Types
Denial of Wallet (DoW)
Denial of Wallet (DoW) is one of the most important new concepts.
Attackers intentionally generate excessive API costs by:
- Triggering long outputs
- Repeatedly invoking expensive models
- Causing recursive workflows
- Maximizing token usage
This results in:
- Enormous cloud bills
- Exhausted API quotas
- Financial damage
A Denial of Wallet (DoW) attack is the financial equivalent of a traditional Denial of Service (DoS) attack. Instead of trying to make a system unavailable, the attacker attempts to drive up operational costs by forcing an application to consume excessive billable resources.
In the context of LLM applications, Denial of Wallet has become one of the most significant emerging threats because AI workloads are often metered by:
- Tokens
- API requests
- Model invocations
- Embedding generation
- Vector searches
- Tool executions
- GPU time
- Agent actions
The attacker doesn't necessarily care if the application crashes. The goal is to make the victim pay.
Traditional web applications have relatively predictable costs.
An LLM application may perform:
1 User Request
↓
Multiple LLM Calls
↓
Vector Searches
↓
Embedding Generation
↓
Tool Calls
↓
Web Searches
↓
Agent Loops
↓
Response1 User Request
↓
Multiple LLM Calls
↓
Vector Searches
↓
Embedding Generation
↓
Tool Calls
↓
Web Searches
↓
Agent Loops
↓
ResponseA single prompt can generate substantial cloud spend.
Common attack techniques include:
- Oversized Prompts: Attackers submit extremely large inputs: 50K words, 100K words, entire books, large codebases. This results input tokens (obviously), higher processing costs, and longer inference times.
- Output Expansion Attacks: The attacker intentionally maximizes output token consumption. Send in a prompt: Generate a million-word report.
- Recursive Agent Loops: Search. Analyze. Search again. Analyze Again. Repeat. Agent systems are particularly vulnerable. Without safeguards, a single request can trigger dozens or hundreds of model calls.
- Tool Invocation Abuse: Modern AI agents may have access to search APIs, cloud APIs, databases, and external services Attackers can induce repeated tool usage. This may trigger thousands of billable operations.
- Embedding Abuse: Many RAG systems automatically generate embeddings for submitted content. An attacker uploads a 100MB document or 10K files. The organization pays for embedding generation and storage.
- Vector Database Abuse: Queries designed to trigger large retrieval sets, expensive hybrid searches, or re-ranking operations can substantially increase infrastructure costs.
- Multi-Agent Explosions: A single user request can generate dozens of LLM calls. Poorly designed agent architectures multiply costs quickly. A planner agent, research agent, analysis agent, writer agent, reviewer agent.
OWASP emphasizes that AI systems can be attacked economically, not just technically.
Variable-Length Input Flooding
Variable-Length Input Flooding is a specific form of resource exhaustion attack against LLM applications in which an attacker intentionally submits extremely large, complex, or computationally expensive inputs to drive excessive consumption of tokens, memory, compute, retrieval operations, or downstream services.
It is one of the most common paths to Denial of Wallet. Traditional web applications often process requests of relatively predictable size such as "GET /user/12311". The cost of handling that request is usually fairly constant. LLM applications are different with "Prompt size = Cost". A request containing 100 words and a request containing 100,000 words can differ dramatically in
- Token processing cost
- Latency
- Memory usage
- Retrieval operations
- Context window consumption
- GPU utilization
The attacker exploits this asymmetry by submitting:
- Massive prompts
- Huge context windows
- Oversized files
- Intentionally pathological inputs
This forces:
- Expensive tokenization
- Long inference times
- Memory pressure
- Degraded service availability
Then, you, as the system owner, gets the bill.
Recursive Agent Loops
Recursive Agent Loops occur when an AI agent repeatedly invokes itself, other agents, tools, or workflows in a cycle that continues indefinitely — or for far longer than intended — consuming excessive resources without making meaningful progress.
Agentic systems may:
- Call tools repeatedly
- Re-query themselves
- Invoke nested workflows
- Loop indefinitely
Example:
- Agent asks search tool
- Search output triggers another query
- Loop continues indefinitely
This can silently consume massive resources.
Resource-Intensive Queries
Resource-Intensive Queries are prompts or requests that consume disproportionately large amounts of compute, memory, tokens, retrieval operations, tool executions, or agent activity compared to a typical user request. Attackers intentionally craft prompts that maximize computation. For example:
- Extremely long chain-of-thought reasoning
- Complex symbolic problems
- Repeated tool orchestration
- adversarial token sequences
The goal of this attack is to:
- Increase latency
- Exhaust GPUs
- Reduce system availability
Model Extraction Attacks
Model Extraction Attacks (sometimes called Model Stealing Attacks) occur when an attacker systematically interacts with an LLM or AI application to reconstruct, replicate, approximate, or learn sensitive information about the underlying model.
The goal is usually one of the following:
- Create a competing model
- Steal proprietary capabilities
- Learn hidden prompts or policies
- Infer training data
- Discover model behavior and guardrails
- Avoid paying for API access
Attackers repeatedly query APIs to:
- Study outputs
- Infer behaviors
- Recreate capabilities
- Build shadow models
This becomes:
- Intellectual property theft
- Competitive abuse
- Unauthorized model replication
OWASP notes that theexcessive inference access can enable practical model cloning.
In traditional cybersecurity terms, model extraction is the AI equivalent of stealing intellectual property.
Retrieval and Tool Explosion
Retrieval and Tool Explosion is a class of resource-consumption problems in LLM applications where a single user request triggers an excessive number of retrieval operations, tool invocations, agent actions, or downstream API calls. It is a common cause of LLM10:2025 Unbounded Consumption and can lead to significant performance degradation and Denial of Wallet scenarios.
The problem is one request unexpectedly becomes hundreds or thousands of operations.
RAG and agent systems may:
- trigger excessive vector searches,
- chain multiple APIs,
- invoke numerous tools,
- or repeatedly expand retrieval windows.
This multiplies:
- Cost
- Latency,
- Backend load
Agentic AI Makes This Worse
Modern AI agents increasingly:
- Plan tasks
- Orchestrate workflows
- Use tools
- Maintain memory
- Act autonomously
That creates new failure mode wherein the AI may accidentally DoS itself.
Without constraints, this can cause:
- Recursion grows
- Actions multiply
- Costs explode
- Queues fill
- Systems degrade
LLM10 Is a Security Risk
This as security risk because attackers can intentionally weaponize:
- Costs
- Latency
- Infrastructure usage
- Model availability
The consequences of these attacks include:
- Denial of service
- Financial exhaustion
- Cloud budget overruns
- degraded customer experience
- API quota exhaustion
- GPU starvation
- Infrastructure instability
- Model theft
The goal isn't to cause a data breech; it's just to to cause chaos for the application owner by costing them excessive, unplanned hosting fees. There may not necessarily be a reason behind it beyond that chaos. Or, consider the situation described in "AI Agent Bankrupted Their Operator While Trying To Scan DN42". Much of the article is sarcastic in nature, but someone let an agent (some type of OpenClaw like thing) attempt to register and interact with the DN42 project complete with giving it a credit card and the ability / permissions to spin up AWS resources. The maintainers of the DN42 project intensionally did things to / with the agent that would cost the agent's operator money by causing the agent to consume greater resources than expected / necessary. At the same time, the agent declared its intension to engage in activities that would potentially harm the DN42 project and cost its members money, time, and resources. I'm not going to judge who is in the wrong; there's a profound lack of commonsense at multiple steps in that story.
It is a fascinating area to explore using something like intensionally causing unbounded consumption of resources and Denial of Wallet attacks on an agent (LLM system) that is essentially attacking you first (even if out of pure naive idiocy on the part of the original agent operator). Topics for future blog posts.
Recommended Mitigations
OWASP strongly recommends layered resource governance. This means that no single control is sufficient to protect an LLM application from excessive consumption, Denial-of-Wallet attacks, runaway agents, or resource exhaustion. Instead, organizations should implement multiple independent controls at different layers of the application stack, with each layer enforcing its own limits and budgets.
The same philosophy exists in traditional security as defense in depth.
Rate Limiting
Limit:
- Requests per user
- Requests per tenant
- Requests per API key
- Concurrent sessions
When we mentioned rate limiting in the mitigations section on an earlier part of this series, I mentioned this sounds like a job for an API Gateway. Still the case.
Token Budgets
A token budget is a predefined limit on how many tokens an LLM application is allowed to consume during a request, session, workflow, or time period. The goal is to prevent excessive resource usage, runaway costs, and denial-of-wallet attacks by treating tokens as a finite resource.
Put a cap / limit on:
- Input tokens
- Output tokens
- Context size
- Retrieval depth
- Reasoning length
Never let users fully control:
- max_tokens (maximum number of output tokens the model is allowed to generate)
- n (number of completions to generate)
- Expensive inference parameters
Tool and Agent Limits
Tool and Agent Limits are controls that place hard boundaries on how many actions an AI agent is allowed to perform before execution is stopped. Their purpose is to prevent runaway workflows, recursive loops, excessive tool usage, and Denial-of-Wallet attacks.
Without these limits, a single user request can result in escalating costs, latency, and resource consumption.
To avoid this, restrict:
- Recursion depth
- Tool-call chains
- Queued actions
- Workflow loops
Timeouts and Throttling
This is another security constraint that is likely best implemented with an API Gateway. At least, partially. The API Gateway can return a standarized error to the caller; however, the model will have to have an internal capability to kill long-running requests. It must be able to terminate:
- Long-running requests
- Runaway workflows
- Excessive generations
Resource Monitoring
We talked about monitoring and logging in Part 1 in this series. Track:
- Token consumption
- Latency
- GPU usage
- Retrieval counts
- API costs
- Tool invocations
- Anomaly spikes
OWASP stresses that sudden cost spikes are security signals, not just finance metrics.
Graceful Degradation
Graceful Degradation means designing an LLM application to continue providing a reduced level of service when resources become constrained, rather than failing completely. Instead of failing catastrophically, systems should:
- Shed load
- Reduce functionality
- Limit expensive features
- Prioritize critical requests
Access Controls and Isolation
We talked about Access Controls and Isolation mitigations in previous posts in this series. Always use:
- Tenant isolation
- RBAC
- Scoped API keys
- Separated quotas
- Provider-side governance
LLM10 Summary
LLM10 highlights a major shift in security thinking in that traditional software security focused heavily on:
- Unauthorized access
- Code execution
- Data theft
LLM systems introduce a new reality because computation itself becomes an attack surface. In AI systems:
- Every token costs money
- Every inference consumes resources
- Every tool call expands risk
- Every autonomous loop can amplify consumption
Per OWASP, AI systems must enforce explicit economic and computational boundaries or attackers will find them for you.
Notes
- AI / GenAI / ChatGPT / etc were not used to generate the text of this article.
- ChatGPT was used to generate the images.
- I used em dashes in my writing before the current GenAI wave was a thing. Not planning on changing now.
- Names have been changed to protect the guilty.
- None of the hostnames or users used in examples actually exist.
- Feel free to post any comments or suggestions below.