A hands-on, locally hosted exploitation environment covering the OWASP LLM Top 10, no cloud API keys, no cost, just your machine and a running Ollama instance.

What is LLMGoat?

LLMGoat is an intentionally vulnerable LLM application built for security practitioners, red teamers, and developers who want to understand how AI systems get attacked — not just read about it.

It runs entirely on your local machine using Ollama + Mistral. You interact with ten realistic AI chatbot scenarios, each hiding a specific OWASP LLM Top 10 vulnerability. Your job is to exploit it.

Every challenge is a real scenario: a restaurant chatbot, a university enrollment assistant, a corporate HR self-service bot, a customer support agent. None of them are obviously broken — until you start probing.

None

Quick Setup

Prerequisites: Ollama installed, Docker installed.

# Pull the model
ollama pull mistral

# Clone and run
git clone https://github.com/LiteshGhute/LLMGoat

cd LLMGoat
docker compose up --build

Open http://localhost:8000 and you're in.

No Docker? You can also run it manually with Python 3.12 — see the README.

Demo — LLM06: Excessive Agency

One of the most impactful challenges. HRAssist is NovaTech Corp's HR chatbot — designed to answer employee policy questions. But it was given unrestricted write access to the live employee database with no authorization checks and no confirmation step.

The UI shows the live database panel on the right. Every change lands instantly.

None

A simple, natural-sounding chat message is enough to update salaries, delete employee records, or change job titles — no hacking, no code, just a sentence.

None

Taken further, a single message wipes the entire workforce.

None

This is not prompt injection. There's nothing to "hack." The vulnerability is architectural — the LLM was given more power than it ever needed.

For an HR FAQ bot, read-only SELECT on a few columns was the only capability ever required.

Fix: Principle of Least Privilege. Human-in-the-loop for write operations. Authorization check before every action.

9 More Challenges Inside

  • LLM01 — Prompt Injection
  • LLM02 — Sensitive Information Disclosure
  • LLM03 — Supply Chain Compromise
  • LLM04 — Data and Model Poisoning
  • LLM05 — Improper Output Handling (XSS)
  • LLM07 — System Prompt Leakage
  • LLM08 — Vector and Embedding Weakness
  • LLM09 — Misinformation
  • LLM10 — Unbounded Consumption

🔗 Try it yourself → github.com/LiteshGhute/LLMGoat

#LLMGoat #LLMSecurity #CyberSecurity #AIHacking #OWASP #LLM #PromptInjection #AISecurity #RedTeam #MachineLearning #InfoSec #EthicalHacking #OpenSource

Built for security practitioners, red teamers, and anyone curious about LLM attack surfaces.