Your AI Coding Assistant Is Shipping Secrets to Production — How to Audit AI-Generated Code Before…

I just open-sourced vibe-guard-skills — a free set of Claude Code slash commands that run a three-pass security and reliability audit on…

Vikas Sah

~10 min read · April 18, 2026 (Updated: April 18, 2026) · Free: Yes

I just open-sourced vibe-guard-skills — a free set of Claude Code slash commands that run a three-pass security and reliability audit on your AI-generated code before every commit and push. Nothing leaves your machine. No external API calls. No subscription.

GitHub: https://github.com/codecoincognition/vibe-guard-skills

nstall in one command:

curl -fsSL https://raw.githubusercontent.com/codecoincognition/vibe-guard-skills/main/install.sh | bash

Here's why it exists, what it catches, and how to wire it into your workflow in about two minutes.

In March 2026, security researchers confirmed 35 CVEs (Common Vulnerabilities and Exposures — formally tracked exploitable security flaws) traced back to AI-generated code. That's not a full quarter's worth of findings. That's a single month. And it's ten times the monthly rate from the second half of 2025.

This isn't a panic story. AI-assisted development is genuinely faster and the code quality is often good. But there's a specific, reproducible failure profile that AI code has — one that's different from human-written bugs — and most developers aren't accounting for it.

The Specific Failure Profile of AI-Generated Code

Human developers make certain kinds of mistakes: typos, off-by-one errors, missed edge cases from unfamiliarity with a library. These are well-understood. Static analysis tools catch many of them. Code review catches most of the rest.

AI-generated code has a different failure mode. It's pattern-complete and locally coherent. It looks right. It passes tests at dev scale. It survives code review because reviewers are pattern-matching too — the code follows conventions, uses the right abstractions, and has sensible names.

The failure mode shows up elsewhere:

Production scale. The N+1 query pattern is a perfect example. An AI will write posts.map(p => db.query("SELECT * FROM comments WHERE post_id=?", p.id))because it's syntactically correct and semantically obvious. It works fine on 50 records in your dev environment. At 50,000 records, you're making 50,000 round trips to the database.
Security by omission. According to Veracode's analysis, 45% of AI-generated code introduces OWASP Top 10 vulnerabilities. The failure isn't usually a wrong implementation — it's a missing one. No auth check before returning user data. No ownership verification on a resource endpoint. No rate limiting on a login route. The code is functionally complete except for the parts the model didn't know to add.
Secrets and defaults. The Moltbook breach in early 2026 is the textbook example: hardcoded Supabase credentials committed to a repo, combined with Row Level Security disabled (which is actually a common AI default when setting up a demo schema quickly). Result: 1.5M API tokens burned, 35K emails exposed, 4.75M records accessible. GitGuardian found 28.6 million secrets exposed in public repos in 2025 — a 34% year-over-year increase.

The cost differential is brutal. Catching a hardcoded credential during code review costs you 15 minutes. Finding it after it's in production, after someone's scraped your repo, costs somewhere between $3,000 and $10,000+ depending on the scope of the breach response.

The core problem isn't that AI writes bad code. It's that it writes code that's confidently incomplete — and that completeness gap is invisible until production.

Stack Overflow's 2025 developer survey found that 70% of developers accept AI suggestions without modification, and 56% rarely review the generated code line by line. At those acceptance rates, the failure profile of AI code becomes the failure profile of your codebase.

What vibe-guard-skills Is

vibe-guard-skills is a set of Claude Code skills (slash commands) that live in ~/.claude/skills/. They run inside your existing Claude Code session against your git diff. No external API, no data exfiltration, no new accounts.

There are three individual passes and one master command:

/vibe-check — Production Resilience

Audits for runtime failures that survive tests and code review:

N+1 query patterns
Missing error handling on async/IO operations
Null/undefined edge cases at scale
Resource leaks (unclosed connections, uncleared timers, unbounded memory growth)
Race conditions in concurrent code
Data integrity gaps (missing transactions, partial write scenarios)
Scale failures (algorithms that degrade non-linearly with data volume)

/vibe-secure — Security

Audits for security gaps:

Hardcoded secrets, tokens, and credentials
Injection surfaces (SQL, command, template, LDAP)
Missing or incomplete auth checks
Insecure defaults (disabled security features, permissive CORS, cleartext storage)
Business logic race conditions (TOCTOU, double-spend, parallel request abuse)
Supply chain risks (pinning, integrity verification)

/vibe-explain — Comprehension

This one is different. Instead of looking for bugs, it makes sure you understand what was generated:

Plain-English summary of what each module does and what it assumes
Module contracts: what goes in, what comes out, what side effects exist
Debt score: how much of this code you'd be comfortable modifying without running tests
Ownership check: are there sections that would be hard to debug if they broke?

The philosophy behind /vibe-explain is that code you don't understand is code you can't safely maintain. If you're shipping 200 lines of AI-generated database abstraction and couldn't explain the connection pooling strategy to a coworker, that's a risk.

/vibe-guard — The Master Command

Runs all three passes. Takes a flag to control scope:

--quick — Critical issues only, roughly 10 seconds. Good for mid-edit spot checks.
(default) — Full three-pass audit against git diff HEAD.
--full — Scans the entire repository, not just the diff. Use this for initial setup or after a large refactor.

The Architecture: Two Passes in One

The design behind each audit pass isn't just a checklist. There are two layers running:

Layer 1: Fixed pattern library. A known set of AI failure patterns compiled from CVE databases, security research, and production post-mortems. N+1 queries, hardcoded secrets, missing ownership checks — these show up repeatedly because AI models learn from codebases that have these patterns, and they reproduce them. This layer is deterministic. The same code will always produce the same findings.

Layer 2: Context-aware analysis. This is where Claude Code reads your specific code and reasons about it in context. A route that looks like it's missing an auth check might be intentionally public — or it might be a data leak. A database query that looks like an N+1 might be operating on a batched dataset that makes it fine. Layer 2 distinguishes between these cases by understanding your actual application logic, not just matching syntax patterns.

You get repeatability from Layer 1 and accuracy from Layer 2. A pure checklist would generate too many false positives on context-dependent patterns. A pure LLM analysis would miss well-known patterns it doesn't happen to reason about in a given run.

Try This: Setup in Two Minutes

Step 1: Install

curl -fsSL https://raw.githubusercontent.com/codecoincognition/vibe-guard-skills/main/install.sh | bash

This installs the skills to ~/.claude/skills/ globally. They'll be available in any Claude Code session from that point forward. Restart Claude Code if you have a session open.

Step 2: Add the auto-invoke block to your CLAUDE.md

Open the CLAUDE.md at the root of your project (create one if it doesn't exist) and add:

## When to auto-invoke the audits
- Before git commit - run /vibe-guard --quick. If CRITICAL, stop and surface it.
- Before git push - run full /vibe-guard. If critical, do not push; wait for user.
- After editing auth/payments/billing/user input - run /vibe-secure.
- After 50+ lines generated in one edit - run /vibe-explain.
- After touching SQL/ORM/database - run /vibe-check.
## Rules
- Never auto-apply fixes. Show report, ask user which to fix.
- Do not run /vibe-guard --full autonomously.
- If git diff HEAD is empty, announce "clean working tree."

This makes the audits part of your Claude Code workflow without requiring you to remember to run them manually. Claude Code reads CLAUDE.md as its operating instructions — the auto-invoke rules become built-in behaviors.

Step 3 (optional): Install the pre-push git hook

bash setup-hooks.sh

This sets core.hooksPath = .githooks in your local git config and installs a pre-push hook that runs /vibe-guard before every push. If any CRITICAL findings come back, the push is blocked and the findings are printed to your terminal.

The hook silently skips if the claude CLI isn't installed (so it's safe to add to a shared repo). If you need to bypass it for a specific push: git push --no-verify.

What the Output Looks Like

Reports are structured and actionable. Here's a realistic /vibe-secure finding:

/vibe-secure scan — 2 findings
══════════════════════════════════════
🔴 CRITICAL — Hardcoded credential
  [vibe-secure] config.js:12
  SUPABASE_KEY = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
  Fix: Move to process.env.SUPABASE_KEY, add to .gitignore
🔴 CRITICAL — Missing auth check
  [vibe-secure] api/users.js:45  GET /users/:id
  Route returns user data without verifying the requesting user
  owns the id. Add ownership check before returning.
══════════════════════════════════════
SUMMARY: 2 critical · 0 warnings

And a /vibe-check finding for the N+1 pattern:

/vibe-check scan — 1 finding
══════════════════════════════════════
🔴 CRITICAL — N+1 query pattern
  [vibe-check] models/posts.js:88
  posts.map(p => db.query("SELECT * FROM comments WHERE post_id=?", p.id))
  Fix: Replace with single JOIN or batch query with WHERE IN
══════════════════════════════════════
SUMMARY: 1 critical · 0 warnings

Each finding has: severity (CRITICAL or warning), the exact file and line, what the problem is, and a specific fix recommendation. Nothing is applied automatically. The report is feedback — you decide what to fix, in what order, and how.

When to Use Which Command

The three flags give you different trade-offs between speed and coverage. Here's how I think about the right one for each situation:

— quick during active editing

When you're in the middle of implementing a feature and want a fast sanity check, --quick is the right call. It's roughly 10 seconds and only surfaces CRITICAL findings — the ones that would be a serious problem if they shipped. It's not a full audit; it's a "did I just accidentally hardcode my API key or write an obvious N+1" check.

Run it after any significant chunk of AI generation, especially if you're in a flow state and haven't been reviewing line by line.

Default /vibe-guard before commit

The default run against git diff HEAD is the right pre-commit check. It covers the full three passes at normal depth on only the changed code, which keeps it fast enough to run every time without becoming a friction point.

This is also where the /vibe-explain pass pays off. Before you commit a 150-line AI-generated module, make sure you can understand the explanation it gives you. If the debt score comes back high and you can't easily answer "what does this module assume about its inputs" — that's a signal to spend 10 more minutes with it before committing.

Individual passes for domain-specific work

If you just wrote a new database query layer, run /vibe-check directly. If you just added a payment webhook, run /vibe-secure. The individual passes are faster because they're focused, and they're more useful when you know exactly which category of risk you're introducing.

— full for initial setup or major refactors

Run /vibe-guard --full when you first install vibe-guard-skills on an existing codebase. You'll likely find patterns that accumulated before you had the tooling in place. Also run it after a large refactor that touched many files — the diff-only default won't catch issues in files you modified indirectly.

Don't run --full autonomously or on a timer. It's a deliberate, human-initiated full scan. That's intentional.

The Honest Limits

The pre-push hook is feedback, not enforcement. git push --no-verify bypasses it. A determined or distracted developer can ignore every finding. If your threat model includes developers who will bypass local tooling under deadline pressure, you need CI-level enforcement as well — a pipeline that blocks merges on critical findings, not just a local hook.

vibe-guard is positioned as the developer-side of a two-layer strategy:

vibe-guard (local) — fast feedback loop, catches issues before they're in the branch, zero friction to run, educates you about the risk as you're writing the code
CI enforcement — pipeline-level gates, runs on the server where it can't be bypassed, triggers on every PR

vibe-guard is developer feedback; CI is enforcement. You need both. vibe-guard catches 80% of the issues before they ever become a CI problem.

There's also a false positive rate to account for. Context-aware analysis means fewer false positives than a pure static analysis tool, but you'll still see findings that don't apply to your specific situation. The rule is: the report is a starting point, not a verdict. Read each finding, understand it, and decide whether it applies. "Dismiss because it's not relevant" is a valid outcome. "Dismiss because I don't have time to think about it" is not.

Finally, vibe-guard is not a replacement for security-oriented code review on high-risk surfaces. Auth systems, payment flows, and data export pipelines warrant human review from someone thinking adversarially about the code — not just an automated audit pass. Use vibe-guard to catch the obvious problems; save the deep review for the high-stakes surfaces.

Getting It Into Your Workflow

The install takes two minutes. The CLAUDE.md block takes another two. The git hook is optional but worth it for any project you care about.

The workflow shift is small: Claude Code generates code, vibe-guard audits it, you see the report and decide what to fix. The vibe generates fast; the guard makes sure what ships is what you intended to ship.

A stat worth internalizing: Veracode found 86% of AI-generated code fails XSS protection checks, and 88% fails log injection checks. Those aren't obscure categories. Those are the first two things an attacker tries. If you're shipping AI-generated web code without an audit pass, you're in that cohort.

The alternative to running the audit is trusting that your AI assistant happened to write secure, production-ready code this time. Given the March 2026 CVE numbers, that's not a bet I'd take.

GitHub: https://github.com/codecoincognition/vibe-guard-skills

If you're already running vibe-guard-skills or a similar audit layer in your workflow, what patterns are you catching most often? I'm tracking which findings show up repeatedly across different codebases to improve the pattern library — drop a comment with what you're seeing. Also, please suggest enhancements/improvements or feel free to contribute!

#vibe-guard #code-audit #vibe-check #cve