Security Review of AI-Generated Code

AI happily writes code with classic vulnerabilities. Learn the OWASP-aligned review checklist for AI output, the prompts that catch issues early, and the tools that automate the rest.

13 min · Reviewed 2026

AI Has Read Every Vulnerability, And Sometimes Writes One

Models train on public code, and public code is full of vulnerabilities. The model learns common patterns — including the insecure ones. SQL strings concatenated with user input, secrets hardcoded for examples, `eval` of user data, missing auth checks. AI will reproduce all of these on demand if you don't tell it not to.

The top eight AI-generated vulnerabilities to watch for

Vuln	What AI does	Fix
SQL injection	f-string queries with user input	Parameterized queries, prepared statements
Command injection	`subprocess.call(f"...{user}...")`	`subprocess.run([...], shell=False)`
Path traversal	`open(user_input)` with no allowlist	Resolve, normalize, check inside allowed dir
Hardcoded secrets	`API_KEY = "sk-..."` in committed file	Read from env, use a secrets manager
Missing auth	Endpoint without role/permission check	Decorator/middleware mandatory on all routes
Open redirect	`redirect(request.args.get("next"))`	Allowlist of safe redirect targets
XSS	`dangerouslySetInnerHTML` with user data	Escape, or use safe APIs
Insecure deserialization	`pickle.loads(user_input)`	Use JSON; if pickle, sign and verify

The pre-flight prompt addition

# Append to any code-generation prompt that touches:
# - user input
# - filesystems
# - shells
# - databases
# - HTTP
# - secrets

"Apply OWASP best practices. No string-concatenated SQL — use parameterized queries.
No shell=True in subprocess calls. No eval/exec. Read all secrets from env vars,
never hardcode. Validate all user input with explicit schemas (Zod, Pydantic, etc.).
If you generate code with any of these, flag it and explain why you couldn't avoid it."A 70-word boilerplate that stops 80% of AI-introduced vulns at the source.

The 5-minute review pass

Search the diff for `f"`, `"" + `, or template strings near `query`, `execute`, `exec`, `subprocess`
Search for hardcoded strings starting with `sk-`, `pk-`, `AKIA`, `ghp_`, `xoxb-`, `Bearer ` followed by characters
Confirm every endpoint has explicit auth — `@require_auth`, `requireUser()`, etc.
Search for `eval(`, `exec(`, `pickle.loads(`, `Function(`, `setTimeout("`, `innerHTML`
Verify error messages don't leak internals — check `except: return str(e)` patterns

Tools that automate the boring parts

Tool	What it catches	Setup
Semgrep	Pattern-based vulns (SQLi, command inj, etc.)	`semgrep --config=auto .`
Bandit (Python)	Common Python security smells	`pip install bandit && bandit -r .`
ESLint security plugins	JS/TS security patterns	`eslint-plugin-security`
GitHub Code Scanning	CodeQL queries on every push	Free for public repos
Trivy / Grype	Vulnerable dependencies in lockfiles	CI step or local scan
GitGuardian / TruffleHog	Hardcoded secrets in commits	Pre-commit hook + CI

The adversarial review prompt

# After AI writes code, run this in a fresh chat:

"Act as a security auditor. The following code was just generated by AI.
List every realistic security issue you can find, ranked by severity (critical/high/med/low).
For each: the line numbers, the threat model (who attacks how), and the fix.
Do not be polite. Assume hostile users.

<paste code>"

# Use a *different* model than wrote the code if possible.
# A second pair of eyes from a different family catches what the first missed.Cross-family adversarial review catches more than same-family review. Mix Claude with GPT for coverage.

What no AI catches

Business-logic flaws (a logged-in user accessing another tenant's data via the wrong route param)
Race conditions in concurrent code
Subtle privilege escalation through chained features
Side channels (timing attacks on `==` comparison of secrets)

AI knows the patterns. Humans know the threats.
— An application security engineer

The big idea: AI accelerates code, but it does not understand attackers. Apply the OWASP-aligned prompt boilerplate, run static analyzers automatically, and budget a human security review for anything user-facing. Speed without security is a CVE in waiting.

End-of-lesson check

15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-coding-debug-security-review-creators

What is the core idea behind "Security Review of AI-Generated Code"?
1. AI happily writes code with classic vulnerabilities. Learn the OWASP-aligned review checklist for AI output, the prompts that catch issues early, and the tools that automate the rest.
2. Tasks under 30 minutes — coordination overhead exceeds parallelism win
3. Logs: redact tokens, secrets, headers before exposing through MCP
4. branch-per-agent
Which term best describes a foundational idea in "Security Review of AI-Generated Code"?
1. SQL injection
2. OWASP
3. secret scanning
4. Semgrep
A learner studying Security Review of AI-Generated Code would need to understand which concept?
1. OWASP
2. secret scanning
3. SQL injection
4. Semgrep
Which of these is directly relevant to Security Review of AI-Generated Code?
1. OWASP
2. SQL injection
3. Semgrep
4. secret scanning
Which of the following is a key point about Security Review of AI-Generated Code?
1. Search the diff for `f"`, `"" + `, or template strings near `query`, `execute`, `exec`, `subprocess`
2. Search for hardcoded strings starting with `sk-`, `pk-`, `AKIA`, `ghp_`, `xoxb-`, `Bearer ` followed…
3. Confirm every endpoint has explicit auth — `@require_auth`, `requireUser()`, etc.
4. Search for `eval(`, `exec(`, `pickle.loads(`, `Function(`, `setTimeout("`, `innerHTML`
Which of these does NOT belong in a discussion of Security Review of AI-Generated Code?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. Confirm every endpoint has explicit auth — `@require_auth`, `requireUser()`, etc.
3. Search for hardcoded strings starting with `sk-`, `pk-`, `AKIA`, `ghp_`, `xoxb-`, `Bearer ` followed…
4. Search the diff for `f"`, `"" + `, or template strings near `query`, `execute`, `exec`, `subprocess`
Which statement is accurate regarding Security Review of AI-Generated Code?
1. Race conditions in concurrent code
2. Subtle privilege escalation through chained features
3. Business-logic flaws (a logged-in user accessing another tenant's data via the wrong route param)
4. Side channels (timing attacks on `==` comparison of secrets)
Which of these does NOT belong in a discussion of Security Review of AI-Generated Code?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. Business-logic flaws (a logged-in user accessing another tenant's data via the wrong route param)
3. Subtle privilege escalation through chained features
4. Race conditions in concurrent code
What is the key insight about "AI's favorite vuln: the helpful error message" in the context of Security Review of AI-Generated Code?
1. AI loves returning detailed errors to the client. "Database query failed: table users does not exist on 10.0.1.
2. Tasks under 30 minutes — coordination overhead exceeds parallelism win
3. Logs: redact tokens, secrets, headers before exposing through MCP
4. branch-per-agent
What is the key insight about "Defense in depth still applies" in the context of Security Review of AI-Generated Code?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. AI review and Semgrep are layers, not replacements for security training.
3. Logs: redact tokens, secrets, headers before exposing through MCP
4. branch-per-agent
Which statement accurately describes an aspect of Security Review of AI-Generated Code?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. Logs: redact tokens, secrets, headers before exposing through MCP
3. Models train on public code, and public code is full of vulnerabilities. The model learns common patterns — including the insecure ones.
4. branch-per-agent
What does working with Security Review of AI-Generated Code typically involve?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. Logs: redact tokens, secrets, headers before exposing through MCP
3. branch-per-agent
4. The big idea: AI accelerates code, but it does not understand attackers.
Which best describes the scope of "Security Review of AI-Generated Code"?
1. It focuses on AI happily writes code with classic vulnerabilities. Learn the OWASP-aligned review checklist for AI
2. It is unrelated to ai-coding workflows
3. It applies only to the opposite beginner tier
4. It was deprecated in 2024 and no longer relevant
Which section heading best belongs in a lesson about Security Review of AI-Generated Code?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. The top eight AI-generated vulnerabilities to watch for
3. Logs: redact tokens, secrets, headers before exposing through MCP
4. branch-per-agent
Which section heading best belongs in a lesson about Security Review of AI-Generated Code?
1. Tasks under 30 minutes — coordination overhead exceeds parallelism win
2. Logs: redact tokens, secrets, headers before exposing through MCP
3. The pre-flight prompt addition
4. branch-per-agent

← Back to interactive lesson

Tendril · Creators · AI-Assisted Coding