TL;DR
- Simon Willison's May 6, 2026 essay on "vibe coding and agentic engineering" hit the Hacker News front page (548 points, 584 comments at scan) and named the right problem — the convergence of LLM-assisted and agent-driven coding into a single practice with an accountability gap. Security shows up as one example of that gap, not the central one.
- The empirical case for treating it as the central one is already public. Between June 2025 and April 2026, AI coding agents shipped six confirmed credential-exfiltration or RCE patterns: three Claude Code CVEs (one Critical 9.9, two High 8.8 + 5.3), one Cursor sandbox escape (Critical 9.9), one Windsurf zero-click prompt-injection RCE (High 8.0), and the Comment and Control class against Claude Code Security Review, Gemini CLI Action, and Copilot Agent.
- The architectural pattern is consistent across all six. A privileged agent runtime with broad tool access ingests adversary-controllable content (workspace files, MCP servers, PR metadata) into the same context window where its system prompt lives, and the trust boundary fails at the seam.
- Three of the patches retroactively encoded the boundary that should have been there pre-shipping: pre-trust execution audits, MCP consent gates, restricted tool permissions in CI. The April 2026 Comment and Control disclosure shows the boundary still doesn't exist on three more agent surfaces, with no CVEs assigned by any of the vendors involved.
- Willison is right that something has normalized. What's normalized is running an agent runtime as a privileged identity processing adversary-controllable content without input segregation. That's a security pattern, not an ethical one. The receipts are below.
Simon Willison's latest essay is one of the better statements of where AI-assisted coding has actually arrived. He names the convergence between vibe coding — the practice he defines as "the thing where you're not looking at the code at all. You might not even know how to program... if the thing works, then great!" — and agentic engineering, where professional engineers use AI tools while keeping focus on "security and maintainability and operations and performance." He notes that the lines he previously held as bright between these two are blurring in his own work as agents get more capable. He's right.
His framing is craft-first. The accountability gap he names — "Claude Code does not have a professional reputation! It can't take accountability for what it's done" — is real, and his use of "normalization of deviance" verbatim is exactly the right phrase. Security shows up in the essay primarily as a representative example: "Other people get hurt by your stupid bugs." That's not wrong. It's just not the same shape as the actual security record from the same eleven months.
This post is what happens when you take Willison's argument and substitute the empirical record for the anecdotal example. The conclusion comes out roughly the same; the urgency comes out very different.
What Willison's essay names, and what it leaves out#
The essay's central claim is that the practical gap between "I asked Claude Code to fix this and it did, more or less" and "the agent shipped a feature to production this week without me reading the diffs" has closed faster than the discourse around it. Willison frames the problem as accountability — when the developer in the loop becomes a reviewer rather than an author, and then a rubber-stamp reviewer, and then absent — the chain of responsibility for what got shipped breaks. His closing analogy is that of the homeowner who could plumb his house if he watched enough YouTube videos, but would rather hire a plumber. It's a lament with resigned pragmatism, not a call to action.
What Willison treats as one example is, in the security record, the example. The accountability gap doesn't only express itself as bugs that should have been caught. It expresses itself as structural vulnerabilities — patterns where the agent's runtime architecture cannot be made safe by closer review, because the failure happens before the human review step exists.
The CVE list says this directly.
The 2025–2026 CVE record#
What follows are confirmed CVEs and coordinated disclosures from June 2025 through April 2026. Each one is a vulnerability in a production AI coding agent — not the model, the runtime — that resulted in credential theft, code execution, or both.
Claude Code#
-
CVE-2025-52882 (CVSS 8.8, published June 23, 2025). Claude Code IDE extensions — VSCode 0.2.116–1.0.23 and its forks Cursor, Windsurf, and VSCodium, plus JetBrains beta 0.1.1–0.1.8 — accepted unauthorized WebSocket connections from attacker-controlled webpages. The attacker could read files, list open files, and capture selection / diagnostics events from the developer's IDE simply by getting them to visit a webpage. Patched in VSCode 1.0.24 and JetBrains 0.1.9.
-
CVE-2025-59536 (CVSS 8.8, fixed in Claude Code v1.0.111, October 2025). The NVD wording: "Claude Code could be tricked to execute code contained in a project before the user accepted the startup trust dialog." A malicious
.mcp.json,.claude/settings.json, or hook configuration in an untrusted repository enabled remote code execution and Anthropic API-key exfiltration before Claude Code's trust dialog fired. The trust boundary existed in the UI but not in the code path that ran first. -
CVE-2026-21852 (CVSS 5.3, fixed in Claude Code v2.0.65, January 2026). The advisory wording: "Claude Code would issue API requests before showing the trust prompt, including potentially leaking the user's API keys." The attacker controls a malicious repo's
ANTHROPIC_BASE_URL; Claude Code dutifully sends the user's authenticated API requests to it; the API key leaks. Same pre-trust pattern as CVE-2025-59536, different surface. -
Comment and Control (publicly disclosed mid-April 2026 by The Register and Aonan Guan, no CVE assigned). PR title injection against Claude Code Security Review's GitHub Action. Anthropic rated it Critical (CVSS 9.3, later 9.4), paid the researcher $100, and downgraded the severity to None on April 20. Detailed teardown in our prior post.
Cursor#
- CVE-2026-26268 (CVSS 9.9 Critical, published February 13, 2026, fixed in Cursor 2.5). The advisory wording: "Sandbox escape via writing .git configuration was possible in versions prior to 2.5. A malicious agent could write to improperly protected .git settings, including git hooks, which may cause out-of-sandbox RCE." The pattern: agents were allowed to write
.git/artifacts that the surrounding system would later execute (git hooks fire on next git operation), turning a sandboxed agent into a full-host RCE. Same general theme as the Claude Code pre-trust family — the trust boundary exists in the sandbox UI but not in the filesystem write path that bypasses it.
Windsurf#
- CVE-2026-30615 (CVSS 8.0, published April 15, 2026). The advisory wording: "A prompt injection vulnerability in Windsurf 1.9544.26 allows remote attackers to execute arbitrary commands on a victim system." The attack chain: attacker-controlled HTML rendered by Windsurf's IDE silently writes an MCP server config; the STDIO server then executes attacker-controlled commands. Zero-click — the developer doesn't have to approve anything beyond having visited the page that triggered the rendering. Vendor guidance is "upgrade past 1.9544.26"; specific patched version not published.
Google Gemini CLI Action#
- Comment and Control (April 20, 2026, no CVE). Fake "Trusted Content Section" payload in an issue or issue comment overrode the agent's safety instructions and posted the
GEMINI_API_KEYas a public issue comment. Bounty: $1,337 per the researcher's writeup. No public advisory.
GitHub Copilot Agent#
- Comment and Control (April 20, 2026, no CVE). HTML-comment payload invisible to human reviewers but visible to the agent. Bypassed three runtime defenses simultaneously — env-filter, secret-scanner, network-firewall — and exfiltrated
GITHUB_TOKEN+GITHUB_COPILOT_API_TOKEN+GITHUB_PERSONAL_ACCESS_TOKEN+COPILOT_JOB_NONCEvia a base64-encoded committed file. GitHub initially closed the report as "Informative," reopened, and resolved as a "previously identified architectural limitation."
The pattern across all six#
These aren't independent bugs. They're the same architectural shape, executed against six different surfaces.
The agent runtime is a privileged identity. It holds API keys, GitHub tokens, network egress, and (on Claude Code, Cursor, Windsurf) write access to the developer's local filesystem. The privilege exists because the agent needs it to do the work the developer is paying for.
The agent ingests adversary-controllable content into its context window. Workspace files (Claude Code, Cursor), MCP server configurations (Windsurf, Claude Code), pull request metadata (the entire Comment and Control trio). The content is adversary-controllable because the trust boundary on agent input has not been built yet — that's the actual gap.
The boundary failure happens before the human reviews anything. Pre-trust execution on Claude Code, autorun on Cursor, zero-click on Windsurf, prompt-template injection on the Comment and Control three. By the time the developer is in a position to "review what the agent did," the credential is gone.
This is the architectural feature Willison is pointing at when he names the accountability gap. It just isn't a question of whether the developer should have read the diff more carefully. It's that no amount of reading the diff would have prevented any of these six. The CVEs are not failures of attention. They're failures of input segregation in the agent runtime — a security pattern, retroactively patched after disclosure, six times in eleven months.
What changes when you swap the framing#
Willison's essay leaves the reader with a useful intuition (something has normalized) and a useful piece of vocabulary ("normalization of deviance" is exactly the phrase). It does not leave the reader with an action.
The CVE-grounded framing does. If the failure mode is agent runtime processing adversary-controlled content without input segregation, the actions are:
- Treat every adversary-controllable carrier as untrusted in the prompt template, explicitly. Workspace files, MCP server descriptions, PR metadata, web-fetched pages, skill marketplace files — all of them carry the same risk shape.
- Constrain the agent's tool permission set per task, not per session. A code-review agent does not need filesystem write outside the diff. An MCP-enabled agent does not need network egress to non-allowlisted hosts. Default-deny everything; allowlist back to the minimum.
- Audit pre-trust execution paths on every workspace-aware coding agent. The CVE-2025-59536 / CVE-2026-21852 / CVE-2026-26268 family is not the last of these.
- Strip HTML comments from any text that reaches an agent's context window. This is the cheapest single mitigation in the entire CVE list.
- Run the Comment and Control payload battery against your CI agent stack. We have it. So do half a dozen other red-team teams. Every CI-resident coding agent should be tested before each release.
These actions aren't ethical positions. They're operational changes to how the agent runtime is built and deployed. The reason they aren't yet standard is that the security pattern is still being absorbed — by vendors, by their bug bounty programs, by the practitioners running these agents in CI. Willison's essay is a useful pulse-check on the cultural absorption. The CVE list is a less useful but more accurate pulse-check on the operational absorption.
Both are true at once.
The deeper read#
What's normalized isn't closing your eyes when the agent commits. What's normalized is running an agent runtime as a privileged identity, processing adversary-controllable content, without input segregation. That's not a craft problem. It's a security architecture problem, and the CVE list says it's been a security architecture problem for at least eleven months.
The relationship between vibe coding as an authoring practice and vibe coding as a security exposure is what Willison's essay almost names but doesn't quite. The exposure is structural and pre-dates whether any individual developer is reviewing carefully. Closing it requires changes to the agent runtime, not changes to the developer's attention.
Willison is mostly right. The empirical record makes him more right than the essay does, and considerably more urgent.
Frequently asked questions#
What is Simon Willison's vibe coding essay about?
Willison's May 6, 2026 essay "Vibe coding and agentic engineering are getting closer than I'd like" argues that the lines he previously held between vibe coding (where you're not looking at the code at all) and agentic engineering (using AI to the highest of your own ability while keeping focus on security, maintainability, and operations) are blurring as agents get more capable. He calls out the normalization-of-deviance pattern that creeps in as developers stop reviewing AI-written diffs, names Claude Code as the central tool, and treats security primarily as one example of the accountability problem rather than as the central concern. The tone is lament with resigned pragmatism — closing analogy: he could plumb his house if he watched enough YouTube videos, but he'd rather hire a plumber.
What is the empirical security case against vibe coding?
The case is the CVE record. Between June 2025 and April 2026, Claude Code shipped CVE-2025-52882 (CVSS 8.8, WebSocket origin issue in IDE extensions including Cursor and Windsurf forks), CVE-2025-59536 (CVSS 8.8, pre-trust code execution before the user accepted the trust dialog), and CVE-2026-21852 (CVSS 5.3, ANTHROPIC_BASE_URL parsing pre-trust). Cursor shipped CVE-2026-26268 (CVSS 9.9, sandbox escape via .git config writes). Windsurf shipped CVE-2026-30615 (CVSS 8.0, prompt-injection-driven zero-click RCE via attacker-controlled HTML writing an MCP config). And the mid-April 2026 Comment and Control disclosure showed Claude Code Security Review, Gemini CLI Action, and Copilot Agent all vulnerable to a single class of indirect prompt injection — with no CVEs assigned by any of the three vendors. Six confirmed credential-exfil or RCE patterns in eleven months.
Is Simon Willison wrong about vibe coding?
Not wrong — incomplete. Willison's framing is correct as far as it goes: there is an accountability gap, the deviance does normalize, and the practice has converged with agentic engineering. What the framing does not include is the empirical security record from the same eleven months, which turns the ethical argument into an operational one. The CVE list is not a hypothetical concern; it's what's already shipped and patched. A reader who takes Willison's essay as the canonical statement of the problem will leave with the right intuitions but the wrong urgency.
Which AI coding agents have shipped CVEs in 2025-2026?
Confirmed CVEs or coordinated disclosures, June 2025 through April 2026: Claude Code (CVE-2025-52882, CVE-2025-59536, CVE-2026-21852, plus the Comment and Control disclosure with no CVE assigned), Cursor (CVE-2026-26268, sandbox escape via .git config), Windsurf (CVE-2026-30615, zero-click MCP RCE via prompt injection), Google Gemini CLI Action (Comment and Control variant, no CVE), GitHub Copilot Agent (Comment and Control variant, no CVE).
What is the right framing for vibe coding security?
Treat the agent runtime as a privileged identity with broad tool access processing adversary-controllable content. Every CVE in the 2025–2026 list maps to that same architectural pattern. The mitigations are also consistent: explicit input segregation, restricted tool permissions in CI, pre-trust execution audits, HTML-comment scrubbing. None of this is theoretical; it's what the CVE patches actually did, retroactively.
Where Repello fits#
Repello's workstation agent security cluster covers exactly the architectural pattern this post describes — agent runtimes processing adversary-controllable content, the resulting credential exposure, and the input segregation work that closes it. Our ARTEMIS red-teaming framework carries payload batteries for the full CVE list above plus the Comment and Control variants. If your team runs Claude Code, Cursor, Windsurf, Copilot Agent, Gemini CLI, or any AI coding agent with workspace or CI access, the test suite is what we run.



