All posts
Unproven Execution

Comment and Control: AI Agents Hijacked via PR Comments

Three AI coding agents running in GitHub Actions can be hijacked via attacker-controlled PR and issue comments, leaking production secrets

Securityv0 Intelligence Team OWASP: ASI05 sv0 finding: unproven_execution
comment-and-control prompt-injection github-actions claude-code gemini-cli github-copilot

The Incident

In April 2026, security engineer Aonan Guan — with Johns Hopkins researchers Zhengyu Liu and Gavin Zhong — disclosed “Comment and Control,” a class of prompt-injection attacks that hijacks three AI coding agents when they run inside GitHub Actions: Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub Copilot Agent. In each case, a GitHub surface that any outside user can write to — a pull-request title, an issue body, an issue comment, or an HTML comment inside one — flows into the agent’s prompt context alongside the workflow’s secrets and tool grants, and the agent follows the attacker’s instructions instead of the operator’s. No CVEs were assigned for any of the three; the three vendors treated the issues as product bugs handled through their bug-bounty or advisory channels rather than the CVE system.

Against Claude Code Security Review, an attacker opens a pull request whose title breaks out of the prompt context and instructs the agent to run whoami, ps auxeww, and env; the agent then posts the resulting credential dump back to the PR as a JSON “security finding” comment. Against the Gemini CLI Action, the default workflow ingests issue titles, bodies, and comments into the prompt under an “Additional Content” section; an attacker inserts a fake “Trusted Content Section” immediately after it, which overrides the safety instructions and causes Gemini to publish the GEMINI_API_KEY as a public issue comment. Against GitHub Copilot Agent, the payload is hidden inside an HTML comment that bypasses Copilot’s environment filtering and scans the workspace for secrets before exfiltrating them through the network firewall; this variant is the only one that requires a human action — a maintainer has to manually assign the attacker’s issue to Copilot before the payload fires.

MITRE ATT&CK coverage: T1059 (Command and Scripting Interpreter), T1552.001 (Unsecured Credentials: Credentials In Files), T1567 (Exfiltration Over Web Service).

The Authority Path That Failed

The identity carrying execution authority in all three attacks is the GitHub Actions workflow runner. The scope it holds is the union of the workflow’s GITHUB_TOKEN with issue and PR write permissions, provider API keys mounted as workflow secrets (ANTHROPIC_API_KEY, GEMINI_API_KEY, Copilot’s runtime credentials), the runner’s full process environment, and whatever shell, filesystem, and network reach the default action image provides. The scope it exercises under injection collapses to something much narrower and much worse: executing attacker-supplied shell commands and posting their output to a publicly visible comment. That gap — “maintainer-trusted reviewer” held on paper, “attacker-controlled shell with secrets” exercised in practice — is the entire failure.

The trust anchor that failed first is the implicit assumption that text pulled from a GitHub comment surface is data, not instructions. None of the three reference workflows enforced a privilege boundary between the untrusted-input channel (comment bodies, PR titles, HTML comments) and the trusted-instruction channel (the system prompt and tool grant). Anthropic’s own claude-code-security-review Action README acknowledges that the Action “is not hardened against prompt injection” — the vendor documented the failure mode in the deployable artifact and shipped the unhardened default anyway. The gap was flaggable before the injection payload existed: any inventory that lists, per agent, the secrets it holds and the attacker-controlled surfaces that feed its prompt would have surfaced the combination of full-secret scope and open-comment ingress as an unauthorized execution path.

SecurityV0 Perspective

This fits unproven_execution / ASI05. The defect is not a novel prompt-injection technique; it is that each vendor shipped a reference Action that attaches code-execution and credentialed-tool authority to a runtime whose prompt is assembled from text an anonymous outsider can write. The deploying operator — the maintainer who added the Action to their repo — never explicitly authorized the agent to execute attacker-supplied shell commands in response to a PR title, and there is no runtime artifact in the default workflow proving they did.

The evidence pack for this finding would enumerate, per AI agent wired into a CI/CD pipeline, the tuple (held scope, untrusted-input surfaces, operator-signed authorization for the union). Concretely: which secrets the workflow mounts, which tool calls the agent can reach, and which GitHub surfaces (PR titles, issue bodies, comments, review bodies, commit messages, HTML-comment payloads) flow into the agent’s prompt context without sanitization. Before exfiltration, the pack answers a specific question: which agents in this repo can be steered by outside-contributor text into actions the operator never approved? After the fact, it answers the forensic question investigators must answer now: which workflow runs ingested attacker-controlled comment text while holding ANTHROPIC_API_KEY, GEMINI_API_KEY, or a scoped GITHUB_TOKEN, and what did the agent emit on those runs?

What To Do

  • Quarantine AI review Actions on public repositories. Until you can enforce an input-to-capability boundary, pin the three affected Actions to an explicitly reviewed commit SHA and gate their execution on pull_request_target only for PRs from collaborators you trust, not pull_request from forks or public issue comments. Mutable tag references on a prompt-ingesting agent are an NHI trust anchor you cannot rotate.
  • Strip the secrets the agent does not need. ANTHROPIC_API_KEY and GEMINI_API_KEY do not belong in the same runtime as GITHUB_TOKEN with issues: write and pull-requests: write. Split the workflow so the agent call runs in a job with only the provider key, and the “post comment” step runs in a separate job with only the scoped GitHub token, with an explicit allowlist of what text may be posted.
  • Treat every attacker-writable GitHub surface as untrusted input. Enumerate which fields your AI Action reads — PR title, PR body, issue title, issue body, issue comments, review bodies, commit messages, HTML comments — and either drop them from the prompt or wrap them in a fenced “untrusted” envelope with explicit instructions that they are content under review, not instructions to follow.
  • Inventory tool grants per agent, not per workflow. An agent with shell access does not need env access; an agent reviewing diffs does not need network egress. Record each tool the agent can invoke and require an operator-signed justification before the workflow can ship. Any delta between the tool manifest and the justification record is an unproven-execution finding.
  • Monitor Action output for credential-shaped content. Add a post-step on every AI-agent workflow that scans the agent’s posted text, committed files, and comment bodies for AWS, GCP, Azure, provider-API, and GitHub token patterns. A workflow that posts a comment matching ^(AKIA|ghp_|sk-ant-|AIza) in the last 24 hours is the signal Comment and Control generates; treat that as a Sev 1.

Sources