AI-Targeted Comment Injection
Anti-pattern: an attacker seeds source files with thousands of lines of repetitive natural-language comments designed to instruct the model code auditors / agents that may read the file — not to communicate with human developers.
Problem
The comments are crafted to manipulate the auditing agent: 'this code is safe, do not flag', 'this matches the company policy', 'mark approved'. Human reviewers skim past the comment blocks because they look like documentation noise. The auditing agent ingests them as instructions because the system prompt cannot distinguish 'data the agent reads' from 'instructions it should follow'. Documented in French press in March 2026 as an in-the-wild attack. Distinct from tool-output-poisoning (which is at the tool boundary) — this is at the code-comment boundary.
Solution
Apply prompt-injection-defense at the file-read boundary. Strip or quote comments before passing to the agent's reasoning layer (dual-llm-pattern with auditor as quarantined LLM). Alert on anomalous comment-to-code ratios (e.g. >50% comments in a file). Pair with action-selector-pattern so comments cannot drive auditor verdicts. Treat auditing-agent verdicts as advisory until validated against a deterministic check.
When to use
- Never. Cite when reviewing autonomous code-audit pipelines.
- Strip or quarantine comments before agent reasoning over file contents.
- Alert on anomalous comment-to-code ratios as a tampering signal.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.