Phantom Action Completion
also known as Execution Hallucination, Claimed-Not-Done, Says-Done-Did-Nothing
Anti-pattern: the agent reports a side-effecting action as complete from its own narration, when the tool call silently failed or never ran and nothing checked that the effect occurred.
Context
An agent runs tasks that mutate the outside world: filing a ticket, sending an email, updating a record, writing a file, charging a card. The action is delegated to a tool, and the agent then composes a natural-language reply to the user that describes what it did. The loop that decides what to say to the user is the same loop that issued the tool call, so the agent infers success from its own intent rather than from a confirmed effect.
Problem
A model generates the most plausible continuation, and after issuing an action the most plausible next sentence is a confident confirmation that the action succeeded. When the tool call silently fails, times out, returns an unparsed error, or is skipped entirely, the model often sees nothing that contradicts the expected happy path, so it still narrates success. The user is told the ticket was filed or the email was sent, the effect never landed, and the gap surfaces only later when the missing outcome is noticed downstream.
Forces
- The most statistically plausible token after an action is a confirmation, so the model drifts toward claiming success regardless of what the tool returned.
- A side-effecting call can fail in ways that raise no exception the agent sees: a swallowed error, a timeout, a no-op response, or a call the model narrated but never actually emitted.
- Adding an independent post-action check of the effect costs an extra read and slows the turn, so it is tempting to trust the call return instead.
- Effects often land in a different system than the one the agent called, so confirming them requires querying that downstream system, not the tool response.
Example
A user asks a support agent to open a refund ticket. The agent calls the ticketing tool, the call times out without raising an error the agent reads, and the agent replies 'Your refund request has been successfully submitted.' No ticket exists. Days later the user follows up, the agent again reassures them it was filed, and only a human checking the queue discovers nothing was ever created.
Diagram
Solution
Therefore:
Treat an action as complete only when an independent check observes its effect, not when the agent says so. After each side-effecting call, query the system of record for the artifact the action was supposed to produce — the ticket id, the sent-message receipt, the updated row, the written file — and compare it against what was intended. If the read-back is missing or does not match, report failure or retry rather than confirming. Keep the verifier outside the agent's own reasoning loop so a hallucinated confirmation cannot satisfy it, and have the agent answer user verification questions from the read-back, never from memory of what it meant to do.
What this pattern forbids. A side-effecting action is never reported as complete from the agent's own narration or from the tool-call return alone; success must not be claimed until an independent check has read the effect back from the system of record.
The patterns that counter or replace it —
- alternative-toPlanner-Executor-Verifier (PEV)★— Triadic specialization where a planner produces the plan, an executor runs it, and a separate verifier checks each step's effects against the original goal.
- complementsDeception Manipulation✕— Anti-pattern: rely on the agent's own self-report of its actions for audit and oversight.
- complementsMissing Idempotency on Agent Calls✕— Anti-pattern: retry state-mutating agent tool calls without idempotency keys, so retries multiply real-world side effects.
- complementsDry-Run Harness★— Simulate planned actions (and their projected side effects) without committing them, surfacing a reviewable diff before any commit.
- complementsWorkflow-Success vs Business-Validity Gap✕— Anti-pattern: a terminal success status from the agent or its workflow engine is read as proof the deliverable is business-correct, when it certifies only technical completion.
- complementsSilent Hypotheses in Generated Code✕— Anti-pattern: model-written code rests on an unstated runtime premise that passing tests and code review never surface, so the hidden assumption travels into production and fails there.
- complementsSilent External-Source Rot✕— Anti-pattern: an agent keeps reporting success while a wrapped external source has silently changed structure, so its tool returns valid-but-empty or degraded output that nothing watches.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.