Anthropic Computer Use
Type: full-code · Vendor: Anthropic · Language: API · License: proprietary · Status: active · Status in practice: emerging · First released: 2024-10-22
Anthropic API beta tool that lets Claude see a desktop via screenshots and drive mouse/keyboard, running inside a developer-supplied sandbox driven by an agent loop the application implements.
Description. Computer Use is a beta API capability where Claude returns abstract tool calls (screenshot, click, type, key, scroll) for a virtual display and the developer's program executes them in a sandboxed environment (the reference implementation runs Xvfb + Mutter + Tint2 inside Docker). The model is paired with text_editor and bash tools for end-to-end automation. Anthropic ships a reference quickstart that bundles the container, the tool implementations, the agent loop, and a web UI. The feature is gated behind a beta header and Anthropic recommends running with classifiers, allowlisted domains, and human confirmation for consequential actions.
Agent loop shape. Stateful multi-turn agent loop driven by the developer. Each turn the model emits one or more tool_use blocks (computer / text_editor / bash); the application executes them in a sandboxed environment, captures results (a screenshot, command output, file diff), and returns them as tool_result blocks. The loop continues until Claude responds without further tool_use blocks or a max-iteration safeguard fires.
Primary use cases
- desktop GUI automation via Claude controlling mouse, keyboard and screenshots
- browser navigation and form filling inside a sandboxed VM
- agentic workflows combining the computer, text_editor and bash tools
- research and prototyping of GUI agents on benchmarks like WebArena
Key concepts
- computer tool → computer-use (docs) — Screenshot capture plus mouse and keyboard control over a virtual display.
- text_editor and bash tools → code-execution (docs) — Companion tools for file editing and shell commands inside the same sandbox.
- Agent loop → react (docs) — Developer-owned cycle that ships tool_use results back to Claude until the task ends.
- Sandboxed computing environment → sandbox-isolation (docs) — Xvfb virtual display, Mutter window manager and Tint2 panel inside a Docker container.
- Prompt-injection classifier → approval-queue (docs) — Server-side classifier flags screenshot prompt injections and steers Claude to ask for confirmation.
Patterns this full-code implements —
- ★Computer Use
Headline capability: screenshot, mouse, keyboard via the computer tool.
- ★★ReAct
Documented agent loop is the canonical model-acts-then-observes-result shape.
- ★★Tool Use
Three first-class tools: computer_*, text_editor_*, bash_*.
- ★★Code Execution
bash and text_editor tools execute commands and edit files in the sandbox.
- ★★Sandbox Isolation
Reference implementation runs Xvfb + Mutter + Tint2 inside Docker; Anthropic recommends a dedicated VM/container.
- ★★Approval Queue
Prompt-injection classifier steers the model to ask for confirmation before risky next actions; Anthropic recommends human confirmation for consequential actions.
- ★★Human-in-the-Loop
Anthropic encourages low-risk tasks and explicit user consent; not a built-in workflow primitive.
- ★Dual-System GUI Agent
The computer tool alone is single-system (vision + action by one model). Anthropic's own docs explicitly frame computer use as a single-agent capability, and separation of planner/executor is left to…
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.