Full-Code · Browser & Computer-Useactive

Anthropic Computer Use

Type: full-code · Vendor: Anthropic · Language: API · License: proprietary · Status: active · Status in practice: emerging · First released: 2024-10-22

Links: homepage docs

Anthropic API beta tool that lets Claude see a desktop via screenshots and drive mouse/keyboard, running inside a developer-supplied sandbox driven by an agent loop the application implements.

Description. Computer Use is a beta API capability where Claude returns abstract tool calls (screenshot, click, type, key, scroll) for a virtual display and the developer's program executes them in a sandboxed environment (the reference implementation runs Xvfb + Mutter + Tint2 inside Docker). The model is paired with text_editor and bash tools for end-to-end automation. Anthropic ships a reference quickstart that bundles the container, the tool implementations, the agent loop, and a web UI. The feature is gated behind a beta header and Anthropic recommends running with classifiers, allowlisted domains, and human confirmation for consequential actions.

Agent loop shape. Stateful multi-turn agent loop driven by the developer. Each turn the model emits one or more tool_use blocks (computer / text_editor / bash); the application executes them in a sandboxed environment, captures results (a screenshot, command output, file diff), and returns them as tool_result blocks. The loop continues until Claude responds without further tool_use blocks or a max-iteration safeguard fires.

Primary use cases

desktop GUI automation via Claude controlling mouse, keyboard and screenshots
browser navigation and form filling inside a sandboxed VM
agentic workflows combining the computer, text_editor and bash tools
research and prototyping of GUI agents on benchmarks like WebArena

flowchart TD user[User task] --> claude[Claude with computer + text_editor + bash tools] claude --> req[tool_use request] req --> classify[Prompt-injection classifier] classify -->|flag| confirm[Ask user to confirm] classify -->|clear| exec[Application executes in sandbox] confirm --> exec exec --> shot[Screenshot or command output] shot --> result[tool_result back to Claude] result --> claude claude --> done[Final assistant text]

Key concepts

computer tool → computer-use (docs) — Screenshot capture plus mouse and keyboard control over a virtual display.
text_editor and bash tools → code-execution (docs) — Companion tools for file editing and shell commands inside the same sandbox.
Agent loop → react (docs) — Developer-owned cycle that ships tool_use results back to Claude until the task ends.
Sandboxed computing environment → sandbox-isolation (docs) — Xvfb virtual display, Mutter window manager and Tint2 panel inside a Docker container.
Prompt-injection classifier → approval-queue (docs) — Server-side classifier flags screenshot prompt injections and steers Claude to ask for confirmation.

Anthropic Computer Use

Neighbourhood

Alternatives & relatives

Listed as alternative by (8)

References

Provenance