Anthropic Computer Use
Anthropic API beta tool that lets Claude see a desktop via screenshots and drive mouse/keyboard, running inside a developer-supplied sandbox driven by an agent loop the application implements.
Description
Computer Use is a beta API capability where Claude returns abstract tool calls (screenshot, click, type, key, scroll) for a virtual display and the developer's program executes them in a sandboxed environment (the reference implementation runs Xvfb + Mutter + Tint2 inside Docker). The model is paired with text_editor and bash tools for end-to-end automation. Anthropic ships a reference quickstart that bundles the container, the tool implementations, the agent loop, and a web UI. The feature is gated behind a beta header and Anthropic recommends running with classifiers, allowlisted domains, and human confirmation for consequential actions.
Solution
Stateful multi-turn agent loop driven by the developer. Each turn the model emits one or more tool_use blocks (computer / text_editor / bash); the application executes them in a sandboxed environment, captures results (a screenshot, command output, file diff), and returns them as tool_result blocks. The loop continues until Claude responds without further tool_use blocks or a max-iteration safeguard fires.
Primary use cases
- desktop GUI automation via Claude controlling mouse, keyboard and screenshots
- browser navigation and form filling inside a sandboxed VM
- agentic workflows combining the computer, text_editor and bash tools
- research and prototyping of GUI agents on benchmarks like WebArena
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.