Framework · Browser & Computer-Use

Anthropic Computer Use

Anthropic API beta tool that lets Claude see a desktop via screenshots and drive mouse/keyboard, running inside a developer-supplied sandbox driven by an agent loop the application implements.

Description

Computer Use is a beta API capability where Claude returns abstract tool calls (screenshot, click, type, key, scroll) for a virtual display and the developer's program executes them in a sandboxed environment (the reference implementation runs Xvfb + Mutter + Tint2 inside Docker). The model is paired with text_editor and bash tools for end-to-end automation. Anthropic ships a reference quickstart that bundles the container, the tool implementations, the agent loop, and a web UI. The feature is gated behind a beta header and Anthropic recommends running with classifiers, allowlisted domains, and human confirmation for consequential actions.

Solution

Stateful multi-turn agent loop driven by the developer. Each turn the model emits one or more tool_use blocks (computer / text_editor / bash); the application executes them in a sandboxed environment, captures results (a screenshot, command output, file diff), and returns them as tool_result blocks. The loop continues until Claude responds without further tool_use blocks or a max-iteration safeguard fires.

Primary use cases

  • desktop GUI automation via Claude controlling mouse, keyboard and screenshots
  • browser navigation and form filling inside a sandboxed VM
  • agentic workflows combining the computer, text_editor and bash tools
  • research and prototyping of GUI agents on benchmarks like WebArena

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.