Framework · Browser & Computer-Use

ChatGPT agent

ChatGPT agent carries out multi-step tasks for a user by operating its own virtual computer with a browser, terminal, and connected tools, pausing for permission before consequential actions.

Description

ChatGPT agent is OpenAI's agentic mode inside ChatGPT that completes end-to-end tasks on a virtual computer rather than only answering questions. The agent plans steps, navigates a visual and text browser, runs code in a terminal, manipulates files, and reads from connectors such as Gmail and Google Drive. It is trained to request user permission before actions with real-world consequences such as purchases or sending email, and the user can interrupt the run or take over the browser at any point. It combines the earlier Operator browser agent, deep research web synthesis, and ChatGPT conversation in one system.

Solution

The agent runs inside an isolated virtual computer that keeps context across tools. It reasons about the goal, drives a visual or text browser and a terminal, reads files and connector data, and iterates step by step. Before any action with real-world consequences it pauses to ask the user for permission, and the user can interrupt, stop, or take over the browser at any point.

Primary use cases

  • autonomous web research and synthesis
  • multi-application task completion on a virtual computer
  • form filling and web navigation on the user's behalf
  • spreadsheet and presentation generation from gathered data

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.