ChatGPT agent
Type: app · Vendor: OpenAI · Language: proprietary · License: proprietary · Status: active · Status in practice: mature · First released: 2025-07-17
ChatGPT agent carries out multi-step tasks for a user by operating its own virtual computer with a browser, terminal, and connected tools, pausing for permission before consequential actions.
Description. ChatGPT agent is OpenAI's agentic mode inside ChatGPT that completes end-to-end tasks on a virtual computer rather than only answering questions. The agent plans steps, navigates a visual and text browser, runs code in a terminal, manipulates files, and reads from connectors such as Gmail and Google Drive. It is trained to request user permission before actions with real-world consequences such as purchases or sending email, and the user can interrupt the run or take over the browser at any point. It combines the earlier Operator browser agent, deep research web synthesis, and ChatGPT conversation in one system.
Agent loop shape. The agent runs inside an isolated virtual computer that keeps context across tools. It reasons about the goal, drives a visual or text browser and a terminal, reads files and connector data, and iterates step by step. Before any action with real-world consequences it pauses to ask the user for permission, and the user can interrupt, stop, or take over the browser at any point.
Primary use cases
- autonomous web research and synthesis
- multi-application task completion on a virtual computer
- form filling and web navigation on the user's behalf
- spreadsheet and presentation generation from gathered data
Key concepts
- Virtual computer → computer-use (docs) — An isolated environment with a browser, terminal, and file system whose state is shared across tools, so the agent can carry intermediate results from one step to the next within a single task.
- Takeover mode → human-in-the-loop (docs) — A control in which the user takes over the visual browser to input sensitive information directly, leaving the agent out of the loop for that step.
- Connectors → tool-use (docs) — Integrations that let the agent read from connected apps such as Gmail, Google Drive, and GitHub so it can use information from those services in a task.
- Deep research (docs) — The web-research capability, carried over from OpenAI's Deep Research, that conducts extensive multi-step searches and synthesises the findings into a report or structured output.
Patterns this app implements —
- ★Computer Use
ChatGPT agent operates its own virtual computer end-to-end, driving a visual browser, a terminal, and file operations rather than calling bespoke per-application APIs.
- ★★Human-in-the-Loop
The agent is trained to stop and request explicit user permission before taking consequential actions such as purchases or sending email, and the user can interrupt or take over at any point.
- ★Browser Agent
The agent navigates the web with both a text browser that scans page content and a visual browser that clicks, scrolls, fills forms, and navigates UI elements, so it completes tasks on sites that hav…
- ★★Code Execution
A terminal in the virtual computer lets the agent run code, manipulate files, do data analysis, and run scripts to produce artefacts such as spreadsheets and presentations.
- ★★Tool Use
Through ChatGPT connectors the agent reads from external applications — Gmail, Google Drive, GitHub, Google Calendar, SharePoint — pulling task-relevant data from connected services rather than only…
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.