TaskWeaver
Type: full-code · Vendor: Microsoft Research Asia (Beijing) · Language: Python · License: MIT · Status: active · Status in practice: mature
Code-first agent framework that converts user requests into executable Python code, preserves in-memory state (variables, DataFrames) across turns, and orchestrates user-defined plugins as callable functions.
Description. TaskWeaver (arXiv 2311.17541) is Microsoft Research Asia's code-first agent framework for data analytics. Its distinguishing design choice is preserving the Python interpreter's in-memory state across conversational turns — DataFrames, intermediate variables, and rich data structures persist so subsequent turns can refer to them by name rather than re-loading or re-serialising. A planner decomposes the user request into tasks, a code generator emits Python that calls user-defined plugins, and an executor runs the code with the persisted state. Code verification before execution and reflective adjustment after errors are first-class.
Agent loop shape. Planner emits a typed plan over the user request. CodeInterpreter generates Python that invokes the plan's user-defined plugins (each plugin is a typed Python callable). The Executor runs the code in a persistent Python interpreter session, so all state from prior turns is alive. Code is checked before execution and the agent reflects on errors to adjust subsequent code generation.
Primary use cases
- multi-turn data analytics where in-memory state (DataFrames) carries across turns
- domain adaptation via plugins encapsulating proprietary algorithms
- verifiable code-execution flows (pre-execution check + post-execution reflection)
- research baseline for code-first vs. text-first agent architectures
Key concepts
- Code-first interface → code-as-action — User requests become executable Python rather than text-only plans.
- Stateful interpreter session — Variables, DataFrames, and in-memory state persist across turns.
- Plugin functions → tool-use — User-defined typed callables the code generator can invoke.
- Pre-execution code verification → sandbox-isolation — Generated code is checked for safety and correctness before running.
Patterns this full-code implements —
- ★Code-as-Action Agent
Core architecture is code-first.
- ★★Code Execution
- ★★Plan-and-Execute
Planner → CodeInterpreter → Executor.
- ★★Hierarchical Agents
Planner over CodeInterpreter over Executor.
- ★★Tool Use
Plugins as typed callables.
- ★★Self-Refine
Reflection on execution errors adjusts next code.
- ★★Structured Output
Typed plan and typed plugins.
- ★★Sandbox Isolation
- ★★Step Budget
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.