LLMCompiler
also known as LLM Compiler, Parallel ReWOO
Take ReWOO's plan-as-DAG and run independent steps in parallel through a task-fetching dispatcher.
This pattern helps complete certain larger patterns —
- specialisesReWOO·— Plan a complete dependency DAG with placeholder variables before any tool runs, then execute and substitute observations into the plan.
- used-byControl-Flow Integrity★— Treat the agent's planned step sequence as a trusted control-flow graph that tool outputs, retrieved content, and user-supplied data cannot redirect at runtime.
Context
A team runs an agent whose work consists of many tool calls — fetching prices for nine tickers, summarising five documents, querying three APIs — and most of those calls are independent of each other. The deployment is latency-sensitive: a user is waiting for an answer or a downstream system has a deadline. The team is already using a plan-then-execute style architecture such as ReWOO (Reasoning Without Observation), where the planner emits a directed acyclic graph of tool calls before any tool runs.
Problem
A sequential executor walks the plan one tool call at a time, so end-to-end latency is the sum of every call even when the calls have no mutual dependency. Naive parallel-tool-calling (firing them all at once from a single chat turn) ignores the dependency graph and breaks when later calls reference earlier results. A bespoke parallel runner without bounded concurrency and a join step blows past provider rate limits, leaks errors across branches, and assembles results out of order. The team needs a runner that respects the dependency graph while overlapping independent work.
Forces
- Concurrency control: limits per provider, rate limits, fan-out costs.
- Failure isolation: one branch failing should not kill others.
- Joiner correctness: combining out-of-order results.
Example
An agent that builds a daily portfolio brief makes nine independent tool calls — fetch prices for nine tickers — strictly in sequence, taking 18 seconds where it could take two. The team rebuilds the loop as llm-compiler: the planner emits the call DAG up front, the task-fetching unit dispatches each fetch as soon as its dependencies (none, in this case) resolve, with concurrency capped at five, and the joiner assembles the brief. The brief returns in just over two seconds and the planner can express genuine cross-step dependencies when they exist.
Diagram
Solution
Therefore:
Three roles. Planner builds the dependency DAG. Task-Fetching Unit dispatches steps as their inputs become available, with bounded concurrency. Joiner assembles the final answer from the resolved DAG.
What this pattern forbids. Steps run only when all referenced upstream variables are resolved.
The smaller patterns that complete this one —
- usesParallelization★★— Run independent LLM calls concurrently and combine results.
And the patterns that stand alongside it, or against it —
- alternative-toParallel Tool Calls★★— Allow the model to emit several independent tool calls in one assistant turn; the host executes them in parallel.
- composes-withSubagent Isolation★— Run subagents in isolated workspaces so their writes do not collide and parallelism is safe.
- complementsGraph of Thoughts·— Model reasoning as an arbitrary DAG so thoughts can be merged, refined, and aggregated across branches.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.