Routing & Composition

Parallel Tool Calls

Allow the model to emit several independent tool calls in one assistant turn; the host executes them in parallel.

Problem

If the agent issues these calls sequentially, the wall-clock latency is the sum of every call even though none of them depend on the others, and the product feels sluggish for no good reason. Building a full directed-acyclic-graph planner that schedules tool calls and tracks dependencies is heavyweight for the simple case where the model already knows which calls are independent. The team needs a lighter way to let independent calls run at the same time without standing up a planner.

Solution

The provider's API allows the assistant turn to contain multiple tool calls. The host fans them out concurrently (with bounded concurrency and rate-limit handling). Results return as multiple tool messages; the next assistant turn sees all of them.

When to use

The model frequently issues multiple independent tool calls per turn.
The provider's API supports multiple tool calls in one assistant message.
The host can fan out concurrent calls with bounded concurrency and rate-limit handling.

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related