Blocking Sync Calls in Agent Loop
also known as Sync Tool Calls in HTTP Handler, Event-Loop-Blocking Agent
Anti-pattern: run synchronous, blocking I/O inside the agent loop or HTTP handler, capping concurrency at the number of OS threads.
Context
An agent is exposed via an HTTP endpoint. Inside the request handler, the agent runs its plan-act loop synchronously, awaiting each model call and tool call serially on the request thread. Works perfectly in development with one user.
Problem
Throughput collapses past 10–20 concurrent requests because the runtime cannot release the thread while awaiting upstream I/O. Memory grows linearly with concurrency. Worse on Python ASGI servers when the agent loop blocks the event loop, freezing all in-flight requests. The failure mode is invisible in dev (one user) and only appears under realistic load.
Forces
- Async code is harder to write and harder to debug than sync.
- Many agent SDKs default to sync APIs in their examples.
- Sync feels safer because the call returns when 'done'.
Example
An agent is wrapped in a Flask endpoint. The agent loop runs 8 tool calls per request, each averaging 2s. Per-request wall time: 16s of blocked thread. With 4 worker processes and 8 threads each, the system serves 32 concurrent requests before queueing. At 100 RPS the queue depth grows unboundedly; user requests time out. Fix is full async + worker offload, not horizontal scale.
Diagram
Solution
Therefore:
Use async tool clients and async model SDKs throughout the agent loop. Move long-running agent execution off the request thread to a worker process or durable workflow runtime. Where sync is unavoidable, isolate it in a thread pool that does not share threads with the request handler. Pair with stateless-reducer-agent so the agent can be paused, persisted and resumed across workers.
What this pattern forbids. No useful constraint; the missing constraint is non-blocking I/O end-to-end in the agent path.
And the patterns that stand alongside it, or against it —
- complementsStateless Reducer Agent★— Design the agent as a pure function (state, event) → newState; entire execution history is held in an external event log; enables pause / resume / replay / time-travel without bespoke checkpointing.
- alternative-toEvent-Driven Agent★★— Trigger the agent on external events (webhooks, message queues, file changes) instead of user requests or schedules.
- complementsDurable Workflow Snapshot★— Capture workflow execution state as a snapshot in a pluggable storage provider so a paused run can resume across deployments, process restarts, and host crashes.
- complementsOrchestrator as Bottleneck✕— Anti-pattern: route all agent runs through a single-process orchestrator that becomes the system-wide concurrency ceiling.
- complementsAgent Resumption★★— Persist agent execution state so a long-running run survives restarts, deploys, or user disconnects.
- complementsInfrastructure Burst Bottleneck (Agent Scale-Out)✕— Anti-pattern: deploy agents whose scale-out behavior triggers sudden data-and-compute bursts that on-prem or under-provisioned cloud infrastructure cannot absorb; agents work at small scale and freeze in production.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.