JSON-Only Action Schema

also known as JSON-Dict Tool Calls Only, No Code-as-Action, Function-Argument JSON as Action Language

Anti-pattern: restrict the agent's action language to JSON tool-call dictionaries even for tasks where code-as-action (functions composing, loops, conditionals over results) would be the natural shape.

This pattern helps complete certain larger patterns —

used-byLLM as Periphery·— Invert the typical LLM-in-the-middle architecture: a deterministic state machine and event store form the core; the LLM is restricted to edge tasks — input interpretation and output synthesis only.

Context

A team is building an agent on a framework that standardised early on the provider's function-calling contract: the model emits one tool call per turn as a JSON dictionary with flat arguments, the host executes it, and the result comes back as another turn. As tasks grow more sophisticated — data wrangling, multi-step reductions, conditional branching on intermediate results — the team keeps the JSON-only action language and expresses composition by issuing more turns. The option of letting the agent write a short code snippet that calls tools as functions inside a sandbox is dismissed as too risky or out of scope.

Problem

A JSON tool call cannot directly express a loop, a conditional over an intermediate value, or the reuse of one tool's output as another tool's argument. To compose three tools the agent must take three or more turns, ship each intermediate result back through the model as a string, and reconstruct any structured object on each side. Token cost is dominated by these round-tripped intermediates, latency is dominated by the turn count, and the action language drifts further from the code-shaped composition the model actually saw most of in training.

Forces

JSON tool calls are the dominant industry contract and the easiest to log, validate, and rate-limit.
Code-as-action requires a sandboxed interpreter (Python, JS) with its own security envelope.
Multiple papers (Executable Code Actions Elicit Better LLM Agents; CodeAct) report that LLMs solve composition-heavy tasks better when allowed to emit code.
Code is over-represented in LLM training corpora compared to JSON tool-call traces.

Example

A data-investigation agent has tools for query, transform, and chart. Under JSON-only it must call query, return the rows as a JSON blob through the model, call transform with that blob inlined, return another blob, then call chart. Token cost is dominated by the round-tripped tables; latency is dominated by turn count. The team switches to code-as-action: tools are exposed as Python functions, the agent writes a five-line script that pipes query into transform into chart, the interpreter executes it, and the agent receives the chart object back. One turn replaces five.

Diagram

flowchart TD T[Composition-heavy task] --> J{Action language?} J -- JSON only --> T1[Turn 1: call A] T1 --> T2[Round-trip result] T2 --> T3[Turn 2: call B with inlined result] T3 --> Tn[...many turns...] J -- code-as-action --> Code[Agent emits code] Code --> Run[Sandbox runs script] Run --> One[Return composed result in one turn]

Solution

Therefore:

Don't insist on JSON-only when the task needs composition. For composition-heavy work, swap to code-as-action: expose tools as ordinary functions in a sandboxed interpreter and let the agent write the glue. Keep JSON for simple one-tool one-arg actions where the contract genuinely fits. See code-as-action, agent-computer-interface, sandbox-isolation.

What this pattern forbids. Avoiding it constrains the action-language choice: JSON-only dictionaries may serve narrow one-tool-per-turn flows, but composition-heavy tasks must not be forced through them when code-as-action in a sandbox is the natural shape.

The patterns that counter or replace it —

alternative-toCode-as-Action Agent★— Have the agent emit a code snippet as its action each step, executed in a constrained interpreter, instead of emitting JSON tool calls; tool composition becomes function nesting and control flow inside the snippet.
alternative-toTool Use★★— Let the LLM produce typed calls against an external toolkit instead of producing free-form text the surrounding system has to parse.
complementsSandbox Isolation★★— Run agent-emitted code or actions in a contained environment with restricted filesystem, network, and process privileges.
complementsAgent-Computer Interface★— Design the tool surface for an LLM agent specifically, with affordances different from human-facing CLIs.
complementsDeterministic Control Flow, Not Prompt★— Branching decisions live in deterministic application code while the LLM is invoked at strategic points to produce structured signals that the code branches on.
complementsCanonical-Entity Grounding★— Require the agent to resolve every business identifier it uses — SKU, account, supplier, customer — through an authoritative lookup against the system of record, rather than emitting the identifier from the model's parametric memory.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance

Source: patterns/json-only-action-schema.md on GitHub · commit 7965435 · view history
Added to catalog: 2026-05-20
Last updated: 2026-05-22
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.