II · Planning & Control FlowExperimental·

Two-Rate Cloud-Brain / Edge-Controller Split

also known as Fast-Slow VLA Split, Two-Clock Brain-Controller, Asynchronous Dual-System Robot Policy

Run a slow planner at low frequency that emits a compact latent plan, and a small on-device controller that tracks it at the robot's native control rate without ever blocking on the planner.

Context

An embodied agent has to keep a physical body stable and on-task. The body needs new motor commands tens of times a second to stay balanced and to track a moving target, but the large model that understands the scene, follows the instruction, and chooses what to do next takes far longer than one control period to produce an output. The big model often runs off-board on a cloud or workstation accelerator, while only a small accelerator sits on the robot.

Problem

A single model cannot be both the deliberate planner and the real-time motor loop. If the high-rate controller waits for each new plan from the slow planner, its effective rate collapses to the planner's inference speed and the body falls out of balance or overshoots its target. If the slow planner is forced to run fast enough for control, it must shrink until it can no longer reason about the scene or the instruction. The agent needs deliberation and real-time actuation at the same time on the same body.

Forces

Real-time stability needs a fixed high control rate; a missed deadline is a physical failure, not a slow response.
Scene understanding and instruction following need a large model whose inference is far slower than one control period.
On-board compute and power are limited, so the large model often runs off-board and reaches the body over a link with variable latency.
The two parts must agree on what to do, yet they update on two different clocks.

Example

A humanoid robot is told to carry a tray across a moving crowd. A large planner on a nearby workstation looks at the scene a few times a second and posts a short latent goal such as 'step left, keep the tray level'. A small model on the robot reads that goal and the robot's balance sensors a hundred times a second and sends the leg and arm commands that keep it upright. When a person steps in and the next plan is late, the small model keeps tracking the last goal so the robot does not stumble.

Diagram

Solution

Therefore:

Separate the agent into two loops that run on two clocks and communicate through a small shared latent. The slow planner — a large model, often off-board — reads the instruction and recent observations and emits a compact latent plan or goal at low frequency, for example a few hertz. The fast controller — a small model on the robot — takes that latent plan plus the latest proprioception and sensor readings and produces motor commands at the native control rate, for example fifty to a hundred hertz. The fast loop never waits for the slow loop: it reads whichever latent plan is currently posted and keeps tracking it, and the slow loop overwrites that plan asynchronously whenever its next inference finishes. The interface between them is the latent plan, so the planner can be retrained or moved across the link without changing the controller's deadline.

What this pattern forbids. The fast controller must close its loop at the native control rate and may not block waiting on the slow planner; it must act on the last posted latent plan, and when no fresh plan has arrived it must keep tracking the previous one or fall back to a safe hold rather than stall.

And the patterns that stand alongside it, or against it —

alternative-toTalker-Reasoner★— Split an interactive agent into a fast Talker for conversational responses and a slow Reasoner for deliberative planning and tool use, so the conversational loop never blocks on reasoning.
alternative-toDual-System GUI Agent★— Split a GUI agent into a decision model that plans and recovers from errors and a grounding model that observes pixels and emits the precise action; route each subproblem to the better-suited model.
complementsHierarchical Agents★★— Organise agents in a tree where higher-level agents decompose tasks for lower-level agents, recursively.
complementsLocal-to-Cloud Handoff★— Promote an interactive local agent session mid-task to a detached cloud agent that keeps running after the developer disconnects and reports back asynchronously.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance

Source: patterns/two-rate-brain-controller-split.md on GitHub · commit ad426c4 · view history
Added to catalog: 2026-06-14
Last updated: 2026-06-14
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.