Language Agent Tree Search
also known as LATS, MCTS for Agents, Tree-Search Agent, Backtracking Agent
Lift the agent loop into a search tree with a learned value function and backtracking.
This pattern helps complete certain larger patterns —
- specialisesTree of Thoughts★— Search over a tree of partial reasoning states with explicit lookahead, evaluation, and backtracking.
- specialisesTest-Time Compute Scaling★★— Allocate more inference-time compute (samples, search, deeper thinking) instead of scaling parameters to improve quality.
Context
A team gives an agent a problem where several reasoning paths are plausible at the start — a coding bug with multiple possible root causes, a puzzle with several candidate frames, an investigation that could go in three directions. The first plausible path is often not the best one, and committing to it produces confidently wrong answers when it dead-ends. The team has at least some signal (test suite, verifier, heuristic scorer) that can rate a partial trajectory.
Problem
Single-chain agent loops like ReAct (the reason-act-observe loop) and Plan-and-Execute commit to one chain of thought from the first step. When that chain enters a wrong frame they cannot backtrack cheaply; they either thrash inside the wrong frame or restart from scratch. Self-consistency (sample many answers and vote) helps for one-shot tasks but does not help an agent that needs to interleave tool calls with reasoning. The team needs a way to explore alternative trajectories while still spending most of the compute on the branches that are paying off.
Forces
- Search is expensive; the value function must be cheap.
- Branch ranking determines whether search beats greedy.
- Memory of failed branches must not leak into successful ones.
Example
A coding agent given an ambiguous bug report tries the first plausible fix, finds it wrong on the test suite, then thrashes because its single chain-of-thought has already committed to that frame. The team rebuilds the loop as LATS: each partial trajectory is a node, expansion samples alternative next actions, the test suite acts as the value signal, and UCT selects the next node to explore. When a branch fails its tests the agent backtracks instead of digging in. Hard bugs that previously needed a human now resolve autonomously.
Diagram
Solution
Therefore:
Apply Monte Carlo Tree Search (MCTS) to the agent loop. Each node is a partial trajectory. Expansion samples next thoughts/actions. Backpropagation updates a value estimate. Selection chooses the next node by UCT. The agent can backtrack from a failing branch instead of committing.
What this pattern forbids. Each node may be expanded only by sampling actions consistent with the parent state.
The smaller patterns that complete this one —
- usesReAct★★— Interleave a single thought, a single tool call, and a single observation per step so the agent reasons over fresh evidence.
- generalisesAdaptive Branching Tree Search·— At each node of an inference-time search tree, use Thompson sampling to decide whether to deepen an existing answer or branch a fresh attempt, optionally choosing per-node which underlying LLM to invoke.
And the patterns that stand alongside it, or against it —
- complementsSelf-Consistency★★— Sample the same question multiple times at non-zero temperature and aggregate by majority or judge to mitigate hallucination.
- complementsExploration vs Exploitation★— Balance taking the best-known action (exploit) with trying alternatives that might be better (explore).
- complementsGraph of Thoughts·— Model reasoning as an arbitrary DAG so thoughts can be merged, refined, and aggregated across branches.
- complementsProcess Reward Model★— Train a verifier that scores each reasoning step rather than only the final answer.
- complementsAutomatic Workflow Search·— Treat the agent's workflow (a graph of LLM-invoking nodes) as an artefact to search; use Monte Carlo Tree Search guided by an eval benchmark to discover the best workflow, then deploy it.
- complementsWorld Model as Tool·— Let a planning agent invoke a generative world model as a tool to roll out hypothetical futures before committing to an action, treating the world model as a callable simulator rather than a training target.
- complementsMulti-Path Plan Generator★★— Generate multiple candidate next-steps at each plan node enabling later selection — the planning generator pattern paired with tree-of-thoughts / LATS-style search.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.