Reasoning

Latent-Space Reasoning

Let the model reason in continuous hidden-state space instead of decoding each step to text, feeding the last hidden state back as the next input embedding, so one latent step can hold several continuations.

Problem

Forcing every reasoning step through natural-language tokens spends most of the compute on producing coherent words rather than on the few decisions that matter, and it makes the model commit to one continuation at each step — once a token is emitted, the path is chosen. Tasks that need to keep several options open and backtrack are penalised, because token-by-token decoding cannot represent 'either of these next steps' in a single state. The language channel becomes a bottleneck on reasoning that is shaped for human readers, not for search.

Solution

Instead of decoding each reasoning step into a word token and re-encoding it, take the model's last hidden state as the reasoning state — a 'continuous thought' — and feed it directly back as the next input embedding. The model reasons through a sequence of these latent states and only decodes to text when it produces the final answer. Because a continuous state is not collapsed onto one token, it can encode several alternative next steps at once, letting the model explore breadth-first and defer commitment, which helps on tasks that require backtracking. Training mixes latent steps into the reasoning trace so the model learns to use them.

When to use

  • The task needs multi-step reasoning with backtracking or search.
  • Token-by-token commitment is hurting reasoning quality.
  • Training or fine-tuning the model to use latent steps is feasible.
  • A human-readable reasoning trace is not a hard requirement.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related