Self-Ask

also known as Decompose-Ask, Sub-Question Prompting

Have the model emit explicit follow-up sub-questions, answer them (optionally via search), then compose the final answer.

Context

A team is using a model on questions whose answer requires chaining several known facts together. For example, 'which of the founder's PhD advisors won a Turing Award?' depends on first knowing who founded the organisation, then who that person's PhD advisors were, then which awards each of those advisors won. The model can answer each individual hop correctly when asked in isolation, but when the question is posed as a single sentence it tends to return the wrong endpoint.

Problem

Knowing each fact and being able to chain those facts together inside a single inference are different skills; this gap between them is the so-called compositionality gap. Without scaffolding, the model collapses the chain into a single step and either invents an answer or returns the wrong endpoint. Plain chain-of-thought helps a little, but the reasoning steps are not framed as questions, so the model cannot offload any of them to a search tool, and a human reader cannot easily inspect where in the chain the model went wrong.

Forces

Sub-question quality bounds the answer quality.
Sub-question slots invite tool integration but add latency.
Excessive decomposition wastes calls.

Example

A QA agent fails on multi-hop questions like 'which of the founder's PhD advisors won a Turing Award?' even though it knows each fact. The team prompts it to emit explicit follow-up sub-questions ('who was the founder's PhD advisor?', 'did that person win a Turing Award?'), answer each via search, then compose. Multi-hop accuracy jumps because the compositionality gap is closed by externalising the steps the model otherwise short-circuits.

Diagram

flowchart TD Q[Top question] --> M[Model] M --> SQ1[Sub-question 1] SQ1 -->|answer directly<br/>or via search| A1[Answer 1] A1 --> SQ2[Sub-question 2] SQ2 -->|answer| A2[Answer 2] A2 --> Comp[Compose final answer]

Solution

Therefore:

Prompt the model to interleave sub-questions and their answers. Each sub-question is either answered by the model directly or by a search tool. The final answer is composed once all sub-questions are answered.

What this pattern forbids. Sub-question slots are the only insertion point for retrieval or tool calls; the agent cannot retrieve except through a sub-question.

The smaller patterns that complete this one —

generalisesReAct★★— Interleave a single thought, a single tool call, and a single observation per step so the agent reasons over fresh evidence.

And the patterns that stand alongside it, or against it —

complementsLeast-to-Most Prompting★— Decompose a hard problem into an ordered list of easier subproblems, then solve them sequentially with each answer feeding the next.
complementsSocratic Questioning Agent★— Drive the agent toward its goal by asking the user a sequence of strategic, open-ended questions that surface the user's own latent knowledge, goal, or context — rather than producing an answer directly.
complementsQuery-Decomposition Agent★★— An agent whose explicit job is to split an incoming user query into smaller independent sub-queries that can be answered sequentially or in parallel, then merge results.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in frameworks

LangChain
first-class26 patternsOrchestration Frameworks★★ mature
create_self_ask_with_search_agent builds an agent that breaks a question into explicit follow-up sub-questions, answers each (via an Intermediate Answer search tool), then compose…

References

Measuring and Narrowing the Compositionality Gap in Language Models
paper

Provenance

Source: patterns/self-ask.md on GitHub · commit 4fa1213 · view history
Added to catalog: 2026-04-30
Last updated: 2026-05-22
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.