Self-Ask
also known as Decompose-Ask, Sub-Question Prompting
Have the model emit explicit follow-up sub-questions, answer them (optionally via search), then compose the final answer.
Context
A team is using a model on questions whose answer requires chaining several known facts together. For example, 'which of the founder's PhD advisors won a Turing Award?' depends on first knowing who founded the organisation, then who that person's PhD advisors were, then which awards each of those advisors won. The model can answer each individual hop correctly when asked in isolation, but when the question is posed as a single sentence it tends to return the wrong endpoint.
Problem
Knowing each fact and being able to chain those facts together inside a single inference are different skills; this gap between them is the so-called compositionality gap. Without scaffolding, the model collapses the chain into a single step and either invents an answer or returns the wrong endpoint. Plain chain-of-thought helps a little, but the reasoning steps are not framed as questions, so the model cannot offload any of them to a search tool, and a human reader cannot easily inspect where in the chain the model went wrong.
Forces
- Sub-question quality bounds the answer quality.
- Sub-question slots invite tool integration but add latency.
- Excessive decomposition wastes calls.
Example
A QA agent fails on multi-hop questions like 'which of the founder's PhD advisors won a Turing Award?' even though it knows each fact. The team prompts it to emit explicit follow-up sub-questions ('who was the founder's PhD advisor?', 'did that person win a Turing Award?'), answer each via search, then compose. Multi-hop accuracy jumps because the compositionality gap is closed by externalising the steps the model otherwise short-circuits.
Diagram
Solution
Therefore:
Prompt the model to interleave sub-questions and their answers. Each sub-question is either answered by the model directly or by a search tool. The final answer is composed once all sub-questions are answered.
What this pattern forbids. Sub-question slots are the only insertion point for retrieval or tool calls; the agent cannot retrieve except through a sub-question.
The smaller patterns that complete this one —
- generalisesReAct★★— Interleave a single thought, a single tool call, and a single observation per step so the agent reasons over fresh evidence.
And the patterns that stand alongside it, or against it —
- complementsLeast-to-Most Prompting★— Decompose a hard problem into an ordered list of easier subproblems, then solve them sequentially with each answer feeding the next.
- complementsSocratic Questioning Agent★— Drive the agent toward its goal by asking the user a sequence of strategic, open-ended questions that surface the user's own latent knowledge, goal, or context — rather than producing an answer directly.
- complementsQuery-Decomposition Agent★★— An agent whose explicit job is to split an incoming user query into smaller independent sub-queries that can be answered sequentially or in parallel, then merge results.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.