Full-Code · Orchestration Frameworksactive

DSPy

Type: full-code · Vendor: Stanford NLP · Language: Python · License: MIT · Status: active · Status in practice: mature · First released: 2023-01-09

Links: homepage docs repo

Replace hand-tuned prompts with a declarative Python programming model in which you specify input/output behaviour as Signatures, compose Modules (Predict, ChainOfThought, ReAct, ProgramOfThought), and let Optimizers (BootstrapFewShot, MIPROv2, BootstrapFinetune) tune prompts and weights against a metric.

Description. DSPy is the MIT-licensed framework for 'programming—rather than prompting—language models'. DSPy stands for Declarative Self-improving Python. It exposes three layers: Signatures declare what an LM should do (e.g. 'question -> answer'); Modules implement reasoning shapes around those signatures (dspy.Predict, dspy.ChainOfThought, dspy.ReAct, dspy.ProgramOfThought); Optimizers consume a metric and training data to 'tune the prompts and weights of your AI modules' — synthesising few-shot demos, proposing better natural-language instructions, or fine-tuning small LMs. Most often used for classifiers, RAG pipelines, and agent loops where empirically tuned prompts beat hand-crafted strings.

Agent loop shape. DSPy compiles to whatever loop the module declares. A dspy.ReAct agent runs the classic thought→action→observation cycle against a signature plus a tools list; dspy.ProgramOfThought emits and executes code; dspy.ChainOfThought adds a reasoning step before the signature output. At authoring time you only write the signature and pick the module. At compile time, an Optimizer (e.g. MIPROv2) iterates with a teacher LM, generates instructions and few-shot examples per step, and uses Bayesian Optimization to search the space of instructions/demonstrations.

Primary use cases

declarative programming of LM behaviour via Signatures
compiler-driven prompt + weight optimisation against a metric
ReAct-style agents whose prompts are co-tuned with the loop
RAG pipelines whose intermediate prompts get optimised

flowchart TD USER[Task definition] --> SIG[Signature<br/>question -> answer] SIG --> MOD{Module choice} MOD -->|reasoning| COT[dspy.ChainOfThought] MOD -->|tools| REACT[dspy.ReAct + tools] MOD -->|code| POT[dspy.ProgramOfThought] MOD -->|basic| PRED[dspy.Predict] COT --> LM[(LM call)] REACT -->|action| TOOLS[Tool functions] TOOLS -->|observation| REACT REACT --> LM POT --> EXEC[Code executor] EXEC --> LM PRED --> LM LM --> OUT[Typed output] TRAIN[(Trainset + metric)] --> OPT[Optimizer<br/>BootstrapFewShot / MIPROv2 / BootstrapFinetune] OPT -.compiles.-> COT OPT -.compiles.-> REACT OPT -.compiles.-> PRED OPT -->|teacher LM| TEACHER[Teacher module<br/>generates demos]

Key concepts

Signature (docs) — Declarative I/O spec ('question -> answer').
Module → react (docs) — Predict / ChainOfThought / ReAct / ProgramOfThought wrappers around a signature.
Optimizer → evaluator-optimizer (docs) — BootstrapFewShot / MIPROv2 / BootstrapFinetune tune prompts and weights against a metric.
Compiler / teacher LM — Optimizer uses a teacher LM (often the program itself) to bootstrap demos.
dspy.ReAct (docs) — Tool-using agent module over a signature and a tools list.

DSPy

Neighbourhood

Instantiates

Alternatives & relatives

Listed as alternative by (5)

References

Provenance