Instructor

Type: full-code · Vendor: Jason Liu / community · Language: Python, TypeScript, Go, Ruby, Elixir, Rust · License: MIT · Status: active · Status in practice: mature · First released: 2023-07-28

Links: homepage docs repo

Get reliable, type-safe structured data from any LLM by patching the provider client to accept a Pydantic response_model, validate the response, and retry with validation feedback when the model violates the schema.

Description. Instructor is the MIT-licensed library that turns Pydantic models into the contract between application code and an LLM. It patches each supported provider's chat-completion client to add response_model, max_retries, and context parameters, then wraps the call: when validation fails, the library reasks the model with the validation error attached so the next attempt converges on a valid object. Modes (TOOLS, JSON, MD_JSON, FUNCTIONS) select the underlying provider mechanism; from_provider exposes a unified entry point across 15+ providers (OpenAI, Anthropic, Google, Mistral, Cohere, Ollama, DeepSeek, Groq, and more). Instructor is positioned as an extraction library, not an agent framework: 'Instructor for extraction, PydanticAI for agents.'

Agent loop shape. Single-call extraction with internal retry, not an agent loop. The patched create() runs the request, parses the response into the response_model, validates it, and on ValidationError reasks the model up to max_retries times. There is no multi-step tool loop, no handoff, no session memory; agentic behaviour requires composing Instructor inside a higher-level framework.

Primary use cases

extracting typed objects from LLM responses
schema-validated tool-call arguments before execution
multi-provider structured-output code that does not change when swapping models
self-correcting JSON output via validation-feedback retries

flowchart TD APP[Application] --> CLIENT[Patched provider client<br/>via patch() or from_provider()] CLIENT --> CREATE[create(response_model=Model, max_retries=N)] CREATE -->|TOOLS / JSON / MD_JSON| LLM[Provider LLM] LLM --> RESP[Raw response] RESP --> VAL{Pydantic validate} VAL -- ok --> OUT[Typed Pydantic instance] VAL -- ValidationError --> REASK[Reask with error feedback] REASK --> CREATE

Key concepts

Patching (docs) — Adds response_model, max_retries, and context to the provider's create() method without changing original client code.
response_model → structured-output (docs) — Pydantic BaseModel that defines the expected output shape; docstrings and field annotations feed the prompt.
Modes (TOOLS / JSON / MD_JSON / FUNCTIONS) (docs) — Selects which provider-side mechanism Instructor uses to coerce the model into the schema. OpenAI / Anthropic / Google default to TOOLS.
max_retries with validation feedback → self-refine (docs) — On Pydantic ValidationError Instructor retries with the validation error fed back to the model.
from_provider (docs) — Unified entry point across all supported providers; recommended over manual patching.

Instructor

Neighbourhood

Alternatives & relatives

Listed as alternative by (4)

References

Provenance