Guardrails AI
Type: full-code · Vendor: Guardrails AI · Language: Python · License: Apache-2.0 · Status: active · Status in practice: mature · First released: 2023
Wrap LLM calls with composable input and output guards built from validators that detect, quantify, and mitigate specific risks.
Description. Guardrails AI is a Python framework that runs validators around LLM calls. Validators compose into Input and Output Guards that intercept the inputs and outputs of LLMs. The guards detect, quantify, and mitigate risks such as policy violations, hallucinations, and data leakage before output reaches users. Validators are distributed through the Guardrails Hub, and the framework is released under the Apache 2.0 license.
Agent loop shape. Inputs pass through an Input Guard composed of validators before reaching the model, and the model's output passes through an Output Guard before reaching the user; each guard detects, quantifies, and mitigates the configured risks, blocking or correcting violations.
Primary use cases
- validating LLM inputs and outputs against risk checks
- composing validators into reusable guards
- generating structured data from LLMs
Key concepts
- Validator → input-output-guardrails (docs) — A unit that checks one specific property of an input or output (a choice, a length, the absence of PII) and returns a pass or a FailResult that triggers a configured on-fail action.
- Guard → input-output-guardrails (docs) — A composition of validators that wraps an LLM call as an Input Guard (on the prompt) or an Output Guard (on the response), running its validators in sequence.
- On-fail action → evaluator-optimizer (docs) — The per-validator policy chosen when validation fails: reask (re-prompt the model), fix (substitute a corrected value), filter, refrain, exception, or noop.
- Guardrails Hub (docs) — A registry from which pre-built validators are installed and composed into guards, so risk checks are shared rather than re-implemented per project.
Patterns this full-code implements —
- ★★Input/Output Guardrails
Guardrails runs Input and Output Guards that detect, quantify, and mitigate risks such as hallucinations and data leakage, blocking bad outputs before they reach users.
- ★Context Minimization
Validators such as ValidChoices (allow-listed enums) and ValidLength (max lengths) compose into Input Guards that intercept and constrain untrusted LLM inputs to a strictly formatted, typed interface.
- ★★Structured Output
A Guard can be created from a Pydantic model so that the LLM is called in a manner that formats the output to that class, yielding validated structured data rather than free-form text.
- ★★Evaluator-Optimizer
On the reask on-fail action a validator re-prompts the LLM with the validation error message, looping generation against the validator's correctness criteria until the output passes.
- ★★PII Redaction
The DetectPII validator uses Microsoft Presidio to detect personally identifiable information and, on failure, applies a programmatic fix that anonymizes the offending text before it is returned.
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.