PII Redaction
also known as Data Loss Prevention, Sensitive Data Filtering
Detect and remove personally identifiable information from inputs to and outputs from the model.
This pattern helps complete certain larger patterns —
- specialisesInput/Output Guardrails★★— Validate inputs before they reach the model and outputs before they reach the user.
- used-byAgent Middleware Chain★— Wrap every model call, tool call, and memory access in a composable pre/execute/post interceptor pipeline so cross-cutting concerns attach without touching agent or orchestrator code.
Context
A team runs an agent in a regulated environment — healthcare, finance, public sector — where legal frameworks (the EU General Data Protection Regulation, the US Health Insurance Portability and Accountability Act, sectoral data-protection rules) restrict what personally identifying information the system is allowed to see, store, log, or pass on to a third party. The agent's inputs and outputs flow through prompt logs, trace stores, evaluation harnesses, and, for hosted models, the provider's infrastructure.
Problem
Large language models echo what they see in context: any personally identifying information that enters the prompt can end up in the model's response, in the application's trace log, in the eval harness export, and in the third-party provider's request records. Once a customer's name, date of birth, or social-security number has crossed those boundaries, containment is essentially impossible after the fact. Without detection and redaction at the boundary where data enters the model, the operator cannot honestly claim that personal data is protected.
Forces
- Detection precision vs recall.
- Reversible vs irreversible redaction.
- Token-level vs entity-level redaction.
Example
A health-tech company's support agent logs are reviewed by a security auditor who finds patient names and dates of birth in plaintext across hundreds of transcripts, and worse, the model has occasionally echoed an SSN back into a response. The team installs pii-redaction: an input pipeline detects PII via regex plus NER and substitutes placeholders before anything reaches the model; an output pipeline re-substitutes only when explicitly required and refuses on unrequested PII. Every redaction is logged for audit. The next audit finds zero plaintext PII.
Diagram
Solution
Therefore:
Pre-process inputs: detect PII (regex + NER + classifier), replace with placeholders. Post-process outputs: re-substitute placeholders back, or refuse if outputs contain unrequested PII. Audit log of redactions.
What this pattern forbids. PII categories listed in the policy must not appear in model inputs or outputs without explicit authorisation.
And the patterns that stand alongside it, or against it —
- complementsSession Isolation★★— Keep one user's session state and memory unreachable from another user's agent.
- complementsSecrets Handling★— Ensure the model never receives secrets in plaintext; tools resolve credentials from references at runtime.
- complementsOpen-Weight Cascade★— Build a multi-model cascade where lower tiers are open-weight, self-hostable models that run inside the operator's boundary, and only escalations cross to a hosted frontier model — giving cost arbitrage *and* sovereignty.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.