TypeScript AI Agent Output Validation: 6 Patterns with Code Templates
Every AI agent output is untrusted input to the rest of your system. Treat it the same way you treat user-supplied HTTP request bodies: parse, validate, sanitize, then use.
Part of the TypeScript AI Agent Security series. This is the output validation deep-dive. Start with the complete TypeScript AI agent security playbook, or see the companion guides on authorization patterns and logging and audit trails.
These six patterns are standalone, each is a function or class you drop in and wire to your existing agent loop.
| Pattern | Problem it solves | When to use | Overhead |
|---|---|---|---|
| 1. Zod schema enforcement with retry | LLM returns malformed JSON or wrong field types, causing downstream crashes | Any agent where the output feeds into typed application logic | Low, single parse + optional 1 retry |
| 2. PII redaction pipeline | LLM echoes personal data from context into outputs that get stored or sent externally | Agents processing customer data, support workflows, healthcare | Medium, regex + NER scan per output |
| 3. Content policy filter | LLM produces harmful, off-topic, or policy-violating content in user-facing contexts | Customer-facing agents; children's products; regulated content | Low, classifier call per output |
| 4. JSON repair and fallback | LLM produces nearly-valid JSON that strict parsers reject (trailing commas, single quotes) | Any agent that cannot always retry cleanly; cost-sensitive pipelines | Low, repair before parse |
| 5. Confidence threshold guardrail | LLM output is uncertain or hedged; downstream system needs high-confidence inputs | Medical, financial, or legal decision support; automated actions | Medium, requires confidence scoring |
| 6. Token cost circuit breaker | Agent pipelines loop or spike token usage unexpectedly; invoice surprises | Any agentic loop; pipelines with retry logic | Low, counter per session |
TL;DR: Six copy-paste TypeScript patterns for AI agent output validation: (1) Zod schema enforcement on structured outputs with automatic retry, (2) PII detection and redaction before outputs reach users or logs, (3) Content policy filter for harmful or off-topic outputs, (4) JSON repair and fallback for malformed structured outputs, (5) Confidence threshold guardrail that escalates low-certainty outputs to human review, (6) Token cost circuit breaker that stops runaway generation. Each pattern is standalone, drop in and wire to your agent framework.
Pattern 1: Zod Schema Enforcement with Retry
Parse every structured output against its expected schema. On failure, send a correction prompt and retry once before falling back.
import { z } from "zod";
import OpenAI from "openai";
const client = new OpenAI();
async function callWithSchema<T>(
messages: OpenAI.Chat.ChatCompletionMessageParam[],
schema: z.ZodType<T>,
maxRetries = 1
): Promise<T> {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
response_format: { type: "json_object" },
});
const raw = response.choices[0].message.content ?? "";
const parsed = schema.safeParse(JSON.parse(raw));
if (parsed.success) return parsed.data;
if (attempt < maxRetries) {
// Feed the error back as a correction prompt
messages = [
...messages,
{ role: "assistant", content: raw },
{
role: "user",
content: `Your output failed validation: ${parsed.error.message}. Return a valid JSON object matching the schema.`,
},
];
}
}
throw new Error("Output validation failed after retries");
}
// Usage
const AnalysisResult = z.object({
risk_level: z.enum(["low", "medium", "high"]),
summary: z.string().max(500),
action_required: z.boolean(),
});
const result = await callWithSchema(
[{ role: "user", content: "Analyze this contract for risk..." }],
AnalysisResult
);
// result is typed as { risk_level: "low"|"medium"|"high", summary: string, action_required: boolean }
Why one retry: Two retries rarely help, if the model failed twice on the same schema, the schema description in the prompt is the problem, not the model.
Pattern 2: PII Redaction Pipeline
Strip sensitive data from outputs before they reach users, logs, or downstream systems.
type RedactionRule = { pattern: RegExp; label: string };
const DEFAULT_RULES: RedactionRule[] = [
{ pattern: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/gi, label: "EMAIL" },
{ pattern: /\b\d{3}[-.\s]?\d{3}[-.\s]?\d{4}\b/g, label: "PHONE" },
{ pattern: /\b\d{3}-\d{2}-\d{4}\b/g, label: "SSN" },
{
pattern: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g,
label: "CARD",
},
{
pattern: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g,
label: "IP",
},
];
function redactPII(
text: string,
rules: RedactionRule[] = DEFAULT_RULES
): { redacted: string; hits: string[] } {
const hits: string[] = [];
let redacted = text;
for (const rule of rules) {
redacted = redacted.replace(rule.pattern, (match) => {
hits.push(`${rule.label}: ${match.slice(0, 4)}***`);
return `[REDACTED-${rule.label}]`;
});
}
return { redacted, hits };
}
// Usage, always redact before logging or returning to client
const raw = await getAgentOutput();
const { redacted, hits } = redactPII(raw);
if (hits.length > 0) {
console.warn("PII redacted from agent output:", hits);
}
return redacted; // safe to surface
Note: Regex catches structured PII (emails, SSNs, card numbers). Named entities (person names, addresses) require a NER model, regex alone is not sufficient for GDPR Article 17 compliance.
Pattern 3: Content Policy Filter
Block harmful, off-topic, or policy-violating outputs before they reach users. Layered: fast keyword check first, moderation API second.
const BLOCKLIST = [
/\b(how to (make|build|create) (bomb|weapon|malware))\b/i,
/\b(suicide|self.harm) (method|instruction|guide)\b/i,
];
type PolicyVerdict = "pass" | "block" | "review";
async function checkContentPolicy(text: string): Promise<{
verdict: PolicyVerdict;
reason?: string;
}> {
// Layer 1: fast blocklist (< 1ms)
for (const pattern of BLOCKLIST) {
if (pattern.test(text)) {
return { verdict: "block", reason: "blocklist_match" };
}
}
// Layer 2: moderation API (only if layer 1 passes)
const moderation = await openai.moderations.create({ input: text });
const result = moderation.results[0];
if (result.flagged) {
const categories = Object.entries(result.categories)
.filter(([, flagged]) => flagged)
.map(([cat]) => cat);
return { verdict: "block", reason: categories.join(",") };
}
// Layer 3: low-confidence pass → human review queue
const maxScore = Math.max(...Object.values(result.category_scores));
if (maxScore > 0.5) {
return { verdict: "review", reason: `high_score:${maxScore.toFixed(2)}` };
}
return { verdict: "pass" };
}
// Usage
const output = await agent.run(userMessage);
const policy = await checkContentPolicy(output);
if (policy.verdict === "block") {
auditLog.write({ event: "output_blocked", reason: policy.reason });
return SAFE_FALLBACK_MESSAGE;
}
if (policy.verdict === "review") {
humanReviewQueue.push({ output, reason: policy.reason, userId });
return "Your request is being reviewed. We'll follow up shortly.";
}
return output;
Pattern 4: JSON Repair and Fallback
LLMs frequently return near-valid JSON wrapped in markdown fences, with trailing commas, or with commentary appended. Repair before throwing.
function extractAndRepairJSON(raw: string): unknown {
// Strip markdown code fences
let text = raw.replace(/^```(?:json)?\n?/m, "").replace(/\n?```$/m, "").trim();
// Try direct parse first
try {
return JSON.parse(text);
} catch {
// Remove trailing commas before } or ]
text = text.replace(/,(\s*[}\]])/g, "$1");
// Strip single-line comments
text = text.replace(/\/\/[^\n]*/g, "");
try {
return JSON.parse(text);
} catch {
// Find the outermost JSON object or array
const objMatch = text.match(/\{[\s\S]*\}/);
const arrMatch = text.match(/\[[\s\S]*\]/);
const match = objMatch ?? arrMatch;
if (match) {
return JSON.parse(match[0]);
}
throw new Error(`Cannot repair JSON: ${raw.slice(0, 100)}`);
}
}
}
// Usage, wrap any structured output call
const raw = response.choices[0].message.content ?? "";
const parsed = extractAndRepairJSON(raw);
const validated = MySchema.parse(parsed); // then validate with Zod
Pattern 5: Confidence Threshold Guardrail
Route low-confidence outputs to human review instead of surfacing them directly.
const ConfidentOutput = z.object({
answer: z.string(),
confidence: z.number().min(0).max(1),
sources: z.array(z.string()).optional(),
});
type ReviewableOutput =
| { type: "direct"; answer: string }
| { type: "pending_review"; reviewId: string };
async function callWithConfidenceGate(
query: string,
threshold = 0.75
): Promise<ReviewableOutput> {
const result = await callWithSchema(
[
{
role: "system",
content:
"Answer the query. Include a confidence score from 0 to 1. Be conservative, if unsure, score below 0.75.",
},
{ role: "user", content: query },
],
ConfidentOutput
);
if (result.confidence >= threshold) {
return { type: "direct", answer: result.answer };
}
// Low confidence, queue for human review
const reviewId = await humanReviewQueue.push({
query,
draft: result.answer,
confidence: result.confidence,
});
return { type: "pending_review", reviewId };
}
// Usage
const output = await callWithConfidenceGate(userQuery, 0.8);
if (output.type === "direct") {
return output.answer;
} else {
return `We're verifying this answer. Check back using ID: ${output.reviewId}`;
}
When to use: High-stakes domains, healthcare information, legal interpretation, financial advice, or any output that drives an irreversible decision.
Pattern 6: Token Cost Circuit Breaker
Stop runaway agent loops before they generate unexpected API charges.
class CostCircuitBreaker {
private totalTokens = 0;
private readonly limit: number;
private readonly onBudgetExceeded: (tokens: number) => void;
constructor(
tokenLimit: number,
onBudgetExceeded: (tokens: number) => void
) {
this.limit = tokenLimit;
this.onBudgetExceeded = onBudgetExceeded;
}
record(usage: { prompt_tokens: number; completion_tokens: number }): void {
this.totalTokens += usage.prompt_tokens + usage.completion_tokens;
if (this.totalTokens > this.limit) {
this.onBudgetExceeded(this.totalTokens);
throw new Error(
`Token budget exceeded: ${this.totalTokens} > ${this.limit}`
);
}
}
get used(): number {
return this.totalTokens;
}
}
// Usage, wrap your agent loop
const breaker = new CostCircuitBreaker(
50_000, // ~$1.50 at GPT-4o pricing
(tokens) => {
alertOps(`Agent token budget exceeded: ${tokens} tokens used`);
auditLog.write({ event: "budget_exceeded", tokens, sessionId });
}
);
while (agent.hasMoreSteps()) {
const response = await client.chat.completions.create({ /* ... */ });
breaker.record(response.usage!); // throws if over budget
await agent.processResponse(response);
}
Set the limit per session, not per request. A single tool call may be cheap, but a stuck loop calling tools 200 times is not.
Combining Patterns
Wire them in order, validate structure first, then policy, then surface:
async function safeAgentOutput(
messages: OpenAI.Chat.ChatCompletionMessageParam[]
): Promise<string> {
// 1. Get raw output (with cost tracking)
const response = await client.chat.completions.create({ model: "gpt-4o", messages });
breaker.record(response.usage!);
const raw = response.choices[0].message.content ?? "";
// 2. Content policy
const policy = await checkContentPolicy(raw);
if (policy.verdict === "block") return SAFE_FALLBACK_MESSAGE;
// 3. PII redaction
const { redacted } = redactPII(raw);
// 4. Return clean output
return redacted;
}
For structured outputs, replace step 4 with Zod schema validation (Pattern 1) after redaction.
Governance Notes: What These Patterns Satisfy
Each pattern above maps to a governance or compliance obligation that increasingly appears in AI system reviews:
| Pattern | Governance function | Relevant standard |
|---|---|---|
| Zod schema enforcement | Output integrity; prevents malformed data entering downstream systems | NIST AI RMF, Measure 2.5 |
| PII redaction | Data minimization; prevents personal data retention in logs | GDPR Article 5(1)(c); CCPA |
| Content policy filter | Harm prevention; content moderation obligations | EU AI Act (prohibited outputs); FTC Section 5 |
| JSON repair and fallback | Reliability; prevents silent failures from reaching users | NIST AI RMF, Manage 2.2 |
| Confidence threshold guardrail | Human oversight; escalates uncertain outputs | EU AI Act Article 14 (human oversight) |
| Token cost circuit breaker | Operational risk; prevents runaway AI spend | FinOps; internal budget controls |
For teams building TypeScript AI agents in regulated industries, healthcare, financial services, legal, the confidence threshold guardrail and PII redaction patterns are the highest-priority items to implement before deploying to production. The schema enforcement pattern is effectively zero-overhead once in place and should be standard practice for any agent that returns structured data.
When used together, these patterns provide a layered output validation pipeline that is defensible in an AI governance audit: the code evidence shows that outputs were validated for structure, sanitized for sensitive data, filtered for policy compliance, and subject to cost controls before reaching end users.
Related reading
- TypeScript AI agent authorization patterns 2026, control which tools the agent can call
- TypeScript AI agent security audit checklist 2026, audit trail and logging patterns
- TypeScript AI agent security incident response playbook, what to do when something goes wrong
References
- OpenAI, OpenAI Agents SDK TypeScript
- Zod, Zod documentation
- OpenAI, Moderation API
