How do you validate structured JSON output from an LLM in TypeScript?

Define the expected shape as a Zod schema, then call schema.safeParse(output) on every response. If parsing fails, either retry with a correction prompt ('Your previous output was invalid JSON. Return only a JSON object matching this schema: ...') or fall back to a default value.

How do you redact PII from AI agent outputs in TypeScript?

Run a regex pipeline over the output string before it reaches the user or any log. Patterns to match: email addresses, phone numbers, SSNs (\d{3}-\d{2}-\d{4}), credit card numbers (\d{4}[\s-]\d{4}[\s-]\d{4}[\s-]\d{4}), and IP addresses. Replace matches with [REDACTED-TYPE]. For named entities (person names, addresses), use a dedicated NER library, regex alone misses most named PII.

How do you implement a content policy filter for AI outputs in TypeScript?

Build a layered filter: first, a fast keyword/regex blocklist for obvious violations (O(n) over the output string); then, for borderline content, a secondary call to a moderation API (OpenAI Moderation, Azure Content Safety, or similar). Gate the output: if the filter fails, return a safe fallback string and log the violation, do not surface the raw output to the user.

What is a cost circuit breaker for AI agents?

A circuit breaker tracks cumulative token usage per session or per request and throws when a configurable threshold is exceeded. Implement as a wrapper around your LLM client that increments a counter on each response and calls an onBudgetExceeded callback when the limit is hit.

How do you handle hallucinations in TypeScript AI agent outputs?

Hallucination prevention is primarily a retrieval and prompting problem, not a post-processing one. For structured fact outputs, implement a confidence threshold pattern: require the model to output a confidence score alongside the answer, and route low-confidence outputs to human review rather than surfacing them directly.

TypeScript AI Agent Output Validation: 6 Pa…

TypeScript AI Agent Output Validation: 6 Patterns with Code Templates

Code on a monitor, TypeScript AI agent output validation and guardrail patterns

Every AI agent output is untrusted input to the rest of your system. Treat it the same way you treat user-supplied HTTP request bodies: parse, validate, sanitize, then use.

Part of the TypeScript AI Agent Security series. This is the output validation deep-dive. Start with the complete TypeScript AI agent security playbook, or see the companion guides on authorization patterns and logging and audit trails.

These six patterns are standalone, each is a function or class you drop in and wire to your existing agent loop.

Pattern	Problem it solves	When to use	Overhead
1. Zod schema enforcement with retry	LLM returns malformed JSON or wrong field types, causing downstream crashes	Any agent where the output feeds into typed application logic	Low, single parse + optional 1 retry
2. PII redaction pipeline	LLM echoes personal data from context into outputs that get stored or sent externally	Agents processing customer data, support workflows, healthcare	Medium, regex + NER scan per output
3. Content policy filter	LLM produces harmful, off-topic, or policy-violating content in user-facing contexts	Customer-facing agents; children's products; regulated content	Low, classifier call per output
4. JSON repair and fallback	LLM produces nearly-valid JSON that strict parsers reject (trailing commas, single quotes)	Any agent that cannot always retry cleanly; cost-sensitive pipelines	Low, repair before parse
5. Confidence threshold guardrail	LLM output is uncertain or hedged; downstream system needs high-confidence inputs	Medical, financial, or legal decision support; automated actions	Medium, requires confidence scoring
6. Token cost circuit breaker	Agent pipelines loop or spike token usage unexpectedly; invoice surprises	Any agentic loop; pipelines with retry logic	Low, counter per session

TL;DR: Six copy-paste TypeScript patterns for AI agent output validation: (1) Zod schema enforcement on structured outputs with automatic retry, (2) PII detection and redaction before outputs reach users or logs, (3) Content policy filter for harmful or off-topic outputs, (4) JSON repair and fallback for malformed structured outputs, (5) Confidence threshold guardrail that escalates low-certainty outputs to human review, (6) Token cost circuit breaker that stops runaway generation. Each pattern is standalone, drop in and wire to your agent framework.

Pattern 1: Zod Schema Enforcement with Retry

Parse every structured output against its expected schema. On failure, send a correction prompt and retry once before falling back.

import { z } from "zod";
import OpenAI from "openai";

const client = new OpenAI();

async function callWithSchema<T>(
  messages: OpenAI.Chat.ChatCompletionMessageParam[],
  schema: z.ZodType<T>,
  maxRetries = 1
): Promise<T> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages,
      response_format: { type: "json_object" },
    });

    const raw = response.choices[0].message.content ?? "";
    const parsed = schema.safeParse(JSON.parse(raw));

    if (parsed.success) return parsed.data;

    if (attempt < maxRetries) {
      // Feed the error back as a correction prompt
      messages = [
        ...messages,
        { role: "assistant", content: raw },
        {
          role: "user",
          content: `Your output failed validation: ${parsed.error.message}. Return a valid JSON object matching the schema.`,
        },
      ];
    }
  }
  throw new Error("Output validation failed after retries");
}

// Usage
const AnalysisResult = z.object({
  risk_level: z.enum(["low", "medium", "high"]),
  summary: z.string().max(500),
  action_required: z.boolean(),
});

const result = await callWithSchema(
  [{ role: "user", content: "Analyze this contract for risk..." }],
  AnalysisResult
);
// result is typed as { risk_level: "low"|"medium"|"high", summary: string, action_required: boolean }

Why one retry: Two retries rarely help, if the model failed twice on the same schema, the schema description in the prompt is the problem, not the model.

TypeScript AI agent output validation and schema enforcement

Pattern 2: PII Redaction Pipeline

Strip sensitive data from outputs before they reach users, logs, or downstream systems.

type RedactionRule = { pattern: RegExp; label: string };

const DEFAULT_RULES: RedactionRule[] = [
  { pattern: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/gi, label: "EMAIL" },
  { pattern: /\b\d{3}[-.\s]?\d{3}[-.\s]?\d{4}\b/g, label: "PHONE" },
  { pattern: /\b\d{3}-\d{2}-\d{4}\b/g, label: "SSN" },
  {
    pattern: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g,
    label: "CARD",
  },
  {
    pattern: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g,
    label: "IP",
  },
];

function redactPII(
  text: string,
  rules: RedactionRule[] = DEFAULT_RULES
): { redacted: string; hits: string[] } {
  const hits: string[] = [];
  let redacted = text;

  for (const rule of rules) {
    redacted = redacted.replace(rule.pattern, (match) => {
      hits.push(`${rule.label}: ${match.slice(0, 4)}***`);
      return `[REDACTED-${rule.label}]`;
    });
  }

  return { redacted, hits };
}

// Usage, always redact before logging or returning to client
const raw = await getAgentOutput();
const { redacted, hits } = redactPII(raw);

if (hits.length > 0) {
  console.warn("PII redacted from agent output:", hits);
}

return redacted; // safe to surface

Note: Regex catches structured PII (emails, SSNs, card numbers). Named entities (person names, addresses) require a NER model, regex alone is not sufficient for GDPR Article 17 compliance.

Pattern 3: Content Policy Filter

Block harmful, off-topic, or policy-violating outputs before they reach users. Layered: fast keyword check first, moderation API second.

const BLOCKLIST = [
  /\b(how to (make|build|create) (bomb|weapon|malware))\b/i,
  /\b(suicide|self.harm) (method|instruction|guide)\b/i,
];

type PolicyVerdict = "pass" | "block" | "review";

async function checkContentPolicy(text: string): Promise<{
  verdict: PolicyVerdict;
  reason?: string;
}> {
  // Layer 1: fast blocklist (< 1ms)
  for (const pattern of BLOCKLIST) {
    if (pattern.test(text)) {
      return { verdict: "block", reason: "blocklist_match" };
    }
  }

  // Layer 2: moderation API (only if layer 1 passes)
  const moderation = await openai.moderations.create({ input: text });
  const result = moderation.results[0];

  if (result.flagged) {
    const categories = Object.entries(result.categories)
      .filter(([, flagged]) => flagged)
      .map(([cat]) => cat);
    return { verdict: "block", reason: categories.join(",") };
  }

  // Layer 3: low-confidence pass → human review queue
  const maxScore = Math.max(...Object.values(result.category_scores));
  if (maxScore > 0.5) {
    return { verdict: "review", reason: `high_score:${maxScore.toFixed(2)}` };
  }

  return { verdict: "pass" };
}

// Usage
const output = await agent.run(userMessage);
const policy = await checkContentPolicy(output);

if (policy.verdict === "block") {
  auditLog.write({ event: "output_blocked", reason: policy.reason });
  return SAFE_FALLBACK_MESSAGE;
}
if (policy.verdict === "review") {
  humanReviewQueue.push({ output, reason: policy.reason, userId });
  return "Your request is being reviewed. We'll follow up shortly.";
}

return output;

Pattern 4: JSON Repair and Fallback

LLMs frequently return near-valid JSON wrapped in markdown fences, with trailing commas, or with commentary appended. Repair before throwing.

function extractAndRepairJSON(raw: string): unknown {
  // Strip markdown code fences
  let text = raw.replace(/^```(?:json)?\n?/m, "").replace(/\n?```$/m, "").trim();

  // Try direct parse first
  try {
    return JSON.parse(text);
  } catch {
    // Remove trailing commas before } or ]
    text = text.replace(/,(\s*[}\]])/g, "$1");

    // Strip single-line comments
    text = text.replace(/\/\/[^\n]*/g, "");

    try {
      return JSON.parse(text);
    } catch {
      // Find the outermost JSON object or array
      const objMatch = text.match(/\{[\s\S]*\}/);
      const arrMatch = text.match(/\[[\s\S]*\]/);
      const match = objMatch ?? arrMatch;

      if (match) {
        return JSON.parse(match[0]);
      }

      throw new Error(`Cannot repair JSON: ${raw.slice(0, 100)}`);
    }
  }
}

// Usage, wrap any structured output call
const raw = response.choices[0].message.content ?? "";
const parsed = extractAndRepairJSON(raw);
const validated = MySchema.parse(parsed); // then validate with Zod

Pattern 5: Confidence Threshold Guardrail

Route low-confidence outputs to human review instead of surfacing them directly.

const ConfidentOutput = z.object({
  answer: z.string(),
  confidence: z.number().min(0).max(1),
  sources: z.array(z.string()).optional(),
});

type ReviewableOutput =
  | { type: "direct"; answer: string }
  | { type: "pending_review"; reviewId: string };

async function callWithConfidenceGate(
  query: string,
  threshold = 0.75
): Promise<ReviewableOutput> {
  const result = await callWithSchema(
    [
      {
        role: "system",
        content:
          "Answer the query. Include a confidence score from 0 to 1. Be conservative, if unsure, score below 0.75.",
      },
      { role: "user", content: query },
    ],
    ConfidentOutput
  );

  if (result.confidence >= threshold) {
    return { type: "direct", answer: result.answer };
  }

  // Low confidence, queue for human review
  const reviewId = await humanReviewQueue.push({
    query,
    draft: result.answer,
    confidence: result.confidence,
  });

  return { type: "pending_review", reviewId };
}

// Usage
const output = await callWithConfidenceGate(userQuery, 0.8);

if (output.type === "direct") {
  return output.answer;
} else {
  return `We're verifying this answer. Check back using ID: ${output.reviewId}`;
}

When to use: High-stakes domains, healthcare information, legal interpretation, financial advice, or any output that drives an irreversible decision.

Pattern 6: Token Cost Circuit Breaker

Stop runaway agent loops before they generate unexpected API charges.

class CostCircuitBreaker {
  private totalTokens = 0;
  private readonly limit: number;
  private readonly onBudgetExceeded: (tokens: number) => void;

  constructor(
    tokenLimit: number,
    onBudgetExceeded: (tokens: number) => void
  ) {
    this.limit = tokenLimit;
    this.onBudgetExceeded = onBudgetExceeded;
  }

  record(usage: { prompt_tokens: number; completion_tokens: number }): void {
    this.totalTokens += usage.prompt_tokens + usage.completion_tokens;
    if (this.totalTokens > this.limit) {
      this.onBudgetExceeded(this.totalTokens);
      throw new Error(
        `Token budget exceeded: ${this.totalTokens} > ${this.limit}`
      );
    }
  }

  get used(): number {
    return this.totalTokens;
  }
}

// Usage, wrap your agent loop
const breaker = new CostCircuitBreaker(
  50_000, // ~$1.50 at GPT-4o pricing
  (tokens) => {
    alertOps(`Agent token budget exceeded: ${tokens} tokens used`);
    auditLog.write({ event: "budget_exceeded", tokens, sessionId });
  }
);

while (agent.hasMoreSteps()) {
  const response = await client.chat.completions.create({ /* ... */ });
  breaker.record(response.usage!); // throws if over budget
  await agent.processResponse(response);
}

Set the limit per session, not per request. A single tool call may be cheap, but a stuck loop calling tools 200 times is not.

Combining Patterns

Wire them in order, validate structure first, then policy, then surface:

async function safeAgentOutput(
  messages: OpenAI.Chat.ChatCompletionMessageParam[]
): Promise<string> {
  // 1. Get raw output (with cost tracking)
  const response = await client.chat.completions.create({ model: "gpt-4o", messages });
  breaker.record(response.usage!);
  const raw = response.choices[0].message.content ?? "";

  // 2. Content policy
  const policy = await checkContentPolicy(raw);
  if (policy.verdict === "block") return SAFE_FALLBACK_MESSAGE;

  // 3. PII redaction
  const { redacted } = redactPII(raw);

  // 4. Return clean output
  return redacted;
}

For structured outputs, replace step 4 with Zod schema validation (Pattern 1) after redaction.

Governance Notes: What These Patterns Satisfy

Each pattern above maps to a governance or compliance obligation that increasingly appears in AI system reviews:

Pattern	Governance function	Relevant standard
Zod schema enforcement	Output integrity; prevents malformed data entering downstream systems	NIST AI RMF, Measure 2.5
PII redaction	Data minimization; prevents personal data retention in logs	GDPR Article 5(1)(c); CCPA
Content policy filter	Harm prevention; content moderation obligations	EU AI Act (prohibited outputs); FTC Section 5
JSON repair and fallback	Reliability; prevents silent failures from reaching users	NIST AI RMF, Manage 2.2
Confidence threshold guardrail	Human oversight; escalates uncertain outputs	EU AI Act Article 14 (human oversight)
Token cost circuit breaker	Operational risk; prevents runaway AI spend	FinOps; internal budget controls

For teams building TypeScript AI agents in regulated industries, healthcare, financial services, legal, the confidence threshold guardrail and PII redaction patterns are the highest-priority items to implement before deploying to production. The schema enforcement pattern is effectively zero-overhead once in place and should be standard practice for any agent that returns structured data.

When used together, these patterns provide a layered output validation pipeline that is defensible in an AI governance audit: the code evidence shows that outputs were validated for structure, sanitized for sensitive data, filtered for policy compliance, and subject to cost controls before reaching end users.

TypeScript AI agent authorization patterns 2026, control which tools the agent can call
TypeScript AI agent security audit checklist 2026, audit trail and logging patterns
TypeScript AI agent security incident response playbook, what to do when something goes wrong
TypeScript AI agent observability and tracing patterns, OpenTelemetry spans and structured tracing for distributed agent calls

References

Pattern

Problem it solves

When to use

Overhead

1. Zod schema enforcement with retry

LLM returns malformed JSON or wrong field types, causing downstream crashes

Any agent where the output feeds into typed application logic

Low, single parse + optional 1 retry

2. PII redaction pipeline

LLM echoes personal data from context into outputs that get stored or sent externally

Agents processing customer data, support workflows, healthcare

Medium, regex + NER scan per output

3. Content policy filter

LLM produces harmful, off-topic, or policy-violating content in user-facing contexts

Customer-facing agents; children's products; regulated content

Low, classifier call per output

4. JSON repair and fallback

LLM produces nearly-valid JSON that strict parsers reject (trailing commas, single quotes)

Any agent that cannot always retry cleanly; cost-sensitive pipelines

Low, repair before parse

5. Confidence threshold guardrail

LLM output is uncertain or hedged; downstream system needs high-confidence inputs

Medical, financial, or legal decision support; automated actions

Medium, requires confidence scoring

6. Token cost circuit breaker

Agent pipelines loop or spike token usage unexpectedly; invoice surprises

Any agentic loop; pipelines with retry logic

Low, counter per session

import { z } from "zod"; import OpenAI from "openai"; const client = new OpenAI(); async function callWithSchema<T>( messages: OpenAI.Chat.ChatCompletionMessageParam[], schema: z.ZodType<T>, maxRetries = 1 ): Promise<T> { for (let attempt = 0; attempt <= maxRetries; attempt++) { const response = await client.chat.completions.create({ model: "gpt-4o", messages, response_format: { type: "json_object" }, }); const raw = response.choices[0].message.content ?? ""; const parsed = schema.safeParse(JSON.parse(raw)); if (parsed.success) return parsed.data; if (attempt < maxRetries) { // Feed the error back as a correction prompt messages = [ ...messages, { role: "assistant", content: raw }, { role: "user", content: `Your output failed validation: ${parsed.error.message}. Return a valid JSON object matching the schema.`, }, ]; } } throw new Error("Output validation failed after retries"); } // Usage const AnalysisResult = z.object({ risk_level: z.enum(["low", "medium", "high"]), summary: z.string().max(500), action_required: z.boolean(), }); const result = await callWithSchema( [{ role: "user", content: "Analyze this contract for risk..." }], AnalysisResult ); // result is typed as { risk_level: "low"|"medium"|"high", summary: string, action_required: boolean }

type RedactionRule = { pattern: RegExp; label: string }; const DEFAULT_RULES: RedactionRule[] = [ { pattern: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/gi, label: "EMAIL" }, { pattern: /\b\d{3}[-.\s]?\d{3}[-.\s]?\d{4}\b/g, label: "PHONE" }, { pattern: /\b\d{3}-\d{2}-\d{4}\b/g, label: "SSN" }, { pattern: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, label: "CARD", }, { pattern: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g, label: "IP", }, ]; function redactPII( text: string, rules: RedactionRule[] = DEFAULT_RULES ): { redacted: string; hits: string[] } { const hits: string[] = []; let redacted = text; for (const rule of rules) { redacted = redacted.replace(rule.pattern, (match) => { hits.push(`${rule.label}: ${match.slice(0, 4)}***`); return `[REDACTED-${rule.label}]`; }); } return { redacted, hits }; } // Usage, always redact before logging or returning to client const raw = await getAgentOutput(); const { redacted, hits } = redactPII(raw); if (hits.length > 0) { console.warn("PII redacted from agent output:", hits); } return redacted; // safe to surface

class CostCircuitBreaker { private totalTokens = 0; private readonly limit: number; private readonly onBudgetExceeded: (tokens: number) => void; constructor( tokenLimit: number, onBudgetExceeded: (tokens: number) => void ) { this.limit = tokenLimit; this.onBudgetExceeded = onBudgetExceeded; } record(usage: { prompt_tokens: number; completion_tokens: number }): void { this.totalTokens += usage.prompt_tokens + usage.completion_tokens; if (this.totalTokens > this.limit) { this.onBudgetExceeded(this.totalTokens); throw new Error( `Token budget exceeded: ${this.totalTokens} > ${this.limit}` ); } } get used(): number { return this.totalTokens; } } // Usage, wrap your agent loop const breaker = new CostCircuitBreaker( 50_000, // ~$1.50 at GPT-4o pricing (tokens) => { alertOps(`Agent token budget exceeded: ${tokens} tokens used`); auditLog.write({ event: "budget_exceeded", tokens, sessionId }); } ); while (agent.hasMoreSteps()) { const response = await client.chat.completions.create({ /* ... */ }); breaker.record(response.usage!); // throws if over budget await agent.processResponse(response); }

async function safeAgentOutput( messages: OpenAI.Chat.ChatCompletionMessageParam[] ): Promise<string> { // 1. Get raw output (with cost tracking) const response = await client.chat.completions.create({ model: "gpt-4o", messages }); breaker.record(response.usage!); const raw = response.choices[0].message.content ?? ""; // 2. Content policy const policy = await checkContentPolicy(raw); if (policy.verdict === "block") return SAFE_FALLBACK_MESSAGE; // 3. PII redaction const { redacted } = redactPII(raw); // 4. Return clean output return redacted; }

Pattern

Governance function

Relevant standard

Zod schema enforcement

Output integrity; prevents malformed data entering downstream systems

NIST AI RMF, Measure 2.5

PII redaction

Data minimization; prevents personal data retention in logs

GDPR Article 5(1)(c); CCPA

Content policy filter

Harm prevention; content moderation obligations

EU AI Act (prohibited outputs); FTC Section 5

JSON repair and fallback

Reliability; prevents silent failures from reaching users

NIST AI RMF, Manage 2.2

Confidence threshold guardrail

Human oversight; escalates uncertain outputs

EU AI Act Article 14 (human oversight)

Token cost circuit breaker

Operational risk; prevents runaway AI spend

FinOps; internal budget controls