← System Design Library / architecture-semantic-firewall
Stable v1.0 Last Updated: 1/3/2026

The Architecture of a Semantic Firewall

How to design a low-latency, fail-closed security layer for autonomous agents. Lessons from building Guardian.

Summary

  • What it is: A deterministic, low-latency firewall that validates agent tool calls before execution.
  • Why now: Agents are gaining real access to production systems, making action safety a first-class requirement.
  • Who it’s for: Teams shipping autonomous agents that call APIs, execute SQL, or trigger high-risk operations.

Goals

  • Block unsafe tool calls before execution (fail-closed).
  • Keep latency low enough for interactive experiences.
  • Provide auditability and clear policy reasons for blocks.

Non-Goals

  • Replacing backend authorization (this complements it).
  • Detecting every possible business logic bug.
  • Acting as the only safety layer in a full platform.

Requirements

Functional

  • Intercept all agent tool calls.
  • Enforce policy and role constraints.
  • Detect unsafe code patterns (SQL/Python/CLI).
  • Escalate to content inspection when needed.

Non-Functional

  • Latency: p95 < 50ms.
  • Reliability: fail-closed on errors.
  • Explainability: return a reason code for every block.

Architecture Overview

graph LR
    Agent -->|Tool Call| Guardian

    subgraph Guardian [The Semantic Firewall]
        Tier1[Tier 1: Policy Engine] -->|Pass| Tier2
        Tier2[Tier 2: Static Analysis] -->|Pass| Tier3
        Tier3[Tier 3: Content Inspection]
    end

    Tier3 -->|ALLOW| Agent
    Tier3 -->|BLOCK| Agent

Key Design Decisions

1) Fail-Closed vs. Fail-Open

  • Options: Allow on error vs. block on error.
  • Choice: Fail-closed.
  • Rationale: Unsafe actions are higher risk than temporary unavailability.
  • Tradeoffs: Availability degradation during safety outages.

2) AST vs. LLM for Code Analysis

  • Options: Deterministic AST vs. probabilistic LLM checks.
  • Choice: AST.
  • Rationale: Fast and consistent; blocks known-bad patterns reliably.
  • Tradeoffs: Limited semantic reasoning for complex logic bugs.

3) Chain of Responsibility

  • Options: Run all checks vs. short-circuit.
  • Choice: Short-circuit in a chain.
  • Rationale: Saves compute; cheap checks first, expensive last.
  • Tradeoffs: Requires good ordering and thresholds.

Data Flows

Happy Path

  1. Agent emits tool call.
  2. Policy engine approves.
  3. AST analysis passes.
  4. Content inspection passes.
  5. Action allowed.

Failure Path

  1. Agent emits tool call.
  2. Policy engine blocks or AST flags a dangerous pattern.
  3. Guardian returns a block with reason code.

Observability

  • Metrics: block rate, tier latency, reason distribution.
  • Logs: request_id, tool name, policy decision, AST flags.
  • Tracing: span per tier for audit trails.

Risks & Mitigations

  • Policy drift: version policies and audit changes.
  • False positives: use tiered checks + allowlist overrides.
  • Latency spikes: cap deep checks and short-circuit aggressively.

Rollout Plan

  • Start in shadow mode (log only).
  • Enable blocking for high-risk actions first.
  • Expand to broader tool set with monitored thresholds.

Open Questions

  • Where should manual approval workflows live?
  • How should policy rules be shared across tenants?

Notes

Guardian processes an average request in ~18ms (Policy: 0.5ms, AST: 3ms, Sentinel: 15ms). This doc is part of the Aether platform.

Live demo: /guardian