Permission System
System HardeningIntent Must Pass a Safety Gate|458 LOC|4 tools
Safety is a pipeline, not a boolean: deny, check mode, allow, then ask.
s01 > s02 > s03 > s04 > s05 > s06 > [ s07 ] > s08 > s09 > s10 > s11 > s12 > s13 > s14 > s15 > s16 > s17 > s18 > s19
What You'll Learn
- A four-stage permission pipeline that every tool call must pass through before execution
- Three permission modes that control how aggressively the agent auto-approves actions
- How deny and allow rules use pattern matching to create a first-match-wins policy
- Interactive approval with an "always" option that writes permanent allow rules at runtime
Your agent from s06 is capable and long-lived. It reads files, writes code, runs shell commands, delegates subtasks, and compresses its own context to keep going. But there is no safety catch. Every tool call the model proposes goes straight to execution. Ask it to delete a directory and it will -- no questions asked. Before you give this agent access to anything that matters, you need a gate between "the model wants to do X" and "the system actually does X."
The Problem
Imagine your agent is helping refactor a codebase. It reads a few files, proposes some edits, and then decides to run rm -rf /tmp/old_build to clean up. Except the model hallucinated the path -- the real directory is your home folder. Or it decides to sudo something because the model has seen that pattern in training data. Without a permission layer, intent becomes execution instantly. There is no moment where the system can say "wait, that looks dangerous" or where you can say "no, do not do that." The agent needs a checkpoint -- a pipeline (a sequence of stages that every request passes through) between what the model asks for and what actually happens.
The Solution
Every tool call now passes through a four-stage permission pipeline before execution. The stages run in order, and the first one that produces a definitive answer wins.
tool_call from LLM
|
v
[1. Deny rules] -- blocklist: always block these
|
v
[2. Mode check] -- plan mode? auto mode? default?
|
v
[3. Allow rules] -- allowlist: always allow these
|
v
[4. Ask user] -- interactive y/n/always prompt
|
v
execute (or reject)
Read Together
- If you start blurring "the model proposed an action" with "the system actually executed an action," you might find it helpful to revisit
s00a-query-control-plane.md. - If you are not yet clear on why tool requests should not drop straight into handlers, keeping
s02a-tool-control-plane.mdopen beside this chapter may help. - If
PermissionRule,PermissionDecision, andtool_resultstart to collapse into one vague idea,data-structures.mdcan reset them.
How It Works
Step 1. Define three permission modes. Each mode changes how the pipeline treats tool calls that do not match any explicit rule. "Default" mode is the safest -- it asks you about everything. "Plan" mode blocks all writes outright, useful when you want the agent to explore without touching anything. "Auto" mode lets reads through silently and only asks about writes, good for fast exploration.
| Mode | Behavior | Use Case |
|---|---|---|
default | Ask user for every unmatched tool call | Normal interactive use |
plan | Block all writes, allow reads | Planning/review mode |
auto | Auto-allow reads, ask for writes | Fast exploration mode |
Step 2. Set up deny and allow rules with pattern matching. Rules are checked in order -- first match wins. Deny rules catch dangerous patterns that should never execute, regardless of mode. Allow rules let known-safe operations pass without asking.
// Tree-shaken bundle: chapter wiring + only used runtime code.
// agents_self_contained/_runtime.ts
import Anthropic from "@anthropic-ai/sdk";
import dotenv from "dotenv";
import { execSync, spawn, spawnSync } from "node:child_process";
import fs from "node:fs";
import path from "node:path";
import process2 from "node:process";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
dotenv.config({ override: true });
var WORKDIR = process2.cwd();
var DEFAULT_MODEL = "claude-3-5-sonnet-latest";
var anthropicClient = null;
function getModelId() {
return process2.env.MODEL_ID || DEFAULT_MODEL;
}
function getAnthropicClient() {
if (anthropicClient) {
return anthropicClient;
}
anthropicClient = new Anthropic({
apiKey: process2.env.ANTHROPIC_API_KEY || process2.env.ANTHROPIC_AUTH_TOKEN || "missing-api-key",
baseURL: process2.env.ANTHROPIC_BASE_URL || void 0
});
return anthropicClient;
}
function createLoopContext() {
return { workdir: WORKDIR, messages: [], meta: {} };
}
function safePath(relativePath) {
const resolved = path.resolve(WORKDIR, relativePath);
When the user answers "always" at the interactive prompt, a permanent allow rule is added at runtime.
Step 3. Implement the four-stage check. This is the core of the permission system. Notice that deny rules run first and cannot be bypassed -- this is intentional. No matter what mode you are in or what allow rules exist, a deny rule always wins.
// Tree-shaken bundle: chapter wiring + only used runtime code.
// agents_self_contained/_runtime.ts
import Anthropic from "@anthropic-ai/sdk";
import dotenv from "dotenv";
import { execSync, spawn, spawnSync } from "node:child_process";
import fs from "node:fs";
import path from "node:path";
import process2 from "node:process";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
dotenv.config({ override: true });
var WORKDIR = process2.cwd();
var DEFAULT_MODEL = "claude-3-5-sonnet-latest";
var anthropicClient = null;
function getModelId() {
return process2.env.MODEL_ID || DEFAULT_MODEL;
}
function getAnthropicClient() {
if (anthropicClient) {
return anthropicClient;
}
anthropicClient = new Anthropic({
apiKey: process2.env.ANTHROPIC_API_KEY || process2.env.ANTHROPIC_AUTH_TOKEN || "missing-api-key",
baseURL: process2.env.ANTHROPIC_BASE_URL || void 0
});
return anthropicClient;
}
function createLoopContext() {
return { workdir: WORKDIR, messages: [], meta: {} };
}
function safePath(relativePath) {
const resolved = path.resolve(WORKDIR, relativePath);
Step 4. Integrate the permission check into the agent loop. Every tool call now goes through the pipeline before execution. The result is one of three outcomes: denied (with a reason), allowed (silently), or asked (interactively).
// Tree-shaken bundle: chapter wiring + only used runtime code.
// agents_self_contained/_runtime.ts
import Anthropic from "@anthropic-ai/sdk";
import dotenv from "dotenv";
import { execSync, spawn, spawnSync } from "node:child_process";
import fs from "node:fs";
import path from "node:path";
import process2 from "node:process";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
dotenv.config({ override: true });
var WORKDIR = process2.cwd();
var DEFAULT_MODEL = "claude-3-5-sonnet-latest";
var anthropicClient = null;
function getModelId() {
return process2.env.MODEL_ID || DEFAULT_MODEL;
}
function getAnthropicClient() {
if (anthropicClient) {
return anthropicClient;
}
anthropicClient = new Anthropic({
apiKey: process2.env.ANTHROPIC_API_KEY || process2.env.ANTHROPIC_AUTH_TOKEN || "missing-api-key",
baseURL: process2.env.ANTHROPIC_BASE_URL || void 0
});
return anthropicClient;
}
function createLoopContext() {
return { workdir: WORKDIR, messages: [], meta: {} };
}
function safePath(relativePath) {
const resolved = path.resolve(WORKDIR, relativePath);
Step 5. Add denial tracking as a simple circuit breaker. The PermissionManager tracks consecutive denials. After 3 in a row, it suggests switching to plan mode -- this prevents the agent from repeatedly hitting the same wall and wasting turns.
What Changed From s06
| Component | Before (s06) | After (s07) |
|---|---|---|
| Safety | None | 4-stage permission pipeline |
| Modes | None | 3 modes: default, plan, auto |
| Rules | None | Deny/allow rules with pattern matching |
| User control | None | Interactive approval with "always" option |
| Denial tracking | None | Circuit breaker after 3 consecutive denials |
Try It
npm run s07
- Ask the agent to run
pwd - Ask it to run
ls -la - Ask it to summarize the current workspace in one sentence
- Ask it to create
notes/hello.tsand print the file content
What You've Mastered
At this point, you can:
- Explain why model intent must pass through a decision pipeline before it becomes execution
- Build a four-stage permission check: deny, mode, allow, ask
- Configure three permission modes that give you different safety/speed tradeoffs
- Add rules dynamically at runtime when a user answers "always"
- Implement a simple circuit breaker that catches repeated denial loops
What's Next
Your permission system controls what the agent is allowed to do, but it lives entirely inside the agent's own code. What if you want to extend behavior -- add logging, auditing, or custom validation -- without modifying the agent loop at all? That is what s08 introduces: a hook system that lets external shell scripts observe and influence every tool call.
Key Takeaway
Safety is a pipeline, not a boolean -- deny first, then consider mode, then check allow rules, then ask the user.