M26: Hooks, Sessions & the Agent SDK | Building AI Agents with Claude

Learning Objectives

Build a complete tool-using agent with query() + ClaudeAgentOptions + create_sdk_mcp_server — no hand-rolled message loop.
Add lifecycle interceptors with HookMatcher for PreToolUse (logging, validation) and PostToolUse (PII redaction, audit).
Make per-call authorization decisions with the can_use_tool permission callback.
Manage multi-turn flows by composing the message stream returned by query() — including a fork-friendly SessionManager.
Declare specialist subagents as .claude/agents/<name>.md files with isolated context windows.
Decide when to leave the SDK for a raw loop (rare, but exam-tested).

1. Why the SDK Exists — The Loop You No Longer Write

From Hand-Forged to Power Tools

Imagine you're a carpenter. The first house you build, you measure every cut with a tape measure, mark it with a pencil, then cut with a hand saw. Every joint takes ten minutes. By house number ten, your hands know every cut, but your back hurts and the build is still slow.

Then you switch to a track saw with a digital fence. The saw measures, marks, and cuts in one pass. The wood comes out the same as before — same dimensions, same joints — but you finish in a tenth of the time. You're still a carpenter; the saw didn't replace your judgment about where joints go. It replaced the manual measuring step that you stopped learning anything from after house three.

The Anthropic Agent SDK is that track saw. The hand-rolled while True: client.messages.create(...) loop you wrote in M12 and M15B taught you what an agent is. After two or three hand-builds, the loop stops teaching you anything new — it's just typing. The SDK is the power tool that lets you stop typing the loop and start designing the things only you can design: the tools, the prompts, the guardrails.

What the SDK Actually Provides

The claude-agent-sdk Python package (and @anthropic-ai/claude-agent-sdk for Node) gives you four things the raw anthropic SDK does not:

An async query() generator that runs the tool-use loop for you. You write async for msg in query(...) and the SDK handles stop_reason checking, tool dispatch, message-list bookkeeping, retries, and streaming.
The @tool decorator + create_sdk_mcp_server — tools are now async functions registered through an in-process MCP server. Schemas come from your type hints; you no longer hand-write JSON Schema dicts.
HookMatcher — a lifecycle hook system that fires PreToolUse and PostToolUse events, plus a can_use_tool permission callback. Guardrails stop being if-statements scattered through your loop and become declared, swappable functions.
.claude/agents/<name>.md — a file-based way to declare subagents (specialists with isolated context windows) that the coordinator can invoke by name. M14's hand-built coordinator pattern collapses to a directory of markdown files.

⚠️ Common Misconceptions

"Can't I just wrap client.messages.create() in a function called query() and call it the SDK?" — No. That's the trap the original M26 lab fell into. The real claude-agent-sdk manages the message stream, dispatches tool calls in MCP format, fires hooks at lifecycle points, and supports forking — none of which a wrapper around messages.create does. If you see code that imports from anthropic import Agent or uses @agent.tool / @agent.hook decorators, that is fictional API; the real package exposes query, tool, HookMatcher, and ClaudeAgentOptions imported from claude_agent_sdk.

"If I'm using the SDK, do I lose control?" — You lose loop boilerplate, not control. You still write the system prompt, the tools, the hooks, and the permission logic. The SDK runs the dispatch you used to type out by hand.

"Hooks and middleware are the same thing, right?" — Sort of. Hooks are middleware specifically for the agent's tool-use lifecycle: they fire before and after each tool call, with structured input. Web middleware is for HTTP requests; hooks are for tool calls. The pattern is the same, the events are different.

What Each Approach Handles For You

Before the animation, here's the concrete responsibility split. Anything in the "you write" column on the left is code you no longer maintain on the right.

Concern	Raw `messages.create()` loop	`claude-agent-sdk`
Tool-use loop (read `stop_reason`, dispatch, append result, recurse)	you write — ~60 lines	built-in via `query()`
Tool schemas (JSON Schema dicts)	you write by hand	generated from type hints by `@tool`
Tool registry & routing	you write a dispatcher dict	`create_sdk_mcp_server`
Lifecycle interception (log, validate, redact)	scattered `if`-statements in the loop	`HookMatcher` Pre/PostToolUse
Per-call permission decisions	more `if`-statements before dispatch	`can_use_tool` callback
Multi-turn transcript bookkeeping	you maintain a messages list	`resume` token or compose yourself
Subagent declaration	you write a coordinator class	`.claude/agents/<name>.md` markdown
Custom mid-loop streaming, racing tools, vendor APM wrappers around every API call	yes — you have full control	SDK serializes & abstracts

Use the raw loop when the bottom row applies — you need behavior the SDK abstracts away. Use the SDK for everything else, which is >90% of agent code in production.

Manual vs SDK — Side-by-Side

Watch the same agent step through ten lines of work. The left pane is the M15B-style hand-rolled loop. The right pane is the SDK. Both produce the same answer; the right pane stays a third of the size while the left pane grows.

Same agent, two stacks

Manual loop (anthropic SDK)

lines you wrote: 0

claude-agent-sdk

lines you wrote: 0

The right pane is shorter not because the SDK does less — it does more (hooks, sessions, subagents). It's shorter because you are no longer typing the loop. Now let's build a complete agent that way.

2. Build the UCC Agent with `claude-agent-sdk`

You're going to rebuild the same UCC filing research agent from M15B, but with the SDK as the entry point. The business behavior is identical — search filings by debtor name, fetch details, compute a risk score, return a narrative answer. What changes is everything around the business behavior.

What You'll Build (Lab Header)

What: A complete UCC research agent that answers "What is the lien exposure for <debtor>?" using three tools.
Time: 30–45 minutes for steps 1–3, plus 30–45 minutes for hooks/sessions/subagents in later sections.
Prerequisites: Python 3.10+, an ANTHROPIC_API_KEY in your environment, and Node 18+ if you want to run the TypeScript variant.
Files you'll create: mock_data.py, tools.py, run_agent.py in this section; session_manager.py + .claude/agents/risk-analyst.md in later sections.

Step 1: Install the SDK and create the project

5 minsetupmock_data.py

What & Why: Install the real claude-agent-sdk package (not anthropic — that's the lower-level Messages API) and create a tiny mock data module so the tools have something to return. We use mock data instead of a live BigQuery connection so you can run the lab on a plane.

Run the shell commands first, then save the Python panel below as mock_data.py in the project folder you just created (m26-ucc-agent/mock_data.py).

mkdir m26-ucc-agent && cd m26-ucc-agent
python -m venv venv && source venv/bin/activate    # Windows: venv\Scripts\activate
pip install "claude-agent-sdk>=0.2"

# Optional: TypeScript variant
npm init -y && npm i @anthropic-ai/claude-agent-sdk zod tsx typescript

# Confirm the env var is set
echo $ANTHROPIC_API_KEY | head -c 12   # macOS/Linux  — should print your key prefix
echo %ANTHROPIC_API_KEY:~0,12%        # Windows cmd  — same idea
$env:ANTHROPIC_API_KEY.Substring(0,12) # Windows PowerShell

"""mock_data.py — 9 UCC filings for Acme + 2 noise records (Pinnacle + Sunrise)."""
FILINGS_DB = [
    {"filing_number": "NY-2024-0847", "debtor_name": "ACME CORPORATION",
     "state": "NY", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "FIRST NATIONAL BANK",
     "collateral": "All inventory and accounts receivable"},
    {"filing_number": "NY-2024-0848", "debtor_name": "ACME CORP",
     "state": "NY", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "JPMORGAN CHASE", "collateral": "Equipment"},
    {"filing_number": "NY-2024-0849", "debtor_name": "ACME CORPORATION INC",
     "state": "NY", "filing_type": "UCC3_AMENDMENT", "status": "ACTIVE",
     "secured_party": "JPMORGAN CHASE", "collateral": "Equipment + vehicles"},
    {"filing_number": "CA-2024-1201", "debtor_name": "ACME CORP",
     "state": "CA", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "BANK OF AMERICA", "collateral": "All assets"},
    {"filing_number": "CA-2024-1202", "debtor_name": "ACME CORPORATION",
     "state": "CA", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "CITIBANK NA", "collateral": "Inventory"},
    {"filing_number": "TX-2024-0903", "debtor_name": "ACME CORP",
     "state": "TX", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "PNC FINANCIAL", "collateral": "Equipment"},
    {"filing_number": "TX-2024-0904", "debtor_name": "ACME CORPORATION",
     "state": "TX", "filing_type": "UCC3_CONTINUATION", "status": "ACTIVE",
     "secured_party": "PNC FINANCIAL", "collateral": "Equipment"},
    {"filing_number": "FL-2024-0455",
     "debtor_name": "ACME CORP DBA ROADRUNNER SUPPLIES",
     "state": "FL", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "TRUIST FINANCIAL", "collateral": "Motor vehicles"},
    {"filing_number": "OH-2024-0301", "debtor_name": "ACME CORPORATION",
     "state": "OH", "filing_type": "UCC1", "status": "TERMINATED",
     "secured_party": "US BANCORP", "collateral": "Accounts receivable"},
    {"filing_number": "NY-2024-0501", "debtor_name": "PINNACLE INDUSTRIES",
     "state": "NY", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "TD BANK", "collateral": "Equipment"},
    {"filing_number": "CA-2024-1500", "debtor_name": "SUNRISE HOLDINGS LLC",
     "state": "CA", "filing_type": "UCC1", "status": "ACTIVE",
     "secured_party": "FIRST NATIONAL BANK", "collateral": "All assets"},
]

// mock_data.ts — same shape as the Python module.
export interface Filing {
  filing_number: string;
  debtor_name: string;
  state: string;
  filing_type: string;
  status: string;
  secured_party: string;
  collateral: string;
}

export const FILINGS_DB: Filing[] = [
  { filing_number: "NY-2024-0847", debtor_name: "ACME CORPORATION",
    state: "NY", filing_type: "UCC1", status: "ACTIVE",
    secured_party: "FIRST NATIONAL BANK",
    collateral: "All inventory and accounts receivable" },
  // ... (the same 11 records as the Python file)
];

Run: python -c "from mock_data import FILINGS_DB; print(len(FILINGS_DB))"

Expected output

11

Checkpoint — Step 1

You should see 11 records loaded. If you see ImportError, you're running from the wrong directory; cd into m26-ucc-agent.

Troubleshooting

ModuleNotFoundError: No module named 'claude_agent_sdk' → pip install "claude-agent-sdk>=0.2" in your activated venv.
ImportError: cannot import name 'Agent' from 'anthropic' → that import does not exist. The real SDK is claude_agent_sdk. If you see this in someone else's code, it's the fictional API.
ANTHROPIC_API_KEY not set → export ANTHROPIC_API_KEY=sk-ant-... (or the Windows equivalent).

Step 2: Define the three tools

10 mintools.py3 @tool functions + create_sdk_mcp_server

What & Why: Tools in the SDK are async Python functions wrapped with the @tool decorator. The decorator takes the tool name, a description Claude reads to decide when to call it, and a parameter schema. The function returns an MCP-shaped response: {"content": [{"type": "text", "text": ...}]}. Wrap them all in a create_sdk_mcp_server(...) call — that's the in-process MCP server the SDK will route tool calls through.

The three tools we're defining mirror what M15B built by hand: search_filings (partial debtor-name search), get_filing_details (lookup by filing number), and calculate_risk_score (mock score for a given entity).

"""tools.py — 3 SDK tools registered through an in-process MCP server."""
import json
from claude_agent_sdk import tool, create_sdk_mcp_server
from mock_data import FILINGS_DB

@tool(
    "search_filings",
    "Search UCC filings by debtor name. Supports partial matching across states. "
    "Use this FIRST to find candidate filings before fetching details.",
    {"debtor_name": str, "state": str},
)
async def search_filings(args):
    name = args["debtor_name"].upper()
    state = args.get("state")
    hits = [
        f for f in FILINGS_DB
        if name in f["debtor_name"].upper()
           and (not state or f["state"] == state)
    ]
    return {"content": [{"type": "text", "text": json.dumps(hits)}]}

@tool(
    "get_filing_details",
    "Get full details for a specific UCC filing by filing number.",
    {"filing_number": str},
)
async def get_filing_details(args):
    for f in FILINGS_DB:
        if f["filing_number"] == args["filing_number"]:
            return {"content": [{"type": "text", "text": json.dumps(f)}]}
    return {"content": [{"type": "text",
                          "text": json.dumps({"error": "not found"})}]}

@tool(
    "calculate_risk_score",
    "Calculate a delinquency risk score for an entity by aggregating its filings.",
    {"entity_name": str},
)
async def calculate_risk_score(args):
    name = args["entity_name"].upper()
    matches = [f for f in FILINGS_DB if name in f["debtor_name"].upper()]
    active = [f for f in matches if f["status"] == "ACTIVE"]
    states = {f["state"] for f in active}
    score = min(0.95, 0.15 * len(active) + 0.05 * len(states))
    out = {"entity_name": args["entity_name"],
           "active_filings": len(active),
           "states": sorted(states),
           "risk_score": round(score, 2),
           "risk_level": "HIGH" if score > 0.66 else "MEDIUM" if score > 0.33 else "LOW"}
    return {"content": [{"type": "text", "text": json.dumps(out)}]}

ucc_server = create_sdk_mcp_server(
    name="ucc_tools",
    version="1.0.0",
    tools=[search_filings, get_filing_details, calculate_risk_score],
)

// tools.ts
import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
import { FILINGS_DB } from "./mock_data.js";

export const searchFilings = tool(
  "search_filings",
  "Search UCC filings by debtor name. Supports partial matching across states. " +
  "Use this FIRST to find candidate filings before fetching details.",
  { debtor_name: z.string(), state: z.string().optional() },
  async (args) => {
    const name = args.debtor_name.toUpperCase();
    const state = args.state;
    const hits = FILINGS_DB.filter(f =>
      f.debtor_name.toUpperCase().includes(name) &&
      (!state || f.state === state));
    return { content: [{ type: "text", text: JSON.stringify(hits) }] };
  }
);

export const getFilingDetails = tool(
  "get_filing_details",
  "Get full details for a specific UCC filing by filing number.",
  { filing_number: z.string() },
  async (args) => {
    const rec = FILINGS_DB.find(f => f.filing_number === args.filing_number)
              ?? { error: "not found" };
    return { content: [{ type: "text", text: JSON.stringify(rec) }] };
  }
);

export const calculateRiskScore = tool(
  "calculate_risk_score",
  "Calculate a delinquency risk score for an entity by aggregating its filings.",
  { entity_name: z.string() },
  async (args) => {
    const name = args.entity_name.toUpperCase();
    const matches = FILINGS_DB.filter(f => f.debtor_name.toUpperCase().includes(name));
    const active = matches.filter(f => f.status === "ACTIVE");
    const states = [...new Set(active.map(f => f.state))];
    const score = Math.min(0.95, 0.15 * active.length + 0.05 * states.length);
    const level = score > 0.66 ? "HIGH" : score > 0.33 ? "MEDIUM" : "LOW";
    return { content: [{ type: "text", text: JSON.stringify({
      entity_name: args.entity_name, active_filings: active.length,
      states, risk_score: Math.round(score * 100) / 100, risk_level: level,
    }) }] };
  }
);

export const uccServer = createSdkMcpServer({
  name: "ucc_tools",
  tools: [searchFilings, getFilingDetails, calculateRiskScore],
});

Run: python -c "from tools import ucc_server; print('OK,', len(ucc_server.tools), 'tools registered')"

Expected output

OK, 3 tools registered

Why we don't call the tool directly here: the @tool decorator wraps your function as an SdkMcpTool object so the SDK can route MCP calls to it — the wrapped object is no longer a plain async function. End-to-end behavior is verified in Step 3 by running the full agent.

Checkpoint — Step 2

You should see OK, 3 tools registered. The agent in Step 3 will then exercise all three: a partial search for "acme" should pull 9 ACME variants (including the DBA), get_filing_details should look up by filing number, and calculate_risk_score should aggregate them into a HIGH/MEDIUM/LOW score.

Troubleshooting

TypeError: tool() takes 3 positional arguments but 4 were given → the schema dict is the third arg; the function is decorated after in Python. In TS the function is the fourth arg of tool().
tool returned non-MCP content → you returned a Python dict directly. Wrap it: {"content": [{"type": "text", "text": json.dumps(your_dict)}]}.

What Just Happened?

You just registered three tools as an in-process MCP server. The SDK will start that server when query() runs and route any tool call whose name starts with mcp__ucc__ to your async functions. You did not write a JSON Schema, a tool dispatcher, or a tool-result formatter — the decorator generated all three from your type hints.

Step 3: Run the agent via query()

10 minrun_agent.pyThe agent loop you no longer write

What & Why: query() is the SDK's entry point. You give it a prompt string (or a structured message list) and a ClaudeAgentOptions object describing how the agent should run. query() returns an async generator of messages — assistant text blocks, tool-use events, tool results, and a final result message with token counts. You iterate over them and pull out what you need.

Three options matter most: system_prompt (the instructions that shape behavior), mcp_servers (a dict mapping server names to MCP server objects — so the SDK knows where to dispatch tool calls), and allowed_tools (an explicit allowlist of tool names in mcp__<server>__<tool> format). Without allowed_tools, the agent can't call your tools at all.

"""run_agent.py — the SDK entry point. ~25 lines, no manual loop."""
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage
from tools import ucc_server

OPTIONS = ClaudeAgentOptions(
    model="claude-sonnet-4-6",
    system_prompt=(
        "You are a UCC filing research agent. When given a business name, "
        "search thoroughly using name variations including abbreviations and "
        "DBAs. Then call calculate_risk_score and write a narrative report "
        "citing specific filings."
    ),
    mcp_servers={"ucc": ucc_server},
    allowed_tools=[
        "mcp__ucc__search_filings",
        "mcp__ucc__get_filing_details",
        "mcp__ucc__calculate_risk_score",
    ],
    max_turns=8,
)

async def main():
    final_text = ""
    async for msg in query(
        prompt="What is the lien exposure for Acme Corporation?",
        options=OPTIONS,
    ):
        if isinstance(msg, AssistantMessage):
            for block in msg.content:
                if hasattr(block, "text") and block.text:
                    final_text = block.text
    print(final_text)

if __name__ == "__main__":
    asyncio.run(main())

// run_agent.ts
import { query } from "@anthropic-ai/claude-agent-sdk";
import { uccServer } from "./tools.js";

const OPTIONS = {
  model: "claude-sonnet-4-6",
  systemPrompt: "You are a UCC filing research agent. When given a business name, " +
                "search thoroughly using name variations including abbreviations and DBAs. " +
                "Then call calculate_risk_score and write a narrative report citing specific filings.",
  mcpServers: { ucc: uccServer },
  allowedTools: [
    "mcp__ucc__search_filings",
    "mcp__ucc__get_filing_details",
    "mcp__ucc__calculate_risk_score",
  ],
  maxTurns: 8,
};

async function main() {
  let finalText = "";
  for await (const msg of query({
    prompt: "What is the lien exposure for Acme Corporation?",
    options: OPTIONS,
  })) {
    if (msg.type === "assistant") {
      for (const block of msg.content) {
        if ("text" in block && block.text) finalText = block.text;
      }
    }
  }
  console.log(finalText);
}

main();

Run: python run_agent.py (or npx tsx run_agent.ts)

Expected output (paraphrased — Claude generates fresh text each run; numbers are deterministic from the mock data)

Acme Corporation has 9 UCC filings on file — 8 ACTIVE across 4 states (NY:3, CA:2, TX:2, FL:1) and 1 TERMINATED in OH. Active secured-party banks include JPMorgan Chase (2 filings), PNC Financial (2 filings), Bank of America, Citibank NA, First National Bank, and Truist Financial. The Florida filing is registered under the DBA "ACME CORP DBA ROADRUNNER SUPPLIES" — easy to miss in a naive name search. Risk score: 0.95 (HIGH, capped). Driven by 8 active filings across 4 states. Recommend deeper review of CA-2024-1201 (all assets) and TX-2024-0904 (recent UCC-3 continuation, which extends the original lien for another 5 years).

Checkpoint — Step 3

You should see a narrative report mentioning multiple Acme filings across states, the DBA variant, and a HIGH/MEDIUM/LOW risk verdict. If you see 0 filings found, the agent isn't trying name variations — tighten the system prompt to explicitly say "search using ACME, ACME CORP, ACME CORPORATION, and DBA forms."

Troubleshooting

Agent calls no tools at all → check that allowed_tools uses the exact mcp__<server>__<tool> format. The server name is what you passed to mcp_servers={"ucc": ...}.
Agent runs forever / hits max_turns → the system prompt is too vague. Tell it explicitly when to stop ("once you have searched and scored, write the report and stop").
RuntimeError: This event loop is already running → you're inside a Jupyter notebook. Use await main() instead of asyncio.run(main()).

What Just Happened?

You just ran a tool-using agent in 25 lines. The SDK's query() handled: turning your prompt into a Messages API call, parsing the tool_use blocks Claude returned, dispatching them to your async tool functions, formatting the tool_results back into Messages format, looping until stop_reason hit end_turn, and yielding messages to your async for as they happened. The M15B equivalent was ~80 lines of loop code.

Step 4: Side-by-side — the M15B raw loop vs the SDK driver

5 mindiff exerciseno new files

What & Why: This step is a pencil-down comparison. Open M15B's agent_loop.py next to your new run_agent.py and count the lines that vanished. The point isn't to write more code — it's to recognize what the SDK is replacing so you can defend the choice on the cert exam and in code review.

# M15B agent_loop.py — manual tool-use loop (excerpt)
from anthropic import Anthropic
from tools import TOOL_SCHEMAS, dispatch_tool   # you wrote both

client = Anthropic()
messages = [{"role": "user", "content": "What is the lien exposure for Acme?"}]

for turn in range(8):                                    # max_turns
    resp = client.messages.create(
        model="claude-sonnet-4-6",
        system="You are a UCC research agent...",
        tools=TOOL_SCHEMAS,                              # hand-written JSON Schema
        messages=messages,
        max_tokens=2048,
    )
    messages.append({"role": "assistant", "content": resp.content})

    if resp.stop_reason == "end_turn":                   # you check
        break
    if resp.stop_reason != "tool_use":
        raise RuntimeError(f"unexpected stop: {resp.stop_reason}")

    tool_results = []
    for block in resp.content:                           # you parse
        if block.type != "tool_use":
            continue
        try:
            result = dispatch_tool(block.name, block.input)   # you dispatch
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": json.dumps(result),
            })
        except Exception as e:
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": f"error: {e}",
                "is_error": True,
            })
    messages.append({"role": "user", "content": tool_results})  # you format

# Print the last assistant text block
final = next((b.text for b in resp.content if b.type == "text"), "")
print(final)

# M26 run_agent.py — SDK driver (the loop is gone)
import asyncio
from claude_agent_sdk import query, AssistantMessage
from run_agent import OPTIONS                       # the OPTIONS you built in Step 3

async def main():
    final = ""
    async for msg in query(
        prompt="What is the lien exposure for Acme Corporation?",
        options=OPTIONS,
    ):
        if isinstance(msg, AssistantMessage):
            for block in msg.content:
                if hasattr(block, "text") and block.text:
                    final = block.text
    print(final)

asyncio.run(main())

Run: wc -l agent_loop.py run_agent.py (Windows: (Get-Content run_agent.py).Count)

Expected output

62 agent_loop.py 18 run_agent.py 80 total

Checkpoint — Step 4

You can name three things the SDK driver no longer does that the raw loop did: (1) stop_reason branching, (2) tool-result formatting into {"type": "tool_result", "tool_use_id": ...} dicts, (3) the messages.append bookkeeping. If you can't, re-read the diff — this is a Domain 1 cert question.

Troubleshooting

Don't have M15B's agent_loop.py handy? → the snippet above is faithful to it. Compare against your memory or against the M15B lab's solution/ folder.
Tempted to mix the two? → don't. Either commit to the SDK or keep the raw loop. Mixing means you maintain both code paths and lose the SDK's hooks/sessions/subagents.

3. Hooks — `HookMatcher` for Pre/Post Tool Use

Hooks Are the Building's Sprinkler System

When you wire a building, the electrician adds outlets where appliances will plug in. Later, the fire-protection company adds sprinklers and smoke detectors — in the same ceilings, but on a separate system, with their own logic. The sprinklers don't ask the appliances for permission to fire; they listen to a temperature sensor and act.

That's what hooks are. The SDK's tool dispatch is the appliance circuit. Hooks are the sprinkler system layered on top: they fire before a tool runs (PreToolUse) or after it returns (PostToolUse), based on a name pattern, and they decide whether to log it, modify it, or block it.

Concretely: a PreToolUse hook with matcher="*" is a smoke detector wired to every room. A PreToolUse hook with matcher="mcp__ucc__search_filings" is one wired only to the kitchen. A can_use_tool callback is the sprinkler valve — it can hard-deny the tool call before it ever reaches the function.

Hook Lifecycle in Action

Watch a single tool call propagate through the lifecycle. The agent decides to call search_filings; the SDK matches both hooks; PreToolUse fires (logs the call); the tool runs; PostToolUse fires (redacts PII); the result is appended to the message stream.

PreToolUse → tool → PostToolUse

claude

tool_use

→

HookMatcher

match name

→

PreToolUse

log_call

→

@tool

search_filings

→

PostToolUse

redact_pii

→

stream

tool_result

Step 5: Add a PreToolUse logging hook

10 minextends run_agent.py

What & Why: A PreToolUse hook fires before each tool dispatch. Its three parameters are input_data (a dict containing tool_name and tool_input), tool_use_id (the unique id of this call — matches the tool_use_id in the eventual tool_result), and context (general SDK context). Returning an empty dict {} means "continue without modification." Returning a payload with hookSpecificOutput can deny the call — we'll use that in Step 7.

You register hooks via HookMatcher: an object that pairs a matcher pattern (a tool name or "*" for all) with a list of hook functions. Multiple matchers can stack — they fire in registration order.

What to change in run_agent.py: add the new imports + log_tool_call function near the top, then replace your existing OPTIONS = ClaudeAgentOptions(...) block with the version below (same fields as Step 3 plus a new hooks={} kwarg).

# Add these two imports to the top of run_agent.py:
from datetime import datetime
from claude_agent_sdk import HookMatcher

# Add this hook function above the OPTIONS definition:
async def log_tool_call(input_data, tool_use_id, context):
    ts = datetime.utcnow().isoformat() + "Z"
    name = input_data.get("tool_name")
    params = input_data.get("tool_input")
    print(f"[{ts}] PRE  {name}({params})")
    return {}   # empty dict = continue without modification

# Replace your existing OPTIONS = ClaudeAgentOptions(...) block with this version.
# Only the new `hooks=` kwarg is different from Step 3.
OPTIONS = ClaudeAgentOptions(
    model="claude-sonnet-4-6",
    system_prompt=(
        "You are a UCC filing research agent. When given a business name, "
        "search thoroughly using name variations including abbreviations and "
        "DBAs. Then call calculate_risk_score and write a narrative report "
        "citing specific filings."
    ),
    mcp_servers={"ucc": ucc_server},
    allowed_tools=[
        "mcp__ucc__search_filings",
        "mcp__ucc__get_filing_details",
        "mcp__ucc__calculate_risk_score",
    ],
    hooks={
        "PreToolUse": [HookMatcher(matcher="*", hooks=[log_tool_call])],
    },
    max_turns=8,
)

const logToolCall = async (inputData: any, toolUseId: string, ctx: any) => {
  const ts = new Date().toISOString();
  console.log(`[${ts}] PRE  ${inputData.tool_name}(${JSON.stringify(inputData.tool_input)})`);
  return {};
};

const OPTIONS = {
  model: "claude-sonnet-4-6",
  systemPrompt: "...",                  // same as Step 3
  mcpServers: { ucc: uccServer },
  allowedTools: [/* same as Step 3 */],
  hooks: {
    PreToolUse: [{ matcher: "*", hooks: [logToolCall] }],
  },
  maxTurns: 8,
};

Run: python run_agent.py — you should now see a [timestamp] PRE line for every tool call.

Expected output (interleaved with the agent's narrative; exact tool args vary as Claude reasons)

[2026-05-09T08:55:14Z] PRE mcp__ucc__search_filings({'debtor_name': 'Acme Corporation'}) [2026-05-09T08:55:16Z] PRE mcp__ucc__search_filings({'debtor_name': 'ACME CORP'}) [2026-05-09T08:55:18Z] PRE mcp__ucc__calculate_risk_score({'entity_name': 'Acme Corporation'}) <narrative report follows>

Checkpoint — Step 5

You see one PRE log line per tool call. If you see no log lines but the tools still run, your hook isn't registered — double-check that hooks={"PreToolUse": [HookMatcher(...)]} is inside ClaudeAgentOptions, not outside.

Step 6: Add a PostToolUse PII redaction hook (so PII never reaches Claude)

10 minextends run_agent.py

What & Why: A PostToolUse hook fires after a tool returns but before the result is appended to the message stream the agent reads. The hook receives the tool's response in input_data["tool_response"]; whatever you return as tool_response in your output dict is what the agent actually sees. So this hook scrubs PII from what flows back to the agent — which means PII never enters the message history, the trace, or any downstream consumer. (To also redact the stdout log line from Step 5, add the same regex sub inside log_tool_call before the print.)

Realistic UCC filings don't contain SSNs or phone numbers, but the redaction pattern is the same one you'd use in healthcare or finance contexts — so it's worth practicing here.

What to change in run_agent.py: add the imports + redact_pii function near the top (above OPTIONS), then update the hooks={} dict in your existing OPTIONS block to add the new PostToolUse entry.

# Add to the top of run_agent.py:
import re

SSN_RE   = re.compile(r"\b\d{3}-\d{2}-\d{4}\b")
PHONE_RE = re.compile(r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b")

# Add this hook function above OPTIONS (next to log_tool_call):
async def redact_pii(input_data, tool_use_id, context):
    out = input_data.get("tool_response", "")
    if isinstance(out, str):
        out = SSN_RE.sub("[SSN REDACTED]", out)
        out = PHONE_RE.sub("[PHONE REDACTED]", out)
    return {"tool_response": out}

# Update the hooks={} dict inside your existing OPTIONS to add PostToolUse:
hooks={
    "PreToolUse":  [HookMatcher(matcher="*", hooks=[log_tool_call])],
    "PostToolUse": [HookMatcher(matcher="*", hooks=[redact_pii])],
},
# (leave the rest of OPTIONS — model, system_prompt, mcp_servers,
#  allowed_tools, max_turns — unchanged from Step 5)

const SSN_RE   = /\b\d{3}-\d{2}-\d{4}\b/g;
const PHONE_RE = /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g;

const redactPii = async (inputData: any, toolUseId: string, ctx: any) => {
  let out = inputData.tool_response;
  if (typeof out === "string") {
    out = out.replace(SSN_RE, "[SSN REDACTED]")
             .replace(PHONE_RE, "[PHONE REDACTED]");
  }
  return { tool_response: out };
};

const OPTIONS = {
  // ... existing fields ...
  hooks: {
    PreToolUse:  [{ matcher: "*", hooks: [logToolCall] }],
    PostToolUse: [{ matcher: "*", hooks: [redactPii] }],
  },
};

Checkpoint — Step 6

Your run still produces the same narrative; the redactor is a no-op on UCC data because there are no SSNs to find. To prove the hook is wired, temporarily add "phone: 555-123-4567" to one filing's collateral field and re-run — you should see [PHONE REDACTED] in the agent's output.

Troubleshooting

Redaction doesn't show up → the hook only edits string responses. If a tool returns a dict, JSON-stringify it first or extend the hook to walk the dict.
Logger still prints SSNs → PostToolUse runs after the tool, but the PreToolUse logger from Step 5 prints the input — if PII could appear in tool inputs (e.g., user-supplied SSN), redact inside log_tool_call too.

Step 7: Block broad/dangerous queries with can_use_tool

10 minextends run_agent.py

What & Why: Hooks observe and modify. The can_use_tool callback decides. It runs synchronously before each tool dispatch (after the matching PreToolUse hook fires) and returns a permission verdict: PermissionResultAllow() to let the call through, PermissionResultDeny(message="...") to refuse it. The denial message is fed back to Claude so it can adapt — e.g., it might pick a different tool or reformulate its query.

This is the cert-tested permission primitive. The exam scenarios always involve a "user-controlled input feeds into a tool" situation where you need to gate based on the runtime value, not the tool name alone.

What to change in run_agent.py: add the imports + gate function near the top, then add a new can_use_tool=gate kwarg to your existing OPTIONS (next to hooks=).

# Add to the imports at the top of run_agent.py:
from claude_agent_sdk import PermissionResultAllow, PermissionResultDeny

# Add this gate function above OPTIONS:
async def gate(tool_name, tool_input, context):
    # Block search_filings calls that would scan everything.
    if tool_name == "mcp__ucc__search_filings":
        q = (tool_input.get("debtor_name") or "").strip()
        if len(q) < 3:
            return PermissionResultDeny(
                message=f"Query too broad: '{q}' is < 3 chars. "
                        "Provide a longer name fragment."
            )
    return PermissionResultAllow()

# Add can_use_tool=gate to your existing OPTIONS block:
OPTIONS = ClaudeAgentOptions(
    model="claude-sonnet-4-6",
    system_prompt="...",                  # unchanged from Step 6
    mcp_servers={"ucc": ucc_server},
    allowed_tools=[                       # unchanged from Step 6
        "mcp__ucc__search_filings",
        "mcp__ucc__get_filing_details",
        "mcp__ucc__calculate_risk_score",
    ],
    hooks={                               # unchanged from Step 6
        "PreToolUse":  [HookMatcher(matcher="*", hooks=[log_tool_call])],
        "PostToolUse": [HookMatcher(matcher="*", hooks=[redact_pii])],
    },
    can_use_tool=gate,                    # <-- new in Step 7
    max_turns=8,
)

import { PermissionResultAllow, PermissionResultDeny }
       from "@anthropic-ai/claude-agent-sdk";

const gate = async (toolName: string, toolInput: any, ctx: any) => {
  if (toolName === "mcp__ucc__search_filings") {
    const q = (toolInput.debtor_name ?? "").trim();
    if (q.length < 3) {
      return new PermissionResultDeny({
        message: `Query too broad: '${q}' is < 3 chars. ` +
                 "Provide a longer name fragment.",
      });
    }
  }
  return new PermissionResultAllow();
};

const OPTIONS = { /* ... */ canUseTool: gate };

Run: In run_agent.py, change the prompt= argument inside the async for msg in query(...) call from "What is the lien exposure for Acme Corporation?" to "What about company A?", then rerun python run_agent.py. You should see the agent attempt search_filings({"debtor_name": "A"}), get denied by the gate (the SDK feeds your denial message back as a tool error), then either reformulate the search with a longer fragment or apologize that it can't search on a single character.

Checkpoint — Step 7

A 1-character query is denied with your custom message. The agent reads the denial and adapts. If the gate doesn't fire, check that can_use_tool=gate is a top-level kwarg on ClaudeAgentOptions (not nested inside hooks).

⚠️ can_use_tool vs PreToolUse hooks

"Aren't they the same thing?" — Close but not identical. A PreToolUse hook runs and can side-effect (log, modify input), but its return value is mostly informational. can_use_tool is a binary gate: Allow or Deny, with the denial reason fed back to Claude for adaptation. Use hooks for instrumentation; use can_use_tool for authorization decisions.

4. Sessions — Multi-Turn Flows You Control

The SDK does not ship a Session class. The "session" is the message stream returned by query(). Multi-turn flows are something you compose yourself by managing the prompt — either by maintaining a transcript and passing it to the next query() call, or by using the SDK's resume token to continue a prior session id.

Two Ways to Do Multi-Turn

Pattern A — transcript composition: keep the running USER/ASSISTANT pairs in a list. On each new turn, build a single prompt string from the transcript plus the new user input, and pass that to query(). The agent doesn't know about prior turns; it sees one big prompt.

Pattern B — resume tokens: the SDK emits a session_id on the final result message of each query. Pass that id back as ClaudeAgentOptions(resume=session_id) on the next call — the SDK reconstructs the message context server-side and you only send the new user turn.

Pattern A is simpler to teach and easier to fork (deep-copy the transcript list and you have a what-if branch). Pattern B is more efficient for long sessions because it doesn't re-send the full history. The cert exam tests both.

Step 8: session.send() for multi-turn + session.fork() for what-if branches

15 minsession_manager.py

What & Why: We're building Pattern A because forking is the more interesting cert-tested concept — and forking with transcripts is a one-line list(self.transcript) deep copy. Pattern B forking would require coordinating session ids on the SDK side. Pattern A is enough to demonstrate the concept the exam asks about.

The class wraps three things: a list of USER/ASSISTANT lines (the transcript), a send method that appends a turn and runs query(), and a fork method that returns a new SessionManager with the same options and a deep-copied transcript — so changes in the fork don't contaminate the parent.

"""session_manager.py — thin multi-turn wrapper over query()."""
from claude_agent_sdk import query, AssistantMessage

class SessionManager:
    def __init__(self, options):
        self.options = options
        self.transcript: list[str] = []

    async def send(self, user_input: str) -> str:
        # Build a single prompt from prior turns plus the new input.
        prompt = "\n\n".join(self.transcript + [f"USER: {user_input}"])
        final = ""
        async for msg in query(prompt=prompt, options=self.options):
            if isinstance(msg, AssistantMessage):
                for block in msg.content:
                    if hasattr(block, "text") and block.text:
                        final = block.text
        # Persist the turn AFTER the agent has answered.
        self.transcript.append(f"USER: {user_input}")
        self.transcript.append(f"ASSISTANT: {final}")
        return final

    def fork(self) -> "SessionManager":
        clone = SessionManager(self.options)
        clone.transcript = list(self.transcript)   # deep enough for strings
        return clone

// session_manager.ts
import { query } from "@anthropic-ai/claude-agent-sdk";

export class SessionManager {
  options: any;
  transcript: string[] = [];

  constructor(options: any) { this.options = options; }

  async send(userInput: string): Promise<string> {
    const prompt = [...this.transcript, `USER: ${userInput}`].join("\n\n");
    let final = "";
    for await (const msg of query({ prompt, options: this.options })) {
      if (msg.type === "assistant") {
        for (const block of msg.content) {
          if ("text" in block && block.text) final = block.text;
        }
      }
    }
    this.transcript.push(`USER: ${userInput}`);
    this.transcript.push(`ASSISTANT: ${final}`);
    return final;
  }

  fork(): SessionManager {
    const clone = new SessionManager(this.options);
    clone.transcript = [...this.transcript];
    return clone;
  }
}

Try it:

import asyncio
from session_manager import SessionManager
from run_agent import OPTIONS

async def main():
    s = SessionManager(OPTIONS)
    print(await s.send("What is the lien exposure for Acme Corporation?"))
    print(await s.send("Now compare to Pinnacle Industries."))

    branch = s.fork()
    # Hypothetical branch — does NOT contaminate `s`.
    print(await branch.send("What if Acme files a UCC-3 termination on NY-2024-0848?"))

    # Original session continues from turn 2.
    print(await s.send("Summarize the relative risk of Acme vs Pinnacle in one sentence."))

asyncio.run(main())

Checkpoint — Step 8

The first three calls reference the prior turns ("compare to Pinnacle" only makes sense if Acme is in the transcript). The fork's hypothetical does not appear in the parent's final summary. If both threads see each other's turns, your fork() is sharing the list reference instead of copying — check for list(self.transcript), not self.transcript.

Troubleshooting

NameError: SessionManager is not defined → you forgot to from session_manager import SessionManager in the driver.
Cost balloons after 6+ turns → Pattern A re-sends the full transcript every turn. Switch to Pattern B (resume=session_id) for long-lived sessions.
Fork still shares state → if your transcript items become mutable (dicts, not strings), upgrade list(...) to copy.deepcopy(...).

🎓 Cert Tip — Domain 1.7

Know three session moves: --resume <session-name> for named continuation (Claude Code CLI), fork_session / your own fork() for branched exploration from a shared baseline, and the resume-vs-fresh decision — resume when prior context is mostly valid; start fresh with an injected summary when prior tool results are stale (file changed, ticket closed, db query went out of date). The exam will give you a scenario and ask which to pick.

5. Declarative Subagents — `.claude/agents/<name>.md`

Subagents Are the Specialists in Your Building

When you build a large house, you don't hire one carpenter who also does electrical, plumbing, and roofing. You hire specialists. The general contractor (the coordinator) knows what work needs doing and which specialist to call — and importantly, the specialists don't need to know everything about the house. The plumber doesn't need the architectural plans; they need the bathroom diagram and a list of fixtures.

Subagents work the same way. The coordinator is your main agent — it has the conversation history, the user's broader goal, all the messy context. When it needs a focused piece of work done (calculate risk, validate a payment, compose a notification), it delegates to a subagent. The subagent gets a fresh, isolated context window with only the task brief — not the whole conversation.

Concretely in the SDK: a subagent is a markdown file in .claude/agents/<name>.md with frontmatter declaring its name, description, allowed tools, and (optionally) model. The body of the file is the system prompt. The coordinator invokes it by name; the SDK starts a fresh conversation with the subagent's prompt and only the input the coordinator passes in.

Subagent Invocation Flow

The coordinator has a long conversation history (gray blocks). It delegates the risk calculation to the risk-analyst subagent, which spins up with a fresh context window (green) containing only the task brief. When done, it returns a structured result; the coordinator continues with that result added to its conversation.

Coordinator ↔ subagent context isolation

Coordinator

context window

12 turns of conversation
user goals, prior tool results, RAG hits

↓ task brief

↑ result

risk-analyst (subagent)

context window

(empty)

Bonus Step: Declare a risk-analyst subagent

10 min.claude/agents/risk-analyst.md

What & Why: Move the "calculate score and write a one-paragraph risk profile" responsibility out of the coordinator's prompt and into a focused subagent. The subagent has access only to calculate_risk_score — not search_filings or get_filing_details. The coordinator searches and gathers; the subagent scores and reports. This is the cert-favored hub-and-spoke architecture from M14.

Create the directory and file:

mkdir -p .claude/agents
# create the file with your editor of choice, contents shown in the next tab

---
name: risk-analyst
description: Calculates lien exposure and risk scores given an entity name and a list of filings. Returns a structured risk profile.
tools:
  - calculate_risk_score
model: claude-haiku-4-5-20251001
---

You are the Risk Analyst — a focused subagent invoked by the UCC research
coordinator. Your job is narrow:

1. The coordinator hands you an entity name and (optionally) a list of relevant
   filing summaries.
2. Call calculate_risk_score with the entity name.
3. Write a one-paragraph risk profile that includes:
   - The numeric risk_score and the HIGH/MEDIUM/LOW level
   - The two strongest contributing factors (e.g., "active in 4 states", "recent
     UCC-3 continuation just filed")
   - A one-sentence recommendation ("flag for manual review", "monitor monthly",
     "no action needed")

Do NOT search for new filings or fetch additional details — the coordinator
already gathered the data. Stay in your lane.

Then update the coordinator's system prompt to mention the subagent. The SDK looks for .claude/agents/*.md in the current working directory when query() runs and exposes each declared subagent to the coordinator. Make sure you run python run_agent.py from the project root (the same directory that contains the .claude/ folder you just created) — otherwise the subagent won't be discovered.

What to change in run_agent.py: replace the existing system_prompt=(...) argument inside your OPTIONS = ClaudeAgentOptions(...) block with the version below. (Leave every other kwarg from Step 7 unchanged.)

# Replace the system_prompt= argument inside OPTIONS with this version:
system_prompt=(
    "You are the UCC research coordinator. For each user request:\n"
    " 1. Use search_filings (with name variations) and get_filing_details to "
    "gather all relevant filings.\n"
    " 2. Once gathered, delegate to the risk-analyst subagent for scoring and "
    "the one-paragraph profile.\n"
    " 3. Stitch the subagent's profile into your final answer along with "
    "specific filing citations.\n"
    "Do not call calculate_risk_score yourself — that's the subagent's job."
),

Checkpoint — Bonus (Subagent)

Re-run python run_agent.py. The trace now shows the coordinator calling search_filings + get_filing_details, then invoking the risk-analyst subagent (which itself calls calculate_risk_score), then assembling the final answer. The risk-analyst's reasoning is in its context window — the coordinator only sees the structured profile that came back.

⚠️ Subagent Context is Isolated

The cert exam loves this question: "Does a subagent see the coordinator's prior conversation?" No. The subagent starts with a fresh context window containing only its system prompt (the markdown body) and the input the coordinator passes in. If the subagent needs context (e.g., the user's broader goal, prior tool results), the coordinator must include it explicitly in the task brief. Anti-pattern: assuming "well, the subagent is in the same process, so it has the same memory" — it doesn't.

6. Putting It Together — The Production Agent Stack

You have all the pieces. A production-ready UCC agent — the kind that ships behind a FastAPI endpoint — is just these pieces stacked. Each layer adds one capability without disturbing the layers below.

The Stack, Built Up Layer by Layer

Watch the layers stack from the bottom (raw tools) up to the wire (HTTP endpoint). Each layer is one thing the SDK lets you add without rewriting the layers below.

Bottom-up: tools → MCP → options → hooks → sessions → API

Layer 1@tool functionsPure async business logic

Layer 2create_sdk_mcp_serverIn-process tool registry

Layer 3ClaudeAgentOptionsprompt + model + allowed_tools + max_turns

Layer 4HookMatcher + can_use_toolLifecycle hooks + permission gate

Layer 5SessionManager + .claude/agents/Multi-turn + delegated specialists

Layer 6FastAPI wrapperasync POST /query and POST /chat endpoints

Step 9: The complete production agent — SDK + hooks + session + the prelude's ML model as a tool

20 minproduction_agent.pycombines every layer

What & Why: The prelude ("From ML Model to Agent") ended with a pickled RandomForest at models/risk_clf.pkl — a real classifier trained on UCC features. In M15B the agent reimplemented risk scoring with hand-written rules. Now we wire the actual ML model in as a fourth SDK tool, behind the same hooks and inside the same SessionManager from Step 7. This is the agent you'd ship: deterministic ML where deterministic ML belongs (the score), Claude where Claude belongs (the reasoning, the narrative, the multi-turn UX), and one production stack holding it together.

If you didn't keep the prelude's pickle, train it in 30 seconds with python -c "import pickle, numpy as np; from sklearn.ensemble import RandomForestClassifier; X=np.array([[8,4,1,1],[1,1,0,0],[3,2,0,1]]); y=[1,0,1]; pickle.dump(RandomForestClassifier(n_estimators=20, random_state=0).fit(X,y), open('models/risk_clf.pkl','wb'))" after mkdir models.

"""production_agent.py — the full M26 stack in one file.
Wires: @tool functions (incl. the prelude's pickled ML model) +
       create_sdk_mcp_server + ClaudeAgentOptions +
       PreToolUse/PostToolUse hooks + can_use_tool gate +
       SessionManager from Step 7."""
import json, pickle, re
from datetime import datetime
from claude_agent_sdk import (
    tool, create_sdk_mcp_server, query, ClaudeAgentOptions,
    AssistantMessage, HookMatcher,
    PermissionResultAllow, PermissionResultDeny,
)
from mock_data import FILINGS_DB
from session_manager import SessionManager     # from Step 7

# ---- Layer 1: tools (3 from earlier + 1 NEW ML tool from the prelude) -----
RISK_MODEL = pickle.load(open("models/risk_clf.pkl", "rb"))

@tool("search_filings",
      "Search UCC filings by debtor name. Supports partial matching.",
      {"debtor_name": str, "state": str})
async def search_filings(args):
    name, state = args["debtor_name"].upper(), args.get("state")
    hits = [f for f in FILINGS_DB
            if name in f["debtor_name"].upper()
               and (not state or f["state"] == state)]
    return {"content": [{"type": "text", "text": json.dumps(hits)}]}

@tool("get_filing_details",
      "Get full details for a specific UCC filing by filing number.",
      {"filing_number": str})
async def get_filing_details(args):
    rec = next((f for f in FILINGS_DB
                if f["filing_number"] == args["filing_number"]),
               {"error": "not found"})
    return {"content": [{"type": "text", "text": json.dumps(rec)}]}

@tool("ml_risk_predict",
      "Run the production RandomForest delinquency model on an entity. "
      "Returns probability + label. Use this INSTEAD of heuristic scoring.",
      {"entity_name": str})
async def ml_risk_predict(args):
    name = args["entity_name"].upper()
    matches = [f for f in FILINGS_DB if name in f["debtor_name"].upper()]
    active  = [f for f in matches if f["status"] == "ACTIVE"]
    states  = {f["state"] for f in active}
    secured = {f["secured_party"] for f in active}
    features = [[len(active), len(states), len(secured),
                 sum(1 for f in active if "UCC3" in f["filing_type"])]]
    proba = float(RISK_MODEL.predict_proba(features)[0][1])
    return {"content": [{"type": "text", "text": json.dumps({
        "entity_name": args["entity_name"],
        "probability_high_risk": round(proba, 3),
        "label": "HIGH" if proba > 0.66 else "MEDIUM" if proba > 0.33 else "LOW",
        "model": "RandomForest v1 (prelude)",
        "features_used": ["active_filings", "n_states",
                          "n_secured_parties", "n_continuations"],
    })}]}

ucc_server = create_sdk_mcp_server(
    name="ucc_tools", version="2.0.0",
    tools=[search_filings, get_filing_details, ml_risk_predict])

# ---- Layer 4: hooks + permission gate (Steps 5, 6, 7) ---------------------
SSN_RE   = re.compile(r"\b\d{3}-\d{2}-\d{4}\b")
PHONE_RE = re.compile(r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b")

async def log_tool_call(input_data, tool_use_id, ctx):
    print(f"[{datetime.utcnow().isoformat()}Z] PRE  "
          f"{input_data.get('tool_name')}({input_data.get('tool_input')})")
    return {}

async def redact_pii(input_data, tool_use_id, ctx):
    out = input_data.get("tool_response", "")
    if isinstance(out, str):
        out = SSN_RE.sub("[SSN REDACTED]", out)
        out = PHONE_RE.sub("[PHONE REDACTED]", out)
    return {"tool_response": out}

async def gate(tool_name, tool_input, ctx):
    if tool_name == "mcp__ucc__search_filings":
        q = (tool_input.get("debtor_name") or "").strip()
        if len(q) < 3:
            return PermissionResultDeny(message=f"Query '{q}' too broad.")
    return PermissionResultAllow()

# ---- Layer 3: options ----------------------------------------------------
OPTIONS = ClaudeAgentOptions(
    model="claude-sonnet-4-6",
    system_prompt=(
        "You are the production UCC research coordinator. For each question:\n"
        " 1. Search filings using name variations (incl. abbreviations + DBAs).\n"
        " 2. Call ml_risk_predict for the canonical entity name — this is "
        "the production model; do not invent a heuristic.\n"
        " 3. Write a narrative report citing specific filings and the model's "
        "probability + label."
    ),
    mcp_servers={"ucc": ucc_server},
    allowed_tools=["mcp__ucc__search_filings",
                   "mcp__ucc__get_filing_details",
                   "mcp__ucc__ml_risk_predict"],
    hooks={"PreToolUse":  [HookMatcher(matcher="*", hooks=[log_tool_call])],
           "PostToolUse": [HookMatcher(matcher="*", hooks=[redact_pii])]},
    can_use_tool=gate,
    max_turns=10,
)

# ---- Layer 5: a session you can fork -------------------------------------
def new_session() -> SessionManager:
    return SessionManager(OPTIONS)

"""driver.py — exercises the full stack: multi-turn + fork + ML tool."""
import asyncio
from production_agent import new_session

async def main():
    s = new_session()
    print("--- Turn 1 ---")
    print(await s.send("What is the lien exposure for Acme Corporation?"))
    print("--- Turn 2 (follow-up) ---")
    print(await s.send("Compare to Pinnacle Industries."))

    print("--- Fork: what-if Acme files a UCC-3 termination on NY-2024-0848? ---")
    branch = s.fork()
    print(await branch.send("What if Acme files a UCC-3 termination on NY-2024-0848?"))

    print("--- Original session continues ---")
    print(await s.send("One-sentence summary of relative risk."))

asyncio.run(main())

Run: python driver.py

Expected output (abbreviated; exact narrative varies, model probabilities are deterministic)

--- Turn 1 --- [2026-05-09T09:12:01Z] PRE mcp__ucc__search_filings({'debtor_name': 'Acme Corporation'}) [2026-05-09T09:12:03Z] PRE mcp__ucc__search_filings({'debtor_name': 'ACME CORP'}) [2026-05-09T09:12:05Z] PRE mcp__ucc__ml_risk_predict({'entity_name': 'Acme Corporation'}) Acme Corporation has 9 UCC filings; 8 ACTIVE across NY, CA, TX, FL. The production RandomForest model returns probability_high_risk=0.91 (HIGH) ... --- Turn 2 (follow-up) --- [...] mcp__ucc__ml_risk_predict({'entity_name': 'Pinnacle Industries'}) -> 0.18 (LOW) ... --- Fork: what-if ... --- [...] (branch sees the prior 2 turns + the hypothetical; parent session unchanged) --- Original session continues --- Acme is materially riskier than Pinnacle (0.91 vs 0.18) per the production model.

Checkpoint — Step 9

You see (a) every tool call logged by the PreToolUse hook, (b) ml_risk_predict firing with a probability between 0 and 1, (c) the fork's hypothetical absent from the original session's final summary, and (d) Claude using the model's number in its narrative rather than inventing one. If Claude calls calculate_risk_score instead, you forgot to remove it from allowed_tools — the heuristic and the ML model shouldn't both be available, or the agent will pick the easier one.

Troubleshooting

FileNotFoundError: models/risk_clf.pkl → train the toy model with the one-liner above, or copy your prelude pickle into models/.
ValueError: X has N features, but RandomForestClassifier was fitted with 4 features → the feature vector in ml_risk_predict must match the training shape exactly. The toy model expects 4 features in order: [active_filings, n_states, n_secured_parties, n_continuations].
The fork contaminates the parent → you're sharing the transcript list reference. Confirm fork() uses list(self.transcript) (Step 7).
Hooks never fire on ml_risk_predict → matcher must be "*" or include the full MCP-prefixed name mcp__ucc__ml_risk_predict.

What Just Happened?

You wired five layers into one agent: a pickled scikit-learn model exposed as a tool, the SDK's query() running the loop, two hooks observing every call, a permission gate refusing broad queries, and a session that supports forked what-if exploration. The ML model from the prelude didn't get replaced — it became one tool among several that Claude calls when it needs a deterministic number. That's the production pattern: classical ML stays where it's strongest (calibrated probabilities); Claude orchestrates, narrates, and handles the long tail.

Why It Matters — Concrete Numbers

The M15B reference agent is ~200 lines of Python you wrote by hand: 80 for the loop, 40 for tool dispatch, 30 for hook plumbing (when added), 20 for session management, 30 for tool definitions. The same agent in M26 is ~80 lines: 50 for the three @tool functions, 15 for ClaudeAgentOptions, 5 for query() driver, 10 for hooks. That's 60% less code for the same behavior, plus everything you removed (the loop, the dispatch, the bookkeeping) is now battle-tested SDK code instead of code you have to debug at 2 AM.

Adding a fourth tool? One @tool function, one entry in allowed_tools, done. In M15B that was a tools.json edit, a dispatcher case, and a re-run. The SDK isn't magic — it's just less typing for the parts that don't reward thought.

7. When to Leave the SDK

The SDK is the default. Most agent code — including everything in this course's Tier 3 modules and capstones 1–5 / 7 — should reach for claude-agent-sdk first. But there are real cases where you want the raw loop. Use this table to decide:

Scenario	Raw `messages.create()` loop	`claude-agent-sdk`
Standard tool-using agent (1–10 tools)	overkill	default
Hooks for guardrails / observability	manual sprinkles in the loop	`HookMatcher`
Per-call permission decisions	manual if-statements	`can_use_tool`
Subagents	manual coordinator orchestration	`.claude/agents/`
Custom mid-loop streaming (e.g., emit tokens to a SSE stream while tools run in parallel)	yes — you need raw control over the message generator	can't express
Non-standard parallel-tool aggregation (e.g., race two tools and take whichever returns first)	yes — loop logic is custom	SDK serializes
Instrumentation that doesn't fit the hook model (e.g., wrap every API call with a vendor APM library)	yes — you control every call site	partial — hooks see tool calls, not API calls
Cert exam coverage	must understand	must master

Treat the SDK as the default. The raw loop is for when you've identified a specific need the SDK can't satisfy — it's not the starting point. M27 covers the cert-tested decision rubric in more depth.

Module Summary — What You Built

Cheat Sheet — The SDK Surface

from claude_agent_sdk import query, tool, create_sdk_mcp_server, ClaudeAgentOptions, AssistantMessage, HookMatcher, PermissionResultAllow, PermissionResultDeny
Tools: @tool("name", "desc", {schema}) on async functions returning {"content": [{"type": "text", "text": ...}]}
MCP server: create_sdk_mcp_server(name, version, tools=[...])
Options: ClaudeAgentOptions(system_prompt, mcp_servers, allowed_tools, hooks, can_use_tool, max_turns, model, resume?)
Driver: async for msg in query(prompt=..., options=OPTIONS): if isinstance(msg, AssistantMessage): ...
Hooks: HookMatcher(matcher="tool_name_or_*", hooks=[async_fn]) for PreToolUse / PostToolUse
Permission: can_use_tool=async_callback returning Allow/Deny
Subagents: .claude/agents/<name>.md with frontmatter name/description/tools/model

Next up: M27 (Cert Exam Prep) uses the agents you built in M25 and M26 as worked examples for the Claude Certified Architect — Foundations exam. Every Domain 1 (orchestration), Domain 2 (tools/MCP), and Domain 3 (Claude Code) question maps to something you've already built.

Knowledge Check

Q1: Which package provides `query()`, `tool`, and `ClaudeAgentOptions`?

anthropic (the lower-level Messages API SDK)
claude-agent-sdk (Python) / @anthropic-ai/claude-agent-sdk (Node)
anthropic.Agent (built into the main SDK)
@modelcontextprotocol/sdk

Q2: A `PreToolUse` hook receives `input_data`. What two keys can you reliably read from it?

system_prompt and messages
tool_name and tool_input
tool_response and tool_use_id
session_id and turn_count

Q3: You want to block any `search_filings` call where `debtor_name` is shorter than 3 characters. Which mechanism is the right fit?

Add an if check inside the @tool function and return an error string
Register a can_use_tool callback that returns PermissionResultDeny(message=...) for that case — the denial reason flows back to Claude so it can adapt
Override max_turns to 1
Remove search_filings from allowed_tools entirely

Q4: Where are subagents declared in a `claude-agent-sdk` project?

Python files with @subagent decorators
A list passed to ClaudeAgentOptions(subagents=[...])
Markdown files in .claude/agents/<name>.md with frontmatter declaring name, description, tools, and (optional) model
JSON entries in .mcp.json

Q5: A coordinator delegates a task to the `risk-analyst` subagent. Does the subagent see the coordinator's prior conversation history? (Connects to M14's hub-and-spoke pattern.)

Yes — subagents inherit the coordinator's full context window
No — subagents start with a fresh, isolated context window containing only their system prompt and the input the coordinator passed in
Only if the coordinator explicitly calls share_context()
Only the last user message is shared

Q6: What is the SDK's "session"?

A dedicated Session class with .send() and .fork() methods built into the SDK
The async stream of messages returned by query() — multi-turn is implemented either by composing the prompt yourself (transcript pattern) or by passing a SDK resume token
A row in a database that the SDK manages for you
A JWT issued by the Anthropic API on first call

Q7: Which scenario is the strongest case for leaving the SDK and writing a raw `messages.create()` loop?

The agent has only one tool
You want to add a logging hook
You need custom mid-loop streaming or non-standard parallel-tool aggregation that the hook model can't express
You're using TypeScript instead of Python

M26 — Hooks, Sessions & the Agent SDK

Learning Objectives

1. Why the SDK Exists — The Loop You No Longer Write

What Each Approach Handles For You

Manual vs SDK — Side-by-Side

2. Build the UCC Agent with claude-agent-sdk

3. Hooks — HookMatcher for Pre/Post Tool Use

Hook Lifecycle in Action

4. Sessions — Multi-Turn Flows You Control

5. Declarative Subagents — .claude/agents/<name>.md

Subagent Invocation Flow

6. Putting It Together — The Production Agent Stack

The Stack, Built Up Layer by Layer

7. When to Leave the SDK

Module Summary — What You Built

Knowledge Check

Q1: Which package provides query(), tool, and ClaudeAgentOptions?

Q2: A PreToolUse hook receives input_data. What two keys can you reliably read from it?

Q3: You want to block any search_filings call where debtor_name is shorter than 3 characters. Which mechanism is the right fit?

Q4: Where are subagents declared in a claude-agent-sdk project?

Q5: A coordinator delegates a task to the risk-analyst subagent. Does the subagent see the coordinator's prior conversation history? (Connects to M14's hub-and-spoke pattern.)

Q6: What is the SDK's "session"?

Q7: Which scenario is the strongest case for leaving the SDK and writing a raw messages.create() loop?

2. Build the UCC Agent with `claude-agent-sdk`

3. Hooks — `HookMatcher` for Pre/Post Tool Use

5. Declarative Subagents — `.claude/agents/<name>.md`

Q1: Which package provides `query()`, `tool`, and `ClaudeAgentOptions`?

Q2: A `PreToolUse` hook receives `input_data`. What two keys can you reliably read from it?

Q3: You want to block any `search_filings` call where `debtor_name` is shorter than 3 characters. Which mechanism is the right fit?

Q4: Where are subagents declared in a `claude-agent-sdk` project?

Q5: A coordinator delegates a task to the `risk-analyst` subagent. Does the subagent see the coordinator's prior conversation history? (Connects to M14's hub-and-spoke pattern.)

Q7: Which scenario is the strongest case for leaving the SDK and writing a raw `messages.create()` loop?