M26 — Hooks, Sessions & the Agent SDK
In M15B you built a UCC agent from scratch. Coordinator plus subagents plus tools plus guardrails. About 250 lines. This module rebuilds that same agent using the Agent SDK. Your 250 lines shrink to about 40. The output is identical. Keep M15B open in another tab as you work through this module. At each step, compare what the SDK does versus what you coded manually. That comparison is the lesson. After M26 there is one more level in M25 — spec-driven development. CAPSTONE-7 ties all three together.
M15B got you using claude-agent-sdk for the first time. M26 goes deep on the three things the SDK does that the raw API does not: hooks (lifecycle interceptors), sessions (multi-turn flows you control), and declarative subagents (named specialists in .claude/agents/). All three are heavily tested on the Claude Certified Architect exam.
Learning Objectives
- Build a complete tool-using agent with
query()+ClaudeAgentOptions+create_sdk_mcp_server— no hand-rolled message loop. - Add lifecycle interceptors with
HookMatcherforPreToolUse(logging, validation) andPostToolUse(PII redaction, audit). - Make per-call authorization decisions with the
can_use_toolpermission callback. - Manage multi-turn flows by composing the message stream returned by
query()— including a fork-friendlySessionManager. - Declare specialist subagents as
.claude/agents/<name>.mdfiles with isolated context windows. - Decide when to leave the SDK for a raw loop (rare, but exam-tested).
1. Why the SDK Exists — The Loop You No Longer Write
Imagine you're a carpenter. The first house you build, you measure every cut with a tape measure, mark it with a pencil, then cut with a hand saw. Every joint takes ten minutes. By house number ten, your hands know every cut, but your back hurts and the build is still slow.
Then you switch to a track saw with a digital fence. The saw measures, marks, and cuts in one pass. The wood comes out the same as before — same dimensions, same joints — but you finish in a tenth of the time. You're still a carpenter; the saw didn't replace your judgment about where joints go. It replaced the manual measuring step that you stopped learning anything from after house three.
The Anthropic Agent SDK is that track saw. The hand-rolled while True: client.messages.create(...) loop you wrote in M12 and M15B taught you what an agent is. After two or three hand-builds, the loop stops teaching you anything new — it's just typing. The SDK is the power tool that lets you stop typing the loop and start designing the things only you can design: the tools, the prompts, the guardrails.
The claude-agent-sdk Python package (and @anthropic-ai/claude-agent-sdk for Node) gives you four things the raw anthropic SDK does not:
- An async
query()generator that runs the tool-use loop for you. You writeasync for msg in query(...)and the SDK handlesstop_reasonchecking, tool dispatch, message-list bookkeeping, retries, and streaming. - The
@tooldecorator +create_sdk_mcp_server— tools are now async functions registered through an in-process MCP server. Schemas come from your type hints; you no longer hand-write JSON Schema dicts. HookMatcher— a lifecycle hook system that firesPreToolUseandPostToolUseevents, plus acan_use_toolpermission callback. Guardrails stop being if-statements scattered through your loop and become declared, swappable functions..claude/agents/<name>.md— a file-based way to declare subagents (specialists with isolated context windows) that the coordinator can invoke by name. M14's hand-built coordinator pattern collapses to a directory of markdown files.
"Can't I just wrap client.messages.create() in a function called query() and call it the SDK?" — No. That's the trap the original M26 lab fell into. The real claude-agent-sdk manages the message stream, dispatches tool calls in MCP format, fires hooks at lifecycle points, and supports forking — none of which a wrapper around messages.create does. If you see code that imports from anthropic import Agent or uses @agent.tool / @agent.hook decorators, that is fictional API; the real package exposes query, tool, HookMatcher, and ClaudeAgentOptions imported from claude_agent_sdk.
"If I'm using the SDK, do I lose control?" — You lose loop boilerplate, not control. You still write the system prompt, the tools, the hooks, and the permission logic. The SDK runs the dispatch you used to type out by hand.
"Hooks and middleware are the same thing, right?" — Sort of. Hooks are middleware specifically for the agent's tool-use lifecycle: they fire before and after each tool call, with structured input. Web middleware is for HTTP requests; hooks are for tool calls. The pattern is the same, the events are different.
What Each Approach Handles For You
Before the animation, here's the concrete responsibility split. Anything in the "you write" column on the left is code you no longer maintain on the right.
| Concern | Raw messages.create() loop | claude-agent-sdk |
|---|---|---|
Tool-use loop (read stop_reason, dispatch, append result, recurse) | you write — ~60 lines | built-in via query() |
| Tool schemas (JSON Schema dicts) | you write by hand | generated from type hints by @tool |
| Tool registry & routing | you write a dispatcher dict | create_sdk_mcp_server |
| Lifecycle interception (log, validate, redact) | scattered if-statements in the loop | HookMatcher Pre/PostToolUse |
| Per-call permission decisions | more if-statements before dispatch | can_use_tool callback |
| Multi-turn transcript bookkeeping | you maintain a messages list | resume token or compose yourself |
| Subagent declaration | you write a coordinator class | .claude/agents/<name>.md markdown |
| Custom mid-loop streaming, racing tools, vendor APM wrappers around every API call | yes — you have full control | SDK serializes & abstracts |
Use the raw loop when the bottom row applies — you need behavior the SDK abstracts away. Use the SDK for everything else, which is >90% of agent code in production.
Manual vs SDK — Side-by-Side
Watch the same agent step through ten lines of work. The left pane is the M15B-style hand-rolled loop. The right pane is the SDK. Both produce the same answer; the right pane stays a third of the size while the left pane grows.
2. Build the UCC Agent with claude-agent-sdk
You're going to rebuild the same UCC filing research agent from M15B, but with the SDK as the entry point. The business behavior is identical — search filings by debtor name, fetch details, compute a risk score, return a narrative answer. What changes is everything around the business behavior.
- What: A complete UCC research agent that answers "What is the lien exposure for <debtor>?" using three tools.
- Time: 30–45 minutes for steps 1–3, plus 30–45 minutes for hooks/sessions/subagents in later sections.
- Prerequisites: Python 3.10+, an
ANTHROPIC_API_KEYin your environment, and Node 18+ if you want to run the TypeScript variant. - Files you'll create:
mock_data.py,tools.py,run_agent.pyin this section;session_manager.py+.claude/agents/risk-analyst.mdin later sections.
What & Why: Install the real claude-agent-sdk package (not anthropic — that's the lower-level Messages API) and create a tiny mock data module so the tools have something to return. We use mock data instead of a live BigQuery connection so you can run the lab on a plane.
Run the shell commands first, then save the Python panel below as mock_data.py in the project folder you just created (m26-ucc-agent/mock_data.py).
mkdir m26-ucc-agent && cd m26-ucc-agent
python -m venv venv && source venv/bin/activate # Windows: venv\Scripts\activate
pip install "claude-agent-sdk>=0.2"
# Optional: TypeScript variant
npm init -y && npm i @anthropic-ai/claude-agent-sdk zod tsx typescript
# Confirm the env var is set
echo $ANTHROPIC_API_KEY | head -c 12 # macOS/Linux — should print your key prefix
echo %ANTHROPIC_API_KEY:~0,12% # Windows cmd — same idea
$env:ANTHROPIC_API_KEY.Substring(0,12) # Windows PowerShell
"""mock_data.py — 9 UCC filings for Acme + 2 noise records (Pinnacle + Sunrise)."""
FILINGS_DB = [
{"filing_number": "NY-2024-0847", "debtor_name": "ACME CORPORATION",
"state": "NY", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "FIRST NATIONAL BANK",
"collateral": "All inventory and accounts receivable"},
{"filing_number": "NY-2024-0848", "debtor_name": "ACME CORP",
"state": "NY", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "JPMORGAN CHASE", "collateral": "Equipment"},
{"filing_number": "NY-2024-0849", "debtor_name": "ACME CORPORATION INC",
"state": "NY", "filing_type": "UCC3_AMENDMENT", "status": "ACTIVE",
"secured_party": "JPMORGAN CHASE", "collateral": "Equipment + vehicles"},
{"filing_number": "CA-2024-1201", "debtor_name": "ACME CORP",
"state": "CA", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "BANK OF AMERICA", "collateral": "All assets"},
{"filing_number": "CA-2024-1202", "debtor_name": "ACME CORPORATION",
"state": "CA", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "CITIBANK NA", "collateral": "Inventory"},
{"filing_number": "TX-2024-0903", "debtor_name": "ACME CORP",
"state": "TX", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "PNC FINANCIAL", "collateral": "Equipment"},
{"filing_number": "TX-2024-0904", "debtor_name": "ACME CORPORATION",
"state": "TX", "filing_type": "UCC3_CONTINUATION", "status": "ACTIVE",
"secured_party": "PNC FINANCIAL", "collateral": "Equipment"},
{"filing_number": "FL-2024-0455",
"debtor_name": "ACME CORP DBA ROADRUNNER SUPPLIES",
"state": "FL", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "TRUIST FINANCIAL", "collateral": "Motor vehicles"},
{"filing_number": "OH-2024-0301", "debtor_name": "ACME CORPORATION",
"state": "OH", "filing_type": "UCC1", "status": "TERMINATED",
"secured_party": "US BANCORP", "collateral": "Accounts receivable"},
{"filing_number": "NY-2024-0501", "debtor_name": "PINNACLE INDUSTRIES",
"state": "NY", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "TD BANK", "collateral": "Equipment"},
{"filing_number": "CA-2024-1500", "debtor_name": "SUNRISE HOLDINGS LLC",
"state": "CA", "filing_type": "UCC1", "status": "ACTIVE",
"secured_party": "FIRST NATIONAL BANK", "collateral": "All assets"},
]
// mock_data.ts — same shape as the Python module.
export interface Filing {
filing_number: string;
debtor_name: string;
state: string;
filing_type: string;
status: string;
secured_party: string;
collateral: string;
}
export const FILINGS_DB: Filing[] = [
{ filing_number: "NY-2024-0847", debtor_name: "ACME CORPORATION",
state: "NY", filing_type: "UCC1", status: "ACTIVE",
secured_party: "FIRST NATIONAL BANK",
collateral: "All inventory and accounts receivable" },
// ... (the same 11 records as the Python file)
];
Run: python -c "from mock_data import FILINGS_DB; print(len(FILINGS_DB))"
11 records loaded. If you see ImportError, you're running from the wrong directory; cd into m26-ucc-agent.
ModuleNotFoundError: No module named 'claude_agent_sdk'→pip install "claude-agent-sdk>=0.2"in your activated venv.ImportError: cannot import name 'Agent' from 'anthropic'→ that import does not exist. The real SDK isclaude_agent_sdk. If you see this in someone else's code, it's the fictional API.ANTHROPIC_API_KEYnot set →export ANTHROPIC_API_KEY=sk-ant-...(or the Windows equivalent).
What & Why: Tools in the SDK are async Python functions wrapped with the @tool decorator. The decorator takes the tool name, a description Claude reads to decide when to call it, and a parameter schema. The function returns an MCP-shaped response: {"content": [{"type": "text", "text": ...}]}. Wrap them all in a create_sdk_mcp_server(...) call — that's the in-process MCP server the SDK will route tool calls through.
The three tools we're defining mirror what M15B built by hand: search_filings (partial debtor-name search), get_filing_details (lookup by filing number), and calculate_risk_score (mock score for a given entity).
"""tools.py — 3 SDK tools registered through an in-process MCP server."""
import json
from claude_agent_sdk import tool, create_sdk_mcp_server
from mock_data import FILINGS_DB
@tool(
"search_filings",
"Search UCC filings by debtor name. Supports partial matching across states. "
"Use this FIRST to find candidate filings before fetching details.",
{"debtor_name": str, "state": str},
)
async def search_filings(args):
name = args["debtor_name"].upper()
state = args.get("state")
hits = [
f for f in FILINGS_DB
if name in f["debtor_name"].upper()
and (not state or f["state"] == state)
]
return {"content": [{"type": "text", "text": json.dumps(hits)}]}
@tool(
"get_filing_details",
"Get full details for a specific UCC filing by filing number.",
{"filing_number": str},
)
async def get_filing_details(args):
for f in FILINGS_DB:
if f["filing_number"] == args["filing_number"]:
return {"content": [{"type": "text", "text": json.dumps(f)}]}
return {"content": [{"type": "text",
"text": json.dumps({"error": "not found"})}]}
@tool(
"calculate_risk_score",
"Calculate a delinquency risk score for an entity by aggregating its filings.",
{"entity_name": str},
)
async def calculate_risk_score(args):
name = args["entity_name"].upper()
matches = [f for f in FILINGS_DB if name in f["debtor_name"].upper()]
active = [f for f in matches if f["status"] == "ACTIVE"]
states = {f["state"] for f in active}
score = min(0.95, 0.15 * len(active) + 0.05 * len(states))
out = {"entity_name": args["entity_name"],
"active_filings": len(active),
"states": sorted(states),
"risk_score": round(score, 2),
"risk_level": "HIGH" if score > 0.66 else "MEDIUM" if score > 0.33 else "LOW"}
return {"content": [{"type": "text", "text": json.dumps(out)}]}
ucc_server = create_sdk_mcp_server(
name="ucc_tools",
version="1.0.0",
tools=[search_filings, get_filing_details, calculate_risk_score],
)
// tools.ts
import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
import { FILINGS_DB } from "./mock_data.js";
export const searchFilings = tool(
"search_filings",
"Search UCC filings by debtor name. Supports partial matching across states. " +
"Use this FIRST to find candidate filings before fetching details.",
{ debtor_name: z.string(), state: z.string().optional() },
async (args) => {
const name = args.debtor_name.toUpperCase();
const state = args.state;
const hits = FILINGS_DB.filter(f =>
f.debtor_name.toUpperCase().includes(name) &&
(!state || f.state === state));
return { content: [{ type: "text", text: JSON.stringify(hits) }] };
}
);
export const getFilingDetails = tool(
"get_filing_details",
"Get full details for a specific UCC filing by filing number.",
{ filing_number: z.string() },
async (args) => {
const rec = FILINGS_DB.find(f => f.filing_number === args.filing_number)
?? { error: "not found" };
return { content: [{ type: "text", text: JSON.stringify(rec) }] };
}
);
export const calculateRiskScore = tool(
"calculate_risk_score",
"Calculate a delinquency risk score for an entity by aggregating its filings.",
{ entity_name: z.string() },
async (args) => {
const name = args.entity_name.toUpperCase();
const matches = FILINGS_DB.filter(f => f.debtor_name.toUpperCase().includes(name));
const active = matches.filter(f => f.status === "ACTIVE");
const states = [...new Set(active.map(f => f.state))];
const score = Math.min(0.95, 0.15 * active.length + 0.05 * states.length);
const level = score > 0.66 ? "HIGH" : score > 0.33 ? "MEDIUM" : "LOW";
return { content: [{ type: "text", text: JSON.stringify({
entity_name: args.entity_name, active_filings: active.length,
states, risk_score: Math.round(score * 100) / 100, risk_level: level,
}) }] };
}
);
export const uccServer = createSdkMcpServer({
name: "ucc_tools",
tools: [searchFilings, getFilingDetails, calculateRiskScore],
});
Run: python -c "from tools import ucc_server; print('OK,', len(ucc_server.tools), 'tools registered')"
Why we don't call the tool directly here: the @tool decorator wraps your function as an SdkMcpTool object so the SDK can route MCP calls to it — the wrapped object is no longer a plain async function. End-to-end behavior is verified in Step 3 by running the full agent.
OK, 3 tools registered. The agent in Step 3 will then exercise all three: a partial search for "acme" should pull 9 ACME variants (including the DBA), get_filing_details should look up by filing number, and calculate_risk_score should aggregate them into a HIGH/MEDIUM/LOW score.
TypeError: tool() takes 3 positional arguments but 4 were given→ the schema dict is the third arg; the function is decorated after in Python. In TS the function is the fourth arg oftool().tool returned non-MCP content→ you returned a Python dict directly. Wrap it:{"content": [{"type": "text", "text": json.dumps(your_dict)}]}.
You just registered three tools as an in-process MCP server. The SDK will start that server when query() runs and route any tool call whose name starts with mcp__ucc__ to your async functions. You did not write a JSON Schema, a tool dispatcher, or a tool-result formatter — the decorator generated all three from your type hints.
query()What & Why: query() is the SDK's entry point. You give it a prompt string (or a structured message list) and a ClaudeAgentOptions object describing how the agent should run. query() returns an async generator of messages — assistant text blocks, tool-use events, tool results, and a final result message with token counts. You iterate over them and pull out what you need.
Three options matter most: system_prompt (the instructions that shape behavior), mcp_servers (a dict mapping server names to MCP server objects — so the SDK knows where to dispatch tool calls), and allowed_tools (an explicit allowlist of tool names in mcp__<server>__<tool> format). Without allowed_tools, the agent can't call your tools at all.
"""run_agent.py — the SDK entry point. ~25 lines, no manual loop."""
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage
from tools import ucc_server
OPTIONS = ClaudeAgentOptions(
model="claude-sonnet-4-6",
system_prompt=(
"You are a UCC filing research agent. When given a business name, "
"search thoroughly using name variations including abbreviations and "
"DBAs. Then call calculate_risk_score and write a narrative report "
"citing specific filings."
),
mcp_servers={"ucc": ucc_server},
allowed_tools=[
"mcp__ucc__search_filings",
"mcp__ucc__get_filing_details",
"mcp__ucc__calculate_risk_score",
],
max_turns=8,
)
async def main():
final_text = ""
async for msg in query(
prompt="What is the lien exposure for Acme Corporation?",
options=OPTIONS,
):
if isinstance(msg, AssistantMessage):
for block in msg.content:
if hasattr(block, "text") and block.text:
final_text = block.text
print(final_text)
if __name__ == "__main__":
asyncio.run(main())
// run_agent.ts
import { query } from "@anthropic-ai/claude-agent-sdk";
import { uccServer } from "./tools.js";
const OPTIONS = {
model: "claude-sonnet-4-6",
systemPrompt: "You are a UCC filing research agent. When given a business name, " +
"search thoroughly using name variations including abbreviations and DBAs. " +
"Then call calculate_risk_score and write a narrative report citing specific filings.",
mcpServers: { ucc: uccServer },
allowedTools: [
"mcp__ucc__search_filings",
"mcp__ucc__get_filing_details",
"mcp__ucc__calculate_risk_score",
],
maxTurns: 8,
};
async function main() {
let finalText = "";
for await (const msg of query({
prompt: "What is the lien exposure for Acme Corporation?",
options: OPTIONS,
})) {
if (msg.type === "assistant") {
for (const block of msg.content) {
if ("text" in block && block.text) finalText = block.text;
}
}
}
console.log(finalText);
}
main();
Run: python run_agent.py (or npx tsx run_agent.ts)
0 filings found, the agent isn't trying name variations — tighten the system prompt to explicitly say "search using ACME, ACME CORP, ACME CORPORATION, and DBA forms."
- Agent calls no tools at all → check that
allowed_toolsuses the exactmcp__<server>__<tool>format. The server name is what you passed tomcp_servers={"ucc": ...}. - Agent runs forever / hits
max_turns→ the system prompt is too vague. Tell it explicitly when to stop ("once you have searched and scored, write the report and stop"). RuntimeError: This event loop is already running→ you're inside a Jupyter notebook. Useawait main()instead ofasyncio.run(main()).
You just ran a tool-using agent in 25 lines. The SDK's query() handled: turning your prompt into a Messages API call, parsing the tool_use blocks Claude returned, dispatching them to your async tool functions, formatting the tool_results back into Messages format, looping until stop_reason hit end_turn, and yielding messages to your async for as they happened. The M15B equivalent was ~80 lines of loop code.
What & Why: This step is a pencil-down comparison. Open M15B's agent_loop.py next to your new run_agent.py and count the lines that vanished. The point isn't to write more code — it's to recognize what the SDK is replacing so you can defend the choice on the cert exam and in code review.
# M15B agent_loop.py — manual tool-use loop (excerpt)
from anthropic import Anthropic
from tools import TOOL_SCHEMAS, dispatch_tool # you wrote both
client = Anthropic()
messages = [{"role": "user", "content": "What is the lien exposure for Acme?"}]
for turn in range(8): # max_turns
resp = client.messages.create(
model="claude-sonnet-4-6",
system="You are a UCC research agent...",
tools=TOOL_SCHEMAS, # hand-written JSON Schema
messages=messages,
max_tokens=2048,
)
messages.append({"role": "assistant", "content": resp.content})
if resp.stop_reason == "end_turn": # you check
break
if resp.stop_reason != "tool_use":
raise RuntimeError(f"unexpected stop: {resp.stop_reason}")
tool_results = []
for block in resp.content: # you parse
if block.type != "tool_use":
continue
try:
result = dispatch_tool(block.name, block.input) # you dispatch
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result),
})
except Exception as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"error: {e}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results}) # you format
# Print the last assistant text block
final = next((b.text for b in resp.content if b.type == "text"), "")
print(final)
# M26 run_agent.py — SDK driver (the loop is gone)
import asyncio
from claude_agent_sdk import query, AssistantMessage
from run_agent import OPTIONS # the OPTIONS you built in Step 3
async def main():
final = ""
async for msg in query(
prompt="What is the lien exposure for Acme Corporation?",
options=OPTIONS,
):
if isinstance(msg, AssistantMessage):
for block in msg.content:
if hasattr(block, "text") and block.text:
final = block.text
print(final)
asyncio.run(main())
Run: wc -l agent_loop.py run_agent.py (Windows: (Get-Content run_agent.py).Count)
stop_reason branching, (2) tool-result formatting into {"type": "tool_result", "tool_use_id": ...} dicts, (3) the messages.append bookkeeping. If you can't, re-read the diff — this is a Domain 1 cert question.
- Don't have M15B's
agent_loop.pyhandy? → the snippet above is faithful to it. Compare against your memory or against the M15B lab'ssolution/folder. - Tempted to mix the two? → don't. Either commit to the SDK or keep the raw loop. Mixing means you maintain both code paths and lose the SDK's hooks/sessions/subagents.
3. Hooks — HookMatcher for Pre/Post Tool Use
When you wire a building, the electrician adds outlets where appliances will plug in. Later, the fire-protection company adds sprinklers and smoke detectors — in the same ceilings, but on a separate system, with their own logic. The sprinklers don't ask the appliances for permission to fire; they listen to a temperature sensor and act.
That's what hooks are. The SDK's tool dispatch is the appliance circuit. Hooks are the sprinkler system layered on top: they fire before a tool runs (PreToolUse) or after it returns (PostToolUse), based on a name pattern, and they decide whether to log it, modify it, or block it.
Concretely: a PreToolUse hook with matcher="*" is a smoke detector wired to every room. A PreToolUse hook with matcher="mcp__ucc__search_filings" is one wired only to the kitchen. A can_use_tool callback is the sprinkler valve — it can hard-deny the tool call before it ever reaches the function.
Hook Lifecycle in Action
Watch a single tool call propagate through the lifecycle. The agent decides to call search_filings; the SDK matches both hooks; PreToolUse fires (logs the call); the tool runs; PostToolUse fires (redacts PII); the result is appended to the message stream.
PreToolUse logging hookWhat & Why: A PreToolUse hook fires before each tool dispatch. Its three parameters are input_data (a dict containing tool_name and tool_input), tool_use_id (the unique id of this call — matches the tool_use_id in the eventual tool_result), and context (general SDK context). Returning an empty dict {} means "continue without modification." Returning a payload with hookSpecificOutput can deny the call — we'll use that in Step 7.
You register hooks via HookMatcher: an object that pairs a matcher pattern (a tool name or "*" for all) with a list of hook functions. Multiple matchers can stack — they fire in registration order.
What to change in run_agent.py: add the new imports + log_tool_call function near the top, then replace your existing OPTIONS = ClaudeAgentOptions(...) block with the version below (same fields as Step 3 plus a new hooks={} kwarg).
# Add these two imports to the top of run_agent.py:
from datetime import datetime
from claude_agent_sdk import HookMatcher
# Add this hook function above the OPTIONS definition:
async def log_tool_call(input_data, tool_use_id, context):
ts = datetime.utcnow().isoformat() + "Z"
name = input_data.get("tool_name")
params = input_data.get("tool_input")
print(f"[{ts}] PRE {name}({params})")
return {} # empty dict = continue without modification
# Replace your existing OPTIONS = ClaudeAgentOptions(...) block with this version.
# Only the new `hooks=` kwarg is different from Step 3.
OPTIONS = ClaudeAgentOptions(
model="claude-sonnet-4-6",
system_prompt=(
"You are a UCC filing research agent. When given a business name, "
"search thoroughly using name variations including abbreviations and "
"DBAs. Then call calculate_risk_score and write a narrative report "
"citing specific filings."
),
mcp_servers={"ucc": ucc_server},
allowed_tools=[
"mcp__ucc__search_filings",
"mcp__ucc__get_filing_details",
"mcp__ucc__calculate_risk_score",
],
hooks={
"PreToolUse": [HookMatcher(matcher="*", hooks=[log_tool_call])],
},
max_turns=8,
)
const logToolCall = async (inputData: any, toolUseId: string, ctx: any) => {
const ts = new Date().toISOString();
console.log(`[${ts}] PRE ${inputData.tool_name}(${JSON.stringify(inputData.tool_input)})`);
return {};
};
const OPTIONS = {
model: "claude-sonnet-4-6",
systemPrompt: "...", // same as Step 3
mcpServers: { ucc: uccServer },
allowedTools: [/* same as Step 3 */],
hooks: {
PreToolUse: [{ matcher: "*", hooks: [logToolCall] }],
},
maxTurns: 8,
};
Run: python run_agent.py — you should now see a [timestamp] PRE line for every tool call.
PRE log line per tool call. If you see no log lines but the tools still run, your hook isn't registered — double-check that hooks={"PreToolUse": [HookMatcher(...)]} is inside ClaudeAgentOptions, not outside.
PostToolUse PII redaction hook (so PII never reaches Claude)What & Why: A PostToolUse hook fires after a tool returns but before the result is appended to the message stream the agent reads. The hook receives the tool's response in input_data["tool_response"]; whatever you return as tool_response in your output dict is what the agent actually sees. So this hook scrubs PII from what flows back to the agent — which means PII never enters the message history, the trace, or any downstream consumer. (To also redact the stdout log line from Step 5, add the same regex sub inside log_tool_call before the print.)
Realistic UCC filings don't contain SSNs or phone numbers, but the redaction pattern is the same one you'd use in healthcare or finance contexts — so it's worth practicing here.
What to change in run_agent.py: add the imports + redact_pii function near the top (above OPTIONS), then update the hooks={} dict in your existing OPTIONS block to add the new PostToolUse entry.
# Add to the top of run_agent.py:
import re
SSN_RE = re.compile(r"\b\d{3}-\d{2}-\d{4}\b")
PHONE_RE = re.compile(r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b")
# Add this hook function above OPTIONS (next to log_tool_call):
async def redact_pii(input_data, tool_use_id, context):
out = input_data.get("tool_response", "")
if isinstance(out, str):
out = SSN_RE.sub("[SSN REDACTED]", out)
out = PHONE_RE.sub("[PHONE REDACTED]", out)
return {"tool_response": out}
# Update the hooks={} dict inside your existing OPTIONS to add PostToolUse:
hooks={
"PreToolUse": [HookMatcher(matcher="*", hooks=[log_tool_call])],
"PostToolUse": [HookMatcher(matcher="*", hooks=[redact_pii])],
},
# (leave the rest of OPTIONS — model, system_prompt, mcp_servers,
# allowed_tools, max_turns — unchanged from Step 5)
const SSN_RE = /\b\d{3}-\d{2}-\d{4}\b/g;
const PHONE_RE = /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g;
const redactPii = async (inputData: any, toolUseId: string, ctx: any) => {
let out = inputData.tool_response;
if (typeof out === "string") {
out = out.replace(SSN_RE, "[SSN REDACTED]")
.replace(PHONE_RE, "[PHONE REDACTED]");
}
return { tool_response: out };
};
const OPTIONS = {
// ... existing fields ...
hooks: {
PreToolUse: [{ matcher: "*", hooks: [logToolCall] }],
PostToolUse: [{ matcher: "*", hooks: [redactPii] }],
},
};
"phone: 555-123-4567" to one filing's collateral field and re-run — you should see [PHONE REDACTED] in the agent's output.
- Redaction doesn't show up → the hook only edits string responses. If a tool returns a dict, JSON-stringify it first or extend the hook to walk the dict.
- Logger still prints SSNs → PostToolUse runs after the tool, but the PreToolUse logger from Step 5 prints the input — if PII could appear in tool inputs (e.g., user-supplied SSN), redact inside
log_tool_calltoo.
can_use_toolWhat & Why: Hooks observe and modify. The can_use_tool callback decides. It runs synchronously before each tool dispatch (after the matching PreToolUse hook fires) and returns a permission verdict: PermissionResultAllow() to let the call through, PermissionResultDeny(message="...") to refuse it. The denial message is fed back to Claude so it can adapt — e.g., it might pick a different tool or reformulate its query.
This is the cert-tested permission primitive. The exam scenarios always involve a "user-controlled input feeds into a tool" situation where you need to gate based on the runtime value, not the tool name alone.
What to change in run_agent.py: add the imports + gate function near the top, then add a new can_use_tool=gate kwarg to your existing OPTIONS (next to hooks=).
# Add to the imports at the top of run_agent.py:
from claude_agent_sdk import PermissionResultAllow, PermissionResultDeny
# Add this gate function above OPTIONS:
async def gate(tool_name, tool_input, context):
# Block search_filings calls that would scan everything.
if tool_name == "mcp__ucc__search_filings":
q = (tool_input.get("debtor_name") or "").strip()
if len(q) < 3:
return PermissionResultDeny(
message=f"Query too broad: '{q}' is < 3 chars. "
"Provide a longer name fragment."
)
return PermissionResultAllow()
# Add can_use_tool=gate to your existing OPTIONS block:
OPTIONS = ClaudeAgentOptions(
model="claude-sonnet-4-6",
system_prompt="...", # unchanged from Step 6
mcp_servers={"ucc": ucc_server},
allowed_tools=[ # unchanged from Step 6
"mcp__ucc__search_filings",
"mcp__ucc__get_filing_details",
"mcp__ucc__calculate_risk_score",
],
hooks={ # unchanged from Step 6
"PreToolUse": [HookMatcher(matcher="*", hooks=[log_tool_call])],
"PostToolUse": [HookMatcher(matcher="*", hooks=[redact_pii])],
},
can_use_tool=gate, # <-- new in Step 7
max_turns=8,
)
import { PermissionResultAllow, PermissionResultDeny }
from "@anthropic-ai/claude-agent-sdk";
const gate = async (toolName: string, toolInput: any, ctx: any) => {
if (toolName === "mcp__ucc__search_filings") {
const q = (toolInput.debtor_name ?? "").trim();
if (q.length < 3) {
return new PermissionResultDeny({
message: `Query too broad: '${q}' is < 3 chars. ` +
"Provide a longer name fragment.",
});
}
}
return new PermissionResultAllow();
};
const OPTIONS = { /* ... */ canUseTool: gate };
Run: In run_agent.py, change the prompt= argument inside the async for msg in query(...) call from "What is the lien exposure for Acme Corporation?" to "What about company A?", then rerun python run_agent.py. You should see the agent attempt search_filings({"debtor_name": "A"}), get denied by the gate (the SDK feeds your denial message back as a tool error), then either reformulate the search with a longer fragment or apologize that it can't search on a single character.
can_use_tool=gate is a top-level kwarg on ClaudeAgentOptions (not nested inside hooks).
can_use_tool vs PreToolUse hooks
"Aren't they the same thing?" — Close but not identical. A PreToolUse hook runs and can side-effect (log, modify input), but its return value is mostly informational. can_use_tool is a binary gate: Allow or Deny, with the denial reason fed back to Claude for adaptation. Use hooks for instrumentation; use can_use_tool for authorization decisions.
4. Sessions — Multi-Turn Flows You Control
The SDK does not ship a Session class. The "session" is the message stream returned by query(). Multi-turn flows are something you compose yourself by managing the prompt — either by maintaining a transcript and passing it to the next query() call, or by using the SDK's resume token to continue a prior session id.
Pattern A — transcript composition: keep the running USER/ASSISTANT pairs in a list. On each new turn, build a single prompt string from the transcript plus the new user input, and pass that to query(). The agent doesn't know about prior turns; it sees one big prompt.
Pattern B — resume tokens: the SDK emits a session_id on the final result message of each query. Pass that id back as ClaudeAgentOptions(resume=session_id) on the next call — the SDK reconstructs the message context server-side and you only send the new user turn.
Pattern A is simpler to teach and easier to fork (deep-copy the transcript list and you have a what-if branch). Pattern B is more efficient for long sessions because it doesn't re-send the full history. The cert exam tests both.
session.send() for multi-turn + session.fork() for what-if branchesWhat & Why: We're building Pattern A because forking is the more interesting cert-tested concept — and forking with transcripts is a one-line list(self.transcript) deep copy. Pattern B forking would require coordinating session ids on the SDK side. Pattern A is enough to demonstrate the concept the exam asks about.
The class wraps three things: a list of USER/ASSISTANT lines (the transcript), a send method that appends a turn and runs query(), and a fork method that returns a new SessionManager with the same options and a deep-copied transcript — so changes in the fork don't contaminate the parent.
"""session_manager.py — thin multi-turn wrapper over query()."""
from claude_agent_sdk import query, AssistantMessage
class SessionManager:
def __init__(self, options):
self.options = options
self.transcript: list[str] = []
async def send(self, user_input: str) -> str:
# Build a single prompt from prior turns plus the new input.
prompt = "\n\n".join(self.transcript + [f"USER: {user_input}"])
final = ""
async for msg in query(prompt=prompt, options=self.options):
if isinstance(msg, AssistantMessage):
for block in msg.content:
if hasattr(block, "text") and block.text:
final = block.text
# Persist the turn AFTER the agent has answered.
self.transcript.append(f"USER: {user_input}")
self.transcript.append(f"ASSISTANT: {final}")
return final
def fork(self) -> "SessionManager":
clone = SessionManager(self.options)
clone.transcript = list(self.transcript) # deep enough for strings
return clone
// session_manager.ts
import { query } from "@anthropic-ai/claude-agent-sdk";
export class SessionManager {
options: any;
transcript: string[] = [];
constructor(options: any) { this.options = options; }
async send(userInput: string): Promise<string> {
const prompt = [...this.transcript, `USER: ${userInput}`].join("\n\n");
let final = "";
for await (const msg of query({ prompt, options: this.options })) {
if (msg.type === "assistant") {
for (const block of msg.content) {
if ("text" in block && block.text) final = block.text;
}
}
}
this.transcript.push(`USER: ${userInput}`);
this.transcript.push(`ASSISTANT: ${final}`);
return final;
}
fork(): SessionManager {
const clone = new SessionManager(this.options);
clone.transcript = [...this.transcript];
return clone;
}
}
Try it:
import asyncio
from session_manager import SessionManager
from run_agent import OPTIONS
async def main():
s = SessionManager(OPTIONS)
print(await s.send("What is the lien exposure for Acme Corporation?"))
print(await s.send("Now compare to Pinnacle Industries."))
branch = s.fork()
# Hypothetical branch — does NOT contaminate `s`.
print(await branch.send("What if Acme files a UCC-3 termination on NY-2024-0848?"))
# Original session continues from turn 2.
print(await s.send("Summarize the relative risk of Acme vs Pinnacle in one sentence."))
asyncio.run(main())
fork() is sharing the list reference instead of copying — check for list(self.transcript), not self.transcript.
NameError: SessionManager is not defined→ you forgot tofrom session_manager import SessionManagerin the driver.- Cost balloons after 6+ turns → Pattern A re-sends the full transcript every turn. Switch to Pattern B (
resume=session_id) for long-lived sessions. - Fork still shares state → if your transcript items become mutable (dicts, not strings), upgrade
list(...)tocopy.deepcopy(...).
Know three session moves: --resume <session-name> for named continuation (Claude Code CLI), fork_session / your own fork() for branched exploration from a shared baseline, and the resume-vs-fresh decision — resume when prior context is mostly valid; start fresh with an injected summary when prior tool results are stale (file changed, ticket closed, db query went out of date). The exam will give you a scenario and ask which to pick.
5. Declarative Subagents — .claude/agents/<name>.md
When you build a large house, you don't hire one carpenter who also does electrical, plumbing, and roofing. You hire specialists. The general contractor (the coordinator) knows what work needs doing and which specialist to call — and importantly, the specialists don't need to know everything about the house. The plumber doesn't need the architectural plans; they need the bathroom diagram and a list of fixtures.
Subagents work the same way. The coordinator is your main agent — it has the conversation history, the user's broader goal, all the messy context. When it needs a focused piece of work done (calculate risk, validate a payment, compose a notification), it delegates to a subagent. The subagent gets a fresh, isolated context window with only the task brief — not the whole conversation.
Concretely in the SDK: a subagent is a markdown file in .claude/agents/<name>.md with frontmatter declaring its name, description, allowed tools, and (optionally) model. The body of the file is the system prompt. The coordinator invokes it by name; the SDK starts a fresh conversation with the subagent's prompt and only the input the coordinator passes in.
Subagent Invocation Flow
The coordinator has a long conversation history (gray blocks). It delegates the risk calculation to the risk-analyst subagent, which spins up with a fresh context window (green) containing only the task brief. When done, it returns a structured result; the coordinator continues with that result added to its conversation.
user goals, prior tool results, RAG hits
risk-analyst subagentWhat & Why: Move the "calculate score and write a one-paragraph risk profile" responsibility out of the coordinator's prompt and into a focused subagent. The subagent has access only to calculate_risk_score — not search_filings or get_filing_details. The coordinator searches and gathers; the subagent scores and reports. This is the cert-favored hub-and-spoke architecture from M14.
Create the directory and file:
mkdir -p .claude/agents
# create the file with your editor of choice, contents shown in the next tab
---
name: risk-analyst
description: Calculates lien exposure and risk scores given an entity name and a list of filings. Returns a structured risk profile.
tools:
- calculate_risk_score
model: claude-haiku-4-5-20251001
---
You are the Risk Analyst — a focused subagent invoked by the UCC research
coordinator. Your job is narrow:
1. The coordinator hands you an entity name and (optionally) a list of relevant
filing summaries.
2. Call calculate_risk_score with the entity name.
3. Write a one-paragraph risk profile that includes:
- The numeric risk_score and the HIGH/MEDIUM/LOW level
- The two strongest contributing factors (e.g., "active in 4 states", "recent
UCC-3 continuation just filed")
- A one-sentence recommendation ("flag for manual review", "monitor monthly",
"no action needed")
Do NOT search for new filings or fetch additional details — the coordinator
already gathered the data. Stay in your lane.
Then update the coordinator's system prompt to mention the subagent. The SDK looks for .claude/agents/*.md in the current working directory when query() runs and exposes each declared subagent to the coordinator. Make sure you run python run_agent.py from the project root (the same directory that contains the .claude/ folder you just created) — otherwise the subagent won't be discovered.
What to change in run_agent.py: replace the existing system_prompt=(...) argument inside your OPTIONS = ClaudeAgentOptions(...) block with the version below. (Leave every other kwarg from Step 7 unchanged.)
# Replace the system_prompt= argument inside OPTIONS with this version:
system_prompt=(
"You are the UCC research coordinator. For each user request:\n"
" 1. Use search_filings (with name variations) and get_filing_details to "
"gather all relevant filings.\n"
" 2. Once gathered, delegate to the risk-analyst subagent for scoring and "
"the one-paragraph profile.\n"
" 3. Stitch the subagent's profile into your final answer along with "
"specific filing citations.\n"
"Do not call calculate_risk_score yourself — that's the subagent's job."
),
python run_agent.py. The trace now shows the coordinator calling search_filings + get_filing_details, then invoking the risk-analyst subagent (which itself calls calculate_risk_score), then assembling the final answer. The risk-analyst's reasoning is in its context window — the coordinator only sees the structured profile that came back.
The cert exam loves this question: "Does a subagent see the coordinator's prior conversation?" No. The subagent starts with a fresh context window containing only its system prompt (the markdown body) and the input the coordinator passes in. If the subagent needs context (e.g., the user's broader goal, prior tool results), the coordinator must include it explicitly in the task brief. Anti-pattern: assuming "well, the subagent is in the same process, so it has the same memory" — it doesn't.
6. Putting It Together — The Production Agent Stack
You have all the pieces. A production-ready UCC agent — the kind that ships behind a FastAPI endpoint — is just these pieces stacked. Each layer adds one capability without disturbing the layers below.
The Stack, Built Up Layer by Layer
Watch the layers stack from the bottom (raw tools) up to the wire (HTTP endpoint). Each layer is one thing the SDK lets you add without rewriting the layers below.
What & Why: The prelude ("From ML Model to Agent") ended with a pickled RandomForest at models/risk_clf.pkl — a real classifier trained on UCC features. In M15B the agent reimplemented risk scoring with hand-written rules. Now we wire the actual ML model in as a fourth SDK tool, behind the same hooks and inside the same SessionManager from Step 7. This is the agent you'd ship: deterministic ML where deterministic ML belongs (the score), Claude where Claude belongs (the reasoning, the narrative, the multi-turn UX), and one production stack holding it together.
If you didn't keep the prelude's pickle, train it in 30 seconds with python -c "import pickle, numpy as np; from sklearn.ensemble import RandomForestClassifier; X=np.array([[8,4,1,1],[1,1,0,0],[3,2,0,1]]); y=[1,0,1]; pickle.dump(RandomForestClassifier(n_estimators=20, random_state=0).fit(X,y), open('models/risk_clf.pkl','wb'))" after mkdir models.
"""production_agent.py — the full M26 stack in one file.
Wires: @tool functions (incl. the prelude's pickled ML model) +
create_sdk_mcp_server + ClaudeAgentOptions +
PreToolUse/PostToolUse hooks + can_use_tool gate +
SessionManager from Step 7."""
import json, pickle, re
from datetime import datetime
from claude_agent_sdk import (
tool, create_sdk_mcp_server, query, ClaudeAgentOptions,
AssistantMessage, HookMatcher,
PermissionResultAllow, PermissionResultDeny,
)
from mock_data import FILINGS_DB
from session_manager import SessionManager # from Step 7
# ---- Layer 1: tools (3 from earlier + 1 NEW ML tool from the prelude) -----
RISK_MODEL = pickle.load(open("models/risk_clf.pkl", "rb"))
@tool("search_filings",
"Search UCC filings by debtor name. Supports partial matching.",
{"debtor_name": str, "state": str})
async def search_filings(args):
name, state = args["debtor_name"].upper(), args.get("state")
hits = [f for f in FILINGS_DB
if name in f["debtor_name"].upper()
and (not state or f["state"] == state)]
return {"content": [{"type": "text", "text": json.dumps(hits)}]}
@tool("get_filing_details",
"Get full details for a specific UCC filing by filing number.",
{"filing_number": str})
async def get_filing_details(args):
rec = next((f for f in FILINGS_DB
if f["filing_number"] == args["filing_number"]),
{"error": "not found"})
return {"content": [{"type": "text", "text": json.dumps(rec)}]}
@tool("ml_risk_predict",
"Run the production RandomForest delinquency model on an entity. "
"Returns probability + label. Use this INSTEAD of heuristic scoring.",
{"entity_name": str})
async def ml_risk_predict(args):
name = args["entity_name"].upper()
matches = [f for f in FILINGS_DB if name in f["debtor_name"].upper()]
active = [f for f in matches if f["status"] == "ACTIVE"]
states = {f["state"] for f in active}
secured = {f["secured_party"] for f in active}
features = [[len(active), len(states), len(secured),
sum(1 for f in active if "UCC3" in f["filing_type"])]]
proba = float(RISK_MODEL.predict_proba(features)[0][1])
return {"content": [{"type": "text", "text": json.dumps({
"entity_name": args["entity_name"],
"probability_high_risk": round(proba, 3),
"label": "HIGH" if proba > 0.66 else "MEDIUM" if proba > 0.33 else "LOW",
"model": "RandomForest v1 (prelude)",
"features_used": ["active_filings", "n_states",
"n_secured_parties", "n_continuations"],
})}]}
ucc_server = create_sdk_mcp_server(
name="ucc_tools", version="2.0.0",
tools=[search_filings, get_filing_details, ml_risk_predict])
# ---- Layer 4: hooks + permission gate (Steps 5, 6, 7) ---------------------
SSN_RE = re.compile(r"\b\d{3}-\d{2}-\d{4}\b")
PHONE_RE = re.compile(r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b")
async def log_tool_call(input_data, tool_use_id, ctx):
print(f"[{datetime.utcnow().isoformat()}Z] PRE "
f"{input_data.get('tool_name')}({input_data.get('tool_input')})")
return {}
async def redact_pii(input_data, tool_use_id, ctx):
out = input_data.get("tool_response", "")
if isinstance(out, str):
out = SSN_RE.sub("[SSN REDACTED]", out)
out = PHONE_RE.sub("[PHONE REDACTED]", out)
return {"tool_response": out}
async def gate(tool_name, tool_input, ctx):
if tool_name == "mcp__ucc__search_filings":
q = (tool_input.get("debtor_name") or "").strip()
if len(q) < 3:
return PermissionResultDeny(message=f"Query '{q}' too broad.")
return PermissionResultAllow()
# ---- Layer 3: options ----------------------------------------------------
OPTIONS = ClaudeAgentOptions(
model="claude-sonnet-4-6",
system_prompt=(
"You are the production UCC research coordinator. For each question:\n"
" 1. Search filings using name variations (incl. abbreviations + DBAs).\n"
" 2. Call ml_risk_predict for the canonical entity name — this is "
"the production model; do not invent a heuristic.\n"
" 3. Write a narrative report citing specific filings and the model's "
"probability + label."
),
mcp_servers={"ucc": ucc_server},
allowed_tools=["mcp__ucc__search_filings",
"mcp__ucc__get_filing_details",
"mcp__ucc__ml_risk_predict"],
hooks={"PreToolUse": [HookMatcher(matcher="*", hooks=[log_tool_call])],
"PostToolUse": [HookMatcher(matcher="*", hooks=[redact_pii])]},
can_use_tool=gate,
max_turns=10,
)
# ---- Layer 5: a session you can fork -------------------------------------
def new_session() -> SessionManager:
return SessionManager(OPTIONS)
"""driver.py — exercises the full stack: multi-turn + fork + ML tool."""
import asyncio
from production_agent import new_session
async def main():
s = new_session()
print("--- Turn 1 ---")
print(await s.send("What is the lien exposure for Acme Corporation?"))
print("--- Turn 2 (follow-up) ---")
print(await s.send("Compare to Pinnacle Industries."))
print("--- Fork: what-if Acme files a UCC-3 termination on NY-2024-0848? ---")
branch = s.fork()
print(await branch.send("What if Acme files a UCC-3 termination on NY-2024-0848?"))
print("--- Original session continues ---")
print(await s.send("One-sentence summary of relative risk."))
asyncio.run(main())
Run: python driver.py
ml_risk_predict firing with a probability between 0 and 1, (c) the fork's hypothetical absent from the original session's final summary, and (d) Claude using the model's number in its narrative rather than inventing one. If Claude calls calculate_risk_score instead, you forgot to remove it from allowed_tools — the heuristic and the ML model shouldn't both be available, or the agent will pick the easier one.
FileNotFoundError: models/risk_clf.pkl→ train the toy model with the one-liner above, or copy your prelude pickle intomodels/.ValueError: X has N features, but RandomForestClassifier was fitted with 4 features→ the feature vector inml_risk_predictmust match the training shape exactly. The toy model expects 4 features in order:[active_filings, n_states, n_secured_parties, n_continuations].- The fork contaminates the parent → you're sharing the transcript list reference. Confirm
fork()useslist(self.transcript)(Step 7). - Hooks never fire on
ml_risk_predict→ matcher must be"*"or include the full MCP-prefixed namemcp__ucc__ml_risk_predict.
You wired five layers into one agent: a pickled scikit-learn model exposed as a tool, the SDK's query() running the loop, two hooks observing every call, a permission gate refusing broad queries, and a session that supports forked what-if exploration. The ML model from the prelude didn't get replaced — it became one tool among several that Claude calls when it needs a deterministic number. That's the production pattern: classical ML stays where it's strongest (calibrated probabilities); Claude orchestrates, narrates, and handles the long tail.
The M15B reference agent is ~200 lines of Python you wrote by hand: 80 for the loop, 40 for tool dispatch, 30 for hook plumbing (when added), 20 for session management, 30 for tool definitions. The same agent in M26 is ~80 lines: 50 for the three @tool functions, 15 for ClaudeAgentOptions, 5 for query() driver, 10 for hooks. That's 60% less code for the same behavior, plus everything you removed (the loop, the dispatch, the bookkeeping) is now battle-tested SDK code instead of code you have to debug at 2 AM.
Adding a fourth tool? One @tool function, one entry in allowed_tools, done. In M15B that was a tools.json edit, a dispatcher case, and a re-run. The SDK isn't magic — it's just less typing for the parts that don't reward thought.
7. When to Leave the SDK
The SDK is the default. Most agent code — including everything in this course's Tier 3 modules and capstones 1–5 / 7 — should reach for claude-agent-sdk first. But there are real cases where you want the raw loop. Use this table to decide:
| Scenario | Raw messages.create() loop |
claude-agent-sdk |
|---|---|---|
| Standard tool-using agent (1–10 tools) | overkill | default |
| Hooks for guardrails / observability | manual sprinkles in the loop | HookMatcher |
| Per-call permission decisions | manual if-statements | can_use_tool |
| Subagents | manual coordinator orchestration | .claude/agents/ |
| Custom mid-loop streaming (e.g., emit tokens to a SSE stream while tools run in parallel) | yes — you need raw control over the message generator | can't express |
| Non-standard parallel-tool aggregation (e.g., race two tools and take whichever returns first) | yes — loop logic is custom | SDK serializes |
| Instrumentation that doesn't fit the hook model (e.g., wrap every API call with a vendor APM library) | yes — you control every call site | partial — hooks see tool calls, not API calls |
| Cert exam coverage | must understand | must master |
Module Summary — What You Built
from claude_agent_sdk import query, tool, create_sdk_mcp_server, ClaudeAgentOptions, AssistantMessage, HookMatcher, PermissionResultAllow, PermissionResultDeny- Tools:
@tool("name", "desc", {schema})on async functions returning{"content": [{"type": "text", "text": ...}]} - MCP server:
create_sdk_mcp_server(name, version, tools=[...]) - Options:
ClaudeAgentOptions(system_prompt, mcp_servers, allowed_tools, hooks, can_use_tool, max_turns, model, resume?) - Driver:
async for msg in query(prompt=..., options=OPTIONS): if isinstance(msg, AssistantMessage): ... - Hooks:
HookMatcher(matcher="tool_name_or_*", hooks=[async_fn])forPreToolUse/PostToolUse - Permission:
can_use_tool=async_callbackreturning Allow/Deny - Subagents:
.claude/agents/<name>.mdwith frontmattername/description/tools/model