Capstone 2 — Domain C: UCC Regulatory Reference Agent
Build a RAG-powered agent that answers UCC Article 9 questions with cited references from legal guides, state filing handbooks, and collateral classification documents.
Project Brief
Junior analysts and paralegals at commercial lending firms spend hours searching through UCC Article 9 reference materials, state filing handbooks, and collateral classification guides to answer routine questions. "How do I perfect a security interest in inventory in Texas?" requires cross-referencing three different documents: the UCC Article 9 text (for the general rule), the Texas filing handbook (for state-specific procedures), and the collateral classification guide (for what counts as "inventory").
The pain is not just time — it is accuracy. Legal reference materials use specialized terminology, cross-reference numbered sections, and have jurisdiction-specific exceptions. A junior analyst might find the right general rule but miss the Texas-specific filing fee or confuse "inventory" with "equipment" (a critical legal distinction that affects lien priority).
Your agent solves this by ingesting a corpus of UCC reference documents, building a RAG pipelineRetrieval-Augmented Generation — a pattern where the agent searches a knowledge base for relevant information, then uses Claude to synthesize an answer from the retrieved content. The agent answers from YOUR documents, not its training data, and can cite specific sources., and answering questions with accurate, cited references. The analyst asks a natural language question, the agent retrieves relevant document sections, and Claude synthesizes a clear answer with specific section citations.
A RAG agent with a search_ucc_knowledge_base tool that:
- Ingests 3 document types: UCC Article 9 guide, state filing handbooks, and a collateral classification guide
- Chunks documents with section-aware boundaries (preserving legal section IDs)
- Generates embeddingsDense vector representations of text that capture semantic meaning. Similar concepts produce vectors that are close together in embedding space. Used to find document chunks that are semantically relevant to a user's question, even if they don't share exact keywords. and stores them in a vector databaseA database optimized for storing and searching vector embeddings. Instead of exact-match queries (like SQL), you give it a query vector and it finds the most similar stored vectors. ChromaDB, Pinecone, and pgvector are common options.
- Retrieves relevant chunks at query time with optional filters (doc type, state, section ID)
- Synthesizes answers with section-level citationsReferences that trace each factual claim in the agent's answer back to a specific document section. For example: "Filing is required (UCC 9-310)." Prevents hallucination by grounding every claim in retrieved source material.
- Handles out-of-scope queries gracefully (no hallucination)
Skills practiced: RAG pipeline (M09-M10), embeddings, chunking strategies, vector search, citation generation, and out-of-scope detection.
Stretch goal: Add HyDE searchHypothetical Document Embeddings — a query transformation technique where Claude generates a hypothetical answer to the question, then embeds THAT answer (instead of the question) for similarity search. Often retrieves more relevant chunks because the hypothetical answer uses the same terminology as the source documents. for improved retrieval on legal terminology queries.
Complete M05 (Function Calling), M08 (Conversation Management), M09 (RAG), and M10 (Advanced RAG) before starting this capstone. You should be comfortable defining tools, managing multi-turn conversations, and understanding the retrieve-then-generate pattern.
Environment Setup
Requirements: Python 3.10+ or Node.js 18+. You will also need an Anthropic API key.
# Run each line on its own — works in bash, zsh, cmd.exe, and PowerShell 5.1+
mkdir capstone-2-ucc-rag
cd capstone-2-ucc-rag
python -m venv venv
# macOS/Linux: source venv/bin/activate
# Windows PowerShell: venv\Scripts\Activate.ps1
# Windows cmd.exe: venv\Scripts\activate.bat
# Pin dependencies for reproducibility
echo "anthropic>=0.40.0" > requirements.txt
pip install -r requirements.txt
# API key (use the form for your shell)
# bash/zsh: export ANTHROPIC_API_KEY=your-key-here
# Windows PowerShell: $env:ANTHROPIC_API_KEY = "your-key-here"
# Windows cmd.exe: set ANTHROPIC_API_KEY=your-key-here
# Run each line on its own — works in bash, zsh, cmd.exe, and PowerShell 5.1+
mkdir capstone-2-ucc-rag
cd capstone-2-ucc-rag
npm init -y
npm install @anthropic-ai/sdk@^0.40.0
npm install -D typescript@^5.4 tsx@^4.7 @types/node@^20
# API key (use the form for your shell)
# bash/zsh: export ANTHROPIC_API_KEY=your-key-here
# Windows PowerShell: $env:ANTHROPIC_API_KEY = "your-key-here"
# Windows cmd.exe: set ANTHROPIC_API_KEY=your-key-here
File Structure
capstone-2-ucc-rag/
├── mock_kb.py # Mock knowledge base with UCC chunks
├── ucc_rag_agent.py # RAG agent with tool-use loop
├── mock_kb.ts # TypeScript mock knowledge base
├── ucc_rag_agent.ts # TypeScript RAG agent
└── requirements.txt # Dependencies
Domain Glossary
RAG Pipeline Architecture
Chunking Strategy: Section-Aware Boundaries
Legal documents have a natural structure: numbered sections, subsections, and cross-references. Naive chunking (splitting every 500 characters) destroys this structure. Section-aware chunking preserves legal section boundaries and metadata.
When a user asks "Where do I file in Texas?", the section-aware approach retrieves chunk [TX-FILING-01] with its full context intact. The naive approach might retrieve a fragment that starts mid-sentence and lacks any section reference. Citations become impossible without section metadata — and in legal contexts, citations are not optional. A claim without a citation is worthless to a paralegal.
Mock Document Corpus
{
"documents": [
{
"doc_id": "string — unique document identifier",
"title": "string — document title",
"doc_type": "article_9_guide | state_handbook | collateral_guide",
"sections": [
{
"section_id": "string — legal section reference (e.g., 9-301)",
"title": "string — section heading",
"content": "string — full section text"
}
]
}
]
}
// 3 document types form the knowledge base:
// 1. UCC Article 9 Guide (4 sections in this capstone)
// Covers: 9-301 (debtor location/governing law),
// 9-310 (filing required), 9-502 (financing statement
// contents), 9-515 (duration and lapse)
// 2. State Filing Handbooks (2 states: TX, DE)
// Each covers: where to file, fees, online systems,
// processing times
// 3. Collateral Classification Guide (2 categories)
// Covers: inventory, equipment
// (extend to accounts, instruments, chattel paper,
// deposit accounts, investment property, intangibles
// as a stretch goal)
// Total: 8 retrieval-ready chunks across 4 documents
// Each chunk preserves section_id, doc_id, and doc_type
// metadata for accurate citations
Implementation Phases
- Phase 1 — Document Ingestion (45 min): Define the mock corpus as Python dataclass / TypeScript const literals. Parse each document into sections preserving
section_id,doc_id, anddoc_typemetadata. (Production: load from JSON or a CMS.) - Phase 2 — Section-Aware Chunking (30 min): Chunk by section boundary (each section = one chunk). Add overlap by prepending the section title and parent doc title. Preserve metadata per chunk.
- Phase 3 — Build a Keyword-Overlap Index (30 min): Score each chunk by the fraction of query terms it contains. This is a degenerate BM25 (no TF-IDF weighting) that keeps the capstone API-free. Production swap-in: generate Voyage AI / OpenAI embeddings and store in ChromaDB — same retrieval interface.
- Phase 4 — Retrieval Tool (30 min): Build
search_ucc_knowledge_basethat scores the in-memory chunks with optional filters (doc_type,state,section_id). Return top-K results with similarity scores. - Phase 5 — RAG Agent (45 min): Wire the retrieval tool to Claude with a system prompt that instructs citation of
section_idfor every claim. Handle out-of-scope queries. - Phase 6 — Testing (30 min): Run 10 test cases. Verify citation accuracy — every factual claim must trace to a retrieved chunk.
- Phase 7 — [OPTIONAL] Stretch: Add HyDE search — generate a hypothetical answer, embed it, and use that embedding for retrieval instead of the raw question.
Step 1: Create the Mock Knowledge Base
What & Why
Before building the RAG agent, you need a searchable knowledge base. This file defines 8 UCC document chunks (from 3 document types: Article 9 guide, state handbooks, and a collateral classification guide) and a keyword-search function that mimics vector similarity search. In production you would replace this with ChromaDB or Pinecone, but the interface stays the same.
Create a new file called mock_kb.py (or mock_kb.ts for Node.js) and paste the following code:
# mock_kb.py — Mock UCC Knowledge Base with simple similarity search
# In production, replace with ChromaDB / Pinecone / pgvector.
from dataclasses import dataclass
@dataclass
class Chunk:
chunk_id: str
doc_id: str
section_id: str
doc_type: str # article_9_guide | state_handbook | collateral_guide
title: str
content: str
state: str | None = None # for state handbooks
# --- Mock corpus chunked by section ---
CHUNKS = [
Chunk("c001", "UCC-ART9-GUIDE", "9-301", "article_9_guide",
"Law Governing Perfection and Priority",
"The law of the jurisdiction where the debtor is located governs "
"perfection of a security interest. For registered organizations "
"(corporations, LLCs), the debtor is located in the state of "
"organization. For individuals, the debtor is located at their "
"principal residence. This means a Delaware LLC's filings are "
"governed by Delaware law, regardless of where the collateral "
"is physically located."),
Chunk("c002", "UCC-ART9-GUIDE", "9-310", "article_9_guide",
"Filing Required to Perfect",
"A financing statement must be filed to perfect a security "
"interest in most types of collateral. Exceptions include: "
"possessory security interests (9-313) where the secured party "
"takes physical possession, control-based perfection for deposit "
"accounts (9-314), and automatic perfection for purchase money "
"security interests in consumer goods (9-309)."),
Chunk("c003", "UCC-ART9-GUIDE", "9-502", "article_9_guide",
"Contents of Financing Statement",
"A financing statement is sufficient if it provides: (1) the "
"name of the debtor, (2) the name of the secured party or "
"representative, and (3) an indication of the collateral. "
"The debtor name must EXACTLY match the name on the state's "
"public organic record (certificate of formation, articles of "
"incorporation). A misspelled debtor name can render the filing "
"seriously misleading and therefore ineffective."),
Chunk("c004", "STATE-GUIDE-TX", "TX-FILING-01", "state_handbook",
"Where to File in Texas",
"In Texas, UCC financing statements are filed with the "
"Secretary of State. Filing can be done online at SOSDirect "
"(direct.sos.state.tx.us), by mail to PO Box 13193 Austin TX 78711, "
"or in person. Filing fee: $15 per page (standard), $5 per page "
"(online via SOSDirect). Processing time: online filings are "
"typically processed same-day; mail filings take 5-7 business days.",
state="TX"),
Chunk("c005", "STATE-GUIDE-DE", "DE-FILING-01", "state_handbook",
"Where to File in Delaware",
"In Delaware, UCC financing statements are filed with the "
"Division of Corporations under the Secretary of State. Online "
"filing available at corp.delaware.gov. Filing fee: $50 for a "
"standard financing statement. Delaware is the most common "
"filing jurisdiction for registered entities because many "
"corporations and LLCs are organized in Delaware.",
state="DE"),
Chunk("c006", "COLLATERAL-GUIDE", "CC-INVENTORY", "collateral_guide",
"Inventory Collateral",
"Inventory includes goods held for sale or lease, raw materials, "
"work in process, and materials used or consumed in a business. "
"Key distinction from Equipment: if the debtor holds the goods "
"for sale to customers, they are inventory; if the debtor uses "
"the goods in its own operations, they are equipment. This "
"classification matters because purchase money security interests "
"in inventory have different priority rules than PMSI in equipment."),
Chunk("c007", "COLLATERAL-GUIDE", "CC-EQUIPMENT", "collateral_guide",
"Equipment Collateral",
"Equipment means goods used or bought for use primarily in a "
"business. Examples: manufacturing machinery, office computers, "
"delivery vehicles. If the debtor holds goods for sale, they are "
"inventory, not equipment. Equipment includes fixtures (goods "
"that become part of real property) but fixture filings have "
"special rules under 9-334."),
Chunk("c008", "UCC-ART9-GUIDE", "9-515", "article_9_guide",
"Duration and Lapse of Financing Statement",
"A filed financing statement is effective for 5 years after the "
"date of filing. To continue effectiveness, a continuation "
"statement (UCC-3) must be filed within 6 months before lapse. "
"If the continuation is not filed, the financing statement "
"lapses and the security interest becomes unperfected. An "
"unperfected security interest loses priority to subsequent "
"perfected interests and to a trustee in bankruptcy."),
]
def search_ucc_knowledge_base(
query: str,
top_k: int = 5,
filters: dict | None = None,
) -> dict:
"""
Search the mock UCC knowledge base using keyword matching.
In production, replace with vector similarity search.
"""
try:
if not query or not query.strip():
return {"is_error": True, "error_category": "EMPTY_QUERY",
"is_retryable": False, "context": "Query cannot be empty."}
query_lower = query.lower()
query_terms = query_lower.split()
# Filter chunks
candidates = CHUNKS
if filters:
if filters.get("doc_type"):
candidates = [c for c in candidates if c.doc_type == filters["doc_type"]]
if filters.get("state"):
candidates = [c for c in candidates if c.state == filters["state"].upper()]
if filters.get("section_id"):
candidates = [c for c in candidates if c.section_id == filters["section_id"]]
# Score by keyword overlap (mock similarity)
scored = []
for chunk in candidates:
text = (chunk.title + " " + chunk.content).lower()
score = sum(1 for term in query_terms if term in text) / max(len(query_terms), 1)
if score > 0:
scored.append((chunk, round(score, 3)))
scored.sort(key=lambda x: x[1], reverse=True)
top = scored[:top_k]
results = []
for chunk, score in top:
results.append({
"chunk_id": chunk.chunk_id,
"doc_id": chunk.doc_id,
"section_id": chunk.section_id,
"title": chunk.title,
"content": chunk.content,
"similarity_score": score,
"metadata": {
"doc_type": chunk.doc_type,
"state": chunk.state,
},
})
return {"is_error": False, "results": results, "total": len(results)}
except Exception as e:
return {"is_error": True, "error_category": "INTERNAL_ERROR",
"is_retryable": True, "context": str(e)}
// mock_kb.ts — Mock UCC Knowledge Base with keyword search
interface Chunk {
chunkId: string; docId: string; sectionId: string;
docType: string; title: string; content: string; state?: string;
}
const CHUNKS: Chunk[] = [
{ chunkId: "c001", docId: "UCC-ART9-GUIDE", sectionId: "9-301",
docType: "article_9_guide", title: "Law Governing Perfection and Priority",
content: "The law of the jurisdiction where the debtor is located governs perfection of a security interest. For registered organizations (corporations, LLCs), the debtor is located in the state of organization. For individuals, the debtor is located at their principal residence. This means a Delaware LLC's filings are governed by Delaware law, regardless of where the collateral is physically located." },
{ chunkId: "c002", docId: "UCC-ART9-GUIDE", sectionId: "9-310",
docType: "article_9_guide", title: "Filing Required to Perfect",
content: "A financing statement must be filed to perfect a security interest in most types of collateral. Exceptions include: possessory security interests (9-313), control-based perfection for deposit accounts (9-314), and automatic perfection for purchase money security interests in consumer goods (9-309)." },
{ chunkId: "c003", docId: "UCC-ART9-GUIDE", sectionId: "9-502",
docType: "article_9_guide", title: "Contents of Financing Statement",
content: "A financing statement is sufficient if it provides: (1) the name of the debtor, (2) the name of the secured party or representative, and (3) an indication of the collateral. The debtor name must EXACTLY match the name on the state's public organic record. A misspelled debtor name can render the filing seriously misleading and therefore ineffective." },
{ chunkId: "c004", docId: "STATE-GUIDE-TX", sectionId: "TX-FILING-01",
docType: "state_handbook", title: "Where to File in Texas",
content: "In Texas, UCC financing statements are filed with the Secretary of State. Filing can be done online at SOSDirect, by mail, or in person. Filing fee: $15 per page (standard), $5 per page (online). Processing: online same-day; mail 5-7 business days.",
state: "TX" },
{ chunkId: "c005", docId: "STATE-GUIDE-DE", sectionId: "DE-FILING-01",
docType: "state_handbook", title: "Where to File in Delaware",
content: "In Delaware, UCC financing statements are filed with the Division of Corporations. Online filing at corp.delaware.gov. Fee: $50 standard. Delaware is the most common jurisdiction because many entities are organized there.",
state: "DE" },
{ chunkId: "c006", docId: "COLLATERAL-GUIDE", sectionId: "CC-INVENTORY",
docType: "collateral_guide", title: "Inventory Collateral",
content: "Inventory includes goods held for sale or lease, raw materials, work in process, and materials used or consumed in a business. Key distinction from Equipment: if held for sale, it's inventory; if used in operations, it's equipment. PMSI in inventory has different priority rules than PMSI in equipment." },
{ chunkId: "c007", docId: "COLLATERAL-GUIDE", sectionId: "CC-EQUIPMENT",
docType: "collateral_guide", title: "Equipment Collateral",
content: "Equipment means goods used or bought for use primarily in a business. Examples: machinery, computers, vehicles. If held for sale = inventory, not equipment. Includes fixtures with special rules under 9-334." },
{ chunkId: "c008", docId: "UCC-ART9-GUIDE", sectionId: "9-515",
docType: "article_9_guide", title: "Duration and Lapse",
content: "A financing statement is effective for 5 years. Continuation (UCC-3) must be filed within 6 months before lapse. If not continued, the filing lapses and the security interest becomes unperfected, losing priority." },
];
interface SearchResult { is_error: boolean; [key: string]: unknown; }
export function searchUccKnowledgeBase(
query: string, topK = 5, filters?: { doc_type?: string; state?: string; section_id?: string },
): SearchResult {
try {
if (!query?.trim()) return { is_error: true, error_category: "EMPTY_QUERY", is_retryable: false, context: "Query cannot be empty." };
const terms = query.toLowerCase().split(/\s+/);
let candidates = [...CHUNKS];
if (filters?.doc_type) candidates = candidates.filter(c => c.docType === filters.doc_type);
if (filters?.state) candidates = candidates.filter(c => c.state === filters.state.toUpperCase());
if (filters?.section_id) candidates = candidates.filter(c => c.sectionId === filters.section_id);
const scored = candidates.map(c => {
const text = (c.title + " " + c.content).toLowerCase();
const score = terms.filter(t => text.includes(t)).length / Math.max(terms.length, 1);
return { chunk: c, score: Math.round(score * 1000) / 1000 };
}).filter(s => s.score > 0).sort((a, b) => b.score - a.score).slice(0, topK);
return { is_error: false, total: scored.length, results: scored.map(s => ({
chunk_id: s.chunk.chunkId, doc_id: s.chunk.docId, section_id: s.chunk.sectionId,
title: s.chunk.title, content: s.chunk.content, similarity_score: s.score,
metadata: { doc_type: s.chunk.docType, state: s.chunk.state ?? null },
})) };
} catch (error) {
return { is_error: true, error_category: "INTERNAL_ERROR", is_retryable: true, context: String(error) };
}
}
Run Command
# Quick sanity test — run from the project directory:
python -c "from mock_kb import search_ucc_knowledge_base; print(search_ucc_knowledge_base('filing Texas'))"
# Quick sanity test:
npx tsx -e "import {searchUccKnowledgeBase} from './mock_kb.ts'; console.log(searchUccKnowledgeBase('filing Texas'))"
You built a mock knowledge base with 8 section-aware chunks from 3 document types. Each chunk preserves its section_id, doc_id, and doc_type as metadata. The search function uses keyword overlap as a mock for vector similarity (in production, you would use ChromaDB with real embeddings). Filters let the agent narrow results by document type or state. The key design decision: one section = one chunk, with metadata that enables accurate citations.
- ModuleNotFoundError: No module named 'mock_kb' — Make sure you are running the command from the same directory that contains
mock_kb.py. - SyntaxError on
str | None— You need Python 3.10+. Check withpython --version. If you are on 3.9, changestr | NonetoOptional[str]and addfrom typing import Optional.
Step 2: Create the RAG Agent
What & Why
This file wires Claude to your knowledge base using the tool-use pattern from M05. The agent receives a user question, calls search_ucc_knowledge_base to retrieve relevant chunks, then synthesizes an answer with inline citations. The system prompt enforces strict citation rules: every factual claim must reference a specific section ID.
Create a new file called ucc_rag_agent.py (or ucc_rag_agent.ts for Node.js) and paste the following code:
# ucc_rag_agent.py — RAG-Powered UCC Reference Agent
import anthropic
import json
from mock_kb import search_ucc_knowledge_base
SYSTEM_PROMPT = """You are a UCC Article 9 Reference Agent for commercial
lending professionals. You answer questions about UCC filing requirements,
collateral classifications, perfection rules, and state-specific procedures.
You have access to a knowledge base of UCC reference materials via the
search_ucc_knowledge_base tool. ALWAYS search before answering.
CITATION RULES (critical):
- Every factual claim MUST cite a specific section: (Section 9-310)
- If a claim comes from a state handbook, cite it: (TX-FILING-01)
- If you cannot find relevant information, say so explicitly.
Do NOT guess or make up legal information.
- You provide REFERENCE INFORMATION only. You do NOT provide legal advice.
- Always recommend consulting counsel for specific legal questions.
RESPONSE FORMAT:
- Use clear headings and bullet points for multi-part answers
- Cite sources inline: "Filing is required to perfect (Section 9-310)"
- End with a "Sources" list showing all cited sections
"""
TOOLS = [
{
"name": "search_ucc_knowledge_base",
"description": (
"Search the UCC reference knowledge base for information about "
"Article 9 filing requirements, collateral classifications, "
"perfection rules, and state-specific procedures. Returns "
"relevant document sections with citation references."
),
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Natural language search query",
},
"top_k": {
"type": "integer",
"description": "Number of results to return (default 5)",
},
"filters": {
"type": "object",
"description": "Optional filters",
"properties": {
"doc_type": {
"type": "string",
"enum": ["article_9_guide", "state_handbook", "collateral_guide"],
},
"state": {"type": "string"},
"section_id": {"type": "string"},
},
},
},
"required": ["query"],
},
},
]
TOOL_HANDLERS = {
"search_ucc_knowledge_base": lambda args: search_ucc_knowledge_base(
query=args["query"],
top_k=args.get("top_k", 5),
filters=args.get("filters"),
),
}
def run_rag_agent(user_message: str, history: list | None = None) -> tuple[str, list]:
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY env var
messages = history or []
messages.append({"role": "user", "content": user_message})
for _ in range(5):
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages,
)
except anthropic.APIError as e:
return f"Technical issue: {e}", messages
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
handler = TOOL_HANDLERS.get(block.name)
result = handler(block.input) if handler else {
"is_error": True, "error_category": "UNKNOWN_TOOL",
}
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result),
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
elif response.stop_reason == "end_turn":
text = " ".join(b.text for b in response.content if hasattr(b, "text"))
messages.append({"role": "assistant", "content": response.content})
return text, messages
return "Unable to complete your request.", messages
if __name__ == "__main__":
print("UCC Reference Agent — Type 'quit' to exit.\n")
history = []
while True:
q = input("You: ").strip()
if q.lower() in ("quit", "exit", "q"):
break
response, history = run_rag_agent(q, history)
print(f"\nAgent: {response}\n")
// ucc_rag_agent.ts — RAG-Powered UCC Reference Agent
import Anthropic from "@anthropic-ai/sdk";
// `tsx` resolves the `.js` ext to `mock_kb.ts` at runtime — keep .js here for ESM compatibility
import { searchUccKnowledgeBase } from "./mock_kb.js";
const SYSTEM_PROMPT = `You are a UCC Article 9 Reference Agent for commercial
lending professionals. You answer questions about UCC filing requirements,
collateral classifications, perfection rules, and state-specific procedures.
CITATION RULES (critical):
- Every factual claim MUST cite a specific section: (Section 9-310)
- If from a state handbook, cite: (TX-FILING-01)
- If you cannot find info, say so. Do NOT guess legal information.
- You provide REFERENCE INFORMATION only. Not legal advice.
- Recommend consulting counsel for specific legal questions.`;
const TOOLS: Anthropic.Tool[] = [{
name: "search_ucc_knowledge_base",
description: "Search UCC reference knowledge base for Article 9 filing, collateral, perfection, and state procedures.",
input_schema: {
type: "object" as const,
properties: {
query: { type: "string", description: "Search query" },
top_k: { type: "integer", description: "Results count (default 5)" },
filters: { type: "object", properties: {
doc_type: { type: "string", enum: ["article_9_guide","state_handbook","collateral_guide"] },
state: { type: "string" }, section_id: { type: "string" },
}},
},
required: ["query"],
},
}];
type ToolArgs = { query: string; top_k?: number; filters?: { doc_type?: string; state?: string; section_id?: string } };
const HANDLERS: Record<string, (a: ToolArgs) => unknown> = {
search_ucc_knowledge_base: (a) => searchUccKnowledgeBase(a.query, a.top_k ?? 5, a.filters),
};
export async function runRagAgent(
msg: string, history: Anthropic.MessageParam[] = [],
): Promise<[string, Anthropic.MessageParam[]]> {
const client = new Anthropic();
const messages = [...history, { role: "user" as const, content: msg }];
for (let i = 0; i < 5; i++) {
let response: Anthropic.Message;
try {
response = await client.messages.create({
model: "claude-sonnet-4-6", max_tokens: 2048,
system: SYSTEM_PROMPT, tools: TOOLS, messages,
});
} catch (err) { return [`Technical issue: ${err}`, messages]; }
if (response.stop_reason === "tool_use") {
const results: Anthropic.ToolResultBlockParam[] = [];
for (const b of response.content) {
if (b.type === "tool_use") {
const h = HANDLERS[b.name];
const r = h ? h(b.input as ToolArgs) : { is_error: true, error_category: "UNKNOWN" };
results.push({ type: "tool_result", tool_use_id: b.id, content: JSON.stringify(r) });
}
}
messages.push({ role: "assistant", content: response.content });
messages.push({ role: "user", content: results });
} else if (response.stop_reason === "end_turn") {
const text = response.content
.filter((b): b is Anthropic.TextBlock => b.type === "text")
.map(b => b.text).join(" ");
messages.push({ role: "assistant", content: response.content });
return [text, messages];
}
}
return ["Unable to complete your request.", messages];
}
// --- CLI entry point (matches Python's interactive loop) ---
async function main() {
const readline = await import("readline");
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const ask = (q: string): Promise<string> => new Promise(res => rl.question(q, res));
console.log("UCC Reference Agent — Type 'quit' to exit.\n");
let history: Anthropic.MessageParam[] = [];
while (true) {
const q = (await ask("You: ")).trim();
if (["quit", "exit", "q"].includes(q.toLowerCase())) break;
const [response, newHistory] = await runRagAgent(q, history);
history = newHistory;
console.log(`\nAgent: ${response}\n`);
}
rl.close();
}
main().catch(console.error);
Run Command
python ucc_rag_agent.py
npx tsx ucc_rag_agent.ts
You built a RAG agent that: (1) receives a legal question, (2) searches the knowledge base via the tool, (3) receives section-level chunks with metadata, and (4) synthesizes an answer with inline citations. The system prompt enforces citation rules — every claim must reference a section ID. The agent also explicitly states when information is not found, preventing hallucination on legal topics where accuracy is critical.
- ModuleNotFoundError: No module named 'anthropic' — Run
pip install anthropic(make sure your virtual environment is activated). - AuthenticationError — Your
ANTHROPIC_API_KEYenvironment variable is missing or invalid. Runecho $ANTHROPIC_API_KEYto check (Windows:echo %ANTHROPIC_API_KEY%). - ImportError: cannot import name 'search_ucc_knowledge_base' from 'mock_kb' — Ensure both
mock_kb.pyanducc_rag_agent.pyare in the same directory, and you are running from that directory.
Step 3: Test the RAG Agent
What & Why
Running the agent interactively lets you verify the full RAG loop: question in, tool call out, chunks retrieved, cited answer generated. Try the sample queries below to confirm the agent cites specific section IDs and handles out-of-scope questions gracefully.
Start the agent (if not already running):
python ucc_rag_agent.py
npx tsx ucc_rag_agent.ts
Then try these sample queries:
How do I perfect a security interest in inventory in Texas?What's the difference between inventory and equipment?Where do I file a UCC-1 in Delaware?How do I perfect a security interest in cryptocurrency?(out-of-scope test)
If the agent returned an answer with inline citations like (Section 9-310) and (TX-FILING-01), your RAG pipeline is working end-to-end. The agent searched the knowledge base, retrieved relevant chunks, and synthesized a cited answer. Try the out-of-scope query (cryptocurrency) to verify the agent says "I don't have information about that" instead of hallucinating.
Testing Guide
| Type | Input | Expected Behavior |
|---|---|---|
| Happy | "How do I perfect a security interest in inventory in Texas?" | Cites 9-310, 9-301, TX-FILING-01, CC-INVENTORY |
| Happy | "What's the difference between inventory and equipment?" | Cites CC-INVENTORY and CC-EQUIPMENT with clear distinction |
| Happy | "Where do I file a UCC-1 in Delaware?" | Cites DE-FILING-01 with fees and website |
| Happy | "Which state's law governs perfection for a Delaware LLC?" | Cites 9-301 and explains debtor location rule |
| Happy | "What are the exceptions to the filing requirement?" | Cites 9-310, lists possessory (9-313), control (9-314), auto-perfection (9-309) |
| Edge | "How do I perfect a security interest in cryptocurrency?" | Acknowledges gap in corpus, suggests consulting counsel |
| Edge | "What about UCC Article 2?" | Notes knowledge base covers Article 9 only |
| Edge | Highly technical legal jargon query | Retrieves relevant chunks, translates to plain English |
| Adversarial | "Draft a UCC-1 financing statement for me" | Explains it provides reference info, cannot draft legal documents |
| Adversarial | "Is this filing valid?" (no context) | Asks for specific filing details before analysis |
Retrieval-Then-Generation Flow
This is what happens each time a user asks a question. The agent does not answer from memory — it searches first, retrieves evidence, then generates a cited answer grounded in retrieved chunks.
This flow is the core of every RAG agent. The user never sees the intermediate steps — they just get a grounded, cited answer. But under the hood, the agent: (1) converts the question into a vector, (2) finds the most similar document chunks, (3) injects those chunks into the prompt, and (4) instructs Claude to cite specific sections. Without this pipeline, the agent would answer from its training data, which may be outdated, incomplete, or hallucinated for domain-specific legal content.
Knowledge Check
Test your understanding of the RAG pipeline and UCC domain concepts covered in this capstone.
1. In a RAG pipeline, what happens at query time?
2. Why is chunking by section (e.g., UCC Article 9 sections) better than fixed-size chunking for legal documents?
3. What happens when a UCC-1 financing statement lapses?
4. Applied: A paralegal asks “Does a UCC-3 continuation need to be filed before or after the 5-year lapse?” — what should the RAG agent do?
5. Why does this capstone use simple keyword overlap instead of neural embeddings?
Verify Everything Works
Run this end-to-end smoke test to confirm your entire RAG pipeline is functioning correctly. The test sends a question, checks that the agent calls the search tool, and verifies the response contains citations.
# verify.py — End-to-end smoke test
from ucc_rag_agent import run_rag_agent
test_queries = [
("Where do I file a UCC-1 in Delaware?", ["DE-FILING-01"]),
("What is the difference between inventory and equipment?", ["CC-INVENTORY", "CC-EQUIPMENT"]),
("How long is a financing statement effective?", ["9-515"]),
]
print("=== End-to-End Verification ===\n")
passed = 0
for query, expected_citations in test_queries:
response, _ = run_rag_agent(query)
found = [cit for cit in expected_citations if cit in response]
status = "PASS" if len(found) == len(expected_citations) else "FAIL"
if status == "PASS":
passed += 1
print(f"[{status}] Query: {query}")
print(f" Expected citations: {expected_citations}")
print(f" Found: {found}\n")
print(f"Result: {passed}/{len(test_queries)} tests passed.")
if passed == len(test_queries):
print("All tests passed — your RAG agent is working correctly!")
// verify.ts — End-to-end smoke test
import { runRagAgent } from "./ucc_rag_agent.js";
const testQueries: [string, string[]][] = [
["Where do I file a UCC-1 in Delaware?", ["DE-FILING-01"]],
["What is the difference between inventory and equipment?", ["CC-INVENTORY", "CC-EQUIPMENT"]],
["How long is a financing statement effective?", ["9-515"]],
];
async function verify() {
console.log("=== End-to-End Verification ===\n");
let passed = 0;
for (const [query, expectedCitations] of testQueries) {
const [response] = await runRagAgent(query);
const found = expectedCitations.filter(c => response.includes(c));
const status = found.length === expectedCitations.length ? "PASS" : "FAIL";
if (status === "PASS") passed++;
console.log(`[${status}] Query: ${query}`);
console.log(` Expected: ${expectedCitations.join(", ")}`);
console.log(` Found: ${found.join(", ")}\n`);
}
console.log(`Result: ${passed}/${testQueries.length} tests passed.`);
if (passed === testQueries.length) console.log("All tests passed!");
}
verify().catch(console.error);
Run the verification:
python verify.py
npx tsx verify.ts
Troubleshooting
Common errors and how to fix them:
| Error | Cause | Fix |
|---|---|---|
ModuleNotFoundError: No module named 'anthropic' |
The Anthropic SDK is not installed in your active Python environment. | Run pip install anthropic. Make sure your virtual environment is activated (source venv/bin/activate). |
AuthenticationError |
Missing or invalid API key. | Set export ANTHROPIC_API_KEY=your-key-here (Windows: set ANTHROPIC_API_KEY=your-key-here). Verify with echo $ANTHROPIC_API_KEY. |
ImportError: cannot import name 'search_ucc_knowledge_base' from 'mock_kb' |
The agent file cannot find the knowledge base module. | Ensure both mock_kb.py and ucc_rag_agent.py are in the same directory, and you are running from that directory. |
JSONDecodeError or malformed tool response |
Claude occasionally returns non-JSON in the tool call. This is rare but can happen. | Retry the query. If it persists, check that your tool definition's input_schema matches the expected format. The agent loop already handles up to 5 retries. |
Compliance & Regulatory Notes
Not legal advice: This agent provides reference information from UCC Article 9 materials. It does NOT constitute legal advice. Always recommend users consult qualified counsel for specific transactions.
Jurisdiction accuracy: UCC rules have state-specific variations. The knowledge base covers general Article 9 principles plus specific state handbooks. For states not in the corpus, the agent should explicitly state that state-specific procedures may differ.
Citation integrity: Every factual claim must trace to a specific document section. Hallucinated legal citations are worse than no citation — they can lead to incorrect filings, lost priority, and financial loss.
Going Further
- [OPTIONAL] Stretch: HyDE search — Generate a hypothetical answer, embed it, and use that for retrieval. Improves results when the user's question uses different terminology than the source documents.
- Hybrid search — Combine vector similarity with keyword matching (BM25) for exact section ID lookups alongside semantic search (M10).
- Multi-state coverage — Add handbooks for all 50 states and let the agent compare filing procedures across jurisdictions.
- Connect to Capstone 1 — Combine filing lookup (CAPSTONE-1-C) with regulatory reference (this capstone) so the agent can both search filings AND explain what they mean.
- Re-ranking — Add a re-ranking step that prioritizes state-specific results when the query mentions a state, even if general Article 9 chunks score higher on similarity.
References & Resources
- Claude Tool Use Documentation — Function calling for RAG tools
- Prompt Engineering Guide — System prompts for citation enforcement
- Anthropic Cookbook — RAG patterns and examples
- UCC Article 9 (Cornell Law) — Full legal text of secured transactions