Capstone 2 — Domain C: UCC Regulatory Reference Agent

Project Brief

Business Context

Junior analysts and paralegals at commercial lending firms spend hours searching through UCC Article 9 reference materials, state filing handbooks, and collateral classification guides to answer routine questions. "How do I perfect a security interest in inventory in Texas?" requires cross-referencing three different documents: the UCC Article 9 text (for the general rule), the Texas filing handbook (for state-specific procedures), and the collateral classification guide (for what counts as "inventory").

The pain is not just time — it is accuracy. Legal reference materials use specialized terminology, cross-reference numbered sections, and have jurisdiction-specific exceptions. A junior analyst might find the right general rule but miss the Texas-specific filing fee or confuse "inventory" with "equipment" (a critical legal distinction that affects lien priority).

Your agent solves this by ingesting a corpus of UCC reference documents, building a RAG pipeline, and answering questions with accurate, cited references. The analyst asks a natural language question, the agent retrieves relevant document sections, and Claude synthesizes a clear answer with specific section citations.

What You Will Build

A RAG agent with a search_ucc_knowledge_base tool that:

Ingests 3 document types: UCC Article 9 guide, state filing handbooks, and a collateral classification guide
Chunks documents with section-aware boundaries (preserving legal section IDs)
Generates embeddings and stores them in a vector database
Retrieves relevant chunks at query time with optional filters (doc type, state, section ID)
Synthesizes answers with section-level citations
Handles out-of-scope queries gracefully (no hallucination)

Skills practiced: RAG pipeline (M09-M10), embeddings, chunking strategies, vector search, citation generation, and out-of-scope detection.

Stretch goal: Add HyDE search for improved retrieval on legal terminology queries.

Prerequisites

Complete M05 (Function Calling), M08 (Conversation Management), M09 (RAG), and M10 (Advanced RAG) before starting this capstone. You should be comfortable defining tools, managing multi-turn conversations, and understanding the retrieve-then-generate pattern.

Environment Setup

Requirements: Python 3.10+ or Node.js 18+. You will also need an Anthropic API key.

# Run each line on its own — works in bash, zsh, cmd.exe, and PowerShell 5.1+
mkdir capstone-2-ucc-rag
cd capstone-2-ucc-rag
python -m venv venv
# macOS/Linux:        source venv/bin/activate
# Windows PowerShell: venv\Scripts\Activate.ps1
# Windows cmd.exe:    venv\Scripts\activate.bat

# Pin dependencies for reproducibility
echo "anthropic>=0.40.0" > requirements.txt
pip install -r requirements.txt

# API key (use the form for your shell)
# bash/zsh:           export ANTHROPIC_API_KEY=your-key-here
# Windows PowerShell: $env:ANTHROPIC_API_KEY = "your-key-here"
# Windows cmd.exe:    set ANTHROPIC_API_KEY=your-key-here

# Run each line on its own — works in bash, zsh, cmd.exe, and PowerShell 5.1+
mkdir capstone-2-ucc-rag
cd capstone-2-ucc-rag
npm init -y
npm install @anthropic-ai/sdk@^0.40.0
npm install -D typescript@^5.4 tsx@^4.7 @types/node@^20

# API key (use the form for your shell)
# bash/zsh:           export ANTHROPIC_API_KEY=your-key-here
# Windows PowerShell: $env:ANTHROPIC_API_KEY = "your-key-here"
# Windows cmd.exe:    set ANTHROPIC_API_KEY=your-key-here

File Structure

            capstone-2-ucc-rag/
├── mock_kb.py           # Mock knowledge base with UCC chunks
├── ucc_rag_agent.py     # RAG agent with tool-use loop
├── mock_kb.ts           # TypeScript mock knowledge base
├── ucc_rag_agent.ts     # TypeScript RAG agent
└── requirements.txt     # Dependencies
          

Domain Glossary

UCC Article 9

The section of the Uniform Commercial Code governing secured transactions — how lenders create, perfect, and enforce security interests in personal property (not real estate).

Perfection

The legal process of making a security interest enforceable against third parties. Usually requires filing a UCC-1 financing statement with the correct state office.

Security Interest

A lender's legal right to seize and sell a debtor's assets (collateral) if the debt is not repaid. Created by a security agreement between debtor and lender.

Financing Statement

The public document (UCC-1) filed with the SOS to put third parties on notice that a security interest exists. Contains debtor name, secured party name, and collateral description.

Collateral Classification

UCC Article 9 classifies collateral into types: inventory, equipment, accounts, instruments, chattel paper, etc. Classification determines filing location and priority rules.

Priority

The order in which competing security interests are paid from the same collateral. Generally, first to file or perfect has priority (with exceptions for purchase money security interests).

Debtor Location

Under UCC 9-301, perfection is governed by the law of the jurisdiction where the debtor is "located." For registered entities (LLCs, corps), location = state of organization.

Continuation Statement

A UCC-3 filed before the original UCC-1 lapses (typically within 6 months of the 5-year lapse date) that extends the filing for another 5 years.

RAG Pipeline Architecture

RAG Pipeline — Ingest → Chunk → Embed → Store → Retrieve → Generate

📄

Load Docs

→

✂

Chunk by Section

→

🔢

Embed Chunks

→

🗃

Vector DB

→

🔍

Retrieve Top-K

→

🧠

Claude + Citations

Chunking Strategy: Section-Aware Boundaries

Legal documents have a natural structure: numbered sections, subsections, and cross-references. Naive chunking (splitting every 500 characters) destroys this structure. Section-aware chunking preserves legal section boundaries and metadata.

Naive Chunking vs Section-Aware Chunking

Naive (500-char splits)

...security interest. For registered organizations

(corporations, LLCs), the debtor is loc

ated in the state of organization. A financing sta

tement must be filed to perfect a securi

ty interest in most types of collateral.

NO section IDs. NO metadata. Sentences cut mid-word.

Section-Aware

[9-301] Law Governing Perfection: The law of the jurisdiction where the debtor is located governs perfection...

meta: {doc: "UCC-ART9", section: "9-301", type: "article_9_guide"}

[9-310] Filing Required: A financing statement must be filed to perfect a security interest...

meta: {doc: "UCC-ART9", section: "9-310", type: "article_9_guide"}

[TX-FILING-01] Where to File: In Texas, financing statements are filed with the SOS...

meta: {doc: "STATE-GUIDE-TX", section: "TX-FILING-01", type: "state_handbook"}

Why It Matters

When a user asks "Where do I file in Texas?", the section-aware approach retrieves chunk [TX-FILING-01] with its full context intact. The naive approach might retrieve a fragment that starts mid-sentence and lacks any section reference. Citations become impossible without section metadata — and in legal contexts, citations are not optional. A claim without a citation is worthless to a paralegal.

Mock Document Corpus

{
  "documents": [
    {
      "doc_id": "string — unique document identifier",
      "title": "string — document title",
      "doc_type": "article_9_guide | state_handbook | collateral_guide",
      "sections": [
        {
          "section_id": "string — legal section reference (e.g., 9-301)",
          "title": "string — section heading",
          "content": "string — full section text"
        }
      ]
    }
  ]
}

// 3 document types form the knowledge base:

// 1. UCC Article 9 Guide (4 sections in this capstone)
// Covers: 9-301 (debtor location/governing law),
// 9-310 (filing required), 9-502 (financing statement
// contents), 9-515 (duration and lapse)

// 2. State Filing Handbooks (2 states: TX, DE)
// Each covers: where to file, fees, online systems,
// processing times

// 3. Collateral Classification Guide (2 categories)
// Covers: inventory, equipment
// (extend to accounts, instruments, chattel paper,
//  deposit accounts, investment property, intangibles
//  as a stretch goal)

// Total: 8 retrieval-ready chunks across 4 documents
// Each chunk preserves section_id, doc_id, and doc_type
// metadata for accurate citations

Implementation Phases

Phase 1 — Document Ingestion (45 min): Define the mock corpus as Python dataclass / TypeScript const literals. Parse each document into sections preserving section_id, doc_id, and doc_type metadata. (Production: load from JSON or a CMS.)
Phase 2 — Section-Aware Chunking (30 min): Chunk by section boundary (each section = one chunk). Add overlap by prepending the section title and parent doc title. Preserve metadata per chunk.
Phase 3 — Build a Keyword-Overlap Index (30 min): Score each chunk by the fraction of query terms it contains. This is a degenerate BM25 (no TF-IDF weighting) that keeps the capstone API-free. Production swap-in: generate Voyage AI / OpenAI embeddings and store in ChromaDB — same retrieval interface.
Phase 4 — Retrieval Tool (30 min): Build search_ucc_knowledge_base that scores the in-memory chunks with optional filters (doc_type, state, section_id). Return top-K results with similarity scores.
Phase 5 — RAG Agent (45 min): Wire the retrieval tool to Claude with a system prompt that instructs citation of section_id for every claim. Handle out-of-scope queries.
Phase 6 — Testing (30 min): Run 10 test cases. Verify citation accuracy — every factual claim must trace to a retrieved chunk.
Phase 7 — [OPTIONAL] Stretch: Add HyDE search — generate a hypothetical answer, embed it, and use that embedding for retrieval instead of the raw question.

Step 1: Create the Mock Knowledge Base

What & Why

Before building the RAG agent, you need a searchable knowledge base. This file defines 8 UCC document chunks (from 3 document types: Article 9 guide, state handbooks, and a collateral classification guide) and a keyword-search function that mimics vector similarity search. In production you would replace this with ChromaDB or Pinecone, but the interface stays the same.

Create a new file called mock_kb.py (or mock_kb.ts for Node.js) and paste the following code:

# mock_kb.py — Mock UCC Knowledge Base with simple similarity search
# In production, replace with ChromaDB / Pinecone / pgvector.

from dataclasses import dataclass

@dataclass
class Chunk:
    chunk_id: str
    doc_id: str
    section_id: str
    doc_type: str  # article_9_guide | state_handbook | collateral_guide
    title: str
    content: str
    state: str | None = None  # for state handbooks

# --- Mock corpus chunked by section ---
CHUNKS = [
    Chunk("c001", "UCC-ART9-GUIDE", "9-301", "article_9_guide",
          "Law Governing Perfection and Priority",
          "The law of the jurisdiction where the debtor is located governs "
          "perfection of a security interest. For registered organizations "
          "(corporations, LLCs), the debtor is located in the state of "
          "organization. For individuals, the debtor is located at their "
          "principal residence. This means a Delaware LLC's filings are "
          "governed by Delaware law, regardless of where the collateral "
          "is physically located."),
    Chunk("c002", "UCC-ART9-GUIDE", "9-310", "article_9_guide",
          "Filing Required to Perfect",
          "A financing statement must be filed to perfect a security "
          "interest in most types of collateral. Exceptions include: "
          "possessory security interests (9-313) where the secured party "
          "takes physical possession, control-based perfection for deposit "
          "accounts (9-314), and automatic perfection for purchase money "
          "security interests in consumer goods (9-309)."),
    Chunk("c003", "UCC-ART9-GUIDE", "9-502", "article_9_guide",
          "Contents of Financing Statement",
          "A financing statement is sufficient if it provides: (1) the "
          "name of the debtor, (2) the name of the secured party or "
          "representative, and (3) an indication of the collateral. "
          "The debtor name must EXACTLY match the name on the state's "
          "public organic record (certificate of formation, articles of "
          "incorporation). A misspelled debtor name can render the filing "
          "seriously misleading and therefore ineffective."),
    Chunk("c004", "STATE-GUIDE-TX", "TX-FILING-01", "state_handbook",
          "Where to File in Texas",
          "In Texas, UCC financing statements are filed with the "
          "Secretary of State. Filing can be done online at SOSDirect "
          "(direct.sos.state.tx.us), by mail to PO Box 13193 Austin TX 78711, "
          "or in person. Filing fee: $15 per page (standard), $5 per page "
          "(online via SOSDirect). Processing time: online filings are "
          "typically processed same-day; mail filings take 5-7 business days.",
          state="TX"),
    Chunk("c005", "STATE-GUIDE-DE", "DE-FILING-01", "state_handbook",
          "Where to File in Delaware",
          "In Delaware, UCC financing statements are filed with the "
          "Division of Corporations under the Secretary of State. Online "
          "filing available at corp.delaware.gov. Filing fee: $50 for a "
          "standard financing statement. Delaware is the most common "
          "filing jurisdiction for registered entities because many "
          "corporations and LLCs are organized in Delaware.",
          state="DE"),
    Chunk("c006", "COLLATERAL-GUIDE", "CC-INVENTORY", "collateral_guide",
          "Inventory Collateral",
          "Inventory includes goods held for sale or lease, raw materials, "
          "work in process, and materials used or consumed in a business. "
          "Key distinction from Equipment: if the debtor holds the goods "
          "for sale to customers, they are inventory; if the debtor uses "
          "the goods in its own operations, they are equipment. This "
          "classification matters because purchase money security interests "
          "in inventory have different priority rules than PMSI in equipment."),
    Chunk("c007", "COLLATERAL-GUIDE", "CC-EQUIPMENT", "collateral_guide",
          "Equipment Collateral",
          "Equipment means goods used or bought for use primarily in a "
          "business. Examples: manufacturing machinery, office computers, "
          "delivery vehicles. If the debtor holds goods for sale, they are "
          "inventory, not equipment. Equipment includes fixtures (goods "
          "that become part of real property) but fixture filings have "
          "special rules under 9-334."),
    Chunk("c008", "UCC-ART9-GUIDE", "9-515", "article_9_guide",
          "Duration and Lapse of Financing Statement",
          "A filed financing statement is effective for 5 years after the "
          "date of filing. To continue effectiveness, a continuation "
          "statement (UCC-3) must be filed within 6 months before lapse. "
          "If the continuation is not filed, the financing statement "
          "lapses and the security interest becomes unperfected. An "
          "unperfected security interest loses priority to subsequent "
          "perfected interests and to a trustee in bankruptcy."),
]


def search_ucc_knowledge_base(
    query: str,
    top_k: int = 5,
    filters: dict | None = None,
) -> dict:
    """
    Search the mock UCC knowledge base using keyword matching.
    In production, replace with vector similarity search.
    """
    try:
        if not query or not query.strip():
            return {"is_error": True, "error_category": "EMPTY_QUERY",
                    "is_retryable": False, "context": "Query cannot be empty."}

        query_lower = query.lower()
        query_terms = query_lower.split()

        # Filter chunks
        candidates = CHUNKS
        if filters:
            if filters.get("doc_type"):
                candidates = [c for c in candidates if c.doc_type == filters["doc_type"]]
            if filters.get("state"):
                candidates = [c for c in candidates if c.state == filters["state"].upper()]
            if filters.get("section_id"):
                candidates = [c for c in candidates if c.section_id == filters["section_id"]]

        # Score by keyword overlap (mock similarity)
        scored = []
        for chunk in candidates:
            text = (chunk.title + " " + chunk.content).lower()
            score = sum(1 for term in query_terms if term in text) / max(len(query_terms), 1)
            if score > 0:
                scored.append((chunk, round(score, 3)))

        scored.sort(key=lambda x: x[1], reverse=True)
        top = scored[:top_k]

        results = []
        for chunk, score in top:
            results.append({
                "chunk_id": chunk.chunk_id,
                "doc_id": chunk.doc_id,
                "section_id": chunk.section_id,
                "title": chunk.title,
                "content": chunk.content,
                "similarity_score": score,
                "metadata": {
                    "doc_type": chunk.doc_type,
                    "state": chunk.state,
                },
            })

        return {"is_error": False, "results": results, "total": len(results)}

    except Exception as e:
        return {"is_error": True, "error_category": "INTERNAL_ERROR",
                "is_retryable": True, "context": str(e)}

// mock_kb.ts — Mock UCC Knowledge Base with keyword search

interface Chunk {
  chunkId: string; docId: string; sectionId: string;
  docType: string; title: string; content: string; state?: string;
}

const CHUNKS: Chunk[] = [
  { chunkId: "c001", docId: "UCC-ART9-GUIDE", sectionId: "9-301",
    docType: "article_9_guide", title: "Law Governing Perfection and Priority",
    content: "The law of the jurisdiction where the debtor is located governs perfection of a security interest. For registered organizations (corporations, LLCs), the debtor is located in the state of organization. For individuals, the debtor is located at their principal residence. This means a Delaware LLC's filings are governed by Delaware law, regardless of where the collateral is physically located." },
  { chunkId: "c002", docId: "UCC-ART9-GUIDE", sectionId: "9-310",
    docType: "article_9_guide", title: "Filing Required to Perfect",
    content: "A financing statement must be filed to perfect a security interest in most types of collateral. Exceptions include: possessory security interests (9-313), control-based perfection for deposit accounts (9-314), and automatic perfection for purchase money security interests in consumer goods (9-309)." },
  { chunkId: "c003", docId: "UCC-ART9-GUIDE", sectionId: "9-502",
    docType: "article_9_guide", title: "Contents of Financing Statement",
    content: "A financing statement is sufficient if it provides: (1) the name of the debtor, (2) the name of the secured party or representative, and (3) an indication of the collateral. The debtor name must EXACTLY match the name on the state's public organic record. A misspelled debtor name can render the filing seriously misleading and therefore ineffective." },
  { chunkId: "c004", docId: "STATE-GUIDE-TX", sectionId: "TX-FILING-01",
    docType: "state_handbook", title: "Where to File in Texas",
    content: "In Texas, UCC financing statements are filed with the Secretary of State. Filing can be done online at SOSDirect, by mail, or in person. Filing fee: $15 per page (standard), $5 per page (online). Processing: online same-day; mail 5-7 business days.",
    state: "TX" },
  { chunkId: "c005", docId: "STATE-GUIDE-DE", sectionId: "DE-FILING-01",
    docType: "state_handbook", title: "Where to File in Delaware",
    content: "In Delaware, UCC financing statements are filed with the Division of Corporations. Online filing at corp.delaware.gov. Fee: $50 standard. Delaware is the most common jurisdiction because many entities are organized there.",
    state: "DE" },
  { chunkId: "c006", docId: "COLLATERAL-GUIDE", sectionId: "CC-INVENTORY",
    docType: "collateral_guide", title: "Inventory Collateral",
    content: "Inventory includes goods held for sale or lease, raw materials, work in process, and materials used or consumed in a business. Key distinction from Equipment: if held for sale, it's inventory; if used in operations, it's equipment. PMSI in inventory has different priority rules than PMSI in equipment." },
  { chunkId: "c007", docId: "COLLATERAL-GUIDE", sectionId: "CC-EQUIPMENT",
    docType: "collateral_guide", title: "Equipment Collateral",
    content: "Equipment means goods used or bought for use primarily in a business. Examples: machinery, computers, vehicles. If held for sale = inventory, not equipment. Includes fixtures with special rules under 9-334." },
  { chunkId: "c008", docId: "UCC-ART9-GUIDE", sectionId: "9-515",
    docType: "article_9_guide", title: "Duration and Lapse",
    content: "A financing statement is effective for 5 years. Continuation (UCC-3) must be filed within 6 months before lapse. If not continued, the filing lapses and the security interest becomes unperfected, losing priority." },
];

interface SearchResult { is_error: boolean; [key: string]: unknown; }

export function searchUccKnowledgeBase(
  query: string, topK = 5, filters?: { doc_type?: string; state?: string; section_id?: string },
): SearchResult {
  try {
    if (!query?.trim()) return { is_error: true, error_category: "EMPTY_QUERY", is_retryable: false, context: "Query cannot be empty." };
    const terms = query.toLowerCase().split(/\s+/);
    let candidates = [...CHUNKS];
    if (filters?.doc_type) candidates = candidates.filter(c => c.docType === filters.doc_type);
    if (filters?.state) candidates = candidates.filter(c => c.state === filters.state.toUpperCase());
    if (filters?.section_id) candidates = candidates.filter(c => c.sectionId === filters.section_id);
    const scored = candidates.map(c => {
      const text = (c.title + " " + c.content).toLowerCase();
      const score = terms.filter(t => text.includes(t)).length / Math.max(terms.length, 1);
      return { chunk: c, score: Math.round(score * 1000) / 1000 };
    }).filter(s => s.score > 0).sort((a, b) => b.score - a.score).slice(0, topK);
    return { is_error: false, total: scored.length, results: scored.map(s => ({
      chunk_id: s.chunk.chunkId, doc_id: s.chunk.docId, section_id: s.chunk.sectionId,
      title: s.chunk.title, content: s.chunk.content, similarity_score: s.score,
      metadata: { doc_type: s.chunk.docType, state: s.chunk.state ?? null },
    })) };
  } catch (error) {
    return { is_error: true, error_category: "INTERNAL_ERROR", is_retryable: true, context: String(error) };
  }
}

Run Command

# Quick sanity test — run from the project directory:
python -c "from mock_kb import search_ucc_knowledge_base; print(search_ucc_knowledge_base('filing Texas'))"

# Quick sanity test:
npx tsx -e "import {searchUccKnowledgeBase} from './mock_kb.ts'; console.log(searchUccKnowledgeBase('filing Texas'))"

Expected Output

{'is_error': False, 'results': [{'chunk_id': 'c004', 'doc_id': 'STATE-GUIDE-TX', 'section_id': 'TX-FILING-01', 'title': 'Where to File in Texas', ...}], 'total': ...}

Checkpoint: Step 1 Complete

You built a mock knowledge base with 8 section-aware chunks from 3 document types. Each chunk preserves its section_id, doc_id, and doc_type as metadata. The search function uses keyword overlap as a mock for vector similarity (in production, you would use ChromaDB with real embeddings). Filters let the agent narrow results by document type or state. The key design decision: one section = one chunk, with metadata that enables accurate citations.

Troubleshooting Step 1

ModuleNotFoundError: No module named 'mock_kb' — Make sure you are running the command from the same directory that contains mock_kb.py.
SyntaxError on str | None — You need Python 3.10+. Check with python --version. If you are on 3.9, change str | None to Optional[str] and add from typing import Optional.

Now that you have a working knowledge base, you need an agent that can call it as a tool and synthesize cited answers from the results.

Step 2: Create the RAG Agent

What & Why

This file wires Claude to your knowledge base using the tool-use pattern from M05. The agent receives a user question, calls search_ucc_knowledge_base to retrieve relevant chunks, then synthesizes an answer with inline citations. The system prompt enforces strict citation rules: every factual claim must reference a specific section ID.

Create a new file called ucc_rag_agent.py (or ucc_rag_agent.ts for Node.js) and paste the following code:

# ucc_rag_agent.py — RAG-Powered UCC Reference Agent
import anthropic
import json
from mock_kb import search_ucc_knowledge_base

SYSTEM_PROMPT = """You are a UCC Article 9 Reference Agent for commercial
lending professionals. You answer questions about UCC filing requirements,
collateral classifications, perfection rules, and state-specific procedures.

You have access to a knowledge base of UCC reference materials via the
search_ucc_knowledge_base tool. ALWAYS search before answering.

CITATION RULES (critical):
- Every factual claim MUST cite a specific section: (Section 9-310)
- If a claim comes from a state handbook, cite it: (TX-FILING-01)
- If you cannot find relevant information, say so explicitly.
  Do NOT guess or make up legal information.
- You provide REFERENCE INFORMATION only. You do NOT provide legal advice.
- Always recommend consulting counsel for specific legal questions.

RESPONSE FORMAT:
- Use clear headings and bullet points for multi-part answers
- Cite sources inline: "Filing is required to perfect (Section 9-310)"
- End with a "Sources" list showing all cited sections
"""

TOOLS = [
    {
        "name": "search_ucc_knowledge_base",
        "description": (
            "Search the UCC reference knowledge base for information about "
            "Article 9 filing requirements, collateral classifications, "
            "perfection rules, and state-specific procedures. Returns "
            "relevant document sections with citation references."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Natural language search query",
                },
                "top_k": {
                    "type": "integer",
                    "description": "Number of results to return (default 5)",
                },
                "filters": {
                    "type": "object",
                    "description": "Optional filters",
                    "properties": {
                        "doc_type": {
                            "type": "string",
                            "enum": ["article_9_guide", "state_handbook", "collateral_guide"],
                        },
                        "state": {"type": "string"},
                        "section_id": {"type": "string"},
                    },
                },
            },
            "required": ["query"],
        },
    },
]

TOOL_HANDLERS = {
    "search_ucc_knowledge_base": lambda args: search_ucc_knowledge_base(
        query=args["query"],
        top_k=args.get("top_k", 5),
        filters=args.get("filters"),
    ),
}


def run_rag_agent(user_message: str, history: list | None = None) -> tuple[str, list]:
    client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY env var
    messages = history or []
    messages.append({"role": "user", "content": user_message})

    for _ in range(5):
        try:
            response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=2048,
                system=SYSTEM_PROMPT,
                tools=TOOLS,
                messages=messages,
            )
        except anthropic.APIError as e:
            return f"Technical issue: {e}", messages

        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    handler = TOOL_HANDLERS.get(block.name)
                    result = handler(block.input) if handler else {
                        "is_error": True, "error_category": "UNKNOWN_TOOL",
                    }
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result),
                    })
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})

        elif response.stop_reason == "end_turn":
            text = " ".join(b.text for b in response.content if hasattr(b, "text"))
            messages.append({"role": "assistant", "content": response.content})
            return text, messages

    return "Unable to complete your request.", messages


if __name__ == "__main__":
    print("UCC Reference Agent — Type 'quit' to exit.\n")
    history = []
    while True:
        q = input("You: ").strip()
        if q.lower() in ("quit", "exit", "q"):
            break
        response, history = run_rag_agent(q, history)
        print(f"\nAgent: {response}\n")

// ucc_rag_agent.ts — RAG-Powered UCC Reference Agent
import Anthropic from "@anthropic-ai/sdk";
// `tsx` resolves the `.js` ext to `mock_kb.ts` at runtime — keep .js here for ESM compatibility
import { searchUccKnowledgeBase } from "./mock_kb.js";

const SYSTEM_PROMPT = `You are a UCC Article 9 Reference Agent for commercial
lending professionals. You answer questions about UCC filing requirements,
collateral classifications, perfection rules, and state-specific procedures.

CITATION RULES (critical):
- Every factual claim MUST cite a specific section: (Section 9-310)
- If from a state handbook, cite: (TX-FILING-01)
- If you cannot find info, say so. Do NOT guess legal information.
- You provide REFERENCE INFORMATION only. Not legal advice.
- Recommend consulting counsel for specific legal questions.`;

const TOOLS: Anthropic.Tool[] = [{
  name: "search_ucc_knowledge_base",
  description: "Search UCC reference knowledge base for Article 9 filing, collateral, perfection, and state procedures.",
  input_schema: {
    type: "object" as const,
    properties: {
      query: { type: "string", description: "Search query" },
      top_k: { type: "integer", description: "Results count (default 5)" },
      filters: { type: "object", properties: {
        doc_type: { type: "string", enum: ["article_9_guide","state_handbook","collateral_guide"] },
        state: { type: "string" }, section_id: { type: "string" },
      }},
    },
    required: ["query"],
  },
}];

type ToolArgs = { query: string; top_k?: number; filters?: { doc_type?: string; state?: string; section_id?: string } };
const HANDLERS: Record<string, (a: ToolArgs) => unknown> = {
  search_ucc_knowledge_base: (a) => searchUccKnowledgeBase(a.query, a.top_k ?? 5, a.filters),
};

export async function runRagAgent(
  msg: string, history: Anthropic.MessageParam[] = [],
): Promise<[string, Anthropic.MessageParam[]]> {
  const client = new Anthropic();
  const messages = [...history, { role: "user" as const, content: msg }];

  for (let i = 0; i < 5; i++) {
    let response: Anthropic.Message;
    try {
      response = await client.messages.create({
        model: "claude-sonnet-4-6", max_tokens: 2048,
        system: SYSTEM_PROMPT, tools: TOOLS, messages,
      });
    } catch (err) { return [`Technical issue: ${err}`, messages]; }

    if (response.stop_reason === "tool_use") {
      const results: Anthropic.ToolResultBlockParam[] = [];
      for (const b of response.content) {
        if (b.type === "tool_use") {
          const h = HANDLERS[b.name];
          const r = h ? h(b.input as ToolArgs) : { is_error: true, error_category: "UNKNOWN" };
          results.push({ type: "tool_result", tool_use_id: b.id, content: JSON.stringify(r) });
        }
      }
      messages.push({ role: "assistant", content: response.content });
      messages.push({ role: "user", content: results });
    } else if (response.stop_reason === "end_turn") {
      const text = response.content
        .filter((b): b is Anthropic.TextBlock => b.type === "text")
        .map(b => b.text).join(" ");
      messages.push({ role: "assistant", content: response.content });
      return [text, messages];
    }
  }
  return ["Unable to complete your request.", messages];
}

// --- CLI entry point (matches Python's interactive loop) ---
async function main() {
  const readline = await import("readline");
  const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
  const ask = (q: string): Promise<string> => new Promise(res => rl.question(q, res));

  console.log("UCC Reference Agent — Type 'quit' to exit.\n");
  let history: Anthropic.MessageParam[] = [];
  while (true) {
    const q = (await ask("You: ")).trim();
    if (["quit", "exit", "q"].includes(q.toLowerCase())) break;
    const [response, newHistory] = await runRagAgent(q, history);
    history = newHistory;
    console.log(`\nAgent: ${response}\n`);
  }
  rl.close();
}

main().catch(console.error);

Run Command

python ucc_rag_agent.py

npx tsx ucc_rag_agent.ts

Checkpoint: Step 2 Complete

You built a RAG agent that: (1) receives a legal question, (2) searches the knowledge base via the tool, (3) receives section-level chunks with metadata, and (4) synthesizes an answer with inline citations. The system prompt enforces citation rules — every claim must reference a section ID. The agent also explicitly states when information is not found, preventing hallucination on legal topics where accuracy is critical.

Troubleshooting Step 2

ModuleNotFoundError: No module named 'anthropic' — Run pip install anthropic (make sure your virtual environment is activated).
AuthenticationError — Your ANTHROPIC_API_KEY environment variable is missing or invalid. Run echo $ANTHROPIC_API_KEY to check (Windows: echo %ANTHROPIC_API_KEY%).
ImportError: cannot import name 'search_ucc_knowledge_base' from 'mock_kb' — Ensure both mock_kb.py and ucc_rag_agent.py are in the same directory, and you are running from that directory.

The agent is built. Now let's run it interactively and verify that it retrieves the right chunks and cites them correctly.

Step 3: Test the RAG Agent

What & Why

Running the agent interactively lets you verify the full RAG loop: question in, tool call out, chunks retrieved, cited answer generated. Try the sample queries below to confirm the agent cites specific section IDs and handles out-of-scope questions gracefully.

Start the agent (if not already running):

python ucc_rag_agent.py

npx tsx ucc_rag_agent.ts

Then try these sample queries:

How do I perfect a security interest in inventory in Texas?
What's the difference between inventory and equipment?
Where do I file a UCC-1 in Delaware?
How do I perfect a security interest in cryptocurrency? (out-of-scope test)

Example Interaction

You: How do I perfect a security interest in inventory in Texas? Agent: To perfect a security interest in inventory in Texas, you need to follow two steps: **1. Determine the Filing Jurisdiction** Under UCC Article 9, perfection is governed by the law of the jurisdiction where the debtor is located (Section 9-301). For registered entities (LLCs, corporations), this is the state of organization — not necessarily Texas. If your debtor is a Texas LLC, you file in Texas. If they are a Delaware LLC operating in Texas, you file in Delaware. **2. File a Financing Statement** A financing statement (UCC-1) must be filed to perfect the security interest (Section 9-310). The statement must include the debtor name (exactly matching the public record), secured party name, and collateral description (Section 9-502). **Texas-Specific Filing:** In Texas, file with the Secretary of State via SOSDirect (online), mail, or in person. Fees: $5/page online, $15/page by mail (TX-FILING-01). **Collateral Note:** Ensure the collateral is correctly classified as "inventory" — goods held for sale, raw materials, or work in process (CC-INVENTORY). If the debtor uses the goods in operations rather than selling them, they are "equipment" with different priority rules (CC-EQUIPMENT). **Sources:** - Section 9-301: Law Governing Perfection - Section 9-310: Filing Required - Section 9-502: Financing Statement Contents - TX-FILING-01: Texas Filing Procedures - CC-INVENTORY: Inventory Classification *Note: This is reference information, not legal advice. Consult counsel for your specific transaction.*

Checkpoint: Step 3 Complete

If the agent returned an answer with inline citations like (Section 9-310) and (TX-FILING-01), your RAG pipeline is working end-to-end. The agent searched the knowledge base, retrieved relevant chunks, and synthesized a cited answer. Try the out-of-scope query (cryptocurrency) to verify the agent says "I don't have information about that" instead of hallucinating.

Testing Guide

Type	Input	Expected Behavior
Happy	"How do I perfect a security interest in inventory in Texas?"	Cites 9-310, 9-301, TX-FILING-01, CC-INVENTORY
Happy	"What's the difference between inventory and equipment?"	Cites CC-INVENTORY and CC-EQUIPMENT with clear distinction
Happy	"Where do I file a UCC-1 in Delaware?"	Cites DE-FILING-01 with fees and website
Happy	"Which state's law governs perfection for a Delaware LLC?"	Cites 9-301 and explains debtor location rule
Happy	"What are the exceptions to the filing requirement?"	Cites 9-310, lists possessory (9-313), control (9-314), auto-perfection (9-309)
Edge	"How do I perfect a security interest in cryptocurrency?"	Acknowledges gap in corpus, suggests consulting counsel
Edge	"What about UCC Article 2?"	Notes knowledge base covers Article 9 only
Edge	Highly technical legal jargon query	Retrieves relevant chunks, translates to plain English
Adversarial	"Draft a UCC-1 financing statement for me"	Explains it provides reference info, cannot draft legal documents
Adversarial	"Is this filing valid?" (no context)	Asks for specific filing details before analysis

Retrieval-Then-Generation Flow

This is what happens each time a user asks a question. The agent does not answer from memory — it searches first, retrieves evidence, then generates a cited answer grounded in retrieved chunks.

Query → Embed → Search Vectors → Top-K Chunks → Claude Generates Cited Answer

👤

User Question

→

🔢

Embed Query

→

🔍

Search Vectors

→

📋

Top-K Chunks

→

🧠

Claude + Prompt

→

✅

Cited Answer

Why It Matters

This flow is the core of every RAG agent. The user never sees the intermediate steps — they just get a grounded, cited answer. But under the hood, the agent: (1) converts the question into a vector, (2) finds the most similar document chunks, (3) injects those chunks into the prompt, and (4) instructs Claude to cite specific sections. Without this pipeline, the agent would answer from its training data, which may be outdated, incomplete, or hallucinated for domain-specific legal content.

Knowledge Check

Test your understanding of the RAG pipeline and UCC domain concepts covered in this capstone.

1. In a RAG pipeline, what happens at query time?

Claude answers directly from its training data without searching any documents

The query is embedded, similar chunks are retrieved from the vector store, and they are added to the prompt for Claude to generate an answer

All documents are sent to Claude in the prompt every time, and Claude selects the relevant parts

The user manually selects which documents to include before asking the question

2. Why is chunking by section (e.g., UCC Article 9 sections) better than fixed-size chunking for legal documents?

Fixed-size chunks are larger and use more tokens

Section-based chunking produces fewer total chunks, which is always better

Legal sections are self-contained units of meaning — splitting mid-section would break context and make citations unreliable

Fixed-size chunking is not supported by vector databases

3. What happens when a UCC-1 financing statement lapses?

The debtor must refile within 30 days or face penalties

The secured party loses its perfected security interest, meaning their claim on the debtor's collateral is no longer legally enforceable against other creditors

The filing is automatically renewed for another 5 years

The collateral is transferred to the state

4. Applied: A paralegal asks “Does a UCC-3 continuation need to be filed before or after the 5-year lapse?” — what should the RAG agent do?

Answer from general knowledge that continuations are filed after lapse

Refuse to answer because this is legal advice

Return all chunks in the knowledge base for the paralegal to read

Search for relevant chunks about continuation filing timing, find the rule about the 6-month window before lapse, and cite the specific section

5. Why does this capstone use simple keyword overlap instead of neural embeddings?

Keyword overlap is free, requires no external API, and demonstrates core retrieval concepts without adding cost or complexity — neural embeddings can be swapped in later

Keyword overlap always produces better results than neural embeddings for legal text

Neural embeddings are not compatible with Claude

Vector databases cannot store neural embeddings

Verify Everything Works

Run this end-to-end smoke test to confirm your entire RAG pipeline is functioning correctly. The test sends a question, checks that the agent calls the search tool, and verifies the response contains citations.

# verify.py — End-to-end smoke test
from ucc_rag_agent import run_rag_agent

test_queries = [
    ("Where do I file a UCC-1 in Delaware?", ["DE-FILING-01"]),
    ("What is the difference between inventory and equipment?", ["CC-INVENTORY", "CC-EQUIPMENT"]),
    ("How long is a financing statement effective?", ["9-515"]),
]

print("=== End-to-End Verification ===\n")
passed = 0
for query, expected_citations in test_queries:
    response, _ = run_rag_agent(query)
    found = [cit for cit in expected_citations if cit in response]
    status = "PASS" if len(found) == len(expected_citations) else "FAIL"
    if status == "PASS":
        passed += 1
    print(f"[{status}] Query: {query}")
    print(f"       Expected citations: {expected_citations}")
    print(f"       Found: {found}\n")

print(f"Result: {passed}/{len(test_queries)} tests passed.")
if passed == len(test_queries):
    print("All tests passed — your RAG agent is working correctly!")

// verify.ts — End-to-end smoke test
import { runRagAgent } from "./ucc_rag_agent.js";

const testQueries: [string, string[]][] = [
  ["Where do I file a UCC-1 in Delaware?", ["DE-FILING-01"]],
  ["What is the difference between inventory and equipment?", ["CC-INVENTORY", "CC-EQUIPMENT"]],
  ["How long is a financing statement effective?", ["9-515"]],
];

async function verify() {
  console.log("=== End-to-End Verification ===\n");
  let passed = 0;
  for (const [query, expectedCitations] of testQueries) {
    const [response] = await runRagAgent(query);
    const found = expectedCitations.filter(c => response.includes(c));
    const status = found.length === expectedCitations.length ? "PASS" : "FAIL";
    if (status === "PASS") passed++;
    console.log(`[${status}] Query: ${query}`);
    console.log(`       Expected: ${expectedCitations.join(", ")}`);
    console.log(`       Found: ${found.join(", ")}\n`);
  }
  console.log(`Result: ${passed}/${testQueries.length} tests passed.`);
  if (passed === testQueries.length) console.log("All tests passed!");
}

verify().catch(console.error);

Run the verification:

python verify.py

npx tsx verify.ts

Expected Output

=== End-to-End Verification === [PASS] Query: Where do I file a UCC-1 in Delaware? Expected citations: ['DE-FILING-01'] Found: ['DE-FILING-01'] [PASS] Query: What is the difference between inventory and equipment? Expected citations: ['CC-INVENTORY', 'CC-EQUIPMENT'] Found: ['CC-INVENTORY', 'CC-EQUIPMENT'] [PASS] Query: How long is a financing statement effective? Expected citations: ['9-515'] Found: ['9-515'] Result: 3/3 tests passed. All tests passed — your RAG agent is working correctly!

Troubleshooting

Common errors and how to fix them:

Error	Cause	Fix
`ModuleNotFoundError: No module named 'anthropic'`	The Anthropic SDK is not installed in your active Python environment.	Run `pip install anthropic`. Make sure your virtual environment is activated (`source venv/bin/activate`).
`AuthenticationError`	Missing or invalid API key.	Set `export ANTHROPIC_API_KEY=your-key-here` (Windows: `set ANTHROPIC_API_KEY=your-key-here`). Verify with `echo $ANTHROPIC_API_KEY`.
`ImportError: cannot import name 'search_ucc_knowledge_base' from 'mock_kb'`	The agent file cannot find the knowledge base module.	Ensure both `mock_kb.py` and `ucc_rag_agent.py` are in the same directory, and you are running from that directory.
`JSONDecodeError` or malformed tool response	Claude occasionally returns non-JSON in the tool call. This is rare but can happen.	Retry the query. If it persists, check that your tool definition's `input_schema` matches the expected format. The agent loop already handles up to 5 retries.

Compliance & Regulatory Notes

Legal Information Disclaimer

Not legal advice: This agent provides reference information from UCC Article 9 materials. It does NOT constitute legal advice. Always recommend users consult qualified counsel for specific transactions.

Jurisdiction accuracy: UCC rules have state-specific variations. The knowledge base covers general Article 9 principles plus specific state handbooks. For states not in the corpus, the agent should explicitly state that state-specific procedures may differ.

Citation integrity: Every factual claim must trace to a specific document section. Hallucinated legal citations are worse than no citation — they can lead to incorrect filings, lost priority, and financial loss.

Going Further

[OPTIONAL] Stretch: HyDE search — Generate a hypothetical answer, embed it, and use that for retrieval. Improves results when the user's question uses different terminology than the source documents.
Hybrid search — Combine vector similarity with keyword matching (BM25) for exact section ID lookups alongside semantic search (M10).
Multi-state coverage — Add handbooks for all 50 states and let the agent compare filing procedures across jurisdictions.
Connect to Capstone 1 — Combine filing lookup (CAPSTONE-1-C) with regulatory reference (this capstone) so the agent can both search filings AND explain what they mean.
Re-ranking — Add a re-ranking step that prioritizes state-specific results when the query mentions a state, even if general Article 9 chunks score higher on similarity.

References & Resources

Claude Tool Use Documentation — Function calling for RAG tools
Prompt Engineering Guide — System prompts for citation enforcement
Anthropic Cookbook — RAG patterns and examples
UCC Article 9 (Cornell Law) — Full legal text of secured transactions