M04: Structured Output & Parsing
Agents don't just generate text — they produce data that your code must parse, validate, and act on. This module teaches you how to get reliable, structured output from Claude and what to do when it isn't perfect.
Learning Objectives
- Explain why agents require structured output instead of free-form text
- Use Claude's tool use feature to guarantee structured JSON responses
- Validate API responses with Pydantic (Python) and Zod (TypeScript) schemas
- Implement retry logic with error-aware re-prompting for parse failures
- Build a complete data extraction pipeline with schema validation and error recovery
Why Agents Need Structured Responses
BEFORE: Imagine asking a coworker for customer data and they reply with a rambling paragraph: "Oh yeah, John works over at Acme, I think his email is something like john@acme.com, and he might be a PM?" You'd have to read it, guess which parts are data, and manually copy them out every single time.
PAIN: Now imagine your code has to do this hundreds of times per minute. Regex breaks on edge cases, sentence structures vary wildly, and one missed field can crash your entire pipeline at 2 AM.
MAPPING: Structured dataData organized in a predictable format with labeled fields and defined types — like JSON objects, XML documents, or database rows. Structured data can be reliably parsed by code, unlike free-form text. is like getting a filled-in spreadsheet instead of a paragraph — every field has a label, every value has a type, and your code can grab exactly what it needs with zero guesswork.
What this actually looks like: Here's the difference between unstructured and structured output from the same prompt. The left side is what you'd get from a plain text response. The right side is what tool use gives you:
# Unstructured (free-form text — hard to parse reliably):
"Jane Smith is the VP of Engineering at TechCorp. Her email is
jane.smith@techcorp.io and you can reach her at (555) 123-4567."
# Structured (tool_use response — code-ready):
{
"name": "Jane Smith",
"email": "jane.smith@techcorp.io",
"phone": "(555) 123-4567",
"company": "TechCorp",
"role": "VP of Engineering"
}
JSONJavaScript Object Notation — a lightweight, human-readable data format using key-value pairs and arrays. The standard format for API communication and the most common structured output format for LLM agents. is the standard format for this. It bridges the gap between natural language (what Claude generates) and programmatic consumption (what your code needs). Without structured output, agents can't reliably extract fields from responses. They can't route decisions based on Claude's answer. And they can't feed results into APIs or databases. Every step becomes fragile string parsing that breaks on edge cases.
Unstructured → Structured → Application
Tool Use as Structured Output
BEFORE: Without tool useA Claude API feature where you define functions (tools) with JSON Schema parameters. Claude returns structured tool_use content blocks specifying which tool to call and with what arguments. This is Claude's most reliable structured output mechanism., getting structured output from Claude was like shouting your order across a noisy room — you'd write a prompt saying "please return JSON with these fields" and hope the model understood.
PAIN: The model might add markdown formatting around the JSON, forget required fields, use the wrong types, or slip in a conversational sentence before the data. You'd need fragile regex and try-catch blocks just to extract the output.
MAPPING: Tool use is like handing Claude a restaurant order form with pre-printed fields: name of dish, quantity, special instructions. Claude can only fill in the blanks on the form — it physically cannot return free-form text when forced to use a tool. The form (JSON Schema) guarantees the structure.
What this actually looks like in the API response: When Claude uses a tool, you don't get a plain text string. You get a structured content block with type: "tool_use":
# Claude's actual response when forced to use a tool:
{
"role": "assistant",
"content": [
{
"type": "tool_use", # ← not "text"!
"id": "toolu_01A2B3C4D5",
"name": "extract_contact", # ← which tool it "called"
"input": { # ← structured data, guaranteed valid JSON
"name": "Jane Smith",
"email": "jane.smith@techcorp.io",
"phone": "(555) 123-4567",
"company": "TechCorp",
"role": "VP of Engineering"
}
}
],
"stop_reason": "tool_use" # ← stopped because it wants to call a tool
}
First, you define a tool in your API request's
tools array. Each tool has a name, description, and an JSON SchemaA standard for describing the structure of JSON data. It specifies field names, types, required fields, enums, nested objects, and validation rules. Used in Claude's tool definitions to describe expected parameters. that describes its expected parameters — field names, types, which are required.
Second, Claude reads the tool definition and returns a
tool_use content blockA specific content type in Claude's API response. When Claude decides to call a tool, it returns a content block with type "tool_use", containing the tool name and a structured input object matching the tool's JSON Schema. — a structured object with the tool name and populated arguments that match the schema.
Third, you can force Claude to use a specific tool with the
tool_choiceAn API parameter that controls whether Claude must use a tool. Set to {"type": "tool", "name": "..."} to force a specific tool, or {"type": "any"} to require any tool. This guarantees a structured tool_use response. parameter. This guarantees you'll get structured output every time. The key insight: you don't have to actually execute the tool. You can define a "tool" purely as a structured output mechanism — Claude fills in the form, and you just read the values. This works because Claude is specifically trained to produce valid tool_use blocks.
Validation, Stop Sequences & Schema Checking
BEFORE: Without validation layers, you'd send Claude a prompt, get back some JSON-ish text, cross your fingers, and call JSON.parse(). Sometimes it worked; sometimes Claude added a friendly "Here's the data:" prefix or a trailing explanation that broke parsing entirely.
PAIN: In production, this meant ~5-15% of responses would fail to parse, triggering silent data loss or crashes that only surfaced hours later in downstream systems. Debugging was a nightmare because failures were intermittent and format-dependent.
MAPPING: Think of stop sequencesStrings you specify in the API request that cause Claude to stop generating when encountered. For example, setting "}" as a stop sequence ensures Claude stops right after closing a JSON object instead of adding extra text. as a film director yelling "Cut!" at exactly the right moment — they halt generation at the closing brace. Schema validation is the quality inspector who checks every frame after the cut. Together, they form a multi-layer safety net.
What this looks like in practice: Without a stop sequence, Claude might return {"name": "Jane"} Hope that helps!. With "stop_sequences": ["}"] in your API request, Claude stops at the closing brace: {"name": "Jane"}. Clean JSON, no trailing text to strip.
Layer 1 — Format constraints: This is where you use tool use or careful prompt engineering to make Claude emit valid JSON in the first place. Think of it as building the output in the right shape from the start.
Layer 2 — Stop sequences: These are strings (like
} or ]) that you tell the API to watch for. When Claude generates one of these strings, it immediately stops producing more text. This prevents Claude from appending conversational text after your JSON object.
You set them in the API request as
"stop_sequences": ["}"]. Claude generates tokens until it hits that closing brace, then halts. The response's stop_reason will be "stop_sequence" instead of "end_turn", so your code can tell exactly why generation stopped. Note: stop sequences are most useful for prompt-based JSON extraction. When you use tool use (Layer 1), you don't need them because tool_use blocks are already bounded.
Layer 3 — Schema validationThe process of checking that a data structure matches an expected format — verifying that required fields exist, types are correct, and values are within allowed ranges. Implemented with Pydantic (Python) or Zod (TypeScript).: Even if the JSON is syntactically valid, it might have the wrong fields or types. Schema validation checks three things: Does every required field exist? Does every value have the correct type? Did any unexpected data sneak in? When all three layers are active, you approach near-100% valid structured output.
Schema Validation with Pydantic & Zod
PydanticA Python library for data validation using type annotations. You define a class with typed fields, and Pydantic validates, parses, and transforms incoming data automatically. The standard tool for validating LLM output in Python agents. (Python) and ZodA TypeScript-first schema validation library. You define schemas with z.object() and Zod validates data at runtime, providing detailed error messages for each invalid field. The TypeScript equivalent of Pydantic. (TypeScript) are schema validation libraries. In plain English: you define the exact shape your data must have (which fields, what types, which are required), and the library automatically checks every response against that shape. If something doesn't match, it tells you exactly what's wrong.
Under the hood, these libraries work by defining a model class (Pydantic) or schema object (Zod) with typed fields. When you pass Claude's output through the model, the library checks each field: Is name a string? Is price a number? Is email present (since it's required)? If any check fails, it raises a ValidationErrorAn exception raised by Pydantic when input data fails validation. It contains a list of specific field errors — which field failed, what type was expected, and what was received — making it ideal for retry prompts. with field-level details — not just "invalid data" but "field 'email' expected str, got None." These specific error messages are exactly what you'll feed back into retry prompts.
How does this differ from just calling JSON.parse() or json.loads()? Those only check that the JSON is syntactically valid — matching braces, correct commas. Schema validation goes further: it verifies that the data is semantically correct for your application. Valid JSON like {"name": 123} would pass JSON.parse() but fail Pydantic because name should be a string, not a number. The tool's parameters are specified in the input_schemaA field in each tool definition that contains a JSON Schema object describing the tool's expected parameters — their names, types, descriptions, and which are required. Claude uses this schema to generate valid arguments. field of the tool definition. Here's a time-saving trick: you can auto-generate this schema from your Pydantic model using ContactInfo.model_json_schema(). That way, you define the shape once in Pydantic, and the tool definition stays in sync automatically.
tool_use guarantees STRUCTURE (valid JSON matching schema) but NOT semantic correctness. Values inside the JSON may still be wrong. Always add business rule validation after tool_use extraction.
"Tool use guarantees the data is correct." — No. Tool use guarantees structure (valid JSON matching the schema), not semantic correctness. If you ask Claude to extract a phone number and it hallucinates "(555) 000-0000," the JSON will be perfectly valid but the data is wrong. Always add business rule validation on top of schema validation.
"I can just ask Claude to 'respond in JSON' instead of using tool use." — You can, but it's significantly less reliable. Prompt-only JSON extraction fails 5–15% of the time (markdown wrappers, trailing text, missing commas). Tool use fails under 0.5% because Claude is specifically trained for the tool_use format. For anything beyond a quick prototype, use tool use.
"Schema validation catches all bad output." — Schema validation catches type errors (string where number expected) and missing fields, but not logical errors. A schema can confirm age is an integer, but not that 350 is an unreasonable value for a human's age. You need separate business logic validation for semantic checks.
"If parsing fails, just retry the same prompt." — Blind retries have a low success rate because Claude will likely make the same mistake again. The effective approach is to include the specific error message in the retry prompt so Claude can see what went wrong and self-correct.
"Structured output is only for data extraction." — It's also essential for decision routing (Claude returns {"action": "escalate", "reason": "..."}), tool selection (returning which tool to call with what parameters), and any scenario where downstream code needs to branch on Claude's response. If your code does an if on Claude's output, you need structured output.
Error Recovery: When Parsing Fails
BEFORE: Early LLM integrations treated parsing failures as fatal errors — if the JSON was malformed, the request simply failed and the user saw a generic "Something went wrong" message.
PAIN: This meant a single missing comma in Claude's output could waste an entire API call ($0.01-0.05 in tokens), leave a customer-facing request unanswered, and require manual intervention to unblock the pipeline.
MAPPING: Error recovery is like a GPS recalculating your route — when you miss a turn (a parse failure), the system doesn't pull over and shut off the engine. It immediately recomputes, telling you exactly which turn you missed and offering a corrected path to the same destination.
What the retry prompt actually looks like: "Extract contact info from: ...\n\nPrevious attempt failed with: ValidationError — field 'email' expected str, got None\nPlease fix the output to match the required schema exactly." — Claude reads the error, sees it missed the email, and self-corrects on the next attempt.
1. Retry with error feedback: Send the same request again, but append the specific error message (e.g., "field 'email' was null, expected string") to the prompt. Claude reads the error and self-corrects — this fixes ~90% of failures on the first retry.
2. Fallback to simpler format: If the complex schema keeps failing, ask for a simpler one (fewer fields, no nested objects). A partial answer is better than no answer.
3. Partial parsing: Extract whichever fields did validate and flag the rest as incomplete. Useful when some data is better than none.
4. Cascading validators: Try multiple schemas in order — strict first, then progressively looser. This handles cases where Claude returns valid data in a slightly different shape.
5. Human-in-the-loop escalation: After all automated strategies fail, route to a human reviewer. This is your safety net for truly ambiguous inputs.
Across all strategies, always use exponential backoffA retry strategy where the wait time doubles after each failure (e.g., 1s, 2s, 4s). This prevents overwhelming the API during outages and gives transient errors time to resolve. (doubling wait times: 1s, 2s, 4s) and a max retry count (typically 3) to prevent infinite loops and runaway API costs.
When a validation-retry fails, append SPECIFIC error details to the prompt: which field, what was wrong, expected vs actual. Anti-pattern: generic "there were errors, please try again."
Code Walkthrough: Data Extraction Pipeline
Approach 1: Tool Use for Structured Output
Let's start with the data model. The code below defines a ContactInfo schema in two places: a Pydantic model (your validation blueprint) and a matching tool definition (what Claude sees). Having both gives you a two-layer guarantee — Claude's tool_use ensures the right structure, and Pydantic ensures the right types and values. One important gotcha: these two definitions must stay in sync. If you add a field to one, add it to the other. In production, you'd use ContactInfo.model_json_schema() to auto-generate the tool schema from Pydantic, eliminating the sync problem entirely.
The interesting part is the extract_contact() function. It sends text to Claude with tool_choice={"type": "tool", "name": "extract_contact"} — this is the critical line that forces Claude to return a structured tool_use block instead of free-form text. The function then loops through the response content blocks to find the one with type == "tool_use" and validates its input field through Pydantic. Here's the important nuance: even with forced tool use, the values inside the JSON might be wrong (Claude could hallucinate an email, for example). Tool use guarantees the structure is valid, not that the content is correct.
Finally, notice how the error handling separates ValidationError from APIError. These are fundamentally different problems: a validation error means Claude returned the wrong data shape (retry with a better prompt), while an API error means the network or rate limit failed (retry with backoff). Catching them separately lets you respond appropriately to each. Never catch a bare Exception — you'll mask bugs in your own code.
# pip install "anthropic>=0.40.0" "pydantic>=2.0"
import anthropic
import json
from pydantic import BaseModel, ValidationError
from typing import Optional
client = anthropic.Anthropic()
# Define the schema as both Pydantic model and tool definition
class ContactInfo(BaseModel):
name: str
email: str
phone: Optional[str] = None
company: Optional[str] = None
role: Optional[str] = None
# Tool definition matches the Pydantic schema
extract_contact_tool = {
"name": "extract_contact",
"description": "Extract structured contact information from text.",
"input_schema": {
"type": "object",
"properties": {
"name": {"type": "string", "description": "Full name of the person"},
"email": {"type": "string", "description": "Email address"},
"phone": {"type": "string", "description": "Phone number, if mentioned"},
"company": {"type": "string", "description": "Company name, if mentioned"},
"role": {"type": "string", "description": "Job title or role, if mentioned"},
},
"required": ["name", "email"],
},
}
def extract_contact(text: str) -> ContactInfo:
"""Extract contact info using tool use + Pydantic validation."""
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[extract_contact_tool],
tool_choice={"type": "tool", "name": "extract_contact"},
messages=[{
"role": "user",
"content": f"Extract the contact information from this text:\n\n{text}"
}]
)
# Claude returns a tool_use content block
for block in response.content:
if block.type == "tool_use":
# Validate with Pydantic
contact = ContactInfo(**block.input)
return contact
raise ValueError("No tool_use block in response")
except ValidationError as e:
print(f"Validation failed: {e}")
raise
except anthropic.APIError as e:
print(f"API error: {e.status_code} - {e.message}")
raise
# Usage
text = """
Best regards,
Jane Smith, VP of Engineering
TechCorp Inc. | jane.smith@techcorp.io | (555) 123-4567
"""
contact = extract_contact(text)
print(f"Name: {contact.name}")
print(f"Email: {contact.email}")
print(f"Phone: {contact.phone}")
print(f"Company: {contact.company}")
print(f"Role: {contact.role}")
// npm install "@anthropic-ai/sdk@^0.40.0" zod
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';
const client = new Anthropic();
// Define the schema with Zod
const ContactInfo = z.object({
name: z.string(),
email: z.string().email(),
phone: z.string().optional(),
company: z.string().optional(),
role: z.string().optional(),
});
const extractContactTool = {
name: 'extract_contact',
description: 'Extract structured contact information from text.',
input_schema: {
type: 'object',
properties: {
name: { type: 'string', description: 'Full name of the person' },
email: { type: 'string', description: 'Email address' },
phone: { type: 'string', description: 'Phone number, if mentioned' },
company: { type: 'string', description: 'Company name, if mentioned' },
role: { type: 'string', description: 'Job title or role, if mentioned' },
},
required: ['name', 'email'],
},
};
async function extractContact(text) {
try {
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
tools: [extractContactTool],
tool_choice: { type: 'tool', name: 'extract_contact' },
messages: [{
role: 'user',
content: `Extract the contact information from this text:\n\n${text}`
}]
});
for (const block of response.content) {
if (block.type === 'tool_use') {
// Validate with Zod
const contact = ContactInfo.parse(block.input);
return contact;
}
}
throw new Error('No tool_use block in response');
} catch (error) {
if (error instanceof z.ZodError) {
console.error('Validation failed:', error.issues);
} else if (error instanceof Anthropic.APIError) {
console.error(`API error: ${error.status} - ${error.message}`);
}
throw error;
}
}
const text = `Best regards,
Jane Smith, VP of Engineering
TechCorp Inc. | jane.smith@techcorp.io | (555) 123-4567`;
const contact = await extractContact(text);
console.log(`Name: ${contact.name}`);
console.log(`Email: ${contact.email}`);
console.log(`Phone: ${contact.phone}`);
console.log(`Company: ${contact.company}`);
console.log(`Role: ${contact.role}`);
tool_use block with typed fields. Pydantic then validated those fields against your schema. The result: five clean, typed fields extracted from a messy paragraph — no regex, no string splitting, no guessing. If any field had the wrong type or was missing, Pydantic would have raised a ValidationError with the exact field name and expected type.
Adding Retry with Error Feedback
Now for the part that makes this production-ready: automatic error recovery. The code below wraps the extraction in a retry loop that runs up to max_retries times. Here's the clever part: on each retry, it appends the specific validation error to the prompt. So instead of blindly asking Claude to try again, you're saying "the email field was null, but it's required — please fix that." Claude reads the error and self-corrects. This approach resolves ~90% of failures on the first retry.
The other key detail is the time.sleep(2 ** attempt) after each failure. This is exponential backoff: 2 seconds, then 4, then 8. Without it, rapid retries during an API outage just make the problem worse — you'd trigger rate limiting on top of the original issue. And always set a max retry count. Without one, a consistently failing input (imagine someone passes in gibberish text) will loop forever, burning tokens and money with no hope of success.
import anthropic
import time
from pydantic import BaseModel, ValidationError
from typing import Optional
client = anthropic.Anthropic()
class ContactInfo(BaseModel):
name: str
email: str
phone: Optional[str] = None
company: Optional[str] = None
role: Optional[str] = None
def extract_with_retry(text: str, max_retries: int = 3) -> ContactInfo:
"""Extract contact info with retry on validation failure."""
last_error = None
for attempt in range(1, max_retries + 1):
prompt = f"Extract contact information from this text:\n\n{text}"
# On retry, include the previous error
if last_error:
prompt += f"\n\nPrevious attempt failed with: {last_error}"
prompt += "\nPlease fix the output to match the required schema exactly."
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[{
"name": "extract_contact",
"description": "Extract contact info. ALL fields must be valid.",
"input_schema": ContactInfo.model_json_schema(),
}],
tool_choice={"type": "tool", "name": "extract_contact"},
messages=[{"role": "user", "content": prompt}],
)
for block in response.content:
if block.type == "tool_use":
contact = ContactInfo(**block.input)
print(f"Attempt {attempt}: Success!")
return contact
except ValidationError as e:
last_error = str(e)
print(f"Attempt {attempt}: Validation error - {last_error}")
time.sleep(2 ** attempt) # Exponential backoff
except anthropic.APIError as e:
print(f"Attempt {attempt}: API error - {e.message}")
time.sleep(2 ** attempt)
raise RuntimeError(f"Failed after {max_retries} attempts. Last error: {last_error}")
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';
const client = new Anthropic();
const ContactInfo = z.object({
name: z.string(),
email: z.string().email(),
phone: z.string().optional(),
company: z.string().optional(),
role: z.string().optional(),
});
async function extractWithRetry(text, maxRetries = 3) {
let lastError = null;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
let prompt = `Extract contact information from this text:\n\n${text}`;
if (lastError) {
prompt += `\n\nPrevious attempt failed with: ${lastError}`;
prompt += '\nPlease fix the output to match the required schema exactly.';
}
try {
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
tools: [{
name: 'extract_contact',
description: 'Extract contact info. ALL fields must be valid.',
input_schema: {
type: 'object',
properties: {
name: { type: 'string' },
email: { type: 'string' },
phone: { type: 'string' },
company: { type: 'string' },
role: { type: 'string' },
},
required: ['name', 'email'],
},
}],
tool_choice: { type: 'tool', name: 'extract_contact' },
messages: [{ role: 'user', content: prompt }],
});
for (const block of response.content) {
if (block.type === 'tool_use') {
const contact = ContactInfo.parse(block.input);
console.log(`Attempt ${attempt}: Success!`);
return contact;
}
}
} catch (error) {
if (error instanceof z.ZodError) {
lastError = error.issues.map(i => `${i.path}: ${i.message}`).join(', ');
console.log(`Attempt ${attempt}: Validation error - ${lastError}`);
} else if (error instanceof Anthropic.APIError) {
console.log(`Attempt ${attempt}: API error - ${error.message}`);
} else { throw error; }
await new Promise(r => setTimeout(r, 2 ** attempt * 1000));
}
}
throw new Error(`Failed after ${maxRetries} attempts. Last error: ${lastError}`);
}
Hands-On Exercise
What You'll Build
A complete contact extraction pipeline that compares prompt-only vs tool-use approaches, validates output with Pydantic/Zod, and recovers from failures automatically. You'll test against 5 real email signatures.
Time estimate: 25–35 minutes • Prerequisites: M01-M03 labs complete (API key set, SDK installed) • Files you'll create: extractor.py (or extractor.mjs)
Environment Setup
# Python
pip install "anthropic>=0.40.0" "pydantic>=2.0"
export ANTHROPIC_API_KEY="your-key-here"
# Node.js
npm install "@anthropic-ai/sdk@^0.40.0" zod
export ANTHROPIC_API_KEY="your-key-here"
Step 1: Define the Schema and Test Data
Before extracting anything, you need a data model and test cases. This step defines the ContactInfo schema and 5 real-world email signatures that range from easy to tricky. Having a fixed test set lets you objectively compare prompt-only vs tool-use approaches in Step 2.
Create a new file called extractor.py (or extractor.mjs):
import anthropic
import json
from pydantic import BaseModel, ValidationError
from typing import Optional
client = anthropic.Anthropic()
class ContactInfo(BaseModel):
name: str
email: str
phone: Optional[str] = None
company: Optional[str] = None
role: Optional[str] = None
# 5 test email signatures — easy to hard
TEST_SIGNATURES = [
"Best, Jane Smith | jane@acme.com | Acme Corp",
"John Doe, Senior Engineer at MegaTech\njohn.doe@megatech.io | (555) 234-5678",
"Cheers,\nDr. Maria García-López, Head of Research\nBioGen International\nmgarcia@biogen.int",
"— Alex K. | Product @ StartupXYZ | alex@startupxyz.co | they/them",
"Thanks!\nRobert \"Bob\" Williams III\nChief Financial Officer\nGlobal Finance Partners LLC\nrwilliams@gfp.com\n+1 (212) 555-0199",
]
print(f"Schema: {json.dumps(ContactInfo.model_json_schema(), indent=2)}")
print(f"\nTest signatures: {len(TEST_SIGNATURES)}")
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';
const client = new Anthropic();
const ContactInfo = z.object({
name: z.string(),
email: z.string().email(),
phone: z.string().optional(),
company: z.string().optional(),
role: z.string().optional(),
});
const TEST_SIGNATURES = [
"Best, Jane Smith | jane@acme.com | Acme Corp",
"John Doe, Senior Engineer at MegaTech\njohn.doe@megatech.io | (555) 234-5678",
"Cheers,\nDr. Maria García-López, Head of Research\nBioGen International\nmgarcia@biogen.int",
"— Alex K. | Product @ StartupXYZ | alex@startupxyz.co | they/them",
"Thanks!\nRobert \"Bob\" Williams III\nChief Financial Officer\nGlobal Finance Partners LLC\nrwilliams@gfp.com\n+1 (212) 555-0199",
];
console.log(`Test signatures: ${TEST_SIGNATURES.length}`);
Run it: python extractor.py (or node extractor.mjs)
name and email as required. If you don't see the schema, make sure you're using Pydantic v2 (not v1).
Step 2: Extract with Tool Use + Validation
Now let's build the extraction function and see how it handles the full range of signatures. The interesting question isn't whether it works on the clean ones (Sig 1 is trivial) — it's whether it can handle Dr. María García-López's hyphenated name and Robert "Bob" Williams III's nickname in quotes. These are the cases where regex-based extraction falls apart, and tool use shines.
This function combines the tool definition, forced tool_choice, and Pydantic validation from the code walkthrough into a single reusable function. It uses the ContactInfo model and TEST_SIGNATURES from Step 1.
Add the following to extractor.py:
extract_tool = {
"name": "extract_contact",
"description": "Extract structured contact information from an email signature.",
"input_schema": ContactInfo.model_json_schema(),
}
def extract_contact(text: str) -> ContactInfo:
"""Extract contact info using forced tool use + Pydantic validation."""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[extract_tool],
tool_choice={"type": "tool", "name": "extract_contact"},
messages=[{"role": "user", "content": f"Extract contact info:\n\n{text}"}],
)
for block in response.content:
if block.type == "tool_use":
return ContactInfo(**block.input)
raise ValueError("No tool_use block in response")
# Run against all 5 test signatures
successes = 0
for i, sig in enumerate(TEST_SIGNATURES, 1):
try:
contact = extract_contact(sig)
print(f"✓ Sig {i}: {contact.name} <{contact.email}> @ {contact.company or 'N/A'}")
successes += 1
except (ValidationError, ValueError) as e:
print(f"✗ Sig {i}: FAILED — {e}")
except anthropic.APIError as e:
print(f"✗ Sig {i}: API error — {e.message}")
print(f"\nResults: {successes}/{len(TEST_SIGNATURES)} extracted successfully")
const extractTool = {
name: 'extract_contact',
description: 'Extract structured contact information from an email signature.',
input_schema: {
type: 'object',
properties: {
name: { type: 'string', description: 'Full name' },
email: { type: 'string', description: 'Email address' },
phone: { type: 'string', description: 'Phone number if present' },
company: { type: 'string', description: 'Company name if present' },
role: { type: 'string', description: 'Job title if present' },
},
required: ['name', 'email'],
},
};
async function extractContact(text) {
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
tools: [extractTool],
tool_choice: { type: 'tool', name: 'extract_contact' },
messages: [{ role: 'user', content: `Extract contact info:\n\n${text}` }],
});
for (const block of response.content) {
if (block.type === 'tool_use') return ContactInfo.parse(block.input);
}
throw new Error('No tool_use block');
}
let successes = 0;
for (let i = 0; i < TEST_SIGNATURES.length; i++) {
try {
const contact = await extractContact(TEST_SIGNATURES[i]);
console.log(`✓ Sig ${i+1}: ${contact.name} <${contact.email}> @ ${contact.company || 'N/A'}`);
successes++;
} catch (error) {
console.log(`✗ Sig ${i+1}: FAILED — ${error.message?.slice(0, 80)}`);
}
}
console.log(`\nResults: ${successes}/${TEST_SIGNATURES.length} extracted successfully`);
Run it: python extractor.py (or node extractor.mjs)
Step 3: Add Retry with Error Feedback
Even with tool use, validation can occasionally fail. Maybe Claude returns an empty string for a required field, or interprets "j (at) co (dot) com" too literally. This step wraps the extraction in a retry loop that feeds specific error messages back to Claude. Here's the key idea: instead of blindly retrying, we tell Claude exactly what went wrong so it can fix the specific problem. This step builds on extract_contact() from Step 2.
Add the following to extractor.py:
import time
def extract_with_retry(text: str, max_retries: int = 3) -> ContactInfo:
"""Extract with automatic retry on validation failure."""
last_error = None
for attempt in range(1, max_retries + 1):
prompt = f"Extract contact info:\n\n{text}"
if last_error:
prompt += f"\n\nPrevious attempt failed: {last_error}\nFix the output."
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[extract_tool],
tool_choice={"type": "tool", "name": "extract_contact"},
messages=[{"role": "user", "content": prompt}],
)
for block in response.content:
if block.type == "tool_use":
return ContactInfo(**block.input)
except ValidationError as e:
last_error = str(e)
print(f" Attempt {attempt}: Validation error, retrying...")
time.sleep(2 ** attempt)
except anthropic.APIError as e:
print(f" Attempt {attempt}: API error — {e.message}")
time.sleep(2 ** attempt)
raise RuntimeError(f"Failed after {max_retries} attempts: {last_error}")
# Test retry with a deliberately tricky signature
tricky = "Contact: J. at some-company, email is j (at) co (dot) com, phone TBD"
try:
result = extract_with_retry(tricky)
print(f"Extracted: {result.name} <{result.email}>")
except RuntimeError as e:
print(f"Gave up: {e}")
async function extractWithRetry(text, maxRetries = 3) {
let lastError = null;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
let prompt = `Extract contact info:\n\n${text}`;
if (lastError) prompt += `\n\nPrevious attempt failed: ${lastError}\nFix the output.`;
try {
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
tools: [extractTool],
tool_choice: { type: 'tool', name: 'extract_contact' },
messages: [{ role: 'user', content: prompt }],
});
for (const block of response.content) {
if (block.type === 'tool_use') return ContactInfo.parse(block.input);
}
} catch (error) {
if (error instanceof z.ZodError) {
lastError = error.issues.map(i => `${i.path}: ${i.message}`).join(', ');
console.log(` Attempt ${attempt}: Validation error, retrying...`);
} else { throw error; }
await new Promise(r => setTimeout(r, 2 ** attempt * 1000));
}
}
throw new Error(`Failed after ${maxRetries} attempts: ${lastError}`);
}
const tricky = "Contact: J. at some-company, email is j (at) co (dot) com, phone TBD";
try {
const result = await extractWithRetry(tricky);
console.log(`Extracted: ${result.name} <${result.email}>`);
} catch (error) {
console.log(`Gave up: ${error.message}`);
}
Run it: python extractor.py (or node extractor.mjs)
Troubleshooting
ModuleNotFoundError: No module named 'pydantic'— Runpip install pydantic>=2.0. Pydantic v2 is required formodel_json_schema().Cannot find module 'zod'— Runnpm install zod.- All 5 extractions fail — Check your API key is set. Run
echo $ANTHROPIC_API_KEYto verify (Linux/Mac) orecho %ANTHROPIC_API_KEY%(Windows). - Retry loop takes too long — Exponential backoff means waits of 2s, 4s, 8s. If testing, reduce
max_retriesto 2 or remove thetime.sleep()temporarily. RuntimeError: Failed after 3 attempts— This is expected for very ambiguous input. The circuit breaker worked correctly. Try a cleaner test signature to confirm the pipeline works.
Verify Everything Works
Run the complete file to execute all steps in sequence:
python extractor.py # or: node extractor.mjs
Stretch Goals (Optional)
- Build a generic
extract(text, schema)function that accepts any Pydantic/Zod schema - Add a confidence score to each extracted field based on the source text
Multi-Modal Input: Vision and PDF
Agents don't only consume text. Claude can process imagesBinary visual data (JPEG, PNG, GIF, WebP) sent to Claude as base64-encoded strings or URLs, enabling the model to "see" and reason about visual content. and PDFsPortable Document Format files that Claude can read natively, extracting text, tables, and layout information without external OCR. directly, which unlocks powerful agent use cases like reading scanned documents, extracting data from photographs, and analyzing charts or diagrams.
Sending Images via Base64
To send an image to Claude, you encode it as a base64 string and include it in the message's content array with type: "image" and the appropriate media_type (e.g., image/jpeg, image/png). This lets agents process photos of receipts, screenshots of dashboards, scanned filings, or any visual input without needing a separate OCR pipeline.
Vision Use Cases for Agents
- Scanned document extraction — read a scanned UCC filing or invoice and extract structured fields like filing numbers, debtor names, and dates
- Photo-based data capture — extract product details from a photo of a shipping label or whiteboard notes from a meeting
- Chart and diagram analysis — interpret a bar chart image and return the underlying data as JSON
PDF Processing
Claude supports native document understanding for PDFs, so you can send multi-page documents directly without external OCR tools. This is especially valuable for agents that need to process contracts, medical records, or government filings. Combine PDF input with tool use to extract structured data automatically.
# Pseudocode: an analyze_document tool for a multi-modal agent
TOOL analyze_document(document_base64, media_type):
"""Send a scanned document to Claude with vision and
return structured data extracted from it."""
SEND message to Claude:
content = [
{ type: "image", source: base64_data, media_type: media_type },
{ type: "text", text: "Extract: filing number, debtor name,
secured party, filing date. Return as JSON." }
]
tools = [ structured_filing_schema ]
PARSE tool_use response
VALIDATE against schema
RETURN structured filing record
Knowledge Check
Test your understanding of structured output, validation, and error recovery.
Q1: What is the most reliable way to get structured JSON output from Claude?
Q2: Given this Pydantic model, which JSON response will FAIL validation?
class Product(BaseModel):
name: str
price: float
in_stock: bool{"name": "Widget", "price": 9.99, "in_stock": true}{"name": "Widget", "price": "nine dollars", "in_stock": true}{"name": "Widget", "price": 9, "in_stock": false}{"name": "Widget", "price": 0.0, "in_stock": true}Q3: What should you include in a retry prompt after a validation failure?
Q4: What does tool_choice: {"type": "tool", "name": "extract_contact"} do?
tool_use responseQ5: Why should retry logic use exponential backoff?
Q6: Fill in the blank to force Claude to use a specific tool:
response = client.messages.create(
model="claude-sonnet-4-6",
tools=[my_tool],
______={"type": "tool", "name": "my_tool"},
messages=[...]
)tool_selectforce_tooltool_choicerequired_tooltool_choice. Setting it to {"type": "tool", "name": "my_tool"} forces Claude to use that specific tool, guaranteeing structured output.Module Summary
Key Takeaways
- Structured output is the contract between the AI and your system — without it, downstream code breaks unpredictably.
- Tool use is the most reliable method — Claude is specifically trained to produce valid tool_use content blocks with typed parameters.
- Defense in depth — combine format constraints, stop sequences, and schema validation for near-100% reliability.
- Pydantic & Zod turn "it usually works" into "it always works or fails explicitly" with typed, validated objects.
- Retry with error feedback — include the validation error in the retry prompt so Claude can self-correct. Always use exponential backoff.
Next Module Preview: M05 — Function Calling
Now that you can get reliable structured output, you're ready for the pivotal moment in the course: giving Claude the ability to do things, not just generate text. In Module 5, you'll build your first tool-using agent — the moment Claude goes from chatbot to agent.