M06: Multi-Tool Orchestration

In M05 you gave Claude one tool at a time. Now you'll orchestrate multiple tools together — running them in parallel, chaining their outputs, and handling failures gracefully. This is where agents become truly powerful.

Learning Objectives

  • Explain the difference between parallel and sequential tool calls, and when to use each
  • Implement the agentic loop that processes multiple tool_use blocks and chains results
  • Engineer effective tool descriptions that help Claude select the right tool
  • Build a ToolRegistry that dynamically adds and removes tools based on context
  • Handle partial failures, retries, and circuit breakers in multi-tool workflows

Parallel Tool Calls — When and Why

Everyday Analogy

BEFORE: Imagine a kitchen where only one sous chef handles all prep work — chopping onions, then dicing tomatoes, then mincing garlic, one task after another while the head chef waits.

PAIN: Dinner service grinds to a halt because three 10-minute tasks take 30 minutes total, and every dish that depends on those ingredients sits idle the entire time.

MAPPING: Parallel tool callsWhen Claude requests multiple tool calls in a single response because the calls are independent. The client executes all tools concurrently and returns all results in one message, reducing round trips and wall-clock time. solve this exactly the way a smart head chef would — assign each task to a different sous chef so all three run simultaneously. The total time drops to the slowest single task (10 minutes), not the sum of all three (30 minutes). In agent terms, Claude emits multiple tool_use blocks in one response, your code runs them concurrently, and the wall-clock time equals the slowest tool, not the total.

Technical Definition Claude can return multiple tool_useA content block in Claude's response with type "tool_use". It contains the tool's name, a unique ID, and an input object with arguments. Your code reads this block, executes the function, and returns the result. When multiple tool_use blocks appear in one response, the calls are independent and can run in parallel. content blocks in a single assistant message when the calls are independent. Your client code then executes all of those tools concurrently. Remember from M05: your code runs the tools, not Claude. Once all tools finish, you bundle all the tool_resultA message you send back to Claude containing the output of a tool execution. It must reference the tool_use_id from Claude's request. When returning multiple results from parallel calls, include all tool_result blocks in a single user message. messages into a single user message and send them back to Claude in the next turn. The payoff? One API round trip handles N tools, instead of N separate round trips.

So what does parallel execution actually look like in an API response? When Claude decides it needs multiple independent tools, it returns multiple tool_use blocks in a single response. Here's an example:

"content": [ { "type": "tool_use", "id": "toolu_01A...", "name": "web_search", "input": {"query": "AI agents 2025"} }, { "type": "tool_use", "id": "toolu_01B...", "name": "wiki_search", "input": {"query": "AI agents"} }, { "type": "tool_use", "id": "toolu_01C...", "name": "paper_search", "input": {"query": "autonomous agents"} } ]

Three tool_use blocks, each with its own id. Your code sees all three, fires them off concurrently (using ThreadPoolExecutor in Python or Promise.all in Node.js), and returns all three tool_result blocks in a single user message. One round trip, three tools executed.

Why It Matters Parallel tool calls aren't just a performance optimization — they fundamentally change agent design by enabling concurrent information gathering. Consider a real scenario: a research agent needs to search Google (200ms), query a wiki API (350ms), and hit a paper database (500ms). Sequentially, that's 1,050ms of wall-clock time. In parallel, it's just 500ms — a 2.1x speedup. Scale that to a production agent handling 10,000 requests/day, and you save over 1.5 hours of cumulative user wait time daily. Each parallel batch also uses just one API round trip instead of three, cutting your Anthropic API call volume (and associated overhead) by 67%.
⚔ PARALLEL 🐢 SEQUENTIAL šŸ¤– Claude decides šŸ” search 200ms šŸ“” wiki 350ms šŸ“„ papers 500ms All 3 results 500ms total 1 API round trip šŸ¤– Claude decides šŸ” 200ms šŸ“” 350ms šŸ“„ 500ms 1,050ms total 3 API round trips 2.1Ɨ faster →
Animation: Parallel vs Sequential Execution
🤖 Claude decides: 3 independent calls
↓ ↓ ↓
🔍 web_search 200ms
📚 wiki_search 350ms
📄 paper_search 500ms
✅ All results returned in one message
Parallel: 500ms (max)  |  Sequential: 1050ms (sum)  |  2.1× faster
When NOT to Use Parallel Calls Parallel calls only work when tools have no data dependencies. If Tool B needs the result of Tool A, they must run sequentially. Forcing parallelism on dependent tools produces incorrect results.

Sequential Tool Chains — Output Feeds Input

Everyday Analogy

BEFORE: Imagine trying to build a car by dumping all the raw materials — steel, rubber, glass — into a room and hoping a finished vehicle appears. Without a defined sequence, nothing fits together.

PAIN: You can't install a windshield before the frame is welded, and you can't paint the body before it's assembled. Doing steps out of order wastes materials and produces a broken result.

MAPPING: A sequential tool chain works like an assembly line — Station A (search) produces URLs, Station B (fetch_page) takes those URLs and produces page text, Station C (summarize) takes that text and produces bullet points. Each stage transforms the data and passes it forward. Skipping a station or running them out of order means the next stage gets the wrong input and the whole pipeline breaks.

Technical Definition In sequential chainsA pattern where Claude calls Tool A, receives the result, then calls Tool B using data from Tool A's result, and so on. Each step requires a full API round trip. The chain continues until Claude produces a final text response (stop_reason: "end_turn")., Claude calls Tool A and receives the result. Then it calls Tool B, using data from Tool A's output as input. Each of these steps is a separate API round trip — your code sends a tool_result, and Claude processes it and responds with another tool_use. This back-and-forth is managed by the agentic loopThe while loop in your code that repeatedly sends tool results back to Claude until stop_reason is "end_turn" instead of "tool_use". This is the same pattern you built in M05 — now it runs for multiple iterations., which keeps running until stop_reasonA field in Claude's API response indicating why generation stopped. "end_turn" means Claude finished normally. "tool_use" means Claude wants to call a tool and is waiting for your code to execute it and return the result. becomes "end_turn" — meaning Claude has finished its work and is ready to present a final answer.

Here's what a sequential chain looks like in practice. Notice how each step's output becomes the next step's input — data transforms at each stage:

// Round trip 1: Claude requests search tool_use: search({query: "Claude AI"}) tool_result: {urls: ["docs.anthropic.com/...", "blog.anthropic.com/..."]} // Round trip 2: Claude uses URL from search result tool_use: fetch_page({url: "docs.anthropic.com/..."}) tool_result: {content: "Claude is an AI assistant built by..." // 5000 chars} // Round trip 3: Claude uses page text for summary tool_use: summarize({text: "Claude is an AI assistant..."}) tool_result: {summary: ["Key point 1...", "Key point 2..."]} // Round trip 4: Claude composes final answer (stop_reason: "end_turn")

Each round trip is a full API call. Claude sees the previous result, decides what to do next, and issues another tool_use. This continues until the chain is complete and Claude returns its final text response.

Animation: Sequential Chain — Data Transforms at Each Stage
search()query: "Claude AI"
fetch_page()url: docs.anthropic.com
summarize()text: 5000 chars
Round trip 1
Round trip 2
Round trip 3
Cost Awareness Each round trip in a sequential chain sends the entire conversation history. A 5-step chain means 5 API calls with growing message arrays. Monitor token usage — long chains get expensive. (You'll learn context management techniques in M08.)

Tool Selection — How Claude Picks the Right Tool

Everyday Analogy

BEFORE: Imagine walking into a hardware store where every tool is in an unmarked cardboard box — no labels, no descriptions, just numbered bins. You need to drive a small screw into a circuit board, but you have no idea which bin has the right screwdriver.

PAIN: You end up grabbing random boxes, trying tools that don't fit, stripping the screw, and wasting an hour on a two-minute job. Worse, you might use a power drill and destroy the delicate board entirely.

MAPPING: This is exactly what happens when Claude gets tool definitions with vague descriptions like "searches stuff". Claude reads the name and description of each tool to decide which one fits the user's request. A label like "Phillips-head screwdriver, size #2, for small electronics" maps directly to a good tool description: "Search the web using a query string. Returns top 5 results. Use for current events or factual questions." The clearer the label, the more accurate Claude's selection.

Technical Definition Claude selects tools by weighing four factors. First, and most important, the tool name and description — this is the primary signal Claude uses to decide which tool fits. Second, the parameter schemasThe input_schema JSON Schema you learned in M05. Well-defined schemas with property descriptions, types, and required fields help Claude generate correct arguments. Vague schemas lead to wrong inputs. — well-defined schemas help Claude understand what inputs are needed and generate correct arguments. Third, the user request itself — Claude maps the user's intent ("What's the weather?") to the tool that best matches that capability. Fourth, the conversation context — if Claude just got search results back, it's more likely to pick fetch_page next than send_email.
Animation: Claude's Tool Selection Process
User: "What's the weather in Tokyo and convert 72°F to Celsius?"
get_weather
Get current weather for a city
calculate
Evaluate math expressions
search_db
Query a database
send_email
Send an email message
get_time
Get time in a timezone
translate
Translate between languages

Tool Description Engineering

Bad Description "name": "search", "description": "searches stuff" — too vague. Claude doesn't know what it searches, when to use it, or what it returns.
Good Description "name": "web_search", "description": "Search the web using a query string. Returns top 5 results with title, URL, and snippet. Use for current events, factual questions, or when the user asks to look something up online."
Key Insight Tool descriptions are the new prompts. Investing in clear, specific tool descriptions is as important as writing good system prompts — they directly determine whether Claude picks the right tool.
Tool Selection Accuracy vs. Number of Tools 100% 80% 60% 40% Number of Tools per Agent āœ“ SWEET SPOT ⚠ DEGRADATION ZONE 97% 3 95% 5 88% 8 76% 12 61% 18
🎓 Cert Tip — Domain 2.3

Keep 4–5 tools per agent maximum. Tool selection accuracy degrades rapidly above 5. Anti-pattern: one agent with 18+ tools. Instead, distribute tools across specialized subagents.

⚠️ Common Misconceptions

"More tools = more capable agent" — This is the most counterintuitive misconception in multi-tool orchestration. In practice, tool selection accuracy degrades noticeably once you pass 5–6 tools. Each additional tool adds more descriptions for Claude to evaluate, more chances for ambiguity between similar tools, and more input tokens per request. An agent with 4 focused tools will outperform an agent with 18 scattered ones almost every time.

"Claude automatically parallelizes independent tools" — Claude CAN return multiple tool_use blocks in one response, and it often does when the calls are clearly independent. But you still need to write the parallel execution logic in YOUR code (ThreadPoolExecutor, Promise.all). If your code processes tool_use blocks sequentially even when Claude sends multiple, you lose the speedup. Parallelism requires effort on both sides.

"Sequential is always worse than parallel" — Not at all. When Tool B needs the output of Tool A (e.g., fetch a page from a URL that search returned), they MUST run sequentially. Forcing parallel execution on dependent tools produces incorrect results — Tool B would run with no input. The right approach is to parallelize independent tools and chain dependent ones.

"Dynamic registration is premature optimization" — For a 3-tool agent, yes. For a production agent with 15–20 tools, it's essential. Each tool definition consumes 200–500 input tokens. Sending 20 tools with every request means 4,000–10,000 extra tokens per call — that's real cost at scale. And the accuracy benefit of fewer tools is arguably more important than the token savings.

"Claude always picks the right tool" — Claude is remarkably good at tool selection, but it's not infallible. Ambiguous descriptions, overlapping tool capabilities, and misleading parameter names all cause misselection. This is why tool description engineering matters — it's the single highest-leverage thing you can do to improve agent reliability.

Dynamic Tool Registration

Everyday Analogy

BEFORE: Imagine a surgeon walking into the operating room and finding every instrument the hospital owns laid out on the tray — orthopedic saws, dental drills, eye surgery lasers, and the cardiac tools they actually need. Hundreds of instruments, all within reach.

PAIN: The surgeon wastes time scanning past irrelevant tools, risks grabbing the wrong instrument under pressure, and the tray is so cluttered that the correct scalpel is buried under equipment meant for a completely different specialty.

MAPPING: Dynamic tool registration is like a surgical nurse who curates the tray — only cardiac instruments are laid out for a heart surgery. In agent terms, instead of sending all 20 tools with every API call, you filter the tools array based on the current task context. Fewer tools means Claude scans less, picks more accurately, and you burn fewer input tokens on irrelevant definitions.

Technical Definition Every time you call the Claude API, the tools array you pass gets serialized into input tokens — Claude reads every tool's name, description, and parameter schema before deciding which to use. This is called token overheadTool definitions consume input tokens. Each tool's name, description, and parameter schema are serialized and sent with every API call. Sending 20 tools when only 3 are relevant wastes tokens on every request.. Dynamic tool registration means you build a different tools array for each request based on what the user actually needs. Why bother? Three reasons: (1) Cost — fewer tools means fewer input tokens burned on every call. (2) Accuracy — Claude picks better when it has 4 focused options instead of 20 scattered ones. (3) Security via least privilegeA security principle: give each request access only to the tools it actually needs. An admin-only tool like delete_user should not be available during a regular user's research query. — a regular user's research query should never see the delete_user tool, even if it exists in your system.
Animation: Dynamic Tool Registry — Filter by Context
Full Registry (8 tools)
web_search
fetch_page
query_db
send_email
send_slack
delete_user
modify_perms
summarize
Sent to API
Token Savings A typical tool definition uses 200–500 tokens. Sending 20 tools = 4,000–10,000 extra input tokens per request. Filtering to 5 relevant tools saves thousands of tokens per call.

Handling Errors in Multi-Tool Workflows

Everyday Analogy

BEFORE: Imagine a relay race where the team has no backup plan — four runners, one baton, and if anyone trips, the entire team is disqualified. No substitutes, no recovery protocol.

PAIN: In the real race, the second runner twists an ankle at the handoff. Without a plan, the baton hits the ground, the team freezes, and they forfeit a race they were winning. One failure cascades into total failure.

MAPPING: Multi-tool workflows face the same risk — if fetch_page returns a 404, does your entire agent crash? Error handling gives you the backup plan: return is_error: true so Claude can reason about alternatives (like switching to web_search), implement retries with exponential backoff for transient failures, and add circuit breakers that disable a tool after repeated failures so the agent doesn't waste time on a broken endpoint.

Technical Definition When you're running multiple tools, any one of them can fail — a URL returns 404, an API times out, a database query hits a permission error. These are compound failure modes, meaning the more tools you chain together, the higher the probability that at least one breaks. The key rule: when a tool fails, don't crash your agent loop. Instead, return the error as a tool_result with is_error: trueA boolean flag in the tool_result message that tells Claude the tool execution failed. Claude can then reason about the failure and decide on an alternative approach, retry, or inform the user — rather than treating error text as a successful result. and a descriptive message. This tells Claude "this tool failed, here's why" — and Claude can then decide to try an alternative tool, ask the user for help, or work with partial results. For tools that fail repeatedly (e.g., an API endpoint that's down), use a circuit breakerA pattern that tracks consecutive failures for a tool. After N failures, the circuit "opens" and the tool is temporarily disabled. This prevents wasting tokens and time on a tool that's consistently failing. — a counter that disables the tool after N consecutive failures so the agent stops wasting tokens retrying a broken endpoint.

That relay race analogy maps directly to real API messages. When a tool fails, you don't just drop the baton — you hand Claude a structured report explaining what went wrong, so it can pick a new runner. Here's exactly what that looks like in practice — the JSON you'd send back to Claude when a tool fails versus when it succeeds with no results:

// Tool FAILED — tells Claude "I couldn't even try" { "type": "tool_result", "tool_use_id": "toolu_01A...", "is_error": true, "content": "{\"error\": \"404 Not Found: https://example.com/api\"}" } // Tool SUCCEEDED but found nothing — tells Claude "I checked, nothing there" { "type": "tool_result", "tool_use_id": "toolu_01B...", "content": "{\"results\": []}" }

Claude makes very different decisions based on which one you return. The first says "the endpoint is broken, try something else." The second says "nothing matches your query." Confusing the two — returning an empty result when the tool actually crashed — leads Claude to conclude there's genuinely no data, which is a silent, hard-to-debug failure.

Animation: Error Recovery & Circuit Breaker
🤖Claude calls fetch_page("example.com/api")
Tool returns: is_error: true, "404 Not Found"
💡Claude reasons: "Page unavailable, I'll try web_search instead"
🔍Claude calls web_search("example.com API docs")
web_search succeeds — workflow continues on alternate path
After 3 failures: circuit breaker disables fetch_page, notifies user

Error Handling Strategies

  • Per-tool try/catch: Wrap each tool in error handling; return descriptive messages as tool_result
  • Let Claude adapt: Return is_error: true — Claude can often find alternatives on its own
  • Tool-level retries: Exponential backoff for transient failures (timeouts, rate limits)
  • Circuit breakers: After N consecutive failures, disable the tool and notify the user
  • Graceful degradation: Return partial results — some data is better than no data
⚠️ Common Misconceptions — Error Handling

"If a tool fails, just return an empty result" — This is one of the most dangerous mistakes in agent development. An empty result ({"results": []}) tells Claude "I checked and found nothing." An error (is_error: true) tells Claude "I couldn't even check." Claude makes completely different decisions based on which one you return. The first leads to "there's no data on this topic." The second leads to "let me try a different approach." Confusing them creates silent, hard-to-debug failures.

"Retrying a failed tool 100 times will eventually work" — Brute-force retries burn tokens and time. If an endpoint is down, it's down. Use exponential backoff (wait 1s, then 2s, then 4s) with a maximum of 3 retries for transient failures like timeouts. For persistent failures, use a circuit breaker instead.

"Claude can recover from any error automatically" — Claude is good at adapting when you give it structured error information. But it can only work with the tools you've provided. If the only way to get data is through a broken tool and no alternative exists, Claude will inform the user rather than magically producing correct information. Always design your tool set with fallback options where possible.

Circuit Breakers in Depth

A circuit breaker is borrowed from electrical engineering — when too much current flows through a wire, the breaker trips and cuts the circuit to prevent a fire. In agent code, it works the same way: you track consecutive failures for each tool, and after a threshold (typically 3–5 failures in a row), you temporarily disable that tool. This prevents the agent from wasting tokens and time retrying a broken endpoint over and over.

Here's how it works internally. You maintain a counter per tool (e.g., {"fetch_page": 0, "web_search": 0}). Every time a tool succeeds, its counter resets to zero. Every time it fails, the counter increments. When the counter hits your threshold, the circuit "opens" — meaning that tool is removed from the tools array sent to Claude on the next API call. Claude never even sees it as an option, so it naturally picks alternatives. After a cooldown period (say 60 seconds), you can "half-open" the circuit by adding the tool back for one test call to see if the endpoint has recovered.

How does this differ from simple retries? Retries happen within a single tool call — you might try the same HTTP request 3 times with exponential backoff before giving up. Circuit breakers operate across tool calls — they track a pattern of repeated failures over time and make a system-level decision to stop using that tool entirely. You'd typically use both together: retry transient failures within a call, and circuit-break persistent failures across calls.

Code Walkthrough: Research Assistant Agent

This agent demonstrates all orchestration patterns: parallel search, sequential fetch-and-summarize, dynamic tool registration, and error recovery.

Conceptual Bridge: You've now learned four orchestration concepts separately — parallel calls, sequential chains, tool selection, and dynamic registration. The code below stitches them all into one working agent. We'll build it in four steps: first define the tool schemas Claude will see, then implement the actual functions those schemas map to, then write the agentic loop that handles both parallel and sequential execution, and finally wrap the tools in a registry for dynamic filtering. Each step builds on the previous one.

Step 1: Define the Tools

Let's start by defining 5 tool schemas that tell Claude what's available. Each schema has a name, a description, and an input_schema specifying arguments. Remember from the Tool Selection section: Claude reads these schemas to decide which tools to call and what arguments to pass. It never executes them directly. The interesting part is the description field — notice how each one explains not just what the tool does, but when to use it (e.g., "Use after web_search to get full content from a result URL"). Vague descriptions like "does stuff" leave Claude guessing, and a guessing agent is an unreliable agent.
# pip install anthropic>=0.30.0
import anthropic
import json
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY env var

tools = [
    {
        "name": "web_search",
        "description": (
            "Search the web for current information. Returns top 3 "
            "results with title, URL, and snippet. Use for recent "
            "events, factual questions, or general research."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "fetch_page",
        "description": (
            "Fetch the full text content of a web page by URL. "
            "Returns page text (max 5000 chars). Use after "
            "web_search to get full content from a result URL."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "Full URL to fetch"}
            },
            "required": ["url"]
        }
    },
    {
        "name": "summarize_text",
        "description": (
            "Summarize long text into key points (3-5 bullets). "
            "Use after fetch_page to condense page content."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "text": {"type": "string", "description": "Text to summarize"},
                "max_points": {
                    "type": "integer",
                    "description": "Max bullet points (default 5)"
                }
            },
            "required": ["text"]
        }
    },
    {
        "name": "format_citation",
        "description": (
            "Format a source as an academic citation. Use after "
            "summaries are ready to create proper references."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string", "description": "Article title"},
                "url": {"type": "string", "description": "Source URL"},
                "accessed_date": {"type": "string", "description": "e.g. '2025-01-15'"}
            },
            "required": ["title", "url"]
        }
    },
    {
        "name": "save_to_file",
        "description": "Save content to a local file. Returns file path.",
        "input_schema": {
            "type": "object",
            "properties": {
                "filename": {"type": "string", "description": "Output filename"},
                "content": {"type": "string", "description": "Content to save"}
            },
            "required": ["filename", "content"]
        }
    }
]
// npm install @anthropic-ai/sdk@^0.30.0
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); // reads ANTHROPIC_API_KEY env var

const tools = [
  {
    name: "web_search",
    description:
      "Search the web for current information. Returns top 3 " +
      "results with title, URL, and snippet. Use for recent " +
      "events, factual questions, or general research.",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string", description: "Search query" }
      },
      required: ["query"]
    }
  },
  {
    name: "fetch_page",
    description:
      "Fetch the full text content of a web page by URL. " +
      "Returns page text (max 5000 chars). Use after " +
      "web_search to get full content from a result URL.",
    input_schema: {
      type: "object",
      properties: {
        url: { type: "string", description: "Full URL to fetch" }
      },
      required: ["url"]
    }
  },
  {
    name: "summarize_text",
    description:
      "Summarize long text into key points (3-5 bullets). " +
      "Use after fetch_page to condense page content.",
    input_schema: {
      type: "object",
      properties: {
        text: { type: "string", description: "Text to summarize" },
        max_points: { type: "integer", description: "Max bullet points (default 5)" }
      },
      required: ["text"]
    }
  },
  {
    name: "format_citation",
    description:
      "Format a source as an academic citation. Use after " +
      "summaries are ready to create proper references.",
    input_schema: {
      type: "object",
      properties: {
        title: { type: "string", description: "Article title" },
        url: { type: "string", description: "Source URL" },
        accessed_date: { type: "string", description: "e.g. '2025-01-15'" }
      },
      required: ["title", "url"]
    }
  },
  {
    name: "save_to_file",
    description: "Save content to a local file. Returns file path.",
    input_schema: {
      type: "object",
      properties: {
        filename: { type: "string", description: "Output filename" },
        content: { type: "string", description: "Content to save" }
      },
      required: ["filename", "content"]
    }
  }
];

Step 2: Implement Tools with Error Handling

Now we write the actual functions behind each schema. These are mocks — they return fake data — but the structure is identical to production code. When you're ready for the real thing, just swap in actual API calls and everything else stays the same. The interesting design choice here is the execute_tool dispatcher function. Instead of having your agentic loop know the internals of every tool, it just calls execute_tool(name, inputs) and gets back a result. This keeps the loop clean: it doesn't care whether it's calling a web scraper or a database — it just passes a name and inputs and gets JSON back. Adding a new tool later means writing one function and adding one entry to the dictionary. No changes to the loop. Here's the dilemma with error handling: if a tool throws an exception, should you crash the whole agent? Absolutely not. The dispatcher wraps every call in try/catch and returns errors as structured JSON with is_error: true. Claude reads that structured error and can reason about alternatives — "OK, the page fetch failed, let me try a web search instead." But this only works if you give Claude parseable information. A raw Python stack trace doesn't help it reason. A clean {"error": "404 Not Found", "tool": "fetch_page"} does.
# Mock implementations (replace with real APIs in production)
def web_search(query: str) -> dict:
    time.sleep(0.2)  # Simulate latency
    return {"results": [
        {"title": f"Result 1: {query}", "url": "https://example.com/1",
         "snippet": f"Overview of {query}..."},
        {"title": f"Result 2: {query}", "url": "https://example.com/2",
         "snippet": f"Developments in {query}..."},
        {"title": f"Result 3: {query}", "url": "https://broken.example.com/404",
         "snippet": f"Deep dive into {query}..."},
    ]}

def fetch_page(url: str) -> dict:
    time.sleep(0.3)
    if "broken" in url or "404" in url:
        raise ConnectionError(f"404 Not Found: {url}")
    return {"content": f"Full page content from {url}. " * 20}

def summarize_text(text: str, max_points: int = 5) -> dict:
    return {"summary": [f"Key point {i+1}" for i in range(min(max_points, 5))]}

def format_citation(title: str, url: str, accessed_date: str = None) -> dict:
    date = accessed_date or "2025-01-15"
    return {"citation": f'"{title}." Available at: {url}. Accessed: {date}.'}

def save_to_file(filename: str, content: str) -> dict:
    return {"status": "saved", "path": f"/output/{filename}", "bytes": len(content)}

# Dispatcher with per-tool error handling
tool_functions = {
    "web_search": web_search, "fetch_page": fetch_page,
    "summarize_text": summarize_text, "format_citation": format_citation,
    "save_to_file": save_to_file,
}

def execute_tool(name: str, inputs: dict) -> tuple[str, bool]:
    """Execute a tool, returning (result_json, is_error)."""
    func = tool_functions.get(name)
    if not func:
        return json.dumps({"error": f"Unknown tool: {name}"}), True
    try:
        result = func(**inputs)
        return json.dumps(result), False
    except Exception as e:
        return json.dumps({"error": str(e)}), True
// Mock implementations
async function webSearch(query) {
  await new Promise(r => setTimeout(r, 200));
  return { results: [
    { title: `Result 1: ${query}`, url: "https://example.com/1",
      snippet: `Overview of ${query}...` },
    { title: `Result 2: ${query}`, url: "https://example.com/2",
      snippet: `Developments in ${query}...` },
    { title: `Result 3: ${query}`, url: "https://broken.example.com/404",
      snippet: `Deep dive into ${query}...` },
  ]};
}

async function fetchPage(url) {
  await new Promise(r => setTimeout(r, 300));
  if (url.includes("broken") || url.includes("404"))
    throw new Error(`404 Not Found: ${url}`);
  return { content: `Full page content from ${url}. `.repeat(20) };
}

function summarizeText(text, maxPoints = 5) {
  return { summary: Array.from({ length: Math.min(maxPoints, 5) },
    (_, i) => `Key point ${i + 1}`) };
}

function formatCitation(title, url, accessedDate) {
  const date = accessedDate || "2025-01-15";
  return { citation: `"${title}." Available at: ${url}. Accessed: ${date}.` };
}

function saveToFile(filename, content) {
  return { status: "saved", path: `/output/${filename}`, bytes: content.length };
}

const toolFunctions = {
  web_search: (i) => webSearch(i.query),
  fetch_page: (i) => fetchPage(i.url),
  summarize_text: (i) => summarizeText(i.text, i.max_points),
  format_citation: (i) => formatCitation(i.title, i.url, i.accessed_date),
  save_to_file: (i) => saveToFile(i.filename, i.content),
};

async function executeTool(name, inputs) {
  const func = toolFunctions[name];
  if (!func)
    return { result: JSON.stringify({ error: `Unknown tool: ${name}` }), isError: true };
  try {
    const result = await func(inputs);
    return { result: JSON.stringify(result), isError: false };
  } catch (e) {
    return { result: JSON.stringify({ error: e.message }), isError: true };
  }
}

Step 3: The Agentic Loop with Parallel Execution

This is the orchestration engine — the code that ties everything together. The loop sends messages to Claude, checks the response for tool_use blocks, executes them, and feeds results back. It keeps going until Claude says it's done (stop_reason: "end_turn"). The clever part is the branching logic: when Claude returns multiple tool_use blocks, the code runs them in parallel using ThreadPoolExecutor (Python) or Promise.all (Node.js). When there's only one, it runs sequentially. Either way, results go back in one message. The critical safety measure you'll notice is the max_iterations limit — without it, a confused model could loop indefinitely, burning tokens and money. One subtle gotcha: you must append response.content (the full content array, including tool_use blocks) to messages, not just the text. Claude needs to see its own tool requests in the conversation history to make sense of the results you're sending back.
def run_research_agent(question: str, available_tools=None) -> str:
    """Run the agentic loop with parallel tool execution."""
    active_tools = available_tools or tools
    messages = [{"role": "user", "content": question}]
    max_iterations = 10  # Safety limit

    for iteration in range(max_iterations):
        try:
            response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=4096,
                system="You are a research assistant. Search multiple "
                       "sources in parallel when possible.",
                tools=active_tools,
                messages=messages,
            )
        except anthropic.APIError as e:
            return f"API error: {e.status_code} - {e.message}"

        # Collect tool_use blocks
        tool_uses = [b for b in response.content if b.type == "tool_use"]

        if response.stop_reason == "end_turn" or not tool_uses:
            # Claude is done — extract final text
            return "\n".join(
                b.text for b in response.content if b.type == "text"
            )

        # Execute tools — parallel when multiple requested
        if len(tool_uses) > 1:
            # PARALLEL: use ThreadPoolExecutor
            tool_results = []
            with ThreadPoolExecutor(max_workers=len(tool_uses)) as pool:
                futures = {
                    pool.submit(execute_tool, tu.name, tu.input): tu.id
                    for tu in tool_uses
                }
                for future in as_completed(futures):
                    tid = futures[future]
                    result_json, is_err = future.result()
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": tid,
                        "content": result_json,
                        **({"is_error": True} if is_err else {}),
                    })
        else:
            # SEQUENTIAL: single tool
            tu = tool_uses[0]
            result_json, is_err = execute_tool(tu.name, tu.input)
            tool_results = [{
                "type": "tool_result",
                "tool_use_id": tu.id,
                "content": result_json,
                **({"is_error": True} if is_err else {}),
            }]

        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

    return "Max iterations reached."

# Run it
answer = run_research_agent(
    "Research the latest developments in AI agents. "
    "Search multiple sources and summarize findings."
)
print(answer)
async function runResearchAgent(question, availableTools) {
  const activeTools = availableTools || tools;
  const messages = [{ role: "user", content: question }];
  const maxIterations = 10;

  for (let i = 0; i < maxIterations; i++) {
    let response;
    try {
      response = await client.messages.create({
        model: "claude-sonnet-4-6",
        max_tokens: 4096,
        system: "You are a research assistant. Search multiple " +
                "sources in parallel when possible.",
        tools: activeTools,
        messages,
      });
    } catch (e) {
      return `API error: ${e.status} - ${e.message}`;
    }

    const toolUses = response.content.filter(b => b.type === "tool_use");

    if (response.stop_reason === "end_turn" || toolUses.length === 0) {
      return response.content
        .filter(b => b.type === "text")
        .map(b => b.text)
        .join("\n");
    }

    let toolResults;
    if (toolUses.length > 1) {
      // PARALLEL: Promise.all
      toolResults = await Promise.all(
        toolUses.map(async (tu) => {
          const { result, isError } = await executeTool(tu.name, tu.input);
          return {
            type: "tool_result",
            tool_use_id: tu.id,
            content: result,
            ...(isError ? { is_error: true } : {}),
          };
        })
      );
    } else {
      // SEQUENTIAL: single tool
      const tu = toolUses[0];
      const { result, isError } = await executeTool(tu.name, tu.input);
      toolResults = [{
        type: "tool_result",
        tool_use_id: tu.id,
        content: result,
        ...(isError ? { is_error: true } : {}),
      }];
    }

    messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
  }
  return "Max iterations reached.";
}

const answer = await runResearchAgent(
  "Research the latest developments in AI agents. " +
  "Search multiple sources and summarize findings."
);
console.log(answer);
What Just Happened? You just built a complete agentic loop that handles both parallel and sequential tool execution. When Claude returns multiple tool_use blocks, your code runs them concurrently using ThreadPoolExecutor (Python) or Promise.all (Node.js). When Claude returns a single tool call, it runs sequentially. Either way, results go back to Claude in one message, and the loop repeats until Claude says it's done (stop_reason: "end_turn"). This is the same pattern powering production agents — the only difference in real systems is that the tool implementations call actual APIs instead of mocks.

Step 4: Dynamic Tool Registry

Let's put the dynamic registration concept from earlier into code. The ToolRegistry class stores tools with category tags and filters them on demand. The payoff is simple: instead of sending all 20 tools with every API call, you call get_tools_for_context(tags=["research"]) and get back only the 3 tools relevant to the current phase. This is especially powerful in multi-phase workflows — you'd use research tools during the search phase, then swap to citation tools once summaries are ready. One small trap to watch for: tag names are case-sensitive. If you tag one tool "research" and another "Research", the filter won't match both. Pick a convention (lowercase recommended) and stick with it.
class ToolRegistry:
    """Manages tools and filters them by context."""

    def __init__(self):
        self._tools: dict[str, dict] = {}
        self._tags: dict[str, set[str]] = {}

    def register(self, tool: dict, tags: list[str] = None):
        name = tool["name"]
        self._tools[name] = tool
        self._tags[name] = set(tags or [])

    def unregister(self, name: str):
        self._tools.pop(name, None)
        self._tags.pop(name, None)

    def get_tools_for_context(
        self, tags: list[str] = None, names: list[str] = None
    ) -> list[dict]:
        if names:
            return [self._tools[n] for n in names if n in self._tools]
        if tags:
            tag_set = set(tags)
            return [
                self._tools[n] for n, t in self._tags.items()
                if t & tag_set
            ]
        return list(self._tools.values())

# Usage
registry = ToolRegistry()
registry.register(tools[0], tags=["research", "search"])
registry.register(tools[1], tags=["research", "fetch"])
registry.register(tools[2], tags=["research", "analysis"])
registry.register(tools[3], tags=["citation"])
registry.register(tools[4], tags=["output"])

# Phase 1: only research tools
research_tools = registry.get_tools_for_context(tags=["research"])
# => [web_search, fetch_page, summarize_text]

# Phase 2: add citation tools after summaries are ready
cite_tools = registry.get_tools_for_context(
    names=["format_citation", "save_to_file"]
)
class ToolRegistry {
  constructor() {
    this._tools = new Map();
    this._tags = new Map();
  }

  register(tool, tags = []) {
    this._tools.set(tool.name, tool);
    this._tags.set(tool.name, new Set(tags));
  }

  unregister(name) {
    this._tools.delete(name);
    this._tags.delete(name);
  }

  getToolsForContext({ tags, names } = {}) {
    if (names)
      return names.filter(n => this._tools.has(n)).map(n => this._tools.get(n));
    if (tags) {
      const tagSet = new Set(tags);
      const result = [];
      for (const [name, toolTags] of this._tags) {
        for (const t of tagSet) {
          if (toolTags.has(t)) { result.push(this._tools.get(name)); break; }
        }
      }
      return result;
    }
    return [...this._tools.values()];
  }
}

// Usage
const registry = new ToolRegistry();
registry.register(tools[0], ["research", "search"]);
registry.register(tools[1], ["research", "fetch"]);
registry.register(tools[2], ["research", "analysis"]);
registry.register(tools[3], ["citation"]);
registry.register(tools[4], ["output"]);

const researchTools = registry.getToolsForContext({ tags: ["research"] });
const citeTools = registry.getToolsForContext({
  names: ["format_citation", "save_to_file"]
});
What Just Happened? You built a ToolRegistry that tags tools by category ("research", "citation", "output") and filters them on demand. In a real workflow, you'd call get_tools_for_context(tags=["research"]) during the search phase to give Claude only 3 tools, then switch to names=["format_citation", "save_to_file"] when summaries are ready. This is how production agents keep tool sets lean — each phase of a multi-step workflow sees only the tools it needs, improving both accuracy and token efficiency.

Hands-On Exercise

What You'll Build

A multi-tool research agent that searches multiple sources in parallel, fetches pages sequentially, handles errors gracefully with is_error: true, and uses a ToolRegistry to filter tools by context.

Time Estimate: 30–45 minutes

Prerequisites: Python 3.10+ (or Node.js 18+), an Anthropic API key, and completion of M05 (Function Calling Fundamentals).

Files You'll Create: multi_tool_agent.py (or multi_tool_agent.mjs for Node.js) — a single file containing tool schemas, mock implementations, a tool dispatcher, a ToolRegistry, and the agentic loop with parallel execution.

Environment Setup

mkdir multi-tool-lab && cd multi-tool-lab
python -m venv venv && source venv/bin/activate   # Windows: venv\Scripts\activate
pip install anthropic>=0.30.0
export ANTHROPIC_API_KEY=your-key-here             # Windows: set ANTHROPIC_API_KEY=your-key-here
mkdir multi-tool-lab && cd multi-tool-lab
npm init -y && npm install @anthropic-ai/sdk
export ANTHROPIC_API_KEY=your-key-here             # Windows: set ANTHROPIC_API_KEY=your-key-here

Step 1: Define Tools, Mock Implementations & ToolRegistry

What: This step sets up everything the agent needs: 5 tool schemas that Claude will see, mock functions behind each schema, an execute_tool dispatcher with error handling, and a ToolRegistry for filtering tools by context.

Why: Separating tool definitions (what Claude sees) from tool implementations (what your code runs) is a fundamental pattern. The ToolRegistry adds the ability to dynamically filter which tools are sent to Claude based on context. We're putting it all in one file for simplicity.

Create a new file called multi_tool_agent.py and add the following:

import anthropic
import json
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY

# ── Tool Schemas (what Claude sees) ──────────────────────────
tools = [
    {
        "name": "web_search",
        "description": (
            "Search the web for current information. Returns top 3 "
            "results with title, URL, and snippet. Use for recent "
            "events, factual questions, or general research."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "fetch_page",
        "description": (
            "Fetch the full text content of a web page by URL. "
            "Returns page text (max 5000 chars). Use after "
            "web_search to get full content from a result URL."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "Full URL to fetch"}
            },
            "required": ["url"]
        }
    },
    {
        "name": "summarize_text",
        "description": (
            "Summarize long text into key points (3-5 bullets). "
            "Use after fetch_page to condense page content."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "text": {"type": "string", "description": "Text to summarize"},
                "max_points": {
                    "type": "integer",
                    "description": "Max bullet points (default 5)"
                }
            },
            "required": ["text"]
        }
    },
    {
        "name": "format_citation",
        "description": (
            "Format a source as an academic citation. Use after "
            "summaries are ready to create proper references."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string", "description": "Article title"},
                "url": {"type": "string", "description": "Source URL"},
                "accessed_date": {"type": "string", "description": "e.g. '2025-01-15'"}
            },
            "required": ["title", "url"]
        }
    },
    {
        "name": "save_to_file",
        "description": "Save content to a local file. Returns file path.",
        "input_schema": {
            "type": "object",
            "properties": {
                "filename": {"type": "string", "description": "Output filename"},
                "content": {"type": "string", "description": "Content to save"}
            },
            "required": ["filename", "content"]
        }
    }
]

# ── Mock Implementations ─────────────────────────────────────
def web_search(query: str) -> dict:
    time.sleep(0.2)
    return {"results": [
        {"title": f"Result 1: {query}", "url": "https://example.com/1",
         "snippet": f"Overview of {query}..."},
        {"title": f"Result 2: {query}", "url": "https://example.com/2",
         "snippet": f"Developments in {query}..."},
        {"title": f"Result 3: {query}", "url": "https://broken.example.com/404",
         "snippet": f"Deep dive into {query}..."},
    ]}

def fetch_page(url: str) -> dict:
    time.sleep(0.3)
    if "broken" in url or "404" in url:
        raise ConnectionError(f"404 Not Found: {url}")
    return {"content": f"Full page content from {url}. " * 20}

def summarize_text(text: str, max_points: int = 5) -> dict:
    return {"summary": [f"Key point {i+1}" for i in range(min(max_points, 5))]}

def format_citation(title: str, url: str, accessed_date: str = None) -> dict:
    date = accessed_date or "2025-01-15"
    return {"citation": f'"{title}." Available at: {url}. Accessed: {date}.'}

def save_to_file(filename: str, content: str) -> dict:
    return {"status": "saved", "path": f"/output/{filename}", "bytes": len(content)}

# ── Dispatcher with Error Handling ───────────────────────────
tool_functions = {
    "web_search": web_search, "fetch_page": fetch_page,
    "summarize_text": summarize_text, "format_citation": format_citation,
    "save_to_file": save_to_file,
}

def execute_tool(name: str, inputs: dict) -> tuple[str, bool]:
    """Execute a tool, returning (result_json, is_error)."""
    func = tool_functions.get(name)
    if not func:
        return json.dumps({"error": f"Unknown tool: {name}"}), True
    try:
        result = func(**inputs)
        return json.dumps(result), False
    except Exception as e:
        return json.dumps({"error": str(e), "tool": name}), True

# ── ToolRegistry ─────────────────────────────────────────────
class ToolRegistry:
    def __init__(self):
        self._tools: dict[str, dict] = {}
        self._tags: dict[str, set[str]] = {}

    def register(self, tool: dict, tags: list[str] = None):
        self._tools[tool["name"]] = tool
        self._tags[tool["name"]] = set(tags or [])

    def unregister(self, name: str):
        self._tools.pop(name, None)
        self._tags.pop(name, None)

    def get_tools_for_context(self, tags: list[str] = None, names: list[str] = None) -> list[dict]:
        if names:
            return [self._tools[n] for n in names if n in self._tools]
        if tags:
            tag_set = set(tags)
            return [self._tools[n] for n, t in self._tags.items() if t & tag_set]
        return list(self._tools.values())

# Register tools with tags
registry = ToolRegistry()
registry.register(tools[0], tags=["research", "search"])
registry.register(tools[1], tags=["research", "fetch"])
registry.register(tools[2], tags=["research", "analysis"])
registry.register(tools[3], tags=["citation"])
registry.register(tools[4], tags=["output"])

print("āœ“ Tools, dispatcher, and registry ready.")
print(f"  All tools: {[t['name'] for t in registry.get_tools_for_context()]}")
print(f"  Research only: {[t['name'] for t in registry.get_tools_for_context(tags=['research'])]}")

Run it:

Command
python multi_tool_agent.py
Expected Output
āœ“ Tools, dispatcher, and registry ready. All tools: ['web_search', 'fetch_page', 'summarize_text', 'format_citation', 'save_to_file'] Research only: ['web_search', 'fetch_page', 'summarize_text']
✅ Checkpoint If you see both lists printed — 5 tools total and 3 research-only tools — Step 1 is working. The registry correctly filters by tag.
Troubleshooting
  • ModuleNotFoundError: No module named 'anthropic' → Run pip install anthropic
  • TypeError: 'type' object is not subscriptable → You need Python 3.9+ for dict[str, dict], list[dict], set[str] annotations and 3.9+ for tuple[str, bool]. On older versions, add from __future__ import annotations at the top of the file, or use typing.Dict, typing.List, typing.Set, typing.Tuple.
  • Tag filter returns nothing unexpectedly → Tag names are case-sensitive. "Research" and "research" are different. The registry uses set intersection (t & tag_set) — if your tag list and registered tags don't match exactly, you'll get an empty result.

Step 2: Add the Agentic Loop with Parallel Execution

What: This step adds the orchestration engine — the loop that sends messages to Claude, executes tool calls (in parallel when multiple are returned), feeds results back, and repeats until Claude is done.

Why: Without this loop, you can only make one-shot API calls. The agentic loop is what turns a chatbot into an agent — it lets Claude chain together multiple tools across multiple iterations to complete complex tasks. This step uses the tools and dispatcher from Step 1.

Add the following to the bottom of multi_tool_agent.py (after the registry setup):

# ── Agentic Loop with Parallel Execution ─────────────────────
def run_agent(question: str, tool_tags: list[str] = None, verbose: bool = True) -> str:
    """Run the multi-tool agent. Optionally filter tools by tag."""
    if tool_tags:
        active_tools = registry.get_tools_for_context(tags=tool_tags)
    else:
        active_tools = registry.get_tools_for_context()

    if verbose:
        print(f"\n{'='*60}")
        print(f"Question: {question}")
        print(f"Active tools: {[t['name'] for t in active_tools]}")
        print(f"{'='*60}")

    messages = [{"role": "user", "content": question}]
    max_iterations = 10

    for iteration in range(max_iterations):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            system=(
                "You are a research assistant. When asked to compare or "
                "research multiple topics, search for each one in parallel. "
                "When asked to fetch and summarize a page, do it sequentially."
            ),
            tools=active_tools,
            messages=messages,
        )

        tool_uses = [b for b in response.content if b.type == "tool_use"]

        if response.stop_reason == "end_turn" or not tool_uses:
            final_text = "\n".join(
                b.text for b in response.content if b.type == "text"
            )
            if verbose:
                print(f"\nāœ“ Agent finished in {iteration + 1} iteration(s)")
            return final_text

        # Show what Claude requested
        if verbose:
            mode = "PARALLEL" if len(tool_uses) > 1 else "SEQUENTIAL"
            print(f"\n  Iteration {iteration + 1} [{mode}]:")
            for tu in tool_uses:
                print(f"    → {tu.name}({json.dumps(tu.input)[:80]}...)")

        # Execute tools — parallel when multiple
        if len(tool_uses) > 1:
            tool_results = []
            with ThreadPoolExecutor(max_workers=len(tool_uses)) as pool:
                futures = {
                    pool.submit(execute_tool, tu.name, tu.input): tu.id
                    for tu in tool_uses
                }
                for future in as_completed(futures):
                    tid = futures[future]
                    result_json, is_err = future.result()
                    if verbose and is_err:
                        print(f"    āœ— Error for {tid}: {result_json[:60]}")
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": tid,
                        "content": result_json,
                        **({"is_error": True} if is_err else {}),
                    })
        else:
            tu = tool_uses[0]
            result_json, is_err = execute_tool(tu.name, tu.input)
            if verbose and is_err:
                print(f"    āœ— Error: {result_json[:60]}")
            tool_results = [{
                "type": "tool_result",
                "tool_use_id": tu.id,
                "content": result_json,
                **({"is_error": True} if is_err else {}),
            }]

        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

    return "Max iterations reached."

# ── Test Scenarios ───────────────────────────────────────────
if __name__ == "__main__":
    # Test 1: Parallel search (Claude should call web_search multiple times)
    print("\n" + "ā–¶ TEST 1: PARALLEL SEARCH ".ljust(60, "─"))
    result1 = run_agent(
        "Search for information about these 3 topics: AI agents, "
        "prompt engineering, and tool use patterns.",
        tool_tags=["research"]
    )
    print(f"\nResult preview: {result1[:200]}...")

    # Test 2: Sequential chain (search → fetch → summarize)
    print("\n" + "ā–¶ TEST 2: SEQUENTIAL CHAIN ".ljust(60, "─"))
    result2 = run_agent(
        "Search for 'Claude AI tool use', then fetch the first "
        "result page and summarize its content.",
        tool_tags=["research"]
    )
    print(f"\nResult preview: {result2[:200]}...")

    # Test 3: Error recovery (fetch_page will 404 on broken URL)
    print("\n" + "ā–¶ TEST 3: ERROR RECOVERY ".ljust(60, "─"))
    result3 = run_agent(
        "Fetch and summarize this page: https://broken.example.com/404",
        tool_tags=["research"]
    )
    print(f"\nResult preview: {result3[:200]}...")

    # Test 4: Dynamic tool filtering (citation tools only)
    print("\n" + "ā–¶ TEST 4: DYNAMIC TOOL FILTERING ".ljust(60, "─"))
    result4 = run_agent(
        "Format a citation for an article titled 'Multi-Tool AI Agents' "
        "from https://example.com/agents, accessed today.",
        tool_tags=["citation"]
    )
    print(f"\nResult preview: {result4[:200]}...")

Run the full agent:

Command
python multi_tool_agent.py
Expected Output (abbreviated)
āœ“ Tools, dispatcher, and registry ready. All tools: ['web_search', 'fetch_page', 'summarize_text', 'format_citation', 'save_to_file'] Research only: ['web_search', 'fetch_page', 'summarize_text'] ā–¶ TEST 1: PARALLEL SEARCH ───────────────────────────────────── ============================================================ Question: Search for information about these 3 topics: AI agents, ... Active tools: ['web_search', 'fetch_page', 'summarize_text'] ============================================================ Iteration 1 [PARALLEL]: → web_search({"query": "AI agents"}...) → web_search({"query": "prompt engineering"}...) → web_search({"query": "tool use patterns"}...) āœ“ Agent finished in 2 iteration(s) Result preview: Here's what I found about each topic... ā–¶ TEST 2: SEQUENTIAL CHAIN ──────────────────────────────────── ... Iteration 1 [SEQUENTIAL]: → web_search({"query": "Claude AI tool use"}...) Iteration 2 [SEQUENTIAL]: → fetch_page({"url": "https://example.com/1"}...) Iteration 3 [SEQUENTIAL]: → summarize_text({"text": "Full page content from..."}...) āœ“ Agent finished in 4 iteration(s) ā–¶ TEST 3: ERROR RECOVERY ────────────────────────────────────── ... Iteration 1 [SEQUENTIAL]: → fetch_page({"url": "https://broken.example.com/404"}...) āœ— Error: {"error": "404 Not Found: https://broken.example.com/404"... āœ“ Agent finished in 2 iteration(s) Result preview: I wasn't able to fetch that page — it returned a 404 error... ā–¶ TEST 4: DYNAMIC TOOL FILTERING ────────────────────────────── ... Active tools: ['format_citation'] ...
✅ Checkpoint

Look for these key behaviors in your output:

  • Test 1: Should show [PARALLEL] with 3 web_search calls in one iteration
  • Test 2: Should show [SEQUENTIAL] across 3–4 iterations, each building on the previous result
  • Test 3: Should show āœ— Error followed by Claude adapting (trying a different approach or informing the user)
  • Test 4: Should show Active tools: ['format_citation'] — only 1 tool instead of 5
Troubleshooting
  • Agent runs forever / hits max iterations → The max_iterations = 10 safety limit will stop it. If Claude keeps calling tools without converging, make the prompt more specific.
  • AuthenticationError → Check your ANTHROPIC_API_KEY is set correctly. Run echo $ANTHROPIC_API_KEY to verify.
  • Test 1 shows SEQUENTIAL instead of PARALLEL → Claude doesn't always parallelize. Try rephrasing: "Search for these 3 topics simultaneously: ..." The system prompt also encourages parallel behavior.
  • APIError: 529 (overloaded) → Wait 30 seconds and try again. Consider running tests one at a time by commenting out the others.

Verify Everything Works

Run the complete file end-to-end. All 4 tests should complete without crashing, demonstrating parallel execution, sequential chaining, error recovery, and dynamic tool filtering:

Command
python multi_tool_agent.py

If all 4 tests complete and you see the āœ“ Agent finished message for each, you've successfully built a multi-tool orchestration agent with parallel execution, error handling, and dynamic tool registration.

🎉 Congratulations

You've built a production-pattern multi-tool agent! You can extend this by swapping mock implementations for real APIs (e.g., use the requests library in fetch_page), adding a circuit breaker counter that disables tools after 3 consecutive failures, or implementing execution timing to compare parallel vs sequential wall-clock times.

Stretch Goals (Optional)
  • Add execution timing to each tool call and print a trace waterfall showing parallel vs sequential sections
  • Implement a circuit breaker class that disables a tool after 3 consecutive failures
  • Add a cost tracker that estimates token usage per iteration based on message length

Knowledge Check

Q1: Given three tool calls where B needs A's result, but C is independent of both, what's the optimal execution strategy?

A Run all three sequentially: A → B → C
B Run A and C in parallel, then B after A completes
C Run all three in parallel
D Run C first, then A, then B
Correct! A and C are independent so they run in parallel. B depends on A's result, so it must wait. This gives maximum parallelism while respecting data dependencies.

Q2: Rank these tool descriptions from LEAST to MOST effective: (1) "queries data"  |  (2) "Run a SQL query against the users DB. Returns matching rows. Use when asking about user accounts."  |  (3) "database tool"

A 3, 2, 1
B 1, 3, 2
C 3, 1, 2 (worst to best)
D 2, 1, 3
Correct! "database tool" is worst (no verb, no context). "queries data" is slightly better (has a verb). The detailed description is best — it specifies what, how, returns, and when.

Q3: A tool fails with a network timeout. What's the BEST way to report this to Claude? (Recall from M05: Claude doesn't execute tools — you do.)

A Throw an exception and crash the agent loop
B Return a tool_result with is_error: true and a descriptive message
C Return an empty tool_result
D Silently retry 100 times
Correct! Returning is_error: true lets Claude reason about alternatives. This extends the M05 pattern of returning errors as tool_result — now with the explicit is_error flag.

Q4: You have 20 tools averaging 400 input tokens each. Filtering to 5 per request saves approximately how many tokens?

A 400 tokens
B 2,000 tokens
C 4,000 tokens
D 6,000 tokens
Correct! 20 × 400 = 8,000 tokens. 5 × 400 = 2,000. Savings: 6,000 tokens per request.

Q5: In a 3-tool sequential chain (search → fetch → summarize), how many API round trips before Claude's final text response?

A 1 round trip
B 3 round trips
C 4 round trips (3 tools + final response)
D 6 round trips
Correct! RT 1: Claude requests search. RT 2: Claude requests fetch. RT 3: Claude requests summarize. RT 4: Claude produces final text (stop_reason: "end_turn"). Total: 4.

Q6: Claude returns 3 tool_use blocks in one response. How should you return the results? (Recall from M05: the tool_use_id links each result to its request.)

A All 3 tool_result blocks in a single user message
B Each in a separate user message (3 messages)
C Concatenate all results into one tool_result
D Return only the first, discard the others
Correct! All tool_result blocks go in a single user message, each referencing its corresponding tool_use_id. This is the API protocol for parallel tool results.

Module Summary

Key Concepts Recap

  • Parallel tool calls: Multiple independent tools in one response. Execute concurrently, return all results in one message.
  • Sequential chains: Output of one feeds the next. Each step is a full API round trip via the agentic loop.
  • Tool selection: Description quality directly determines accuracy. Include what, when, and returns.
  • Dynamic registration: Filter tools by context to save tokens, improve accuracy, enforce least privilege.
  • Error handling: Return is_error: true. Let Claude adapt. Use circuit breakers for persistent failures.

Next: M07 — Model Context Protocol (MCP)

You've been defining tools manually in code. MCP standardizes how tools are discovered, described, and invoked across any client and server. You'll connect to external MCP servers and expose your own tools as MCP services.

References & Resources