M12: The ReAct Pattern — Reason, Act, Observe | Building AI Agents with Claude

Learning Objectives

By the end of this module you will be able to:

Explain the difference between a chatbot and an agent, and when each is appropriate
Describe the four phases of the ReAct loop (Reason → Act → Observe → Repeat) and what happens in each
Implement a working ReAct agent using Claude's Messages API and tool-use loop
Use thought traces to improve reasoning quality and debuggability
Define correct stop conditions — without relying on arbitrary iteration caps
Compare the manual implementation to the Agent SDK approach side-by-side

Chatbot vs. Agent — The Fundamental Difference

💡 Everyday Analogy

Before agents: calling a help desk where the operator can only answer questions from a script. You ask "is my order shipped?" and they look up one field and read it back. One question in, one answer out, every time.

The pain: If your order is late and needs rerouting through a different carrier and the new tracking number needs emailing to you — the operator says "I can't help with that, please call three separate departments." Each department has their own single-answer script.

The mapping: A chatbot is that operator — one message in, one response out, no memory between calls, no ability to take actions. An agent is a project manager who gets your request, decides what steps to take, calls the right departments in sequence, waits for results, and comes back with a complete solution. That autonomy to loop through multiple steps is exactly what ReAct gives Claude.

INTERACTION PATTERN: CHATBOT vs. AGENT

CHATBOT

👤 "Summarize this doc"

🤖 "Here's a summary: ..."

1 step. Done.

AGENT (ReAct)

👤 "Research & compare LLMs"

💭 Thought: I need to search for current LLM benchmarks

🔧 Act: web_search("LLM leaderboard 2025")

👁 Observe: [search results returned]

💭 Thought: I need pricing data too

🔧 Act: web_search("LLM pricing 2025")

🤖 "Here's a complete comparison with pricing..."

🤖 Chatbot

One message → one response
No tools, no actions
Can only use training knowledge
Done in a single API call
Great for: Q&A, summaries, generation

          ⚙️ Agent (ReAct)
          Goal → multiple reasoning/action cycles
Uses real tools (search, code, APIs)
Observes results and adapts its plan
Runs for as many turns as needed
Great for: research, automation, multi-step workflows

        

What Is ReAct?

📐 Technical Definition

ReAct (Reasoning + Acting) is a prompting and execution pattern where the model alternates between producing explicit reasoning steps ("Thought: ...") and taking tool actions ("Act: ..."), observing the results of those actions, and repeating until the task is complete.

The name comes from the 2022 academic paper "ReAct: Synergizing Reasoning and Acting in Language Models" by Yao et al. The key insight: models that reason out loud before acting make significantly fewer errors than models that just output answers or call tools silently. The thought step is not decoration — it is the mechanism that improves quality.

Before ReAct, there were two extreme approaches to using language models: pure chain-of-thought (all reasoning, no external tools) or pure tool use (call tools without reasoning about which ones or why). Both had obvious failure modes. Chain-of-thought hallucinated facts because it had no way to verify. Silent tool use called the wrong tool because it had no reasoning step to narrow down the right action.

ReAct fuses both. The "Thought:" step is where Claude plans, notices gaps, and decides what action would fill them. The "Act:" step is the actual tool call. The "Observe:" step is reading the tool result. Then the loop continues with another thought — informed by what was just learned — until Claude has enough information to give a final answer.

✅ Why It Matters

Studies on ReAct showed a 10–40% improvement in task completion on knowledge-intensive benchmarks compared to silent tool use alone. The cost: a few hundred tokens per turn for the thought traces. The benefit: dramatically fewer dead-end tool calls, fewer hallucinated facts, and traces you can actually debug. For production agents handling real tasks, that reliability delta is the difference between something you can ship and something you can only demo.

The ReAct Loop: Reason → Act → Observe → Repeat

The loop has four phases. Let's walk through each one carefully, because understanding what happens at each step is what lets you debug, optimize, and extend ReAct agents.

THE REACT LOOP — ANIMATED

💭 Reason

Plan what to do next

🔧 Act

Call a tool

👁 Observe

Read tool result

🔁 Repeat

Or finish

Waiting...

Press ▶ to walk through the loop step by step.

Phase 1 — Reason

This is where Claude thinks out loud. Before calling any tool, it produces a "Thought:" statement describing what it knows, what's missing, and what action would be most useful next. Think of it as Claude's internal monologue — except you make it external and visible in the conversation history.

Why force the reasoning to be explicit? Because when the thought is written out, it becomes part of the conversation context. That means subsequent reasoning steps can reference earlier thoughts. The model is essentially leaving notes for itself. Without this, complex multi-step tasks collapse because each tool call happens in a kind of vacuum — the model has already "forgotten" what it was trying to accomplish.

Phase 2 — Act

The Act phase is a Claude tool call — one of the tool_use blocks you met in M05. Claude picks a tool from the available set (web search, code execution, database query, whatever you've given it) and calls it with specific parameters. This is the moment Claude interacts with the real world: it reads a file, fetches a URL, runs code, queries an API. Anything can go wrong here (network errors, empty results, access denied), so your agent infrastructure must handle errors gracefully and return them to Claude as structured tool results.

Phase 3 — Observe

After the tool executes, your code (not Claude — Claude doesn't run the tools, it just requests them) appends the tool result to the conversation history as a tool_result content block. Claude reads this result in the next turn — that reading is the "Observe" phase. This is where Claude updates its understanding: "The search returned three papers, but none of them have the pricing data I need." That updated understanding feeds directly into the next Reason phase.

Phase 4 — Repeat or Stop

After observing, Claude has a choice: call another tool (continue the loop) or produce a final answer (stop). The correct way to detect which it chose is by checking stop_reason in the API response. If stop_reason == "tool_use", Claude requested more tools — keep going. If stop_reason == "end_turn", Claude is done and produced a final text response.

🎓 Cert Tip — Domain 1.1

The correct way to determine loop termination is checking stop_reason: 'tool_use' means continue, 'end_turn' means done. CRITICAL anti-pattern: parsing Claude's natural language response to see if it says "I'm done" or "task complete." Text parsing is fragile — Claude may phrase completion differently every time, and any mismatch causes the loop to hang or terminate early.

Implementing ReAct with Claude's Tool Use

ReAct is not a library or framework — it's a pattern you implement directly using the Messages API you already know from M05 and M06. The key is the agentic loop: a while loop that keeps calling client.messages.create() until stop_reason == "end_turn", with each iteration appending both Claude's tool requests and your tool results to the running conversation history.

📐 The Agentic Loop Pattern

The loop works like this: send the user's request to Claude → check if Claude wants to use tools → if yes, execute those tools and append results → loop back to Claude with the updated history → if stop_reason == "end_turn", extract and return the final text response.

The conversation history array is the agent's memory — it grows with every turn. Claude's "reasoning" happens because each new call includes all previous thoughts, actions, and observations. The stateless API call becomes stateful because your code maintains the history list.

Chunk 1: Tool Definitions

Before the agent loop, define what tools are available. Each tool needs a name, a clear description (Claude uses this to decide when to call it — bad descriptions mean wrong tool selections), and a JSON schema for inputs.

# Tool definitions — what the agent can do
# The 'description' field is critical: Claude reads it to decide WHEN to call this tool
TOOLS = [
    {
        "name": "web_search",
        "description": "Search the web for current information. Use for: facts that may have changed since training, recent events, prices, statistics. Do NOT use for: well-known stable facts (capitals, basic science), things you already know.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The search query. Be specific — 'Claude 3.5 Sonnet context window 2025' is better than 'Claude context window'."
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "read_url",
        "description": "Fetch and return the text content of a web page. Use after web_search when you need the full article, not just the snippet.",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The full URL to fetch (must start with https://)"}
            },
            "required": ["url"]
        }
    }
]

import Anthropic from '@anthropic-ai/sdk';

// Tool definitions — what the agent can do
const TOOLS: Anthropic.Tool[] = [
  {
    name: "web_search",
    description: "Search the web for current information. Use for: facts that may have changed since training, recent events, prices, statistics. Do NOT use for: well-known stable facts, things you already know.",
    input_schema: {
      type: "object",
      properties: {
        query: {
          type: "string",
          description: "The search query. Be specific."
        }
      },
      required: ["query"]
    }
  },
  {
    name: "read_url",
    description: "Fetch and return the text content of a web page. Use after web_search when you need the full article.",
    input_schema: {
      type: "object",
      properties: {
        url: { type: "string", description: "The full URL to fetch (must start with https://)" }
      },
      required: ["url"]
    }
  }
];

Chunk 2: The Agent Loop

This is the heart of ReAct. Notice three things: (1) the loop condition checks stop_reason, not a counter; (2) every tool call and its result is appended to messages before looping; (3) the system prompt explicitly instructs Claude to reason before acting.

import anthropic
import json

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from environment

SYSTEM_PROMPT = """You are a research assistant with access to web search tools.

Before calling ANY tool, write a Thought: explaining:
1. What you currently know about the question
2. What specific information you still need
3. Why this particular tool call will fill that gap

After receiving a tool result, write another Thought: summarizing what you learned
and whether you need more information or can now answer the question.

This reasoning process makes your answers more accurate and your actions more focused."""


def run_react_agent(user_question: str, max_turns: int = 20) -> str:
    """
    Run the ReAct loop until Claude finishes (stop_reason == 'end_turn').
    max_turns is a SAFETY NET — the agent terminates via stop_reason, not via this cap.
    """
    messages = [{"role": "user", "content": user_question}]
    turn_count = 0

    while turn_count < max_turns:
        turn_count += 1

        # --- REASON + ACT phase ---
        # Claude produces a thought, then optionally requests a tool call
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=4096,
            system=SYSTEM_PROMPT,
            tools=TOOLS,
            messages=messages
        )

        # Append Claude's full response to history (preserves thought traces + tool calls)
        messages.append({"role": "assistant", "content": response.content})

        # --- STOP CHECK ---
        # 'end_turn' = Claude is done and produced a final answer
        # 'tool_use' = Claude wants to call one or more tools
        if response.stop_reason == "end_turn":
            # Extract the final text response
            for block in response.content:
                if hasattr(block, 'text'):
                    return block.text
            return "Agent finished without a text response."

        # --- OBSERVE phase ---
        # Find all tool_use blocks Claude requested and execute them
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result
                })

        # Append all tool results so Claude can observe them in the next turn
        if tool_results:
            messages.append({"role": "user", "content": tool_results})

    return f"Safety limit reached after {max_turns} turns."


def execute_tool(name: str, inputs: dict) -> str:
    """Dispatch tool calls to their implementations."""
    try:
        if name == "web_search":
            return web_search(inputs["query"])
        elif name == "read_url":
            return read_url(inputs["url"])
        else:
            return json.dumps({"error": f"Unknown tool: {name}", "isError": True})
    except Exception as e:
        # Always return structured errors — never raise. The agent decides what to do next.
        return json.dumps({"error": str(e), "tool": name, "isError": True})

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); // reads ANTHROPIC_API_KEY from environment

const SYSTEM_PROMPT = `You are a research assistant with access to web search tools.

Before calling ANY tool, write a Thought: explaining:
1. What you currently know about the question
2. What specific information you still need
3. Why this particular tool call will fill that gap

After receiving a tool result, write another Thought: summarizing what you learned.`;

async function runReactAgent(userQuestion: string, maxTurns = 20): Promise {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userQuestion }
  ];
  let turnCount = 0;

  while (turnCount < maxTurns) {
    turnCount++;

    // REASON + ACT: Claude produces thought + optional tool call
    const response = await client.messages.create({
      model: "claude-opus-4-7",
      max_tokens: 4096,
      system: SYSTEM_PROMPT,
      tools: TOOLS,
      messages,
    });

    // Append Claude's response to history (preserves thoughts and tool calls)
    messages.push({ role: "assistant", content: response.content });

    // STOP CHECK: 'end_turn' means done; 'tool_use' means continue
    if (response.stop_reason === "end_turn") {
      const textBlock = response.content.find((b): b is Anthropic.TextBlock => b.type === "text");
      return textBlock?.text ?? "Agent finished without a text response.";
    }

    // OBSERVE: execute tool calls, collect results
    const toolResults: Anthropic.ToolResultBlockParam[] = [];
    for (const block of response.content) {
      if (block.type === "tool_use") {
        const result = await executeTool(block.name, block.input as Record);
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: result,
        });
      }
    }

    if (toolResults.length > 0) {
      messages.push({ role: "user", content: toolResults });
    }
  }
  return `Safety limit reached after ${maxTurns} turns.`;
}

✅ What Just Happened? You just implemented the full ReAct loop. The while loop runs until Claude says stop_reason == "end_turn". Each iteration: Claude reasons + requests tools → you execute them → you append results → repeat. The messages list is the shared memory that makes all the "reasoning" work — every thought, action, and observation accumulates there.

Thought Traces: The Agent's Working Memory

Thought traces are the "Thought: ..." paragraphs Claude writes before each tool call. They look optional but they aren't — they're the mechanism that makes multi-step reasoning coherent. Here's why they matter concretely: imagine Claude is researching a question that requires three tool calls. Without thought traces, each call happens in a kind of cognitive vacuum. With thought traces, the second call starts with Claude reading its own first thought, the tool result, and its second thought — all of which anchor it to the original intent.

💡 Everyday Analogy

Before thought traces: imagine a researcher who opens a paper, reads a paragraph, closes the paper, opens a second paper, reads a paragraph, closes it — with no notes, no annotations, no connection between what they've read. By the time they finish the third paper, they've forgotten what the first one said.

The pain: They can't synthesize across sources because nothing they read got connected to anything else. The final report is either a copy of the last thing they read or a collection of disconnected fragments.

The mapping: Thought traces are the researcher's notebook. "After reading paper 1, my current understanding is X and I still need to find Y." That note travels into every subsequent step. The agent's final answer can synthesize because the notes are always in context.

THOUGHT TRACES — REASONING ACROSS TURNS

TURN 1

💭 Thought: The user wants current LLM benchmark data. I know general rankings but pricing may have changed. Let me search for the latest leaderboard first.

🔧 Act: web_search("LLM leaderboard benchmark scores 2025")

TURN 2 — AFTER OBSERVING SEARCH RESULTS

👁 Observe: Results show GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro scores. Missing: pricing, latency data.

💭 Thought: I have performance scores now. My original goal was a comparison, so I also need pricing. Let me get that next.

🔧 Act: web_search("Claude GPT-4o Gemini API pricing per million tokens 2025")

TURN 3 — FINAL ANSWER

👁 Observe: Pricing data retrieved for all three models.

🤖 end_turn: "Here's a complete comparison of performance and pricing: Claude 3.5 Sonnet leads on coding tasks at $3/$15 per 1M tokens, GPT-4o leads on multilingual at $5/$15, Gemini 1.5 Pro offers largest context at $3.50/$10.50..."

✅ Practical Impact of Thought Traces

Debugging: When an agent produces a wrong answer, the thought traces tell you exactly where it went wrong — was it the wrong tool choice? Wrong search query? Misinterpreted result? Without traces, you have inputs and outputs with a black box between them.

Logging: Thought traces are your audit trail. Every tool call has a "why" recorded immediately before it. This matters in production when someone asks "why did the agent delete that file?" — you can show them the thought that preceded the action.

Stop Conditions: When Should the Agent Stop?

This is where many ReAct implementations go wrong. There are two common anti-patterns and one correct pattern.

⚠️ Anti-pattern #1: Parsing the Text Response

Do NOT check if Claude's text contains phrases like "I'm done," "task complete," or "I have everything I need." This fails because: (1) Claude may phrase completion differently on every run; (2) those phrases may appear inside a thought trace before Claude actually finishes; (3) any phrasing mismatch causes the loop to run forever or stop too early.

⚠️ Anti-pattern #2: Counting Iterations as the Primary Stop Logic

Do NOT use if turn_count >= 10: break as your main stopping mechanism. A cap of 10 might be too few for a complex research task (agent stops before finishing) and too many for a simple lookup (wastes 9 turns). More critically: if the agent hits the cap on a real task, you get a half-finished result with no indication it's incomplete.

✅ Correct Pattern: stop_reason + Safety Cap

The correct primary stop condition is stop_reason == "end_turn" — Claude decides it has enough information and stops calling tools. The iteration cap (max_turns) is a safety net, not a control mechanism. Set it high enough that legitimate tasks finish (20-50 depending on your use case) and handle the cap-exceeded case explicitly (log it, notify the user, do not silently return partial results).

🎓 Cert Tip — Domain 1.1

maxTurns / iteration caps are a SAFETY NET, not a control mechanism. Anti-pattern: using an arbitrary cap (e.g., 10 iterations) as the primary stopping logic. Let the agent terminate naturally via stop_reason. If you're hitting the cap regularly, your agent design has a problem (missing "enough information" signal, tools that never return complete data, or goals that are too open-ended).

Manual Loop vs. Agent SDK — Side by Side Tier 2

M12 is your first Tier 2 module — you've just implemented the ReAct loop by hand, and now you'll see the same logic expressed using the claude-agent-sdk. The goal isn't to replace one with the other — it's to understand what the SDK abstracts away so you can debug it when it breaks.

# Manual ReAct — you own every piece
import anthropic
import json

client = anthropic.Anthropic()
messages = [{"role": "user", "content": question}]

while True:
    resp = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=4096,
        system=SYSTEM_PROMPT,
        tools=TOOLS,
        messages=messages
    )
    messages.append({"role": "assistant", "content": resp.content})

    if resp.stop_reason == "end_turn":
        return next(b.text for b in resp.content if hasattr(b, 'text'))

    tool_results = []
    for block in resp.content:
        if block.type == "tool_use":
            result = execute_tool(block.name, block.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": result
            })
    messages.append({"role": "user", "content": tool_results})

# YOU control: history management, tool dispatch,
# error handling, stop logic, logging

// Manual ReAct — you own every piece
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();

const messages: Anthropic.MessageParam[] = [
  { role: "user", content: question }
];

while (true) {
  const resp = await client.messages.create({
    model: "claude-opus-4-7",
    max_tokens: 4096,
    system: SYSTEM_PROMPT,
    tools: TOOLS,
    messages
  });
  messages.push({ role: "assistant", content: resp.content });

  if (resp.stop_reason === "end_turn") {
    const tb = resp.content.find(b => b.type === "text") as Anthropic.TextBlock;
    return tb.text;
  }

  const results: Anthropic.ToolResultBlockParam[] = [];
  for (const b of resp.content) {
    if (b.type === "tool_use") {
      const res = await executeTool(b.name, b.input as Record);
      results.push({ type: "tool_result", tool_use_id: b.id, content: res });
    }
  }
  messages.push({ role: "user", content: results });
}
// YOU control: everything

# Agent SDK — the loop is handled for you
from claude_agent_sdk import query, ClaudeAgentOptions, tool

@tool
async def web_search(search_query: str) -> dict:
    """Search the web for current information.

    Args:
        search_query: The search query string
    Returns:
        dict with search results
    """
    # Your implementation here (call real search API)
    results = await call_search_api(search_query)
    return {"content": [{"type": "text", "text": results}]}

@tool
async def read_url(url: str) -> dict:
    """Fetch and return text content of a web page.

    Args:
        url: Full URL to fetch (must start with https://)
    Returns:
        dict with page content
    """
    content = await fetch_url(url)
    return {"content": [{"type": "text", "text": content}]}

# The SDK handles: loop, history, tool dispatch, stop logic
result = await query(
    prompt=question,
    options=ClaudeAgentOptions(
        model="claude-opus-4-7",
        system_prompt=SYSTEM_PROMPT,
        tools=[web_search, read_url],
        max_turns=20
    )
)
print(result.final_text)

// Agent SDK — the loop is handled for you
import { query, ClaudeAgentOptions, tool } from 'claude-agent-sdk';

const webSearch = tool({
  name: "web_search",
  description: "Search the web for current information.",
  parameters: {
    search_query: { type: "string", description: "The search query" }
  },
  execute: async ({ search_query }) => {
    const results = await callSearchApi(search_query);
    return { content: [{ type: "text", text: results }] };
  }
});

const readUrl = tool({
  name: "read_url",
  description: "Fetch and return text content of a web page.",
  parameters: {
    url: { type: "string", description: "Full URL to fetch" }
  },
  execute: async ({ url }) => {
    const content = await fetchUrl(url);
    return { content: [{ type: "text", text: content }] };
  }
});

// SDK handles: loop, history, tool dispatch, stop logic
const result = await query(question, {
  model: "claude-opus-4-7",
  systemPrompt: SYSTEM_PROMPT,
  tools: [webSearch, readUrl],
  maxTurns: 20,
} as ClaudeAgentOptions);

console.log(result.finalText);

✅ What the SDK Does (and Doesn't) Hide

SDK handles: the while loop, appending messages to history, routing tool_use blocks to your decorated functions, collecting tool results, checking stop_reason, surfacing the final text.

You still control: tool implementations (the actual API calls happen in your @tool functions), the system prompt, model selection, max_turns safety cap, and error handling within each tool.

Why learn both: When something goes wrong in the SDK (wrong tool called, loop terminates early, unexpected output) you need to understand the manual loop to know where to look. The SDK is a productivity layer, not a black box you trust blindly.

Hands-On Lab: Build a ReAct Research Agent

What You'll Build

A ReAct research agent that answers multi-step questions by searching the web, reading pages, and synthesizing a structured report

Time · Prerequisites · Files

⏱ 30-45 min · Python 3.10+, ANTHROPIC_API_KEY set
📄 agent.py · tools.py · mock_search.py

Environment Setup

Shell — run once

mkdir react-agent && cd react-agent
python -m venv venv
# Windows:  venv\Scripts\activate
# Mac/Linux: source venv/bin/activate
pip install anthropic httpx

# Set your API key
# Windows:  set ANTHROPIC_API_KEY=your-key-here
# Mac/Linux: export ANTHROPIC_API_KEY=your-key-here

Step 1: Create the Mock Search Tool

What & Why: We'll use a mock search implementation so the lab works without a real search API key. The mock returns realistic-looking results for a fixed set of queries. When you deploy to production, you swap this for a real search API (SerpAPI, Brave Search, Tavily) — the agent loop code doesn't change at all.

Create mock_search.py:

mock_search.py

"""
Mock search tool — simulates web_search results for development.
Swap for httpx + real search API in production.
"""

MOCK_RESULTS = {
    "python ai frameworks 2025": """
Top Python AI/Agent Frameworks 2025:
1. LangChain — most ecosystem integrations, complex but powerful
2. LlamaIndex — best for RAG and document processing
3. CrewAI — multi-agent orchestration, growing fast
4. claude-agent-sdk — Anthropic's first-party SDK for Claude agents
5. AutoGen (Microsoft) — enterprise-focused, code execution focus

Trend: First-party SDKs (claude-agent-sdk, OpenAI Agents SDK) are gaining vs wrapper frameworks.
""",
    "claude agent sdk features": """
claude-agent-sdk v1.2 Features (May 2025):
- @tool decorator for typed async tool functions
- query() for single-agent invocation
- ClaudeAgentOptions: model, system_prompt, tools, max_turns, hooks
- Built-in: can_use_tool hook for fine-grained control
- create_sdk_mcp_server() for MCP server creation
- Native subagent support via .claude/agents/ directory
- Streaming support with async generators
""",
    "react pattern llm agents": """
ReAct (Reasoning + Acting) — Yao et al. 2022:
- Interleaves reasoning traces with tool calls
- Shows 10-40% improvement on knowledge-intensive benchmarks vs silent tool use
- Key principle: thought before action improves tool selection accuracy
- Adopted by: LangChain, LlamaIndex, AutoGen as default agent pattern
- Works with any LLM supporting tool use (Claude, GPT-4, Gemini)
"""
}

def mock_search(query: str) -> str:
    """Return mock search results for the given query."""
    query_lower = query.lower()
    # Find best matching mock result
    for key, result in MOCK_RESULTS.items():
        if any(word in query_lower for word in key.split()):
            return f"Search results for '{query}':\n{result}"
    return f"Search results for '{query}':\nNo specific results found. General information: {query} is an active research area in AI. Recent developments include improved tooling and better benchmark performance."

Test it:

python -c "from mock_search import mock_search; print(mock_search('python ai frameworks 2025'))"

Expected output:

Search results for 'python ai frameworks 2025':

Top Python AI/Agent Frameworks 2025:
1. LangChain — most ecosystem integrations, complex but powerful
...

✅ If you see a structured search result, the mock is working. If you see an ImportError, make sure you're in the react-agent/ directory.

Step 2: Build the Complete ReAct Agent

What & Why: This is the full agent combining the system prompt, tool definitions, tool dispatcher, and the ReAct loop. Critically, the agent also extracts and prints thought traces so you can see the reasoning in real time.

Create agent.py:

agent.py

"""
ReAct Research Agent — M12 hands-on lab
Demonstrates the Reason → Act → Observe loop using Claude's tool use.
"""
import anthropic
import json
import os
from mock_search import mock_search

client = anthropic.Anthropic()

SYSTEM_PROMPT = """You are a research assistant that compiles accurate, well-sourced reports.

IMPORTANT: Before EVERY tool call, write:
  Thought: [your reasoning — what you know, what's missing, why THIS tool call]

After EVERY tool result, write:
  Thought: [what you learned and whether you need more information]

When you have enough information, produce a structured report with:
- Summary (2-3 sentences)
- Key Findings (bullet points)
- Sources Used (list the searches you ran)"""

TOOLS = [
    {
        "name": "web_search",
        "description": "Search the web for current information about AI, technology, or research topics. Use specific queries for better results.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query — be specific (e.g., 'claude agent sdk features 2025' not 'claude')"
                }
            },
            "required": ["query"]
        }
    }
]


def execute_tool(name: str, inputs: dict) -> str:
    """Dispatch a tool call to its implementation."""
    if name == "web_search":
        return mock_search(inputs.get("query", ""))
    return json.dumps({"error": f"Unknown tool: {name}", "isError": True})


def run_agent(question: str, max_turns: int = 20, verbose: bool = True) -> str:
    """
    Run the ReAct loop until stop_reason == 'end_turn'.

    Args:
        question: The user's research question
        max_turns: Safety cap (loop terminates via stop_reason, not this cap)
        verbose: Print thought traces and tool calls to stdout
    """
    messages = [{"role": "user", "content": question}]
    turn = 0

    if verbose:
        print(f"\n{'='*60}")
        print(f"QUESTION: {question}")
        print('='*60)

    while turn < max_turns:
        turn += 1

        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=4096,
            system=SYSTEM_PROMPT,
            tools=TOOLS,
            messages=messages
        )

        # Append Claude's full response (includes both text thoughts and tool_use blocks)
        messages.append({"role": "assistant", "content": response.content})

        # Print verbose trace
        if verbose:
            print(f"\n--- Turn {turn} (stop_reason: {response.stop_reason}) ---")
            for block in response.content:
                if hasattr(block, 'text') and block.text:
                    print(f"💭 {block.text[:300]}{'...' if len(block.text) > 300 else ''}")
                elif block.type == "tool_use":
                    print(f"🔧 {block.name}({json.dumps(block.input, indent=None)[:100]})")

        # STOP: Claude has produced its final answer
        if response.stop_reason == "end_turn":
            for block in response.content:
                if hasattr(block, 'text') and block.text:
                    return block.text
            return "Agent completed without final text."

        # OBSERVE: Execute tool calls, collect results
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                if verbose:
                    print(f"👁  Result preview: {result[:150]}...")
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result
                })

        if tool_results:
            messages.append({"role": "user", "content": tool_results})

    return f"[Safety cap reached after {max_turns} turns — partial result]"


if __name__ == "__main__":
    question = "What are the main Python frameworks for building AI agents in 2025, and how does the claude-agent-sdk compare to them?"
    answer = run_agent(question, verbose=True)
    print("\n" + "="*60)
    print("FINAL REPORT:")
    print("="*60)
    print(answer)

Step 3: Run the Agent

python agent.py

Expected output (abbreviated):

============================================================
QUESTION: What are the main Python frameworks for building AI agents in 2025...
============================================================

--- Turn 1 (stop_reason: tool_use) ---
💭 Thought: I need to find current information about Python AI frameworks...
🔧 web_search({"query": "python ai frameworks 2025"})
👁  Result preview: Search results for 'python ai frameworks 2025': ...

--- Turn 2 (stop_reason: tool_use) ---
💭 Thought: I have a list of frameworks. Now let me search specifically for claude-agent-sdk...
🔧 web_search({"query": "claude agent sdk features"})
👁  Result preview: Search results for 'claude agent sdk features': ...

--- Turn 3 (stop_reason: end_turn) ---
💭 [Final report text]

============================================================
FINAL REPORT:
============================================================
## Summary
Python's agent framework ecosystem has matured significantly in 2025...

## Key Findings
- Five major frameworks dominate: LangChain, LlamaIndex, CrewAI, claude-agent-sdk, AutoGen
- claude-agent-sdk differentiates through first-party Claude integration...

## Sources Used
- web_search("python ai frameworks 2025")
- web_search("claude agent sdk features")

✅ If you see multiple "Turn" headers with Thought/Tool/Observe traces, the ReAct loop is working correctly. The agent should complete in 2–4 turns for this question. If you get an AuthenticationError, check that ANTHROPIC_API_KEY is set.

Troubleshooting

AuthenticationError: Run echo $ANTHROPIC_API_KEY (Mac/Linux) or echo %ANTHROPIC_API_KEY% (Windows) — must return your key, not blank.
ModuleNotFoundError: anthropic: Run pip install anthropic — make sure your virtual environment is activated.
Agent stops after 1 turn with no tool calls: Check your system prompt is being passed. Also verify TOOLS list is non-empty.
Loop never terminates: Usually means tool results are not being appended correctly. Add a print(messages[-1]) after the tool_results append to verify.

Knowledge Check

1. What is the correct way to determine when a ReAct agent should stop looping?

A

Check if Claude's response text contains "I'm done" or "task complete"

B

Count iterations and stop after a fixed number (e.g., 10 turns)

C

Check stop_reason == "end_turn" in the API response

D

Stop when no tool_use blocks appear in the response content

2. What does "ReAct" stand for, and why does the pattern improve accuracy?

A

Reactive + Active — the model reacts to inputs and actively calls tools without intermediate steps

B

Reasoning + Acting — interleaving explicit reasoning traces with tool calls improves tool selection and reduces hallucination

C

Retrieve + Act — the model retrieves from a vector database before taking actions

D

Reason + ACT (API Control Token) — a special token that triggers the loop

3. In the ReAct loop implementation, where should tool results be appended in the messages array?

A

As a new "system" role message, before the assistant's response

B

As an "assistant" role message, replacing the tool_use block

C

As a "user" role message with type: "tool_result" content blocks, after the assistant's message

D

Tool results don't need to be appended — Claude receives them automatically from the API

4. Why are thought traces valuable in a production ReAct agent? (Select all that apply — click the best single answer)

A

They reduce the total number of API calls by letting Claude plan in advance

B

They create an audit trail showing WHY each tool was called, and accumulate as working memory across turns so Claude stays anchored to the original intent

C

They allow Claude to skip tools it has already called in the same session

D

They are required by the Messages API — the API rejects requests without thought traces

5. You're using the Agent SDK's `query()` function and the agent terminates after 1 turn without calling any tools. What should you check first?

A

The model version — newer models don't support tool use

B

The max_turns setting — it might be set to 1

C

The tool descriptions and system prompt — Claude may have enough information to answer directly or the tool descriptions don't signal relevance for this question

D

The @tool decorator syntax — SDK tools must have specific return types to be called

6. This module (M12) is a Tier 2 module. What does that mean for the lab?

A

The lab uses only the Agent SDK — the raw API is not shown

B

The lab ships both a manual (raw API) solution and an SDK solution side-by-side, so students see what the SDK abstracts

C

The lab requires two separate API keys — one for the raw API and one for the SDK

D

The lab is optional — Tier 2 modules are advanced topics only experienced developers should attempt

Module Summary

Key Concepts

ReAct = Reasoning + Acting interleaved
Loop phases: Reason → Act → Observe → Repeat
stop_reason == "end_turn" = correct stop signal
Thought traces = working memory + audit trail
Iteration cap = safety net, not control mechanism
Tool descriptions = Claude's selection signal

What We Built

✅ ReAct research agent with verbose trace output
✅ Tool definitions with quality descriptions
✅ Correct stop-reason loop termination
✅ Structured error handling in tool dispatch
✅ Manual + SDK side-by-side comparison

→ Next: M13 — Planning & Task Decomposition

ReAct handles sequential reasoning well, but what about tasks so complex that they need a plan before any action is taken? M13 builds on today's loop by adding an explicit planning phase: Claude decomposes a goal into a DAG of sub-tasks, then executes each node — potentially in parallel — using the ReAct pattern you just learned.

The ReAct Pattern