CC8: Tool Use Deep Dive
How Claude calls tools at the API level — the request/response loop, tool schemas, the four message-block types, multi-tool orchestration, fine-grained streaming, the built-in text-edit and web-search tools — and how all of this maps to the Claude Code tools you've been using since CC0.
Learning Objectives
- Trace the four-step tool-use loop: request → tool_use → tool_result → final answer.
- Write a JSON Schema for a tool that Claude can call reliably.
- Identify the four content-block types (
text,tool_use,tool_result,thinking). - Orchestrate multiple tools in one conversation, handling parallel tool calls.
- Use
tool_choiceanddisable_parallel_tool_useto control Claude's tool-selection behavior. - Use the built-in text-edit and web-search server tools (no client-side code).
- Explain how Claude Code's Bash, Edit, Read tools are tool-use under the hood.
Why Tool Use
Imagine a doctor diagnosing without ever ordering a test. They guess based on symptoms alone. Sometimes right, often wrong, no way to verify. Now imagine the same doctor with a lab next door: order a CBC, get results back, decide. Slower per question, dramatically more accurate.
Tool use is the lab. Claude alone is the doctor without one — it can only emit text from training. With tools, Claude can look things up, run code, edit files, query databases. Each tool call is an order to the lab; the result comes back, Claude integrates it, and answers.
Tool use is the API mechanism by which Claude can request that the application run a function and return the result. The application defines tools (name, description, input schema), Claude decides when to call them, the application executes them and returns results, Claude integrates the results into its final answer. The loop can repeat for multi-step tasks.
Every Claude Code action you've seen — reading a file, editing it, running Bash, searching with Grep — is a tool call. The CLI defines those tools, Claude decides which to call, the CLI runs them, and the loop continues until Claude says "done." Understanding tool use is understanding how Claude Code actually works.
The Tool-Use Loop
One end-to-end tool call is a four-step round trip:
tool_use block: name + input. stop_reason: "tool_use".tool_result block with the output (or error).tool_result). Claude integrates and returns text.Steps 1 & 2 — defining the tool, getting a call
tools = [{
"name": "get_weather",
"description": "Get current temperature for a city.",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"units": {"type": "string", "enum": ["c", "f"], "default": "c"},
},
"required": ["city"],
},
}]
resp = client.messages.create(
model="claude-sonnet-4-6", max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
)
print(resp.stop_reason) # "tool_use"
print(resp.content)
# [TextBlock(type="text", text="I'll check the weather."),
# ToolUseBlock(type="tool_use", id="toolu_01ABC", name="get_weather",
# input={"city": "Tokyo", "units": "c"})]
const tools: Anthropic.Tool[] = [{
name: "get_weather",
description: "Get current temperature for a city.",
input_schema: {
type: "object",
properties: {
city: { type: "string", description: "City name" },
units: { type: "string", enum: ["c", "f"], default: "c" },
},
required: ["city"],
},
}];
const resp = await client.messages.create({
model: "claude-sonnet-4-6", max_tokens: 1024,
tools,
messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
});
console.log(resp.stop_reason); // "tool_use"
Steps 3 & 4 — running the tool, sending the result back
def get_weather(city: str, units: str = "c") -> dict:
# Real impl would call a weather API; we fake it here.
return {"city": city, "temp": 14, "units": units, "condition": "rain"}
# Find the tool_use block
tool_use = next(b for b in resp.content if b.type == "tool_use")
result = get_weather(**tool_use.input)
# Build the next request: append the assistant's prior turn AND a user turn
# whose content is a tool_result block referring to the tool_use_id.
follow_up = client.messages.create(
model="claude-sonnet-4-6", max_tokens=1024,
tools=tools,
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"},
{"role": "assistant", "content": resp.content}, # full prior content
{"role": "user", "content": [
{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(result),
}
]},
],
)
print(follow_up.content[0].text)
# It's currently 14°C and raining in Tokyo.
- You must echo back the assistant's full prior content — the
tool_useblock has to be in history for Claude to match thetool_use_id. - The tool_result goes inside a user-role message, not its own role. Wrap it in
{"role": "user", "content": [{"type": "tool_result", ...}]}. - Tool results don't have to be JSON. A string is fine; Claude reads it. But for structured returns, JSON.stringify them so Claude can reason about fields.
Tool Schemas — What Makes a Good One
The tool definition Claude sees has three parts: name, description, input_schema. All three matter.
Name — verb_noun, snake_case
Claude picks tools largely on name. get_weather is better than weather — verbs make the action explicit. Snake case is convention; mixed case works but stay consistent.
Description — the most important field
This is what convinces Claude to not hallucinate the answer and call the tool instead. Three things every description should have:
- What it does in one sentence.
- When to use it — scenarios where this tool beats Claude's own knowledge.
- What it returns — shape and meaning of the output.
"description": (
"Look up an order by its purchase order (PO) number. "
"Use this whenever a user mentions a PO# or asks about order status. "
"Returns: {po_number, status, eta, line_items}. "
"Status is one of: open, shipped, delivered, cancelled."
)
Input schema — JSON Schema, with descriptions per field
Per-field description is what teaches Claude how to fill the field correctly. enum closes off invalid values. required marks must-haves.
{
"type": "object",
"properties": {
"po_number": {
"type": "string",
"pattern": "^PO-[0-9]{6}$",
"description": "Purchase order number, format: PO-123456"
},
"include_line_items": {
"type": "boolean",
"default": false,
"description": "Whether to include the array of items in the order"
}
},
"required": ["po_number"]
}
If Claude calls the wrong tool, calls the right tool with wrong inputs, or doesn't call any tool when it should — nine times out of ten the fix is in the description, not the schema. Treat each description as a 3-sentence prompt-engineering exercise.
The Four Content-Block Types
Claude's content array can contain four kinds of blocks. You'll see all four once you start using thinking + tools together.
| Block type | Where | Carries |
|---|---|---|
text | Assistant or user | Plain text the model wrote (or the user typed). |
tool_use | Assistant only | id, tool name, input (matching schema). |
tool_result | User only (echoing back) | tool_use_id (matches the id), content (string or array). |
thinking | Assistant, when extended thinking enabled | Hidden reasoning trace; you echo it back unmodified. |
Handling mixed blocks
Don't assume content[0] is what you want. Iterate:
for block in resp.content:
if block.type == "text":
print("TEXT:", block.text)
elif block.type == "tool_use":
print(f"CALLING {block.name}({block.input})")
run_tool(block)
elif block.type == "thinking":
# Echo back unchanged; you can log it for debugging
pass
If your tool fails, return a tool_result with is_error: true and a string explaining what broke. Claude will adapt — e.g. retry with a different argument, or apologize to the user. Don't throw an exception client-side; that aborts the loop.
Multiple Tools & Multi-Turn
You can pass an array of tools and Claude picks per turn. The model can also issue multiple tool calls in parallel — one assistant message containing several tool_use blocks.
tools = [
{"name": "get_weather", "description": "...", "input_schema": {...}},
{"name": "get_time", "description": "Current time in a city.", "input_schema": {...}},
]
resp = client.messages.create(
model="claude-sonnet-4-6", max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content":
"I'm flying from NYC to Tokyo. What's the weather and local time at both ends?"}],
)
# resp.content may contain 4 tool_use blocks: weather(NYC), weather(Tokyo),
# time(NYC), time(Tokyo). Run them all (ideally in parallel), then send all 4
# tool_result blocks in one user-role follow-up.
The full loop until end_turn
def agent_loop(user_msg: str, tools: list, max_iters: int = 10):
history = [{"role": "user", "content": user_msg}]
for _ in range(max_iters):
resp = client.messages.create(
model="claude-sonnet-4-6", max_tokens=2048,
tools=tools, messages=history,
)
history.append({"role": "assistant", "content": resp.content})
if resp.stop_reason != "tool_use":
return resp # final answer; bail
# Run every tool_use block, collect tool_result blocks
results = []
for block in resp.content:
if block.type == "tool_use":
output = dispatch(block.name, block.input) # your dispatcher
results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(output),
})
history.append({"role": "user", "content": results})
raise RuntimeError("Agent loop exceeded max iterations")
You wrote a tiny agent. The pattern: loop while stop_reason == "tool_use", run every tool_use block, send all tool_results back together. Always cap iterations — a buggy tool can put you in an infinite loop. CC13 covers this loop pattern in depth.
Tool Choice & Fine-Grained Control
By default, Claude decides whether and which tool to call. The tool_choice parameter lets you override:
tool_choice | Behavior | Use when |
|---|---|---|
{"type": "auto"} (default) | Claude picks: zero, one, or many tools. | Normal agent operation. |
{"type": "any"} | Claude must call some tool, but picks which. | You know a tool is needed; routing tasks. |
{"type": "tool", "name": "X"} | Claude must call exactly tool X. | Forced structured output (CC1). |
{"type": "none"} | No tools allowed; text only. | Final summarization step. |
Disabling parallel tool calls
By default, Claude can issue several tool_use blocks in one assistant message. If your tools have side effects that must run serially (e.g., a state-mutating API), pass:
resp = client.messages.create(
model="claude-sonnet-4-6", max_tokens=1024,
tools=tools,
tool_choice={"type": "auto", "disable_parallel_tool_use": True},
messages=[...]
)
Fine-grained tool streaming
By default, when streaming with tool use, the SDK buffers the JSON input until complete. Fine-grained streaming emits the input incrementally — useful for very large inputs (file edits, long writes) where you want to start downstream work before the input is done.
with client.messages.stream(
model="claude-sonnet-4-6", max_tokens=4096,
tools=tools,
extra_headers={"anthropic-beta": "fine-grained-tool-streaming-2025-05-14"},
messages=[...],
) as stream:
for event in stream:
if event.type == "input_json_delta":
partial_json += event.partial_json
# You can begin parsing/echoing partial JSON here
Mostly for large file edits. The default streaming is fine for 95% of cases. Don't enable fine-grained unless you're seeing user-visible latency from buffering.
Built-In Tools — Text Edit & Web Search
Anthropic ships server tools — tools whose execution happens inside Anthropic's infrastructure, not your client. You enable them by referencing them in tools; you do not implement them yourself.
The text-edit tool
The text-edit tool gives Claude four operations on a text file: view, create, str_replace, insert. The model emits a tool_use call with the operation; your code applies it to a real file and returns the result. (Server tool naming is misleading here — the tool's schema is built-in, but execution is still on your side.)
tools = [{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}]
resp = client.messages.create(
model="claude-sonnet-4-6", max_tokens=2048,
tools=tools,
messages=[{"role": "user", "content":
"Open ./auth.py, find the function login(), and rename it to authenticate()."
}],
)
# resp.content has a tool_use with operations like:
# {"command": "view", "path": "./auth.py"}
# {"command": "str_replace", "path": "./auth.py",
# "old_str": "def login():", "new_str": "def authenticate():"}
# Note: tool type/name versioning is model-family specific.
# Sonnet/Opus 4.x: text_editor_20250728 + str_replace_based_edit_tool.
# Older Sonnet 3.5: text_editor_20241022 + str_replace_editor.
The web-search tool
Web search does run server-side — you don't get to see or implement the search; the result comes back as part of Claude's response, with citations.
tools = [{"type": "web_search_20250305", "name": "web_search", "max_uses": 3}]
resp = client.messages.create(
model="claude-sonnet-4-6", max_tokens=2048,
tools=tools,
messages=[{"role": "user", "content":
"What did Anthropic ship at their last devday?"}],
)
# resp.content includes a tool_use, server-side search runs, then tool_result
# with results, then text with citations. All in one round trip.
Server-tool invocations are billed (web search has a per-use fee). Cap with max_uses to avoid surprise costs from over-eager Claude. Always set max_uses for web_search in production code.
How Claude Code Uses Tools
Now the payoff: every Claude Code action you've performed is exactly the loop above. The CLI defines a fixed set of tools and runs the loop on your behalf.
| CLI tool | Schema | Executor |
|---|---|---|
Bash | {command, description, timeout} | Spawns shell, returns stdout/stderr/exit. |
Read | {file_path, offset?, limit?} | Reads file with optional pagination. |
Edit | {file_path, old_string, new_string, replace_all?} | Performs string replace; errors if not unique. |
Grep | {pattern, path?, type?, ...} | Wraps ripgrep, returns matches. |
Glob | {pattern, path?} | Filename matching. |
WebFetch | {url, prompt} | Fetches + lets Claude extract. |
When you use --debug or watch transcripts, you'll see the exact tool_use / tool_result blocks. Skills, subagents, and MCP servers all add tools to the same loop — nothing magical, just more entries in the tools array Claude sees.
An MCP server registers its tools with Claude Code at startup. Once registered, those tools are indistinguishable to Claude from built-in ones — same schema format, same loop. CC9 covers writing your own MCP server with this in mind.
Hands-On Lab — Build a 2-Tool Agent
You'll build a 60-line Python agent that has two tools (get_lien_count and get_filing_dates) and answers questions about UCC filings. This is the same loop Claude Code runs — written by you, in 60 lines.
Step 1 — Mock data
FILINGS = {
"PO-100001": {"debtor": "Acme LLC", "filed": "2024-03-12", "lien_count": 2},
"PO-100002": {"debtor": "Beta Inc", "filed": "2024-09-01", "lien_count": 0},
"PO-100003": {"debtor": "Acme LLC", "filed": "2025-01-04", "lien_count": 5},
}
def get_lien_count(debtor: str) -> dict:
total = sum(f["lien_count"] for f in FILINGS.values() if f["debtor"] == debtor)
return {"debtor": debtor, "total_liens": total}
def get_filing_dates(debtor: str) -> dict:
dates = [f["filed"] for f in FILINGS.values() if f["debtor"] == debtor]
return {"debtor": debtor, "filing_dates": sorted(dates)}
Step 2 — Tool definitions + dispatcher
TOOLS = [
{
"name": "get_lien_count",
"description": "Total active liens for a debtor across all UCC filings. "
"Use when a user asks 'how many liens does X have?'. "
"Returns {debtor, total_liens}.",
"input_schema": {
"type": "object",
"properties": {"debtor": {"type": "string"}},
"required": ["debtor"],
},
},
{
"name": "get_filing_dates",
"description": "All filing dates for a debtor, ascending. "
"Returns {debtor, filing_dates: ISO date strings}.",
"input_schema": {
"type": "object",
"properties": {"debtor": {"type": "string"}},
"required": ["debtor"],
},
},
]
def dispatch(name: str, args: dict) -> dict:
if name == "get_lien_count": return get_lien_count(**args)
if name == "get_filing_dates": return get_filing_dates(**args)
return {"error": f"unknown tool {name}"}
Step 3 — The loop
import json, sys
from anthropic import Anthropic
def agent(user_q: str, max_iters: int = 6) -> str:
client = Anthropic()
history = [{"role": "user", "content": user_q}]
for _ in range(max_iters):
r = client.messages.create(
model="claude-sonnet-4-6", max_tokens=1024,
tools=TOOLS, messages=history,
)
history.append({"role": "assistant", "content": r.content})
if r.stop_reason != "tool_use":
return next(b.text for b in r.content if b.type == "text")
results = []
for b in r.content:
if b.type == "tool_use":
out = dispatch(b.name, b.input)
results.append({
"type": "tool_result",
"tool_use_id": b.id,
"content": json.dumps(out),
})
history.append({"role": "user", "content": results})
return "[max iterations reached]"
if __name__ == "__main__":
print(agent(" ".join(sys.argv[1:]) or "How many liens does Acme LLC have, and when did they file?"))
Step 4 — Run it
$ python agent.py "How many liens does Acme LLC have, and when did they file?"
Acme LLC has 7 total active liens, filed on 2024-03-12 and 2025-01-04.
Watch what happens: Claude calls both tools in one assistant turn (parallel tool use), your dispatcher runs both, sends two tool_results back, Claude composes the answer.
Step 5 — Inspect the loop
Add a print statement after each r.stop_reason check. You'll see exactly two iterations — one with stop_reason: "tool_use", one with stop_reason: "end_turn". Two API calls, four tool block exchanges, one final answer.
A 60-line Python agent that runs the same loop Claude Code runs internally. You've now seen tool use end-to-end. Skills, subagents, and MCP servers are different ways of contributing entries to this loop's tools array. When CC9 (MCP server-building) lands, it'll feel like adding to a loop you already understand.
Knowledge Check
1. Claude responded with stop_reason: "tool_use". What's the next step?
tool_choice: "none".tool_result block, re-send.tool_result block.2. You're getting wrong tool selections. Most likely cause?
3. Claude responded with TWO tool_use blocks in one assistant message. How do you reply?
content array.tool_result blocks in its content array.4. You want Claude to always call your record_signoff tool. Best param?
tool_choice: {"type": "auto"}tool_choice: {"type": "any"}tool_choice: {"type": "tool", "name": "record_signoff"}tool_choice: {"type": "none"}{"type": "tool", "name": ...} forces a specific tool. "any" means "some tool"; "none" means no tools.{"type": "tool", "name": "record_signoff"}.5. Your tool occasionally fails. What goes back to Claude?
tool_result block with is_error: true and a string explaining the failure.is_error: true with a human-readable message. Claude can adapt (retry, reword, give up gracefully). Throwing breaks the loop.tool_result blocks with is_error: true so Claude can adapt within the loop.Module Summary
- Tool-use loop: request →
stop_reason: "tool_use"→ run tool → send backtool_resultin user role → final answer. Loop untilend_turn. - Tool schema:
name(verb_noun),description(what + when + returns),input_schema(JSON Schema with per-field descriptions). - Four block types:
text,tool_use,tool_result,thinking. Iterate; don't assumecontent[0]. - Multiple tools: Claude can call several in parallel within one assistant message. Reply with all results in one user message.
tool_choice:auto(default),any(must call something),{"type": "tool", "name": ...}(force one),none(text only).- Built-in server tools: text editor (
str_replace_editor), web search. Cap web_search withmax_uses. - Errors stay inside the loop:
tool_result+is_error: true+ message. Don't throw. - Claude Code's Bash/Read/Edit/Grep tools are exactly this protocol — CC9's MCP servers add to the same array.