Claude Code Mastery — Direct Track
CC13 — Patterns Module 14 of 16
50 minAdvanced
← CC12: RAG 🏠 Home CC14: Power User & CI/CD →

CC13: Workflows, Agents & Computer Use

The four canonical patterns for orchestrating multi-step Claude work — chaining, parallelization, routing, the agent loop — plus environment inspection and computer use, the GUI-driving capability that lets Claude operate desktop apps directly.

Learning Objectives

  • Distinguish workflows (deterministic, predictable) from agents (adaptive, looping).
  • Implement the three core workflow patterns: chaining, parallelization, routing.
  • Build an agent loop that picks tools, observes outcomes, and stops correctly.
  • Use environment inspection so Claude probes its world before deciding.
  • Recognize when computer use (screen + mouse + keyboard) is appropriate — and when it isn't.
  • Identify which Claude Code features map to which pattern.
CC8 was one tool call. CC13 is what happens when the task is bigger than one call — and when to use which pattern instead of throwing everything at "an agent."

Workflows vs Agents — The Most Useful Distinction

Everyday Analogy

A workflow is a recipe: step 1, step 2, step 3, done. Anyone can follow it. The recipe doesn't change based on whether the milk smells funny — you just follow the steps.

An agent is a chef. Given a goal ("make dinner") and a kitchen, they decide what to cook, taste as they go, change course if the cream curdles, and stop when the meal is ready. Adaptive. Looping. Sometimes goes wrong in surprising ways.

Both are useful. Recipes are predictable, cheap, debuggable. Chefs handle novel situations and partial information. Most teams reach for "agent" when "workflow" would be cheaper and more reliable.

Technical Definition

A workflow is a deterministic orchestration of one or more LLM calls in a fixed sequence (or DAG). The control flow is in your code. An agent is an LLM running a loop where it picks tools, observes results, and decides what to do next. The control flow is in the model. Workflows are predictable; agents are adaptive.

PropertyWorkflowAgent
Control flowYour codeThe LLM
StepsKnown in advanceDecided at runtime
Cost predictabilityHighLow (loops can run away)
DebuggingEasy — standard tracingHard — reasoning is opaque
Best forDefined process, repeatableOpen-ended, partial information
Default to workflow

If you can structure the work as a fixed pipeline, do. You'll have less drama, lower bills, and easier debugging. Reach for an agent only when the steps genuinely can't be enumerated in advance — e.g. open-ended research, multi-file refactors with unknown extent.

Chaining Workflow — Sequential LLM Calls

The simplest workflow: call A's output is call B's input. Use when you can decompose the task into stages, each smaller and more focused than the whole.

Chaining example: PR triage
StepLLM callInputOutput
1SummarizeThe diff3-sentence summary
2ClassifySummary{kind: feat|fix|chore, risk: low|med|high}
3RouteClassificationReviewer assignment
def triage(diff: str) -> dict:
    summary = client.messages.create(
        model="claude-haiku-4-5-20251001", max_tokens=200, temperature=0,
        system="Summarize a diff in 3 sentences.",
        messages=[{"role": "user", "content": diff}],
    ).content[0].text

    classify = client.messages.create(
        model="claude-haiku-4-5-20251001", max_tokens=100, temperature=0,
        tools=CLASSIFY_TOOL,
        tool_choice={"type": "tool", "name": "classify"},
        messages=[{"role": "user", "content": summary}],
    )
    cls = next(b.input for b in classify.content if b.type == "tool_use")

    return {"summary": summary, **cls}
Why chain instead of one big prompt

Each step is smaller, narrower, and easier to evaluate (CC11). Smaller prompts allow cheaper models — the summary can be Haiku even if the final routing needs Sonnet. Chains are easier to debug because every intermediate output is visible.

Parallelization Workflow — Many Calls at Once

Two flavors:

1. Sectioning — one task per piece

Split the work, run each piece in its own call, merge the results.

import asyncio
from anthropic import AsyncAnthropic

aclient = AsyncAnthropic()

async def review_file(path: str, src: str) -> dict:
    r = await aclient.messages.create(
        model="claude-sonnet-4-6", max_tokens=512, temperature=0,
        system="Review this file. Return JSON: {issues: [{line, severity, msg}]}",
        messages=[{"role": "user", "content": f"# {path}\n\n{src}"}],
    )
    return {"file": path, "review": r.content[0].text}

async def review_pr(files: dict[str, str]) -> list[dict]:
    return await asyncio.gather(*[review_file(p, s) for p, s in files.items()])

# Sequential: 10 files * 4s = 40s. Parallel: ~5s.

2. Voting — same task, multiple calls, majority wins

Run the same prompt N times (different temperatures or seeds), aggregate. Useful for high-stakes classifications where occasional flips matter.

async def classify_with_vote(text: str, n: int = 5) -> str:
    async def one():
        r = await aclient.messages.create(
            model="claude-haiku-4-5-20251001", max_tokens=8, temperature=0.3,
            system="Classify into: spam | ham | unsure", messages=[{"role":"user","content":text}],
        )
        return r.content[0].text.strip().lower()
    votes = await asyncio.gather(*[one() for _ in range(n)])
    return max(set(votes), key=votes.count)   # majority
Parallel limits

Anthropic enforces per-account rate limits. If you fan out 100 parallel calls, you'll get 429s. Use a semaphore (e.g. asyncio.Semaphore(20)) to cap concurrency. For very high volumes, use the Batch API — 50% off, 24h SLA, perfect for offline parallel work.

Routing Workflow — First Call Picks the Path

Use one cheap LLM call to classify the request, then dispatch to a specialized handler. Avoids putting every system prompt in front of every request.

Routing for a code-help bot
ClassHandler
"explain code"Sonnet, system: explainer prompt
"debug error"Sonnet, with tools: read file, run tests
"refactor"Opus + extended thinking; multi-file plan
"trivia / capital city"Haiku, no tools
ROUTE_TOOL = [{
    "name": "route",
    "description": "Pick the handler for the user's question.",
    "input_schema": {
        "type": "object",
        "properties": {"handler": {"type": "string",
            "enum": ["explain", "debug", "refactor", "trivia"]}},
        "required": ["handler"],
    },
}]

def handle(user_q: str) -> str:
    r = client.messages.create(
        model="claude-haiku-4-5-20251001", max_tokens=64, temperature=0,
        tools=ROUTE_TOOL, tool_choice={"type":"tool","name":"route"},
        messages=[{"role":"user","content":user_q}],
    )
    handler = next(b.input["handler"] for b in r.content if b.type=="tool_use")
    return HANDLERS[handler](user_q)   # dispatch to the right specialist
Why routing

Putting all five system prompts in one giant prompt for every request bloats tokens and confuses Claude. A 50-token routing call followed by a focused 4K-token handler call is cheaper, faster, and more accurate. This is exactly how Claude Code's subagent dispatch works under the hood — CC6.

The Agent Loop — When Steps Aren't Predictable

Same loop you saw in CC8, formalized:

def agent(goal: str, tools: list, max_iters: int = 30, max_cost: float = 1.0) -> str:
    history = [{"role": "user", "content": goal}]
    spent = 0.0
    for step in range(max_iters):
        r = client.messages.create(
            model="claude-sonnet-4-6", max_tokens=4096,
            tools=tools, messages=history,
        )
        history.append({"role": "assistant", "content": r.content})
        spent += estimate_cost(r.usage)
        if spent > max_cost: return "[BUDGET EXCEEDED]"
        if r.stop_reason != "tool_use":
            return next(b.text for b in r.content if b.type == "text")
        results = [{"type":"tool_result", "tool_use_id":b.id,
                    "content": str(dispatch(b.name, b.input))}
                   for b in r.content if b.type == "tool_use"]
        history.append({"role": "user", "content": results})
    return "[MAX ITERS]"

The four termination conditions

  1. Natural: stop_reason == "end_turn" — Claude says "done."
  2. Iteration cap: too many loops — bug in tools or prompt.
  3. Cost cap: dollar budget exceeded — abort before runaway spend.
  4. Time cap: wallclock budget exceeded — UX requirement.
Agents need three caps. Always.

An unbounded agent loop is a footgun: a buggy tool returning {"error": "try again"} can spin forever, billing the whole time. Iteration + cost + time caps make every agent loop bounded by something. Pick all three for production.

Environment Inspection

Agents perform better when they look around before acting. Give them tools to inspect the environment — list files, check git status, fetch a URL's status code — and prompt them to use these tools first.

SYSTEM = """You are a debugging agent. Before fixing anything:
1. Use list_files to see what's in the project.
2. Use read_file to inspect any file you'll change.
3. Use git_status to confirm clean working tree.
ONLY THEN propose changes."""

# Tools: list_files, read_file, git_status, write_file, run_tests
Why it works

An agent without inspection tools either hallucinates the environment or refuses to act. With inspection tools, it grounds itself in reality first. This is exactly the pattern Claude Code itself uses internally — the CLI starts every session with implicit environment awareness (cwd, git status, recent files).

Computer Use — Driving the GUI

Technical Definition

Computer use is a Claude capability where the model takes a screenshot of a desktop or browser, decides where to click and what to type, and emits actions like {"action": "left_click", "coordinate": [x, y]} or {"action": "type", "text": "..."}. Your client takes the screenshot, executes the action, takes another screenshot, and feeds it back. Same agent-loop pattern as tool use — tools happen to be GUI primitives.

The three computer-use tools

ToolPurpose
computer_20250124Screenshot, click, type, scroll, key combos.
text_editor_20250124Open, view, edit text files (same as in CC8).
bash_20250124Run shell commands.

Minimal request

resp = client.messages.create(
    model="claude-sonnet-4-6", max_tokens=2048,
    tools=[
        {"type": "computer_20250124", "name": "computer",
         "display_width_px": 1920, "display_height_px": 1080, "display_number": 1},
        {"type": "text_editor_20250124", "name": "str_replace_editor"},
        {"type": "bash_20250124", "name": "bash"},
    ],
    extra_headers={"anthropic-beta": "computer-use-2025-01-24"},
    messages=[{"role": "user", "content":
        "Open Chrome, search for 'anthropic computer use docs', "
        "open the first result, and screenshot the title."}],
)
# Claude emits a sequence of tool_use blocks: take_screenshot, left_click, type, ...
# Your client runs each on the actual desktop and feeds back the next screenshot.

When to reach for computer use

  • Yes: automating apps that have no good API (legacy desktop tools, internal tools without integration), QA testing GUIs, browser flows where Selenium is too brittle.
  • No: any task with a real API. Computer use is dramatically slower and more error-prone than direct API calls. If the website has an API, use the API.
Security: run it in a VM

Claude with computer use can click anywhere, type anywhere, run shell commands. A prompt-injection attack in a webpage can convince it to do unintended things. Always run computer use inside a sandboxed VM or container with no access to credentials, secrets, or the internet beyond what the task requires. Anthropic ships a reference Docker image; use it.

Latency & cost reality

A multi-screen task can take 30+ tool round trips, each with a screenshot (large image input). Costs add up fast. For browser automation specifically, prefer Playwright + a small Claude classification call when possible — cheaper, faster, more reliable.

Mapping to Claude Code

The patterns aren't abstract — you've used them all in Claude Code:

PatternWhere in Claude Code
Agent loopThe CLI itself. Every session is one agent loop until exit.
RoutingSubagent dispatch (CC6) — description matching picks the specialist.
ChainingSlash commands (CC5) — multi-step prompts that run sequentially.
ParallelizationBackground subagents and worktrees (CC14) — multiple Claude sessions simultaneously.
Environment inspectionBuilt-in: cwd, git status, recent files — CLI surfaces these by default.
Computer useNot a CLI feature; you can wrap it in an MCP server (CC9) if you need GUI work.
The unifying insight

Every "Claude Code feature" you've learned is one of these patterns specialized for development workflows. CLAUDE.md is environment inspection (Claude inspects rules at session start). Subagents are routing + isolation. Hooks are a workflow gate around the agent loop. Once you see the patterns, the CC features compose — and you can build new ones.

Hands-On Lab — Replace an Agent with a Workflow

You'll start with an agent that "reviews a PR" by looping with tools, measure its cost and accuracy, then refactor to a 3-step chain that does the same job. Often the chain wins on both axes — this lab proves it.

Step 1 — The agent baseline

# review_agent.py
TOOLS = [
    {"name": "list_files", "description": "List changed files in the PR.",
     "input_schema": {"type":"object","properties":{},"required":[]}},
    {"name": "read_file", "description": "Read a file.",
     "input_schema": {"type":"object","properties":{"path":{"type":"string"}},"required":["path"]}},
    {"name": "submit_review", "description": "Submit final review.",
     "input_schema": {"type":"object","properties":{
         "verdict":{"type":"string","enum":["approve","request_changes"]},
         "summary":{"type":"string"}}, "required":["verdict","summary"]}},
]
# Run the loop from CC13's agent() function. Track total tokens + tool calls.

Step 2 — The chain alternative

# review_chain.py
def review_chain(diff: str, files: dict[str,str]) -> dict:
    # Step A: Summarize the diff (Haiku)
    summary = client.messages.create(
        model="claude-haiku-4-5-20251001", max_tokens=300, temperature=0,
        system="Summarize a PR diff in 5 bullets max.",
        messages=[{"role":"user","content":diff}],
    ).content[0].text

    # Step B: Per-file review in parallel (Sonnet)
    async def review_one(path, src):
        r = await aclient.messages.create(
            model="claude-sonnet-4-6", max_tokens=512, temperature=0,
            system="Review one file. Output: severity bullets only.",
            messages=[{"role":"user","content":f"# {path}\n\n{src}"}],
        )
        return r.content[0].text
    file_reviews = asyncio.run(asyncio.gather(*[review_one(p,s) for p,s in files.items()]))

    # Step C: Verdict (Sonnet, structured tool output)
    verdict = client.messages.create(
        model="claude-sonnet-4-6", max_tokens=200, temperature=0,
        tools=VERDICT_TOOL, tool_choice={"type":"tool","name":"submit_review"},
        messages=[{"role":"user","content":
            f"<summary>{summary}</summary>\n<reviews>{file_reviews}</reviews>"}],
    )
    return next(b.input for b in verdict.content if b.type=="tool_use")

Step 3 — Compare on 10 PRs

Run both on a 10-PR test set (CC11 lab). Track:

  • Total tokens (input + output) per PR
  • Wallclock time per PR
  • Verdict accuracy (vs. your hand labels)
  metric        agent     chain
  ----------------------------------
  avg tokens    18,400    9,800
  avg time      24s       6s   (parallel files)
  accuracy      82%       86%

Numbers are illustrative but the shape is real: the chain typically halves cost, beats latency badly, and hits or beats accuracy because each smaller call is more focused.

Step 4 — When NOT to switch

If your task involves "follow the trail until you find the bug" — where step 4's existence depends on step 3's output — you can't replace the agent with a chain. Keep the loop, add caps.

Lab complete — what you should have

Two implementations of the same task and a comparison sheet. The exercise that pays off most: every time you reach for "agent," ask if you can structure it as a chain first. If yes, you'll likely save half the cost and most of the latency.

Knowledge Check

1. You can decompose your task into 3 fixed steps. Should you build an agent or a workflow?

A
Agent — agents are more capable.
B
Workflow — deterministic structure means cheaper, faster, easier to debug.
C
Doesn't matter.
D
Both, in parallel.
Correct. If you know the steps, encode them in code, not in the model. You'll get predictable cost, easier debugging, and better evals (each stage tested independently).
Look again. Workflow is the right default whenever the steps are predictable. Save agents for genuinely open-ended tasks.

2. You're fanning out 50 parallel calls and getting 429 errors. Best fix?

A
Increase max_tokens.
B
Use a semaphore to cap concurrency, or switch to the Batch API for offline work.
C
Switch to Opus.
D
Disable streaming.
Correct. 429s mean you're hitting per-account rate limits. Cap parallelism with a semaphore for online work; for large offline batches, the Batch API is cheaper (50% off) and queues for you.
Look again. Rate limits need explicit concurrency control (semaphore) or use of the Batch API.

3. Your agent loop hangs occasionally. What did you forget?

A
To set max_tokens.
B
Iteration / cost / time caps. A buggy tool can spin forever.
C
To enable streaming.
D
To use Opus.
Correct. Production agents need bounded loops: max iterations, max cost, max wallclock. Without all three, a tool returning "try again" or a misbehaving model can rack up unbounded charges.
Look again. Production agents must have iteration / cost / time caps. Without them, runaway loops are a foregone conclusion.

4. A user asks "what's the weather in Tokyo?" through your routing-based bot. The router should send it to:

A
The big-Opus refactor handler.
B
The debugging handler with full tools.
C
A specialized weather handler — or the trivia handler if no weather one exists. Definitely not Opus + tools.
D
The agent loop.
Correct. Routing is about right-sizing the handler. Cheap question, cheap handler. Sending weather queries to an Opus + multi-tool agent is wasted compute.
Look again. The point of routing is to match the task to the cheapest sufficient handler. Trivia goes to the cheap path.

5. You want Claude to fill in a form on a legacy desktop app with no API. What's appropriate?

A
Build a chain workflow.
B
Computer use, in a sandboxed VM, with iteration + cost caps.
C
RAG.
D
Computer use without a sandbox — just trust Claude.
Correct. Legacy GUI with no API is the textbook computer-use case. Always sandbox, always cap loops — computer use is the easiest way to spend a lot of tokens fast if the model gets confused.
Look again. No API + GUI = computer use. But you must sandbox it (security) and cap iterations (cost).

Module Summary

  • Workflow vs agent: workflow when steps are knowable; agent when not. Default to workflow.
  • Chaining: sequential calls, smaller models per step where possible. Output of A is input of B.
  • Parallelization: sectioning (split work) + voting (same work, multiple times, aggregate). Cap concurrency; use Batch API for big offline.
  • Routing: cheap classifier picks the specialist. Avoid one-prompt-fits-all bloat.
  • Agent loop: tool_use loop with three caps — iteration, cost, time. Always.
  • Environment inspection: give the agent tools to look around; prompt it to do so first.
  • Computer use: screenshot + click + type. For legacy GUIs and APIs that don't exist. Sandbox in a VM. If the API exists, prefer the API.
  • Claude Code maps these patterns: subagents = routing, slash commands = chains, worktrees + background = parallelization, the CLI itself = agent loop.