Building AI Agents with Claude
Track 9 — Certification Prep Module 28 of 30
75–90 min Intermediate → Advanced
← M24: What's Next 🏠 Home M26: Hooks & Sessions →

M25: Claude Code Mastery

Master Claude Code configuration, workflows, and CI/CD integration — covering Domain 3 of the Claude Certified Architect exam (~20% of the exam).

Learning Objectives

  • Configure CLAUDE.md files at user, project, and directory levels with correct cascade behavior
  • Build custom slash commands and skills with context isolation and tool restrictions
  • Choose between plan mode and direct execution based on task complexity and risk
  • Use Claude Code's built-in tools (Read, Write, Edit, Bash, Grep, Glob) effectively
  • Integrate Claude Code into CI/CD pipelines with session isolation for unbiased review
  • Use the Message Batches API for high-volume, cost-efficient processing
  • Choose the right permission mode (default, acceptEdits, plan, bypassPermissions) for the risk profile of each session
  • Compose long-running and parallel work using /loop, /schedule, worktrees, and the two-Claude review pattern

Claude Code Is an Orchestration Framework, Not a Coding Assistant

Most developers treat Claude Code like a faster Stack Overflow — type a question, get code, copy-paste. That's the wrong mental model. Claude Code is an agent orchestration framework that happens to be excellent at coding. Once you see the layers, you stop fighting the tool and start composing it.

The Six Layers of Claude Code
You — Architect prompts, decisions, review Claude Code — Main Agent reads, writes, executes, loops CLAUDE.md persistent memory + rules Skills on-demand knowledge Commands reusable workflows Hooks deterministic automation Subagents parallel specialists MCP Servers external tool connections GitHub · Postgres · Slack · Jira subagents reach external tools through MCP most developers only use Layer 0 (CLAUDE.md). Experts compose all six.
Why It Matters

The compounding effect is what separates a "fancy autocomplete" from a programmable engineering team. CLAUDE.md alone makes Claude smarter about your codebase. Hooks alone make every change auto-linted. Subagents alone give you parallel specialists. Combined, they let you describe a feature once and have Claude implement, review, security-audit, and PR it without you copy-pasting between sessions. This module teaches you each layer; the rest is composition.

Claude Code vs The Rest of the AI Coding Landscape

The AI coding landscape splits into three tiers based on autonomy. Knowing where Claude Code sits relative to Copilot, Cursor, and SWE-agents like Devin is how you decide which tool fits which task — and why composing Claude Code's six layers gives you something the others structurally can't.

The AI Coding Spectrum — Where Each Tool Sits
Completion Tools GitHub Copilot line/function completion IDE-native, stateless Cursor file-level edits limited agentic loops "you are the glue" stateless · per-suggestion Agentic Tools ← you are here Claude Code project-level autonomy multi-file edits full CLI + IDE + web Subagents + Hooks + MCP parallel specialists deterministic enforcement live external tools human-steered, autonomous loops "hand off complete tasks, steer outcomes" Full Autonomy Devin SWE-agents full autonomy expensive limited HITL "submit, walk away" harder to steer opaque + costly evolving evolving more control than Copilot & Cursor are tools you use. Claude Code is a system you configure and orchestrate.
The Philosophical Difference

Copilot and Cursor are tools you use — they cap out at making you faster at your current workflow. Claude Code is a system you configure and orchestrate — it can transform what your workflow is. Devin and SWE-agents trade steerability for autonomy — great for narrow benchmarks, harder to course-correct mid-task. The Claude Code sweet spot: autonomous enough to hand off complete features, steerable enough that you stay in the architectural loop.

Every Claude Code session starts from zero memory of your project. CLAUDE.md is the only file loaded automatically every time — it is how you give Claude permanent memory. Get this right and the other five layers build on a solid foundation.

CLAUDE.md Configuration Hierarchy

Everyday Analogy

CLAUDE.md files work like CSS cascading. Think of three layers: a website's global stylesheet (body tag defaults), a page-level class (overrides for specific sections), and an inline style (overrides for a single element). The most specific rule always wins.

Without cascading, every developer on a team would need to copy-paste the same rules into every project and every directory. Changes would mean updating dozens of files. Worse, personal preferences (like tab width or test framework) would conflict with team standards.

CLAUDE.md cascading solves this: user-level is your personal global stylesheet (applied everywhere you use Claude Code), project-level is the team's class rules (committed to git, shared by everyone), and directory-level is the inline override (specific rules for specific code areas). More specific always wins, and all levels combine together.

Here is what the cascade looks like in practice. Suppose your user-level file says indent: 4 spaces, but the project says indent: 2 spaces, and the API directory says indent: tabs. When Claude Code edits a file in src/api/, the effective rule is indent: tabs — the directory-level wins. When it edits a file in src/models/ (no directory-level override), the project rule wins: indent: 2 spaces.

Technical Definition

CLAUDE.mdA markdown file that configures Claude Code's behavior — coding conventions, project context, and tool restrictions. Named after the AI model, placed at strategic locations in your file system to control behavior at different scopes. is a configuration file that tells Claude Code how to behave in a specific context. There are three levels, and they merge together with more-specific levels overriding more-general ones:

User-level (~/.claude/CLAUDE.md): Your personal preferences that apply to every project. Things like: "I prefer 2-space indentation," "always use TypeScript strict mode," "run tests after every code change." These are YOUR rules, not your team's.

Project-level (.claude/CLAUDE.md): Team standards committed to git. Things like: "This project uses PostgreSQL 15," "API routes follow the pattern /api/v1/{resource}," "use the company's internal auth library for all endpoints." Everyone on the team gets these rules automatically.

Directory-level (src/api/CLAUDE.md): Path-specific overrides for specialized code areas. For example, the API directory might say "all handlers must validate input with Zod schemas" while the database directory says "all queries must use parameterized statements."

You can also use @import syntax to pull in shared rules from other files, and create topic-specific rule files in .claude/rules/ for better organization.

CLAUDE.md Cascade — More Specific Wins
🏠
~/.claude/CLAUDE.md
User-level (personal)
indent: 4 spaces
↓ merges with
📁
.claude/CLAUDE.md
Project-level (team)
indent: 2 spaces
↓ merges with
📄
src/api/CLAUDE.md
Directory-level (path-specific)
indent: tabs
↓ result
Effective config in src/api/
Directory rule wins: tabs
indent: tabs (most specific)
File Location Hierarchy — Where to Put What
~/.claude/CLAUDE.md global — ALL sessions your personal preferences ./CLAUDE.md project root — commit to git team standards, shared rules ./CLAUDE.local.md personal — .gitignore private overrides for this project ./src/api/CLAUDE.md on-demand — scoped loads only when working in this dir general specific precedence More-specific files override more-general ones — rules merge top-to-bottom.
⚠️ The Instruction Budget Rule

CLAUDE.md has a practical limit of roughly 150–200 instructions. The Claude Code system prompt already consumes ~50. Every line you add that Claude doesn't actually need dilutes the lines that matter, and files over ~200 lines cause instruction dropout — Claude decides parts "may not be relevant" and silently ignores them.

The CLAUDE.md test: before committing any line, ask: "Would Claude make a mistake on my codebase without this?" If the answer is no, delete the line. Document only what Claude gets wrong, not what it already does correctly.

The Four Anti-Patterns That Kill Compliance

Every team that struggles with "Claude isn't following our CLAUDE.md" is hitting one of these four failure modes. Each one is a budget waste that pushes the file past the dropout threshold or leaves Claude with nowhere to go.

❌ Anti-Pattern 1 — Stuffing everything in one file

Files over ~200 lines trigger instruction dropout: Claude silently decides chunks "may not be relevant" and ignores them. Fix: push module-specific rules into directory-level CLAUDE.md (the file-location hierarchy above). Root file stays under 150 lines.

❌ Anti-Pattern 2 — Documenting what Claude already does right

"Use TypeScript" in a TypeScript project is wasted budget — Claude already does that. Every line you spend confirming default behavior is a line you didn't spend on a correction. Fix: spend budget on corrections, never confirmations. Run the CLAUDE.md test on every line.

❌ Anti-Pattern 3 — Vague prohibitions

"Never use the --foo-bar flag" leaves Claude stuck — it knows what not to do but has nowhere to go. Fix: always pair a prohibition with an alternative. "Never use --foo-bar; prefer --baz instead because it handles unicode correctly." Now Claude has somewhere to land.

❌ Anti-Pattern 4 — Using CLAUDE.md for enforced behaviors

CLAUDE.md is advisory — the model can deprioritize lines under context pressure. If something must always happen (attribution, permission scopes, model selection, hooks), put it in settings.json. Rule: CLAUDE.md is for guidance. settings.json is for guarantees. The hook system you'll see in M26 is how you turn a "rule" into deterministic enforcement.

Minimum Viable CLAUDE.md

Before adding rules, start here. This is the under-30-line template that captures everything Claude actually needs and nothing it doesn't. Resist the urge to expand it until you've shipped a feature and noticed Claude getting something wrong.

# Project: [Name]

## Stack
- Node.js 22, TypeScript 5.4, Fastify 4
- PostgreSQL 16 + Drizzle ORM
- Redis 7 for caching
- Jest for testing

See @package.json for all dependencies.
See @docs/architecture.md for system design.

## How to work on this project
- Run tests: `npm test`
- Run single test: `npm test -- --testPathPattern=auth`
- Typecheck: `npm run typecheck`
- Lint: `npm run lint`

## Things Claude tends to get wrong here
- Always use ESM imports (not CommonJS require)
- Redis keys must include version prefix: `v2:user:{id}:...`
- Auth middleware must run BEFORE rate limiting in route registration
- All DB queries go through the service layer, never directly in routes

## Git workflow
- Never commit to main directly
- Branch naming: `feat/`, `fix/`, `chore/`
- Commit messages: conventional commits format
Why This Template Works

Notice three discipline patterns. (1) The Stack section references package.json via @import instead of duplicating dependency lists. (2) "How to work on this project" gives Claude the exact commands — not "run tests somehow," but npm test -- --testPathPattern=auth. (3) "Things Claude tends to get wrong" is named that way on purpose — it forces every line to pass the CLAUDE.md test before being added. Under 30 lines, all signal, zero dropout risk.

Let's look at what a real-world project-level CLAUDE.md contains. Notice how it is not just formatting preferences — it includes the tech stack, database rules, API patterns, and @import directives that pull in separate rule files for security and testing. This is the kind of context that saves Claude from making wrong assumptions about your project.

# UCC Pipeline Project — Claude Code Configuration

## Project Context
This is a UCC (Uniform Commercial Code) filing data pipeline.
Stack: Python 3.12, FastAPI, PostgreSQL 15, Redis, Docker.
Architecture: Medallion (Bronze → Silver → Gold layers).

## Coding Conventions
- Use type hints on ALL function signatures
- Docstrings: Google style (Args/Returns/Raises)
- Tests: pytest with fixtures, minimum 80% coverage
- Error handling: never catch bare `except:`, always specific exceptions
- Logging: structured JSON via structlog

## Database Rules
- ALL queries must use parameterized statements (no f-strings in SQL)
- Migrations via Alembic — never modify schema directly
- Table names: snake_case, singular (e.g., `ucc_filing`, not `ucc_filings`)

## API Patterns
- Routes: /api/v1/{resource}
- Auth: Bearer token via X-API-Key header
- Responses: always wrap in {data, error, metadata}
- Validation: Pydantic models for all request/response bodies

@import .claude/rules/security.md
@import .claude/rules/testing.md
# Personal Claude Code Preferences

## Editor Preferences
- Indentation: 2 spaces (overridden by project if different)
- Line length: 100 characters max
- Trailing commas: always in multi-line structures

## Workflow Preferences
- Run tests after every code change
- Show git diff before committing
- Prefer small, focused commits over large batches
- Always explain WHY a change was made, not just WHAT

## Communication Style
- Be concise — skip preamble
- When showing code, highlight the changed lines
- If unsure, ask before making assumptions
What Just Happened?

You saw how CLAUDE.md at two levels serves different purposes. The project-level file contains team-wide rules that everyone shares (database patterns, API conventions, tech stack). The user-level file contains personal preferences (indentation, commit style). When Claude Code loads, it merges all applicable files, with more-specific rules winning conflicts.

🎓 Cert Tip — Domain 3.1

Anti-pattern: putting personal editor preferences (indentation, line length) in the project-level .claude/CLAUDE.md. These belong in user-level ~/.claude/CLAUDE.md. The exam tests whether you can distinguish personal preferences from team standards and place each at the correct level.

You now know how to configure Claude Code's behavior at multiple levels. But configuration is passive — it sets rules. Commands and skills are active — they define reusable workflows you can trigger. Let's look at how they differ.

Custom Slash Commands vs Skills

Everyday Analogy

Commands are like speed-dial buttons on a phone. You press the button, it dials a specific number. You have to press it manually every time. It is fast and predictable, but it only does exactly what you programmed.

The pain with only having commands is that complex exploration tasks dump noise into your main conversation. If you run a command that reads 50 files looking for a pattern, all of those file contents are now in your context window — cluttering up the conversation you were having.

Skills solve this. A skill with context: fork is like delegating a task to an assistant in another room. They do their research, come back with a summary, and your workspace stays clean. The exploration noise stays in the forked context; only the result comes back to you.

Here is what that looks like in your terminal. Without a fork, you would see: [Read file_1.py] ... 200 lines ... [Read file_2.py] ... 350 lines ... [Read file_3.py] ... 180 lines ... Result: 3 issues found — all 730 lines dumped into your session. With context: fork, you see: [Skill: entity-resolution running...] ... Result: 3 issues found. Exact matches: 2 (filing NY-2024-001234, NY-2024-005678). Likely match: 1 ("ACME CORP" → 87% similarity). Just the summary. Clean.

Technical Definition

Slash commandsMarkdown files in .claude/commands/ that create custom / commands in Claude Code. User invokes them explicitly by typing /command-name. They support $ARGUMENTS placeholders and YAML frontmatter for configuration.: Markdown files placed in .claude/commands/. When you create .claude/commands/deploy.md, it becomes the /deploy command. The file contains the prompt that Claude will execute. You can use $ARGUMENTS placeholders for user input.

Each command file supports YAML frontmatter at the top with four optional fields: allowed-tools (restrict which tools the command can use), argument-hint (show users what input to provide), model (override the default model), and description (explain what the command does in the / menu).

SkillsEnhanced slash commands in .claude/skills/ with extra capabilities: context isolation (context: fork), automatic invocation based on description matching, and tool restrictions. Skills are the modern replacement for commands when you need isolation or automatic triggering.: A directory in .claude/skills/ containing a SKILL.md file. Skills have all the capabilities of commands plus two critical extras.

First, context: fork — the skill runs in an isolated context window, so exploration noise does not pollute your main session. Second, automatic invocation — Claude can invoke the skill automatically when the user's request matches the skill's description. You do not need to type a slash command; Claude recognizes the intent and triggers the skill on its own.

The key distinction is context isolationWhen a skill runs with context: fork, it gets a separate context window. All the files it reads, searches it performs, and intermediate reasoning stays in the forked context. Only the final result is returned to the main session. This prevents context pollution from exploratory work.. Without it, a command that reads 50 files dumps all that content into your main conversation. With context: fork, the skill reads those files in its own context, analyzes them, and returns only the summary. Your main session stays focused.

Command (shared context) vs Skill (forked context)
Command (no fork)
Main context: user question
/review-filing PO-8847
Read file_1.py (200 lines)
Read file_2.py (350 lines)
Read file_3.py (180 lines)
Result: 3 issues found
Context: 730+ lines of noise
Skill (context: fork)
Main context: user question
/review-filing PO-8847
[forked] Reading files...
[forked] Analyzing...
Result: 3 issues found
Context: clean & focused

Here is a skill definition with context isolation and tool restrictions.

---
name: entity-resolution
description: Analyze entity names across UCC filings to find matches and variations. Use when asked to resolve, match, or deduplicate entity names.
context: fork
allowed-tools:
  - Read
  - Grep
  - Glob
---

# Entity Resolution Analysis

You are analyzing UCC filing entity names to find matches.

## Task
Given entity name `$ARGUMENTS`, search the codebase for:
1. Exact matches in filing records
2. Variations (abbreviations, misspellings, legal suffixes)
3. Related entities (parent/subsidiary relationships)

## Process
1. Use Glob to find all filing data files
2. Use Grep to search for the entity name and common variations
3. Read relevant files to extract full entity records
4. Return a structured summary:
   - Exact matches (count, filing numbers)
   - Likely matches (similarity score, reasoning)
   - Recommended canonical name

## Constraints
- Do NOT modify any files
- Do NOT run any shell commands
- Return only the analysis summary to the main session
---
description: Review a UCC filing parser for correctness and edge cases
allowed-tools:
  - Read
  - Grep
  - Glob
argument-hint: filing_type (e.g., UCC-1, UCC-3)
---

# Review Filing Parser

Review the parser implementation for `$ARGUMENTS` filings.

## Checklist
1. Read the parser source file for this filing type
2. Check: Does it handle all required fields?
3. Check: Are edge cases covered? (missing fields, malformed dates, encoding issues)
4. Check: Does it match the expected output schema?
5. Check: Are there tests? Do they cover edge cases?

## Report Format
- **Status**: PASS / FAIL / NEEDS REVIEW
- **Issues found**: list with severity (critical/warning/info)
- **Missing test coverage**: list of untested scenarios
- **Suggested fixes**: code snippets for critical issues
What Just Happened?

You created two reusable Claude Code workflows. The skill uses context: fork to run entity resolution in isolation — it can read and search dozens of files without polluting your main conversation. The command is simpler — good for a focused review task where context pollution is manageable. Both use allowed-tools to restrict what Claude can do (read-only, no shell access).

🎓 Cert Tip — Domain 3.2

Anti-pattern: using commands for complex exploration tasks that read many files. Use skills with context: fork and allowed-tools restrictions instead. The exam tests whether you can identify when context isolation is needed and choose the right mechanism.

SKILL.md Frontmatter — The Full Spec

The two skills above used four frontmatter fields. The full SKILL.md frontmatter has 15 fields, all optional — only description is recommended. Skills also unify what used to be two separate concepts: custom commands have been merged into skills. A file at .claude/commands/deploy.md and a skill at .claude/skills/deploy/SKILL.md both create /deploy and use the same frontmatter; the skill format adds optional supporting files, automatic invocation, and subagent execution.

SKILL.md Frontmatter — All 15 Fields
Field What it does
nameDisplay name (defaults to dir name); lowercase + hyphens, max 64 chars
descriptionWhen to use it — this drives auto-invocation. Front-load the use case; description + when_to_use is capped at 1,536 chars in the skill listing
when_to_useTrigger phrases / example requests, appended to description
argument-hintAutocomplete hint, e.g. [issue-number] or [filename] [format]
argumentsNamed positional args for $name substitution: arguments: [issue, branch]$issue, $branch
disable-model-invocationtrue → only YOU can invoke (no auto-trigger); use for /deploy, /commit — anything with side effects
user-invocablefalse → hides from / menu; only Claude can invoke (background knowledge)
allowed-toolsPre-approves tools while skill is active — doesn't restrict, just skips per-use prompt. Add deny rules in permissions to actually block.
modelModel override (sonnet/opus/haiku/full ID/inherit); resets after the turn
effortThinking budget for the turn: low / medium / high / xhigh / max
contextfork → runs in a forked subagent context with no main-conversation visibility
agentWhich subagent type executes the fork: Explore, Plan, general-purpose, or any from .claude/agents/
hooksHooks scoped to this skill's lifecycle — supports once: true for one-shot
pathsGlob patterns — auto-load only when working with matching files
shellbash (default) or powershell (Windows; needs CLAUDE_CODE_USE_POWERSHELL_TOOL=1)

String Substitutions — Inject Args, Session ID, Effort, Skill Path

Skills have six substitution variables that get replaced before the body is sent to Claude. Argument positionals support shell-style quoting ("hello world" = one arg).

Variable Resolves to
$ARGUMENTSFull argument string as typed; if absent from body, appended as ARGUMENTS: <value>
$ARGUMENTS[N] / $NPositional arg by 0-based index ($0, $1, …)
$nameNamed arg from arguments frontmatter list, mapped by position
${CLAUDE_SESSION_ID}Current session ID — use for log file paths, correlation IDs
${CLAUDE_EFFORT}Active effort level — adapt instructions to match (e.g. skip exhaustive checks at low)
${CLAUDE_SKILL_DIR}Directory containing this skill's SKILL.md — use to invoke bundled scripts regardless of cwd

Inline Shell Injection — !`cmd` Runs Before Claude Sees Anything

Skills (and slash commands) can run shell commands at render time using backtick-quoted !`command` syntax or fenced ```! blocks. The output replaces the placeholder and the rendered prompt is what reaches Claude — this is preprocessing, not Claude executing the command. Perfect for injecting live data (PR diffs, git status, env versions) without round-tripping through tool calls.

---
name: pr-summary
description: Summarize the current pull request using live GitHub data
context: fork
agent: Explore
allowed-tools: Bash(gh *)
---

## PR diff
!`gh pr diff`

## Reviewer comments
!`gh pr view --comments`

## Multi-line block (use ```! ... ```)
```!
node --version
npm --version
git status --short
```

Summarize the changes and flag anything that needs reviewer attention.

When the skill runs, the three commands execute first; their stdout is interpolated into the prompt; Claude receives a fully-rendered text body with the actual PR data already inlined. To disable shell injection for non-bundled skills (managed-policy enforcement), set "disableSkillShellExecution": true in settings — each !`cmd` is replaced with [shell command execution disabled by policy].

Skill Lifecycle — What Persists, What Doesn't

When invoked, a skill's rendered SKILL.md body enters the conversation as one message and stays there for the rest of the session. Claude does not re-read the skill file on later turns — write standing instructions, not one-shots. After auto-compaction, the most recent invocation of each skill is re-attached, keeping the first 5,000 tokens of each up to a combined 25,000-token budget; older skill invocations get dropped first. Description budget: all skill names always load, but descriptions are truncated to fit a 1% of context window budget (fallback 8,000 chars; raise via SLASH_COMMAND_TOOL_CHAR_BUDGET). Live change detection: adds/edits/removes inside ~/.claude/skills/, project .claude/skills/, or --add-dir targets take effect mid-session — new top-level directories require a restart.

Bundled Skills — What Ships in Every Session

Claude Code includes prompt-based bundled skills that you invoke like any other slash command. Unlike built-in commands (which execute fixed code), bundled skills hand Claude a detailed playbook and let it orchestrate using its tools. They appear in / autocomplete tagged Skill:

Bundled skill What it does
/simplifyRefactors recently-changed code for clarity / consistency / maintainability
/batchRuns the same task across many files / branches / PRs in one shot
/debugStructured debugging playbook (reproduce, isolate, fix, verify)
/loopBackground monitoring — reruns a check on a cadence and reports drift
/claude-apiHelps build agents on top of the Claude Messages API

A few built-in commands are also reachable through the Skill tool: /init, /review, /security-review. Others like /compact are not. Block all skills with a Skill deny rule, restrict to specific skills with Skill(name) or Skill(name *) in permissions.allow. To make a skill's body invoke extended thinking, include the literal word "ultrathink" anywhere in the skill content.

🎓 Cert Tip — Domain 3

The exam tests three skill-specific discriminations: (1) allowed-tools grants pre-approval, not restriction — to actually block, add to permissions.deny; (2) shell injection runs before Claude sees the prompt — treat !`cmd` output as untrusted input you control, not as Claude's tool output; (3) plugin skills don't conflict with personal/project skills because they use plugin-name:skill-name namespacing, but a skill at the same scope as a custom command of the same name takes precedence over the command.

Commands and skills define WHAT Claude does. But there is a higher-level decision: should Claude plan first, or act immediately? Plan mode is that decision point.

Plan Mode vs Direct Execution

Everyday Analogy

Plan mode is the architect drawing blueprints before construction begins. You see the design, review it, suggest changes, and only then does building start. Direct execution is the builder who picks up a hammer and starts working immediately — fast, efficient, and exactly right when the task is clear.

The pain of always planning is wasted time on trivial tasks: you do not need blueprints to hang a picture frame. The pain of never planning is disaster on complex tasks: building a house without blueprints means load-bearing walls in wrong places and plumbing that does not connect.

The real skill is knowing WHEN each mode is appropriate. A simple variable rename? Direct execution. A complex database schema refactor across 30 files? Plan mode. The decision criteria are: complexity, familiarity with the codebase, risk level, and reversibility of the change.

Under the hood, plan mode changes how Claude Code processes your request. Normally, Claude reads your prompt and immediately starts calling tools — reading files, making edits, running tests. In plan mode, Claude still reads files and analyzes the codebase, but it stops before making any changes. Instead, it outputs a structured plan: which files it will modify, what changes it will make to each one, and what risks exist. You review the plan, suggest changes, and only when you say "proceed" does Claude execute.

This is more than just "thinking before acting." Plan mode forces Claude to commit to a strategy before it starts editing. Without it, Claude might start refactoring file A, realize halfway through that file B needs to change first, undo its work on A, edit B, then come back to A. Plan mode catches these dependency chains upfront, saving tool calls and avoiding half-finished states.

Here is what plan mode output actually looks like. When you type /plan or ask Claude to "plan first," you get a structured proposal like this before any code is touched:

Plan: Refactor UCC filing parser to support UCC-3 amendments Step 1: Add amendment_type field to FilingRecord model (src/models/filing.py) - New enum: AmendmentType(ASSIGNMENT, CONTINUATION, TERMINATION) - Add optional amendment_type and original_filing_number fields Step 2: Update parser to detect UCC-3 headers (src/parsers/ucc_parser.py) - New function: detect_amendment_type() — pattern-match on filing header - Modify parse_filing() to call detect_amendment_type() for UCC-3 forms Step 3: Add migration for new columns (migrations/versions/003_add_amendment.py) - Alembic migration: ALTER TABLE ucc_filing ADD COLUMN amendment_type, original_filing_number Step 4: Update tests (tests/test_ucc_parser.py) - 3 new test cases: assignment, continuation, termination - 1 edge case: UCC-3 with missing original filing reference Files modified: 4 | Risk: MEDIUM (schema change) | Reversible: Yes (migration has downgrade) Shall I proceed?
Technical Definition

Plan modeA Claude Code workflow mode where Claude generates a step-by-step plan, presents it for approval, and only executes after the user confirms. Activated with /plan or by asking Claude to "plan first." Good for complex, risky, or unfamiliar tasks.: Claude generates a plan, shows it to you, and waits for your approval before executing any changes. Use plan mode when: (1) you are working in an unfamiliar codebase, (2) the change is complex (touching 5+ files), (3) the change is risky or hard to reverse, or (4) you want to review the approach before code is modified.

Direct execution: Claude acts immediately — reads files, makes edits, runs commands. Use when: (1) the task is well-understood ("rename this variable"), (2) the change is small and reversible, (3) you have clear context about what needs to happen.

Three powerful iteration patterns work well with plan mode:

  • TDD iteration: Red (write a failing test) → Green (ask Claude to make it pass) → Refactor. Each step is small and verifiable.
  • Interview pattern: Ask Claude to ask YOU questions before starting. This surfaces assumptions and requirements you might have missed.
  • Concrete examples: Provide 2–4 examples of desired output, then ask Claude to generalize. This anchors the implementation to real data.
⚠️ Common Misconceptions

"Plan mode is always the safe choice." — Plan mode adds overhead: Claude reads the codebase, generates a plan, waits for approval, then executes. For a simple variable rename, that is 3 extra steps that add nothing. The safe choice is the RIGHT choice for the task — plan mode for complex or risky changes, direct execution for simple well-understood ones.

"If I use plan mode, Claude won't make mistakes." — Plan mode reduces the chance of a wrong approach, but the plan itself can still be flawed. You might approve a plan that misses a dependency or underestimates the blast radius of a change. Always review the plan critically, especially the "Files modified" and "Risk" lines.

"Direct execution means Claude doesn't think." — Claude still reasons about your request in direct execution mode. It reads files, considers alternatives, and picks an approach. The difference is that it does not pause to show you the plan first. It is still "thinking" — it just acts on its conclusions immediately.

🎓 Cert Tip — Domain 3.3

The exam tests decision criteria for when to use plan mode. It is NOT "always use plan mode." The correct answer depends on: task complexity, codebase familiarity, change risk, and reversibility. Know the tradeoffs — plan mode has overhead that is wasteful for simple tasks.

Plan Mode vs Direct Execution — Flow Comparison
Direct Execution small, well-understood tasks Request Edit / Run Changes Done Plan Mode complex, risky, unfamiliar Request Read code Plan output human Execute revise plan with feedback Plan mode adds two stages: an explicit plan output and a human review gate before any file is touched.
Plan mode decides the workflow strategy. But Claude Code also lets you control which actions require approval at all — that's the permission system. Once you understand permission modes, you can choose the right safety/speed tradeoff for each session.

The Four Permission Modes

Everyday Analogy

Permission modes are like the security level on your work badge. Visitor pass: you can look around but every door asks for an escort. Employee badge: most doors open, sensitive areas still need approval. Master key: every door opens automatically. The right level depends on whether you're a new contractor (visitor) or the building manager (master) — and on which floor you're working on.

Claude Code's permission system works the same way. plan mode is the visitor pass — Claude can read but not write. bypassPermissions is the master key — every tool runs without asking. Most production work happens in default or acceptEdits, but knowing all four is the difference between a paranoid workflow that interrupts you constantly and a reckless workflow that breaks production.

The skill is matching the mode to the task: prototyping a brand-new file? acceptEdits. Touching production database migrations? default with explicit approval. Running an unattended overnight job in CI? bypassPermissions in a sandboxed container with a strict allowedTools list.

Technical Definition

Permission modes control which tool calls require user approval before execution. Set globally in ~/.claude/settings.json, per-project in .claude/settings.json, or per-session via --permission-mode. Combined with allowedTools and disallowedTools for fine-grained control.

Permission Modes — Pick the Right One
Mode Behavior Use when
planRead-only. Claude analyzes and proposes a plan, but cannot edit, write, or run commands.Architectural review, exploring an unfamiliar codebase, risky refactors
defaultThe starting mode. Tool calls prompt for approval the first time and can be remembered for the session via permissions.allow.Most interactive work; the safest default
acceptEditsFile edits auto-approved. Bash and other tools still prompt.Greenfield work, prototyping, scaffolding new files
bypassPermissionsSkips all permission checks. Every tool runs without prompting.Sandboxed environments only. Never on a workstation with prod credentials. Pair with disableBypassPermissions: true in managed/user settings to lock it out elsewhere.
Security Rule of Thumb

The strongest signal of a careless Claude Code setup is bypassPermissions in a developer's user-level settings. That single flag means every session, in every project, runs every tool without asking — including rm -rf, git push --force, and any MCP server tool with a connection string. Use bypassPermissions only inside ephemeral containers (Docker, CI runners, sandboxed worktrees), never on your laptop. For day-to-day work, default mode + a project-level permissions.allow list is the sweet spot.

🎓 Cert Tip — Domain 3.4 (Permissions)

The exam tests recognition of mode-to-scenario fit. Common stems: "unattended CI run inside a sandboxed container"bypassPermissions + a strict permissions.allow list. "Reviewing a risky migration"plan. "Prototyping a brand-new feature"acceptEdits. "Production-touching work on a developer's laptop"default (never bypass). Anti-pattern: using bypassPermissions on a workstation to avoid permission prompts — that defeats the entire safety model. Lock it out at the user/managed level with disableBypassPermissions: true.

Permission modes are part of the broader settings system. Where do those settings live, who can override whom, and what other knobs exist? That's the next layer.

settings.json — The Full Configuration Surface

Everyday Analogy

Think of Claude Code's settings as the layered controls on a corporate laptop. The IT admin (managed settings) sets policies you can't override — "you cannot install software, you cannot disable the VPN." You (user settings) set personal defaults — "I prefer dark mode, my keyboard shortcuts." Each project folder (project settings) adds project-specific tweaks — "for this client, use their VPN, this branding." Local overrides (gitignored) are sticky-notes only you see — "skip the auth check on my dev machine."

The pain without this layering: every team member has to repeat the same setup, security policies live in untrusted places, and personal tweaks leak into shared config. The cascade fixes that — admins lock down what matters, projects share what's useful, and personal preferences stay personal.

Same idea here. Five precedence levels mean a security team can mandate sandbox rules organization-wide, a project team can pre-approve their CI tools, and an individual can opt into vim mode — all without conflicts, because higher levels always win the same key.

Five-Level Precedence — Who Beats Whom

Claude Code merges settings from up to five sources for every session. Higher-priority levels override lower ones for the same key. Managed settings can also set a few "managed-only" keys (like allowedMcpServers) that lower levels cannot define at all.

Settings Precedence — Top Always Wins
1. Managed macOS plist / Windows GPO / managed-settings.d/ — admin-locked 2. CLI flags --model, --permission-mode, --add-dir, --worktree, etc. 3. Local .claude/settings.local.json — gitignored, your machine only 4. Project .claude/settings.json — checked into git, team-shared 5. User ~/.claude/settings.json — personal defaults priority

The Settings You'll Actually Touch

Settings group into eight functional areas. Here are the high-leverage knobs for each — the rest of the surface is in the upstream reference.

Area Key fields
Permissionspermissions.allow / ask / deny / defaultMode / additionalDirectories / disableBypassPermissions
Model & effortmodel, availableModels, effortLevel, alwaysThinkingEnabled, fastModePerSessionOptIn, modelOverrides
Hookshooks, disableAllHooks, allowManagedHooksOnly, allowedHttpHookUrls, httpHookAllowedEnvVars
MCPenableAllProjectMcpServers, enabledMcpjsonServers, disabledMcpjsonServers, allowedMcpServers*, deniedMcpServers* (* = managed-only)
Plugins & skillsenabledPlugins, extraKnownMarketplaces, strictKnownMarketplaces*, blockedMarketplaces*, disableSkillShellExecution*, allowedChannelPlugins*
Sandboxsandbox.enabled, sandbox.filesystem.{allow,deny}{Read,Write}, sandbox.network.{allowed,denied}Domains, sandbox.excludedCommands
Git & attributionattribution.commit, attribution.pr, includeGitInstructions, prUrlTemplate
UX & a11yeditorMode (vim/normal), tui, viewMode, prefersReducedMotion, voice.{enabled,mode,autoSubmit}, spinnerVerbs, statusLine, fileSuggestion, autoMemoryEnabled, autoMemoryDirectory

Sandbox Settings — Bash With Filesystem + Network Walls

Set sandbox.enabled: true and Bash runs inside an OS-level jail (Seatbelt on macOS, namespaces on Linux/WSL2). Sandboxed commands skip the permission prompt by default (autoAllowBashIfSandboxed: true) because the OS is enforcing the boundary — making it the safest way to run bypassPermissions-style work without yolo'ing your filesystem.

{
  "sandbox": {
    "enabled": true,
    "failIfUnavailable": true,
    "filesystem": {
      "allowWrite": ["./", "~/.cache/claude"],
      "denyWrite": ["./.git", "./.env*"],
      "denyRead":  ["./.env", "~/.ssh", "~/.aws"]
    },
    "network": {
      "allowedDomains": ["api.github.com", "registry.npmjs.org", "*.anthropic.com"],
      "deniedDomains":  ["*"]
    },
    "excludedCommands": ["docker", "kubectl"]
  }
}

Path prefixes: / = absolute, ~/ = home, ./ or no prefix = project-relative. Network domains support wildcards. excludedCommands lets specific binaries bypass the sandbox when they need raw network/filesystem (e.g. docker). Managed admins can also set sandbox.network.allowManagedDomainsOnly to ignore project/user domain lists entirely.

Managed Settings — Org-Wide Policy

For enterprise rollouts, settings can be deployed via OS policy systems — out of reach of users. Three delivery mechanisms:

  • macOS: com.anthropic.claudecode plist via Jamf, Kandji, etc.
  • Windows admin: HKLM\SOFTWARE\Policies\ClaudeCode via Group Policy or Intune
  • File-based: managed-settings.json base file plus managed-settings.d/*.json drop-in fragments (sorted alphabetically — later files override earlier)

Managed-only keys (cannot be set at any other level) include allowedMcpServers, deniedMcpServers, allowManagedMcpServersOnly, allowManagedPermissionRulesOnly, allowManagedHooksOnly, strictKnownMarketplaces, blockedMarketplaces, forceRemoteSettingsRefresh, disableSkillShellExecution, disableBypassPermissions. On Windows, wslInheritsWindowsSettings tells WSL to read the host's policies.

Attribution — Stop the AI From Spamming Co-Author Lines

By default, Claude Code adds a Co-Authored-By: Claude trailer to commits and an attribution line to PR descriptions. The attribution object lets you customize or disable both independently — replaces the deprecated includeCoAuthoredBy bool:

{
  "attribution": {
    "commit": false,
    "pr":     "Generated with Claude Code (internal)"
  },
  "prUrlTemplate": "https://github.acme.internal/{org}/{repo}/pull/{number}"
}

Voice, Status Line, Spinner — The UX Layer

Three settings that meaningfully change the moment-to-moment feel of working with Claude Code:

  • voice.enabled: true + voice.mode: "hold" (push-to-talk) or "tap" (toggle), with voice.autoSubmit: true to send when you stop talking. Configure interactively with /voice.
  • statusLine.type: "command" + statusLine.command: "./bin/status.sh" — your script's stdout is shown at the bottom of every prompt (great for showing branch / cost / current task).
  • spinnerVerbs.mode: "replace" + spinnerVerbs.verbs: ["compiling","analyzing","brewing"] — replaces the rotating "thinking" verbs. Use "append" mode to add custom ones to the defaults.
Files Other Than settings.json

Three sibling files matter: ~/.claude.json (global config: OAuth, MCP servers, IDE caches), .mcp.json (project MCP servers, separate from settings.json), and .worktreeinclude (gitignored files to copy into worktrees so dev tooling keeps working). The $schema field on settings.jsonhttps://json.schemastore.org/claude-code-settings.json — gives you autocomplete + validation in any editor with JSON Schema support.

🎓 Cert Tip — Domain 4 (Governance)

The exam tests three governance discriminations: (1) Sandbox vs permission modes — permissions are advisory (Claude could ignore a hook); sandbox is OS-enforced. Enterprises layer both. (2) Managed-only fields — the certification names allowedMcpServers, allowManagedHooksOnly, disableSkillShellExecution as fields that cannot be overridden at user/project level. (3) Precedence vs union — for arrays like permissions.allow, levels are unioned; for scalar fields like model, higher level wins outright. Memorize which keys behave which way.

Permission modes control what Claude can do. Built-in tools are how it actually does it — the six primitive operations Claude reaches for to read, modify, and execute on your codebase.

Built-in Tools: Read, Write, Edit, Bash, Grep, Glob

Technical Definition

Claude Code ships with six built-in toolsTools that ship with Claude Code and require no MCP server setup. They provide file system access, code editing, shell execution, and search capabilities. Unlike MCP tools, they are always available.. These are always available — no MCP server setup, no configuration, no installation. They cover the four things you do most often when working with code: finding files, reading them, changing them, and running commands.

Glob — find files by name pattern. Use when you need to discover file structure or locate files: "find all *.test.ts files in src/".

Grep — search file contents with regex. Use when you need to find where something is used: "find all functions that call validateInput".

Read — read file contents. Use to examine and understand code before making changes.

Edit — modify specific parts of existing files (string replacement). Use for targeted changes: fix a bug, rename a variable, update a value.

Write — create new files or completely overwrite existing ones. Use for generating new code, config files, or test files.

Bash — execute shell commands. Use for running tests, installing dependencies, git operations, and anything that requires a shell. This is the most powerful tool and also the most dangerous — it can do anything your terminal can do, including deleting files or running destructive commands. That is why Claude Code prompts you before running Bash commands by default.

The typical workflow follows a predictable pattern: Glob (find the file) → Read (understand it) → Edit (change it) → Bash (test it). Most coding tasks follow this exact sequence. If you find yourself asking Claude to use Bash for searching file contents, that is a sign you should be using Grep instead — dedicated tools are faster, safer, and produce cleaner output than shell equivalents.

An important nuance: Edit vs Write. Edit does a targeted string replacement — it finds an exact match in the file and replaces it. Write overwrites the entire file. For a one-line bug fix, Edit is better because it shows exactly what changed. For generating a new 200-line file from scratch, Write is the right choice. If you confuse them, you risk either losing existing code (Write when you meant Edit) or failing to create a file (Edit on a file that does not exist yet).

Typical Tool Workflow: Find → Read → Edit → Test
🔍
Glob
Find auth module
📄
Read
Read login function
Edit
Fix the bug
Bash
Run tests
🎓 Cert Tip — Domain 2.5 (Built-in Tools)

The exam tests tool selection: given a task description, which built-in tool is correct? Key distinctions: Glob finds files by name pattern (not content). Grep searches file content (not file names). Edit modifies existing files using unique text matching. Write creates new files or overwrites entirely. Confusing Glob/Grep or Edit/Write is a common exam mistake. Bonus: when Edit fails because the target text isn't unique, the cert-correct fallback is Read + Write (read full content, then overwrite). Build codebase understanding incrementally — start with Grep for entry points, then Read to follow imports, rather than reading every file upfront.

Common Misconceptions

"Claude Code is just the API with a CLI wrapper, right?" — No. Claude Code is a full agent that manages its own tool loop. When you give it a task, it decides which tools to call, reads the results, reasons about next steps, and iterates until the task is done. The API gives you a single message-in, message-out exchange. Claude Code gives you an autonomous agent that can read your codebase, make edits, run tests, and fix failures — all in one prompt.

"CLAUDE.md is like .env — it stores secrets and API keys." — Absolutely not. CLAUDE.md stores behavioral instructions: coding conventions, project context, and tool restrictions. Never put secrets in CLAUDE.md — it gets committed to git. Use environment variables for API keys, just like you would with any other tool.

"Skills replace commands entirely." — Not quite. Skills add capabilities that commands lack (context isolation, automatic invocation), but commands are simpler and perfectly fine for focused tasks that do not pollute context. Use a command when the task is small and scoped. Use a skill when you need isolation or auto-triggering.

"The config hierarchy is just about indentation preferences." — The cascade handles much more than formatting. Project-level CLAUDE.md typically includes architecture decisions, security policies, API patterns, database conventions, and deployment rules. Directory-level files can restrict tool access for sensitive areas (e.g., the payments directory might prohibit Bash to prevent accidental charges).

"Batch API is always better because it's cheaper." — Only when latency does not matter. Batch requests can take up to 24 hours. If a user is waiting for a response in a chat interface, they will not wait 24 hours for a 50% discount. Use batch for background processing (nightly reports, bulk extraction). Use synchronous for anything interactive.

You now know how to use Claude Code's tools interactively. But what about automated workflows? CI/CD integration lets Claude Code review code, enforce standards, and catch bugs automatically on every pull request.

CI/CD Integration

Technical Definition

Claude Code can run in non-interactive modeA mode where Claude Code runs without waiting for user input. Activated with the -p flag: claude -p "your prompt here". Outputs the result and exits. Essential for CI/CD pipelines where there is no human to interact with. for automated pipelines. Three flags make this possible:

-p "prompt" — This is the non-interactive flag. Claude runs the prompt, prints the result, and exits. No human needed. This is what makes Claude Code usable in CI/CD at all — without it, the process would hang waiting for user input.

--output-format json — Returns structured JSON instead of plain text. Your pipeline scripts can parse the output programmatically with jq or JSON.parse() instead of trying to regex through natural language.

--json-schema — Goes one step further: it enforces a specific output structure. You define the exact fields your pipeline expects, and Claude is constrained to return only those fields. No surprises, no extra commentary — just the data your script needs.

The most critical design pattern for CI/CD is session isolationUsing SEPARATE Claude Code sessions for code generation and code review. If the same session generates code and then reviews it, the reviewer retains the generator's reasoning context, creating confirmation bias. Separate sessions ensure the reviewer evaluates the code independently.. The idea is simple: use SEPARATE sessions for code generation and code review.

Here is why this matters. If the same session generates code and then reviews it, the reviewer still has the generator's reasoning in its context window. It remembers WHY each decision was made. That creates confirmation bias — the reviewer is more likely to agree with the code, even if the code has bugs, because it already understands the intent behind each line.

Separate sessions fix this. Session B (the reviewer) sees only the raw code diff. It evaluates the code on its own merits, without being influenced by Session A's thought process. This mirrors what good engineering teams do: the person who writes the code is never the only person who reviews it.

CI/CD Pipeline with Session Isolation
Commit
Build
Claude: Generate
Session A
Claude: Review
Session B
Deploy

Let's build a GitHub Actions workflow that puts session isolation into practice. The workflow triggers on every pull request, so every code change gets an independent AI review automatically.

The first half of the workflow — the "Summarize PR Changes" step — is Session A. It reads the git diff and produces a structured summary of what changed. Think of this as the "what happened" step. The second half — "Review PR" — is Session B. It reads the same diff but with a completely different prompt focused on bugs, security issues, and test gaps. Because these are separate claude -p invocations, Session B has zero knowledge of Session A's reasoning. That is the whole point.

One gotcha to watch for: the fetch-depth: 0 on checkout. Without it, GitHub Actions does a shallow clone (only the latest commit), and git diff origin/main...HEAD fails because there is no history to diff against. This is the most common cause of "the review step produced no output" in real CI setups.

# .github/workflows/claude-review.yml
# Automated PR review using Claude Code with session isolation.
# Session A generates a summary; Session B reviews independently.

name: Claude Code PR Review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for diff context

      - name: Install Claude Code
        run: npm install -g @anthropic-ai/claude-code

      # --- Session A: Summarize changes ---
      # This session reads the diff and produces a structured summary.
      - name: Summarize PR Changes
        id: summary
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          DIFF=$(git diff origin/main...HEAD)
          claude -p "Summarize these code changes. Focus on: what changed, why it might have changed, and any patterns you notice. Diff: $DIFF" \
            --output-format json > summary.json

      # --- Session B: Independent review ---
      # SEPARATE session — no access to Session A's reasoning.
      - name: Review PR (Independent Session)
        id: review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          DIFF=$(git diff origin/main...HEAD)
          claude -p "Review this PR diff for: security vulnerabilities, performance issues, missing error handling, and test coverage gaps. Be specific — cite line numbers. Diff: $DIFF" \
            --output-format json > review.json

      - name: Post Review Comment
        uses: actions/github-script@v7
        with:
          script: |
            const review = require('./review.json');
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `## Claude Code Review\n\n${review.result}`
            });
# scripts/review-pr.sh
# Standalone PR review script with session isolation.
# Can be used outside GitHub Actions (e.g., GitLab CI, local review).

#!/usr/bin/env bash
set -euo pipefail

# Ensure ANTHROPIC_API_KEY is set
if [ -z "${ANTHROPIC_API_KEY:-}" ]; then
  echo "Error: ANTHROPIC_API_KEY not set" >&2
  exit 1
fi

DIFF=$(git diff origin/main...HEAD)

if [ -z "$DIFF" ]; then
  echo "No changes to review."
  exit 0
fi

echo "=== Session A: Summarizing changes ==="
SUMMARY=$(claude -p "Summarize these code changes concisely: $DIFF" \
  --output-format json 2>/dev/null)

echo "=== Session B: Independent review ==="
REVIEW=$(claude -p "Review this diff for security issues, bugs, and missing tests. Cite specific line numbers: $DIFF" \
  --output-format json 2>/dev/null)

echo ""
echo "=== Summary ==="
echo "$SUMMARY" | jq -r '.result // .content // .'
echo ""
echo "=== Review ==="
echo "$REVIEW" | jq -r '.result // .content // .'
What Just Happened?

You built a CI/CD pipeline that uses Claude Code in two separate sessions. Session A summarizes what changed. Session B independently reviews the same diff for bugs and security issues. Because they are separate sessions, Session B cannot be influenced by Session A's reasoning — eliminating confirmation bias. This is the same principle as having different people write and review code.

🎓 Cert Tip — Domain 3.5

Critical anti-pattern: same-session self-review in CI/CD. The exam tests this directly. When Claude generates code in Session A and reviews it in the same session, the reviewer retains the generator's reasoning context, creating confirmation bias. Always use separate sessions for generation vs review.

🎓 Cert Tip — Domain 3.6 (Schema-Validated CI Output)

For CI/CD invocation, the exam-recommended pattern is claude -p "<prompt>" --output-format json --json-schema schema.json. The schema contract eliminates parser fragility — your CI script consumes structured fields like {verdict, blocking_issues[], confidence} instead of regex-ing freeform prose. Pair with separate generator/reviewer sessions (Domain 4.5) for confirmation-bias-free gating. Anti-pattern: parsing natural-language Claude output with regex, which breaks every time the model rephrases.

CI/CD runs Claude Code on each PR — one request at a time. But what about bulk tasks: analyzing 10,000 documents, reviewing an entire codebase, or extracting data from a massive dataset? That is where the Batch API comes in.

Batch Processing (Message Batches API)

Technical Definition

The Message Batches APIAn Anthropic API endpoint that accepts up to 10,000 requests in a single batch. Requests are processed asynchronously within a 24-hour window at 50% lower cost than synchronous API calls. Ideal for latency-tolerant, high-volume tasks. is Anthropic's solution for high-volume workloads. Instead of sending requests one at a time and waiting for each response, you package up to 10,000 requests into a single batch. Anthropic processes them in the background, and you collect the results when they are ready.

50% cost reduction: This is the headline benefit. Every request in a batch costs half of what the same request would cost synchronously. For a team processing 10,000 UCC filings at $0.003 per 1K input tokens, that savings adds up to hundreds of dollars per run.

24-hour processing window: The tradeoff is latency. Batch requests are not instant — Anthropic processes them within a 24-hour window. You submit them, go home, and check the results the next morning. This is fundamentally different from a synchronous call that returns in seconds.

When to use batch vs synchronous: Ask one question: is someone waiting for this result right now? If yes (chatbot, interactive tool, real-time dashboard), use synchronous. If no (nightly analysis, bulk data extraction, dataset labeling, codebase-wide review), use batch and save 50%.

Under the hood, the Batch API works differently from the standard Messages API. When you call client.messages.create(), Anthropic allocates GPU capacity immediately, processes your request, and streams back the response. Your code blocks until it is done. With batch, you submit a manifest of requests — essentially a list of "here are 10,000 prompts I need answered" — and Anthropic schedules them to run whenever capacity is available. That is why it is cheaper: Anthropic can fill idle GPU slots instead of guaranteeing instant capacity.

If you have used background jobs in web development (Sidekiq, Celery, AWS Lambda queues), the mental model is the same. You submit work, get a job ID, and poll until the job is complete. The difference is the scale: a single batch can hold 10,000 requests, and the polling interval is minutes, not milliseconds.

One important distinction from what you already know: unlike the synchronous API where you handle one response at a time, batch results come back as a collection. Some requests in the batch may succeed while others fail. Your code must handle partial success — iterating through results and checking each one's status individually.

Let's walk through a complete batch processing pipeline step by step. The first function, create_batch_requests, is straightforward — it takes a list of filing text strings and wraps each one into the format the Batch API expects. Each request gets a unique custom_id so you can match results back to the original documents later. Think of it as putting a label on each envelope before dropping them in the mailbox.

The interesting part is run_batch. It submits the batch, then enters a polling loop — checking every 60 seconds whether the batch has finished processing. When the batch status changes to "ended", it iterates through the results. Here is the critical detail: not every request in a batch is guaranteed to succeed. A filing might be too long, or the model might fail to extract valid JSON from a poorly formatted document. So the code checks each result individually and separates successes from errors. Never assume 100% success in batch processing.

# batch_extract.py
# Extract structured data from UCC filing documents using
# the Message Batches API — 50% cheaper than synchronous calls.

import anthropic
import json
import time


def create_batch_requests(filing_texts: list[str]) -> list[dict]:
    """
    Build batch request objects from filing document texts.
    Each request asks Claude to extract structured data.
    """
    requests = []
    for i, text in enumerate(filing_texts):
        requests.append({
            "custom_id": f"filing-{i:04d}",
            "params": {
                "model": "claude-sonnet-4-6",
                "max_tokens": 1024,
                "messages": [
                    {
                        "role": "user",
                        "content": (
                            "Extract structured data from this UCC filing. "
                            "Return JSON with fields: debtor_name, "
                            "secured_party, collateral_description, "
                            "filing_date, lapse_date, filing_number.\n\n"
                            f"Filing text:\n{text}"
                        ),
                    }
                ],
            },
        })
    return requests


def run_batch(filing_texts: list[str]) -> list[dict]:
    """
    Submit a batch of filing extraction requests and wait
    for results. Polls every 60 seconds.
    """
    client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY

    requests = create_batch_requests(filing_texts)
    print(f"Submitting batch of {len(requests)} requests...")

    try:
        batch = client.messages.batches.create(requests=requests)
        print(f"Batch created: {batch.id}")

        # Poll for completion
        while True:
            status = client.messages.batches.retrieve(batch.id)
            print(f"Status: {status.processing_status} "
                  f"({status.request_counts.succeeded} done)")

            if status.processing_status == "ended":
                break
            time.sleep(60)  # Check every minute

        # Collect results
        results = []
        for result in client.messages.batches.results(batch.id):
            if result.result.type == "succeeded":
                text = result.result.message.content[0].text
                results.append({
                    "id": result.custom_id,
                    "data": json.loads(text),
                })
            else:
                results.append({
                    "id": result.custom_id,
                    "error": str(result.result),
                })
        return results

    except anthropic.APIError as e:
        print(f"Batch API error: {e}")
        return []


if __name__ == "__main__":
    # Example: process 5 sample filings
    sample_filings = [
        "UCC-1 Filing: Debtor: Acme Corp, 123 Main St...",
        "UCC-1 Filing: Debtor: BuildRight LLC, 456 Oak Ave...",
        "UCC-3 Amendment: Original filing #2024-001234...",
        "UCC-1 Filing: Debtor: Metro Holdings Inc...",
        "UCC-3 Continuation: Original filing #2023-009876...",
    ]
    results = run_batch(sample_filings)
    for r in results:
        print(json.dumps(r, indent=2))
// batch_extract.ts
// Extract structured data from UCC filing documents using
// the Message Batches API — 50% cheaper than synchronous calls.

import Anthropic from "@anthropic-ai/sdk";

interface BatchRequest {
  custom_id: string;
  params: {
    model: string;
    max_tokens: number;
    messages: Anthropic.MessageParam[];
  };
}

function createBatchRequests(filingTexts: string[]): BatchRequest[] {
  return filingTexts.map((text, i) => ({
    custom_id: `filing-${String(i).padStart(4, "0")}`,
    params: {
      model: "claude-sonnet-4-6",
      max_tokens: 1024,
      messages: [
        {
          role: "user" as const,
          content:
            "Extract structured data from this UCC filing. " +
            "Return JSON with fields: debtor_name, " +
            "secured_party, collateral_description, " +
            "filing_date, lapse_date, filing_number.\n\n" +
            `Filing text:\n${text}`,
        },
      ],
    },
  }));
}

async function runBatch(filingTexts: string[]) {
  const client = new Anthropic(); // reads ANTHROPIC_API_KEY

  const requests = createBatchRequests(filingTexts);
  console.log(`Submitting batch of ${requests.length} requests...`);

  try {
    const batch = await client.messages.batches.create({ requests });
    console.log(`Batch created: ${batch.id}`);

    // Poll for completion
    while (true) {
      const status = await client.messages.batches.retrieve(batch.id);
      console.log(
        `Status: ${status.processing_status} ` +
        `(${status.request_counts.succeeded} done)`
      );

      if (status.processing_status === "ended") break;
      await new Promise((r) => setTimeout(r, 60_000)); // 1 min
    }

    // Collect results
    const results: Array<{ id: string; data?: unknown; error?: string }> = [];
    for await (const result of client.messages.batches.results(batch.id)) {
      if (result.result.type === "succeeded") {
        const text = (result.result.message.content[0] as Anthropic.TextBlock).text;
        results.push({ id: result.custom_id, data: JSON.parse(text) });
      } else {
        results.push({ id: result.custom_id, error: String(result.result) });
      }
    }
    return results;

  } catch (error) {
    console.error(`Batch API error: ${error}`);
    return [];
  }
}

// Example usage
const sampleFilings = [
  "UCC-1 Filing: Debtor: Acme Corp, 123 Main St...",
  "UCC-1 Filing: Debtor: BuildRight LLC, 456 Oak Ave...",
  "UCC-3 Amendment: Original filing #2024-001234...",
];
runBatch(sampleFilings).then((results) =>
  results.forEach((r) => console.log(JSON.stringify(r, null, 2)))
);
What Just Happened?

You built a batch processing pipeline that extracts structured data from UCC filings at 50% of the normal cost. The batch API handles up to 10,000 requests asynchronously within 24 hours. The code polls for completion, then collects results — including error handling for individual failed requests. This pattern works for any high-volume extraction: document processing, dataset labeling, bulk code review.

⚠️ Common Misconceptions

"I should batch everything to save money." — Only if latency does not matter. Batch requests can take up to 24 hours. A chatbot user will not wait 24 hours for a 50% discount. Use batch for background processing (nightly reports, bulk extraction, dataset labeling). Use synchronous for anything where a human or system is waiting for the result.

"If one request in a batch fails, the whole batch fails." — No. Batches support partial success. Some requests may succeed while others fail (e.g., one filing is too long, another has malformed text). Your code must iterate through results and check each one individually. Never assume 100% success.

"Batch API is a different model or lower quality." — It is the exact same model, same quality, same capabilities. The only difference is scheduling: synchronous gets immediate GPU allocation, while batch requests are queued and processed when capacity is available. That flexible scheduling is why Anthropic can offer the 50% discount.

🎓 Cert Tip — Domain 3.6

The exam tests when to use batch vs synchronous. Batch: non-urgent, high-volume, cost-sensitive (nightly analysis, bulk extraction). Synchronous: user-facing, real-time, interactive. A user waiting for a chatbot response should NEVER hit the batch API — they would wait up to 24 hours.

Background Work & Composition Patterns

Single-session Claude Code is already a force multiplier. But the top-1% workflow is composition: multiple Claude sessions running in parallel, scheduled background agents, and independent reviews of your own work. These aren't experimental features — they ship in Claude Code today and unlock workflows you literally cannot do with any other AI tool.

Worktrees — parallel sessions on the same repo

Git worktrees let you check out multiple branches into separate directories. Claude Code uses this to run truly parallel sessions without stepping on each other's changes. Want one Claude implementing the feature while another reviews the previous PR? Spin up two worktrees, two sessions, two terminals.

# Create a worktree on a new branch, isolated from your main checkout
git worktree add ../proj-feature-auth -b feat/auth

# Open Claude Code there
cd ../proj-feature-auth && claude

# Meanwhile, in your original checkout, run a SECOND Claude session
# reviewing the last commit (the "two-Claude review" pattern)
cd ../proj && claude "review the last commit on main as a staff engineer. Be harsh."

# When done, clean up
git worktree remove ../proj-feature-auth

The Two-Claude Review Pattern

This is the single highest-leverage technique to add to your workflow today. Session A implements a feature with full context of the tradeoffs. Session B starts cold and reviews the diff with no context. The cold reviewer is dramatically harder to fool because it has no rationalization for shortcuts that "made sense at the time."

Two-Claude Review — Implementer + Cold Reviewer
Session A — Implementer full context, full conversation writes feature, runs tests, commits commit git diff last commit cold read Session B — Reviewer empty context no rationalization Structured Review MUST FIX SHOULD FIX · CONSIDER Session A applies fixes, re-runs tests, commits again

Background Monitoring with /loop

Some checks shouldn't block your main session — they should run in the background on a timer. Use /loop <interval> <prompt>. The session keeps running, you keep working, and Claude reports back when something changes.

# Watch the CI pipeline on a feature branch
/loop 5m check if the CI pipeline on branch feat/recommendations passed; report back

# Periodically scan for new failing tests on main
/loop 30m check for any new failing tests on main; only ping me on changes

# Self-paced (Claude decides the cadence based on what it's watching)
/loop check the long-running build at task ID build-7842 and tell me when it finishes

Scheduled Routines with /schedule

/schedule creates a cron-style remote agent that runs even when your laptop is closed. Common patterns: a Monday-morning triage agent that processes the weekend's PR queue, or a one-time agent in two weeks that opens a cleanup PR for a feature flag you just shipped.

# Recurring routine: weekly PR triage every Monday at 9am
/schedule "0 9 * * 1" "review all open PRs older than 5 days; comment on staleness"

# One-time routine: clean up a feature flag in 2 weeks
/schedule "in 14 days" "open a PR removing the recommendations_v2 feature flag if it's still rolled out 100%"

Remote Control — Async Workflows from Anywhere

Long-running Claude tasks shouldn't tie you to your laptop. Run claude remote-control on your machine and you can connect to that running session from claude.ai or the iOS app. Kick off a 20-minute refactor, close your laptop, check progress from your phone on the way to a meeting. The session runs on your machine; the browser/app is just a window into it.

Voice — /voice + Push-to-Talk

Some workflows are dramatically faster spoken than typed — especially exploratory thinking-out-loud or describing an architecture. Run /voice to enable push-to-talk. Hold space, describe what you want, release. Claude transcribes and responds. The natural use case: long-form context dumps and brainstorming where typing is the bottleneck.

Plugins — sharing your setup

Once you've curated a great set of skills, slash commands, and hooks for your project, you can package them as a plugin. Plugins are installable via the Claude Code marketplace (or your team's private registry) and are how engineering teams standardize agent infrastructure across projects. The same plugin that gives the SRE team a /incident command and a Slack notification hook can be installed in every repo with one command.

Why It Matters

These four primitives — worktrees, two-Claude review, /loop, /schedule — turn Claude Code from a fast assistant into a programmable engineering team. The pattern is always the same: stop doing the work in real-time on your main session. Spawn it, schedule it, parallelize it, or hand it to a fresh reviewer. The compounding effect is what the Anthropic engineering guidance calls "agents over workflows" — once the architecture supports autonomous, parallel work, you stop being the bottleneck.

You now have the full Claude Code surface area: configuration, composition primitives, execution control, and async orchestration. Before we wire them into one end-to-end workflow, here's the cheat sheet of everyday power-user features that turn a working setup into a fast one — keyboard shortcuts, session resume, extended thinking, scheduling, and the workflows you'll use every day.

Power User Features & Keyboard Shortcuts

The Anthropic common workflows doc lists the daily features that separate "I use Claude Code" from "I use Claude Code well." This section is the consolidated cheat sheet — every UX detail, keyboard shortcut, and CLI flag the docs recommend.

Plan Mode — The Full UX

You learned plan mode earlier. Here's how the docs say to actually drive it:

Plan Mode — Cheat Sheet
Action How
Switch into Plan Mode mid-sessionShift+Tab cycles: Normal → Auto-Accept → Plan Mode
Start a session in Plan Modeclaude --permission-mode plan
Run headless planclaude --permission-mode plan -p "<query>"
Edit the plan in your editor before acceptingCtrl+G opens plan in $EDITOR
Make Plan Mode the default{"permissions": {"defaultMode": "plan"}} in .claude/settings.json

@-File & Resource References

The @ prefix has two distinct uses depending on where you type it. Don't confuse them:

  • In prompts@src/utils/auth.js includes the file's full content. @src/components shows a directory listing. @github:repos/owner/repo/issues fetches data from a connected MCP server. Adding @file also pulls the CLAUDE.md from that file's directory and parents into context. Multiple references work in one message: "explain the bug in @app.ts that the tests in @app.test.ts are catching."
  • In CLAUDE.md@README.md imports another file's content into CLAUDE.md at session start. Recursive imports up to 5 hops. Relative paths resolve relative to the file that contains the import. Use @~/.claude/my-prefs.md to share personal instructions across worktrees.

Extended Thinking — The UX

You learned the concept of extended thinking in M22. Here's the daily UX:

Extended Thinking — Cheat Sheet
Action How
See Claude's thinking inlineCtrl+O toggles verbose mode
Toggle thinking for the sessionOption+T (mac) / Alt+T (Win/Linux)
Ask for more thinking on one promptInclude the keyword ultrathink anywhere in your prompt
Adjust effort level/effort, or CLAUDE_CODE_EFFORT_LEVEL env var
Cap thinking budgetMAX_THINKING_TOKENS=10000 (Opus 4.7 ignores; uses adaptive)
Disable thinking globally/config, sets alwaysThinkingEnabled: false

Phrases like "think hard" or "think more" are read as regular prompt text, not thinking-budget allocations. Only ultrathink is interpreted as a thinking instruction.

Session Management — Resume, Name, Pick

Sessions persist automatically per project directory. The full vocabulary:

Session Commands & Picker Shortcuts
Goal CLI / Slash command
Resume the most recent session in this dirclaude --continue
Open the pickerclaude --resume · from inside a session: /resume
Resume a session linked to a PRclaude --from-pr 123 (or paste PR URL into picker search)
Name a session at startupclaude -n auth-refactor
Rename mid-session/rename auth-refactor
Resume by name (across worktrees)claude --resume auth-refactor
Branch the current session/branch · /rewind · --fork-session
In picker: search / preview / rename / widen/ · Space · Ctrl+R · Ctrl+A all projects · Ctrl+W worktrees · Ctrl+B branch filter

Best practice: name early. /rename auth-refactor when starting a distinct task is much easier to find later than "explain this function." When the session is too old to fully reload, the picker offers a resume from summary path.

Worktree Extras — Built-In Flag, Includes, Subagent Worktrees

You learned worktrees as a parallel-session primitive. Three docs-recommended details:

  • Built-in flag: claude --worktree feature-auth creates <repo>/.claude/worktrees/feature-auth/ on a new branch worktree-feature-auth. Omit the name for an auto-generated one. Worktrees branch from origin/HEAD — if your main branch changed, run git remote set-head origin -a to re-sync. Add .claude/worktrees/ to your .gitignore.
  • .worktreeinclude: a gitignore-style file at the repo root listing files that should be copied into new worktrees (typically .env.local, config/secrets.json). Only files that match and are gitignored get copied — tracked files are never duplicated.
  • Subagent worktrees: add isolation: worktree to a custom subagent's frontmatter, or ask Claude to "use worktrees for your agents." Each subagent gets its own worktree, automatically cleaned up if no changes were made. Orphaned worktrees from interrupted runs sweep at startup based on cleanupPeriodDays.

Scheduling — Pick the Right Tool

The docs list four ways to run Claude on a schedule. Choose by where the task should run:

Scheduling Options — Decision Matrix
Option Where it runs Best for
RoutinesAnthropic-managed infraTasks that must run when your computer is off; API + GitHub triggers
Desktop scheduled tasksYour machine, via desktop appTasks needing local files / uncommitted changes
GitHub ActionsYour CI pipelineTied to repo events (PR opens, cron in CI)
/loopCurrent CLI sessionQuick polling while a session is open

Scheduled tasks run autonomously — the prompt can't ask clarifying questions. Be explicit about success criteria and what to do with results: "Review open PRs labeled needs-review, comment on issues, post a summary in #eng-reviews Slack."

Unix-Style Utility — Headless & Pipes

Claude Code has a first-class non-interactive mode for shell pipelines and CI:

# Pipe data through Claude
cat build-error.txt | claude -p 'concisely explain the root cause' > output.txt

# Use as a linter in package.json
"lint:claude": "claude -p 'check the diff vs main for typos. report filename and line.'"

# Output formats
claude -p '...' --output-format text        # default plain text
claude -p '...' --output-format json        # full conversation log + cost/duration
claude -p '...' --output-format stream-json # real-time streaming JSON objects

Image Input

Three ways to give Claude an image: (1) drag and drop into the Claude Code window. (2) copy and paste with Ctrl+V (do not use Cmd+V). (3) reference a path: "Analyze this: /path/to/image.png." Useful for debugging error screenshots, generating CSS from design mockups, and reading database-schema diagrams. When Claude references an image like [Image #1], Cmd+Click / Ctrl+Click opens it.

Desktop Notifications via the Notification Hook

Long-running tasks shouldn't keep you tab-switching. The Notification hook fires when Claude needs permission, finishes work, or completes auth. Configure once in ~/.claude/settings.json:

{
  "hooks": {
    "Notification": [{
      "matcher": "",
      "hooks": [{
        "type": "command",
        "command": "osascript -e 'display notification \"Claude needs your attention\" with title \"Claude Code\"'"
      }]
    }]
  }
}

Replace the command with notify-send on Linux or a PowerShell MessageBox on Windows. Narrow the matcher to permission_prompt / idle_prompt / auth_success if you only want specific events.

CLAUDE.md Extras the Earlier Section Skipped

Five smaller CLAUDE.md features that didn't fit the main hierarchy diagram but matter at scale:

  • /init — generates a starting CLAUDE.md by analyzing your codebase. If a CLAUDE.md exists, it suggests improvements rather than overwriting. Set CLAUDE_CODE_NEW_INIT=1 for an interactive multi-phase flow that also offers to set up skills and hooks.
  • AGENTS.md compatibility — Claude Code reads CLAUDE.md, not AGENTS.md. If your repo already has an AGENTS.md for other tools, create a CLAUDE.md that just imports it: @AGENTS.md on line 1, then add Claude-specific instructions below.
  • Managed policy CLAUDE.md — org-wide instructions deployed via MDM/Group Policy/Ansible to /Library/Application Support/ClaudeCode/CLAUDE.md (mac), /etc/claude-code/CLAUDE.md (Linux), or C:\Program Files\ClaudeCode\CLAUDE.md (Windows). Cannot be excluded by individual settings.
  • claudeMdExcludes — in monorepos, ancestor CLAUDE.md files from other teams may pollute context. Skip them with glob patterns in .claude/settings.local.json: "claudeMdExcludes": ["**/monorepo/CLAUDE.md", "/home/user/monorepo/other-team/.claude/rules/**"].
  • --add-dir — gives Claude access to directories outside your working dir. By default their CLAUDE.md isn't loaded; set CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD=1 to also load their memory files.
Debugging "Claude Isn't Following My CLAUDE.md"

Three checks, in order: (1) Run /memory — if your CLAUDE.md isn't listed, Claude isn't seeing it (wrong path or excluded). (2) Make instructions specific. "Use 2-space indentation" beats "format code nicely." (3) Look for conflicting instructions across nested CLAUDE.md files — if two say different things for the same behavior, Claude picks one arbitrarily. For deep debugging, configure the InstructionsLoaded hook to log exactly which instruction files load, when, and why.

🎓 Cert Tip — Domain 3

The exam tests your familiarity with the daily workflow primitives, not just the deep architecture. Common stems: "how do you switch into Plan Mode mid-session"Shift+Tab. "how do you make Claude think more on one prompt"ultrathink keyword. "how do you resume a session by PR"--from-pr. Memorize the cheat sheets above — they're the operational vocabulary the cert assumes you know.

End-to-End: Building a Feature with the Full Stack

You've learned every primitive individually. The payoff is composition. Here's a real workflow for shipping a new API endpoint — from idea to merged PR — using all six layers together. The same task that takes 2–3 hours of manual work compresses to about 25 minutes when the system does the orchestration.

The scenario: add a /api/v2/recommendations endpoint to a Node.js API. Personalized content based on user history. Redis caching, auth middleware, tests.

Full Flow — You, Claude Code, SPEC, Subagents, MCP
You Claude Code SPEC.md Reviewer Security GitHub MCP "interview me about /recommendations" AskUserQuestion × 6 answers (auth, cache, edge cases) writes SPEC.md (new session) "implement from SPEC.md" TDD loop — write tests → implement → fix PostToolUse hooks auto-lint each write Task: spawn code-reviewer subagent MUST FIX: Redis leak, auth order applies fixes, re-runs tests — green Task: spawn security-auditor subagent ✅ clean — no injection vectors, no auth gaps create PR (full description, links Jira) PR URL — ready for human review Total: ~25 minutes vs 2–3 hours manually. Quality gates ran automatically. PR description wrote itself.

The Six Steps in Detail

  1. Step 0 — CLAUDE.md is already loaded. Because you set this up once, Claude already knows your stack, test framework, git workflow, and the things it gets wrong. Zero setup per session.
  2. Step 1 — Interview pattern with AskUserQuestion. Instead of guessing requirements, ask Claude to interview you. "Interview me using AskUserQuestion. Ask about auth, caching strategy, response shape, edge cases, performance constraints. Don't assume anything. When done, write SPEC.md."
  3. Step 2 — Implementation in a fresh session. New terminal, new session: "implement the spec in SPEC.md, TDD style". Your PostToolUse hook lints every file as Claude writes it. Your PreToolUse hook blocks dangerous commands. You're not babysitting any of this.
  4. Step 3 — Parallel review subagent. While the main session is still running (or right after it commits): "Use the code-reviewer subagent on the last commit." The subagent reads the diff cold and returns a structured MUST FIX / SHOULD FIX / CONSIDER report.
  5. Step 4 — Fix and validate. Hand the report back to the main session: "Reviewer found a Redis connection leak and auth middleware in wrong order. Fix both, re-run tests." Tests pass, hooks auto-lint.
  6. Step 5 — Security audit subagent. "Use the security-auditor subagent on this feature." Scans for injection vectors, exposed secrets, auth gaps, rate-limit gaps. Returns clean (or more fixes).
  7. Step 6 — Automated PR via GitHub MCP. "Create a PR. Include the spec, what changed and why, test coverage summary, known limitations." Claude uses GitHub MCP to create the PR, links Jira via Jira MCP, requests reviewers.

The Production-Grade File Tree

Every primitive in this module corresponds to a file location. This is the consolidated layout for a well-configured Claude Code project — what your .claude/ directory looks like once all six layers are wired up.

your-project/
├── CLAUDE.md                ← project memory (commit this, <150 lines)
├── CLAUDE.local.md          ← personal overrides (gitignore)
├── .claude/
│   ├── settings.json        ← hooks, models, permissions, MCP
│   ├── agents/
│   │   ├── code-reviewer.md
│   │   ├── test-writer.md
│   │   ├── security-auditor.md
│   │   └── pm-spec.md
│   ├── skills/
│   │   ├── deploy.md             ← how we deploy to staging/prod
│   │   ├── database-patterns.md  ← our DB conventions
│   │   └── api-design.md         ← our API design rules
│   ├── commands/
│   │   ├── review-pr.md          ← /review-pr $ARGUMENTS
│   │   ├── ship.md               ← /ship — full pipeline
│   │   └── diagnose.md           ← /diagnose — debugging workflow
│   └── hooks/
│       ├── block_dangerous.py
│       ├── auto_format.sh
│       └── session_summary.py
├── src/
│   ├── api/CLAUDE.md        ← directory-scoped rules (loaded on demand)
│   └── db/CLAUDE.md         ← DB-specific conventions
The Mindset Shift

Everyone else: "I'll give Claude a task and see how it does."
Top 1%: "I'll design a system where Claude operates effectively with minimum supervision."

That's an infrastructure mindset applied to AI tooling. You invest time upfront writing a tight CLAUDE.md, setting up hooks, defining subagents — and that investment compounds on every session. The developers shipping the most aren't the best prompters; they're the best system designers. They think about where context degrades and preempt it. Which quality gates should be automatic vs. human-reviewed. Which parts of a task can run in parallel vs. serially. It's not about being a better driver — it's about building a better road.

5-Day Rollout Plan — Where to Start Tomorrow

You don't need to do all of this at once. Start with the smallest change that compounds. Here's the prioritized rollout that the top-1% playbook recommends, in order of return-on-effort.

From Day 1 to Compounding — The Rollout Order
Day 1 CLAUDE.md run /init delete 70% < 50 lines Day 2 First Hook PostToolUse on Write runs lint Day 3 2-Claude Review 2nd session reviews last commit cold Day 4 First Subagent code-reviewer .claude/agents/ use on PR Day 5 First MCP GitHub MCP fetch issue context Week 2+ — Iterate & Compound Refine CLAUDE.md based on what Claude keeps getting wrong Add domain-specific skills · Build the full pipeline Each iteration compounds — the system gets smarter every week The 1% isn't a fixed group. It's the people who treat this like engineering — systematic, iterative, always improving. The bottleneck isn't the tool. It's whether you'll do the work upfront.
🎓 Cert Tip — Domain 3 (Composition)

The exam tests your ability to recognize which primitive solves a given pain point. Stale context across sessions → CLAUDE.md. Ad-hoc lint enforcement → PostToolUse hook. Repetitive multi-step workflows → subagent. Live external data → MCP. Memorize the mapping; it shows up as scenario questions where 3 of 4 answers are technically possible but only 1 matches the right primitive for that pain.

You now have the full Claude Code surface area — configuration, composition, execution control, async orchestration, end-to-end workflow, and a rollout plan. Time to wire it all together in the hands-on exercise.

Hands-On Exercise

What You'll Build

A complete Claude Code project configuration with three-level CLAUDE.md hierarchy, a custom slash command, a skill with context isolation, and a CI/CD review workflow.

Time estimate: 30–45 minutes

Prerequisites: Claude Code installed (npm install -g @anthropic-ai/claude-code), a code project directory to configure, and your ANTHROPIC_API_KEY set.

Files you'll create:

  • ~/.claude/CLAUDE.md — Personal preferences (user-level)
  • .claude/CLAUDE.md — Team standards (project-level)
  • src/api/CLAUDE.md — API-specific rules (directory-level)
  • .claude/commands/check-filing.md — Custom slash command
  • .claude/skills/entity-resolution/SKILL.md — Skill with context fork
  • .github/workflows/claude-review.yml — CI/CD workflow

Environment Setup

# Create project structure
mkdir -p my-ucc-project/src/api
mkdir -p my-ucc-project/.claude/commands
mkdir -p my-ucc-project/.claude/skills/entity-resolution
mkdir -p my-ucc-project/.github/workflows
cd my-ucc-project

# Verify Claude Code is installed
claude --version

# Verify API key is set
echo $ANTHROPIC_API_KEY | head -c 10  # Should print "sk-ant-..." 

Step 1: Create User-Level CLAUDE.md

What & Why: The user-level file holds your personal coding preferences. It applies to every project you open with Claude Code but never gets committed to git, so it does not impose your style on teammates.

Create the file ~/.claude/CLAUDE.md with the following content:

# Personal Claude Code Preferences

## Editor Preferences
- Indentation: 2 spaces
- Line length: 100 characters max
- Trailing commas: always in multi-line structures

## Workflow Preferences
- Run tests after every code change
- Show git diff before committing
- Prefer small, focused commits over large batches

## Communication Style
- Be concise — skip preamble
- If unsure, ask before making assumptions

Run:

cat ~/.claude/CLAUDE.md
Expected output:
# Personal Claude Code Preferences ## Editor Preferences - Indentation: 2 spaces - Line length: 100 characters max ...
✅ Checkpoint

If you see your personal preferences printed, Step 1 is working. If not, check the troubleshooting below.

Troubleshooting:

  • If you see No such file or directory → create the directory first: mkdir -p ~/.claude
  • On Windows, the path is %USERPROFILE%\.claude\CLAUDE.md — use type %USERPROFILE%\.claude\CLAUDE.md instead of cat
  • If the file exists but is empty → check you saved the content (some editors don't auto-save new files)

Step 2: Create Project-Level CLAUDE.md

What & Why: The project-level file defines team standards. It gets committed to git so every developer shares the same rules. This is where tech stack, database conventions, and API patterns go.

Create the file .claude/CLAUDE.md in your project root:

# UCC Pipeline Project — Claude Code Configuration

## Project Context
This is a UCC (Uniform Commercial Code) filing data pipeline.
Stack: Python 3.12, FastAPI, PostgreSQL 15, Redis, Docker.

## Coding Conventions
- Use type hints on ALL function signatures
- Docstrings: Google style (Args/Returns/Raises)
- Tests: pytest with fixtures, minimum 80% coverage
- Error handling: never catch bare `except:`, always specific exceptions

## Database Rules
- ALL queries must use parameterized statements (no f-strings in SQL)
- Migrations via Alembic — never modify schema directly

## API Patterns
- Routes: /api/v1/{resource}
- Validation: Pydantic models for all request/response bodies

Run:

cat .claude/CLAUDE.md
Expected output:
# UCC Pipeline Project — Claude Code Configuration ## Project Context This is a UCC (Uniform Commercial Code) filing data pipeline. Stack: Python 3.12, FastAPI, PostgreSQL 15, Redis, Docker. ...
✅ Checkpoint

If you see the team-level configuration with tech stack and database rules, Step 2 is working.

Troubleshooting:

  • If you see No such file or directory → make sure you created the file inside .claude/, not at the project root
  • If you accidentally created it at the project root → move it: mv CLAUDE.md .claude/CLAUDE.md
  • If the .claude/ directory doesn't exist → run mkdir -p .claude first

Step 3: Create Directory-Level CLAUDE.md

What & Why: Directory-level files override project-level rules for specific code areas. The API directory might require Zod validation, while the database layer requires parameterized queries.

Create src/api/CLAUDE.md:

# API Layer Rules (overrides project-level where applicable)

## Input Validation
- ALL endpoints must validate input with Pydantic models
- Return 422 with field-level error details on validation failure

## Response Format
- Always wrap responses in: { "data": ..., "error": null, "metadata": {} }
- Include request_id in metadata for tracing

## Security
- Bearer token auth via X-API-Key header on all endpoints
- Rate limit: 100 req/min per API key

Run:

cat src/api/CLAUDE.md
Expected output:
# API Layer Rules (overrides project-level where applicable) ## Input Validation - ALL endpoints must validate input with Pydantic models ...
✅ Checkpoint

If you see the API-specific rules, Step 3 is working. You now have three CLAUDE.md files at three levels. When Claude Code operates inside src/api/, it merges all three, with directory-level rules winning any conflicts.

Troubleshooting:

  • If you see No such file or directory → make sure src/api/ exists: mkdir -p src/api
  • If the file shows the project-level content instead → you may have saved to the wrong path. Check with ls -la src/api/

Step 4: Build a Slash Command

What & Why: A slash command creates a reusable workflow you can trigger by typing /check-filing. This one validates a UCC filing parser against known test data.

Create .claude/commands/check-filing.md:

---
description: Validate a UCC filing parser against test data
allowed-tools:
  - Read
  - Grep
  - Glob
argument-hint: filing_type (e.g., UCC-1, UCC-3)
---

# Check Filing Parser

Review the parser for `$ARGUMENTS` filings.

## Checklist
1. Read the parser source for this filing type
2. Check: Does it handle all required fields? (debtor, secured party, collateral, dates)
3. Check: Are edge cases covered? (missing fields, malformed dates)
4. Check: Are there tests with adequate coverage?

## Report Format
- **Status**: PASS / FAIL / NEEDS REVIEW
- **Issues found**: list with severity
- **Missing coverage**: untested scenarios

Run:

cat .claude/commands/check-filing.md | head -3
Expected output:
--- description: Validate a UCC filing parser against test data allowed-tools:

Then open Claude Code in your project directory and type / — you should see check-filing in the list of available commands.

✅ Checkpoint

If the command appears in the / menu and the YAML frontmatter is correct, Step 4 is working.

Troubleshooting:

  • If the command does not appear in / menu → check the file is in .claude/commands/ (not .claude/skills/) and ends with .md
  • If you see YAML parse error → check that the frontmatter uses --- delimiters (three dashes) and proper indentation
  • If the command appears but fails → verify the allowed-tools list uses correct tool names: Read, Grep, Glob (case-sensitive)

Step 5: Build a Skill with Context Isolation

What & Why: Skills with context: fork run in an isolated context window, so exploring dozens of files does not clutter your main session. This skill does entity resolution — searching for name matches across UCC filings.

Create .claude/skills/entity-resolution/SKILL.md:

---
name: entity-resolution
description: Analyze entity names across UCC filings to find matches and variations. Use when asked to resolve, match, or deduplicate entity names.
context: fork
allowed-tools:
  - Read
  - Grep
  - Glob
---

# Entity Resolution Analysis

Given entity name `$ARGUMENTS`, search for:
1. Exact matches in filing records
2. Variations (abbreviations, misspellings, legal suffixes)
3. Related entities (parent/subsidiary relationships)

Return a structured summary with match counts and a recommended canonical name.
Do NOT modify any files. Return only the analysis summary.

Run:

cat .claude/skills/entity-resolution/SKILL.md | head -5
Expected output:
--- name: entity-resolution description: Analyze entity names across UCC filings to find matches and variations. Use when asked to resolve, match, or deduplicate entity names. context: fork allowed-tools:

Then in Claude Code, type "resolve entity Acme Corp" — the skill should trigger automatically based on its description matching your request. You can also invoke it explicitly with /entity-resolution Acme Corp.

✅ Checkpoint

If the skill is available as /entity-resolution and auto-triggers when you mention entity matching, Step 5 is working.

Troubleshooting:

  • If the skill does not appear → check the file is named exactly SKILL.md (all caps, exact casing) inside .claude/skills/entity-resolution/
  • If the skill appears but does not auto-trigger → make sure the description field in the frontmatter contains keywords that match your request (e.g., "resolve", "match", "entity")
  • If the skill runs but pollutes your main context → verify context: fork is set in the YAML frontmatter

Step 6: Configure CI/CD Review Workflow

What & Why: This GitHub Actions workflow runs Claude Code on every PR with session isolation — separate sessions for summarizing and reviewing changes. This prevents confirmation bias.

Create .github/workflows/claude-review.yml using the GitHub Actions code from the CI/CD section above. Ensure your repository has ANTHROPIC_API_KEY set in GitHub Secrets.

Run:

cat .github/workflows/claude-review.yml | head -5
Expected output:
# .github/workflows/claude-review.yml # Automated PR review using Claude Code with session isolation. # Session A generates a summary; Session B reviews independently. name: Claude Code PR Review

Validate the YAML syntax:

python -c "import yaml; yaml.safe_load(open('.github/workflows/claude-review.yml')); print('YAML is valid')"
Expected output:
YAML is valid
✅ Checkpoint

If you see "YAML is valid" and the file header matches, Step 6 is working.

Troubleshooting:

  • If the workflow does not trigger: ensure on: pull_request is at the top level, not nested under jobs.
  • If Claude Code fails in CI: check that ANTHROPIC_API_KEY is set in repository Settings → Secrets → Actions.
  • If npm install -g fails in CI: the ubuntu-latest runner includes Node.js by default, but you may need actions/setup-node@v4 first.

Verify Everything Works

Run these checks to confirm your configuration is complete:

# Verify all files exist
echo "=== Checking CLAUDE.md hierarchy ==="
test -f ~/.claude/CLAUDE.md && echo "✓ User-level" || echo "✗ Missing ~/.claude/CLAUDE.md"
test -f .claude/CLAUDE.md && echo "✓ Project-level" || echo "✗ Missing .claude/CLAUDE.md"
test -f src/api/CLAUDE.md && echo "✓ Directory-level" || echo "✗ Missing src/api/CLAUDE.md"

echo "=== Checking commands and skills ==="
test -f .claude/commands/check-filing.md && echo "✓ Command" || echo "✗ Missing command"
test -f .claude/skills/entity-resolution/SKILL.md && echo "✓ Skill" || echo "✗ Missing skill"

echo "=== Checking CI/CD ==="
test -f .github/workflows/claude-review.yml && echo "✓ Workflow" || echo "✗ Missing workflow"
Expected output:
=== Checking CLAUDE.md hierarchy === ✓ User-level ✓ Project-level ✓ Directory-level === Checking commands and skills === ✓ Command ✓ Skill === Checking CI/CD === ✓ Workflow
Congratulations!

You have a fully configured Claude Code project with a three-level CLAUDE.md hierarchy, a custom command, a context-isolated skill, and a CI/CD review pipeline. This covers the core of Domain 3 on the certification exam.

Stretch goal (optional): Add a batch processing script that extracts structured data from 100 UCC filing documents using the Message Batches API from the Batch Processing section above.

The Three Approaches — Your Complete Toolkit

You now have three ways to build agents.

Approach 1 — Raw API Loop (from M15B): 250 lines and full control. You drive every iteration of the loop, inspect stop_reason, and execute tools yourself.

Approach 2 — Agent SDK (from M26): 40 lines with hooks and sessions. The SDK runs the loop while you focus on tool definitions, hooks for guardrails, and sessions for state.

Approach 3 — Spec-Driven (from this module): 100 lines of spec and Claude Code generates everything. You describe what the agent should do; Claude builds the implementation, and you review and iterate on the spec.

Each builds on the one before. You cannot debug Approach 3 without understanding Approach 1.

CAPSTONE-7 is where you prove this by building the SAME agent all three ways and comparing code size, development time, and flexibility. That comparison is the graduation exercise of this course.

Knowledge Check

Q1: Where should personal code style preferences (indentation, line length) go?

AUser-level: ~/.claude/CLAUDE.md
BProject-level: .claude/CLAUDE.md
CDirectory-level: src/CLAUDE.md
DIn the system prompt of every API call
Correct! Personal preferences belong at user-level (~/.claude/CLAUDE.md) so they apply to all projects but don't impose your style on teammates. Project-level is for team standards.
Not quite. Personal preferences go in user-level (~/.claude/CLAUDE.md). Putting them in project-level forces your style on the whole team. Directory-level is for path-specific rules.

Q2: What does context: fork do in a skill definition?

ACreates a git branch for the skill's code changes
BRuns the skill in a Docker container for security
CRuns the skill in an isolated context window so exploration noise does not pollute the main session
DSplits the request between two Claude models for consensus
Correct! context: fork gives the skill its own context window. Files read, searches performed, and intermediate reasoning stay in the forked context. Only the final result returns to the main session.
Not quite. context: fork creates an isolated context window for the skill. It is not about git branches, containers, or model splitting — it is about keeping the main conversation clean while the skill does exploratory work.

Q3: When should you use plan mode instead of direct execution?

AAlways — plan mode is always safer
BFor complex, risky, or unfamiliar tasks — not for simple well-understood changes
COnly when working with databases
DNever — direct execution is always faster and better
Correct! Plan mode adds overhead. Use it when the complexity, unfamiliarity, or risk justifies the cost. Simple renames or well-understood bug fixes do not need planning.
Not quite. Plan mode is not "always" or "never" — it depends on complexity, familiarity, risk, and reversibility. A simple rename? Direct. A complex multi-file refactor in an unfamiliar codebase? Plan mode.

Q4: You need to find which files in your project contain the function validateInput. Which tool should you use?

AGlob — it searches file patterns
BRead — read every file and search manually
CBash — run grep -r validateInput .
DGrep — it searches file contents by pattern
Correct! Grep searches file contents. Glob finds files by name pattern. You need to search INSIDE files for the function name, so Grep is the right tool. Always prefer the dedicated tool over Bash for search tasks.
Not quite. Glob searches by file name (e.g., "*.ts"), not content. Read requires you to know which file to look in. Bash works but the dedicated Grep tool is preferred. Grep searches file contents by pattern — exactly what you need.

Q5: What is the critical anti-pattern with CI/CD code review?

AUsing Claude Code for review at all — it's not reliable enough
BRunning reviews too frequently — once per day is enough
CSame-session self-review — the reviewer retains the generator's reasoning, creating confirmation bias
DUsing JSON output format — it loses nuance compared to plain text
Correct! Same-session self-review creates confirmation bias because the reviewer retains the generator's reasoning context. Always use SEPARATE sessions for generation and review in CI/CD pipelines.
Not quite. The critical anti-pattern is same-session self-review. When the same session generates code and then reviews it, the reviewer is biased by the generator's reasoning context. Always use separate sessions.

Q6: When should you use the Message Batches API instead of synchronous calls?

ALatency-tolerant, high-volume, cost-sensitive tasks (nightly analysis, bulk extraction)
BReal-time chatbot responses where speed matters
CAny request over 1000 tokens to save money
DOnly for image processing tasks
Correct! Batch is for non-urgent, high-volume, cost-sensitive work. The 50% cost reduction is significant at scale, but the 24-hour processing window makes it unsuitable for anything user-facing or time-sensitive.
Not quite. Batch API is for latency-tolerant, high-volume tasks — not real-time responses. A user waiting for a chatbot reply should never hit the batch API. Use it for nightly analysis, bulk extraction, or dataset labeling.

Q7: A developer creates a command to explore a large codebase, reading 80+ files. What should they use instead?

AA longer command with more instructions
BA skill with context: fork so exploration noise stays isolated
CMultiple smaller commands chained together
DDirect execution without any command
Correct! A skill with context: fork runs in an isolated context window. The 80+ files are read in the forked context and only the analysis summary returns to the main session, keeping it clean.
Not quite. A command that reads 80+ files dumps all that content into your main context window. A skill with context: fork keeps the exploration isolated and returns only the summary.

Q8: In the CLAUDE.md hierarchy, what happens when user-level says "indent: 4 spaces" and project-level says "indent: 2 spaces"?

AError — conflicting rules are not allowed
BUser-level wins because it was set first
CBoth apply simultaneously — Claude uses 3 spaces as a compromise
DProject-level wins — more specific overrides more general
Correct! Like CSS cascading, more specific wins. Project-level is more specific than user-level. If a directory-level CLAUDE.md also existed, it would override both. All levels merge, but conflicts are resolved by specificity.
Not quite. CLAUDE.md uses cascading: all levels merge, and more specific overrides more general. Project-level (more specific) beats user-level (more general). No errors, no compromises — the most specific rule wins.
Your Score
0 / 8

Module Summary

  • CLAUDE.md Hierarchy: User-level (personal) → project-level (team) → directory-level (path-specific). More specific wins. Use @import for shared rules.
  • Commands vs Skills: Commands are manual triggers. Skills add context: fork for isolation and automatic invocation. Use skills for complex exploration tasks.
  • Plan Mode: Use for complex, risky, or unfamiliar tasks. Skip for simple, well-understood changes. Decision depends on complexity, familiarity, risk, and reversibility.
  • Built-in Tools: Glob (find files) → Read (understand) → Edit (change) → Bash (test). Know which tool fits which task.
  • CI/CD: Use -p for non-interactive mode, --output-format json for structured output. Always use separate sessions for generation and review.
  • Batch API: 50% cost reduction for high-volume, latency-tolerant tasks. 24-hour processing window. Not for real-time user-facing responses.

Next up: M26: Hooks, Sessions & Agent SDK covers event-driven automation, session management, and building custom agents with the Claude Agent SDK.

References & Resources