M25: Claude Code Mastery
Master Claude Code configuration, workflows, and CI/CD integration — covering Domain 3 of the Claude Certified Architect exam (~20% of the exam).
Learning Objectives
- Configure CLAUDE.md files at user, project, and directory levels with correct cascade behavior
- Build custom slash commands and skills with context isolation and tool restrictions
- Choose between plan mode and direct execution based on task complexity and risk
- Use Claude Code's built-in tools (Read, Write, Edit, Bash, Grep, Glob) effectively
- Integrate Claude Code into CI/CD pipelines with session isolation for unbiased review
- Use the Message Batches API for high-volume, cost-efficient processing
- Choose the right permission mode (default, acceptEdits, plan, bypassPermissions) for the risk profile of each session
- Compose long-running and parallel work using
/loop,/schedule, worktrees, and the two-Claude review pattern
Claude Code Is an Orchestration Framework, Not a Coding Assistant
Most developers treat Claude Code like a faster Stack Overflow — type a question, get code, copy-paste. That's the wrong mental model. Claude Code is an agent orchestration framework that happens to be excellent at coding. Once you see the layers, you stop fighting the tool and start composing it.
The compounding effect is what separates a "fancy autocomplete" from a programmable engineering team. CLAUDE.md alone makes Claude smarter about your codebase. Hooks alone make every change auto-linted. Subagents alone give you parallel specialists. Combined, they let you describe a feature once and have Claude implement, review, security-audit, and PR it without you copy-pasting between sessions. This module teaches you each layer; the rest is composition.
Claude Code vs The Rest of the AI Coding Landscape
The AI coding landscape splits into three tiers based on autonomy. Knowing where Claude Code sits relative to Copilot, Cursor, and SWE-agents like Devin is how you decide which tool fits which task — and why composing Claude Code's six layers gives you something the others structurally can't.
Copilot and Cursor are tools you use — they cap out at making you faster at your current workflow. Claude Code is a system you configure and orchestrate — it can transform what your workflow is. Devin and SWE-agents trade steerability for autonomy — great for narrow benchmarks, harder to course-correct mid-task. The Claude Code sweet spot: autonomous enough to hand off complete features, steerable enough that you stay in the architectural loop.
CLAUDE.md is the only file loaded automatically every time — it is how you give Claude permanent memory. Get this right and the other five layers build on a solid foundation.CLAUDE.md Configuration Hierarchy
CLAUDE.md files work like CSS cascading. Think of three layers: a website's global stylesheet (body tag defaults), a page-level class (overrides for specific sections), and an inline style (overrides for a single element). The most specific rule always wins.
Without cascading, every developer on a team would need to copy-paste the same rules into every project and every directory. Changes would mean updating dozens of files. Worse, personal preferences (like tab width or test framework) would conflict with team standards.
CLAUDE.md cascading solves this: user-level is your personal global stylesheet (applied everywhere you use Claude Code), project-level is the team's class rules (committed to git, shared by everyone), and directory-level is the inline override (specific rules for specific code areas). More specific always wins, and all levels combine together.
Here is what the cascade looks like in practice. Suppose your user-level file says indent: 4 spaces, but the project says indent: 2 spaces, and the API directory says indent: tabs. When Claude Code edits a file in src/api/, the effective rule is indent: tabs — the directory-level wins. When it edits a file in src/models/ (no directory-level override), the project rule wins: indent: 2 spaces.
CLAUDE.mdA markdown file that configures Claude Code's behavior — coding conventions, project context, and tool restrictions. Named after the AI model, placed at strategic locations in your file system to control behavior at different scopes. is a configuration file that tells Claude Code how to behave in a specific context. There are three levels, and they merge together with more-specific levels overriding more-general ones:
User-level (~/.claude/CLAUDE.md): Your personal preferences that apply to every project. Things like: "I prefer 2-space indentation," "always use TypeScript strict mode," "run tests after every code change." These are YOUR rules, not your team's.
Project-level (.claude/CLAUDE.md): Team standards committed to git. Things like: "This project uses PostgreSQL 15," "API routes follow the pattern /api/v1/{resource}," "use the company's internal auth library for all endpoints." Everyone on the team gets these rules automatically.
Directory-level (src/api/CLAUDE.md): Path-specific overrides for specialized code areas. For example, the API directory might say "all handlers must validate input with Zod schemas" while the database directory says "all queries must use parameterized statements."
You can also use @import syntax to pull in shared rules from other files, and create topic-specific rule files in .claude/rules/ for better organization.
CLAUDE.md has a practical limit of roughly 150–200 instructions. The Claude Code system prompt already consumes ~50. Every line you add that Claude doesn't actually need dilutes the lines that matter, and files over ~200 lines cause instruction dropout — Claude decides parts "may not be relevant" and silently ignores them.
The CLAUDE.md test: before committing any line, ask: "Would Claude make a mistake on my codebase without this?" If the answer is no, delete the line. Document only what Claude gets wrong, not what it already does correctly.
The Four Anti-Patterns That Kill Compliance
Every team that struggles with "Claude isn't following our CLAUDE.md" is hitting one of these four failure modes. Each one is a budget waste that pushes the file past the dropout threshold or leaves Claude with nowhere to go.
Files over ~200 lines trigger instruction dropout: Claude silently decides chunks "may not be relevant" and ignores them. Fix: push module-specific rules into directory-level CLAUDE.md (the file-location hierarchy above). Root file stays under 150 lines.
"Use TypeScript" in a TypeScript project is wasted budget — Claude already does that. Every line you spend confirming default behavior is a line you didn't spend on a correction. Fix: spend budget on corrections, never confirmations. Run the CLAUDE.md test on every line.
"Never use the --foo-bar flag" leaves Claude stuck — it knows what not to do but has nowhere to go. Fix: always pair a prohibition with an alternative. "Never use --foo-bar; prefer --baz instead because it handles unicode correctly." Now Claude has somewhere to land.
CLAUDE.md is advisory — the model can deprioritize lines under context pressure. If something must always happen (attribution, permission scopes, model selection, hooks), put it in settings.json. Rule: CLAUDE.md is for guidance. settings.json is for guarantees. The hook system you'll see in M26 is how you turn a "rule" into deterministic enforcement.
Minimum Viable CLAUDE.md
Before adding rules, start here. This is the under-30-line template that captures everything Claude actually needs and nothing it doesn't. Resist the urge to expand it until you've shipped a feature and noticed Claude getting something wrong.
# Project: [Name]
## Stack
- Node.js 22, TypeScript 5.4, Fastify 4
- PostgreSQL 16 + Drizzle ORM
- Redis 7 for caching
- Jest for testing
See @package.json for all dependencies.
See @docs/architecture.md for system design.
## How to work on this project
- Run tests: `npm test`
- Run single test: `npm test -- --testPathPattern=auth`
- Typecheck: `npm run typecheck`
- Lint: `npm run lint`
## Things Claude tends to get wrong here
- Always use ESM imports (not CommonJS require)
- Redis keys must include version prefix: `v2:user:{id}:...`
- Auth middleware must run BEFORE rate limiting in route registration
- All DB queries go through the service layer, never directly in routes
## Git workflow
- Never commit to main directly
- Branch naming: `feat/`, `fix/`, `chore/`
- Commit messages: conventional commits format
Notice three discipline patterns. (1) The Stack section references package.json via @import instead of duplicating dependency lists. (2) "How to work on this project" gives Claude the exact commands — not "run tests somehow," but npm test -- --testPathPattern=auth. (3) "Things Claude tends to get wrong" is named that way on purpose — it forces every line to pass the CLAUDE.md test before being added. Under 30 lines, all signal, zero dropout risk.
Let's look at what a real-world project-level CLAUDE.md contains. Notice how it is not just formatting preferences — it includes the tech stack, database rules, API patterns, and @import directives that pull in separate rule files for security and testing. This is the kind of context that saves Claude from making wrong assumptions about your project.
# UCC Pipeline Project — Claude Code Configuration
## Project Context
This is a UCC (Uniform Commercial Code) filing data pipeline.
Stack: Python 3.12, FastAPI, PostgreSQL 15, Redis, Docker.
Architecture: Medallion (Bronze → Silver → Gold layers).
## Coding Conventions
- Use type hints on ALL function signatures
- Docstrings: Google style (Args/Returns/Raises)
- Tests: pytest with fixtures, minimum 80% coverage
- Error handling: never catch bare `except:`, always specific exceptions
- Logging: structured JSON via structlog
## Database Rules
- ALL queries must use parameterized statements (no f-strings in SQL)
- Migrations via Alembic — never modify schema directly
- Table names: snake_case, singular (e.g., `ucc_filing`, not `ucc_filings`)
## API Patterns
- Routes: /api/v1/{resource}
- Auth: Bearer token via X-API-Key header
- Responses: always wrap in {data, error, metadata}
- Validation: Pydantic models for all request/response bodies
@import .claude/rules/security.md
@import .claude/rules/testing.md
# Personal Claude Code Preferences
## Editor Preferences
- Indentation: 2 spaces (overridden by project if different)
- Line length: 100 characters max
- Trailing commas: always in multi-line structures
## Workflow Preferences
- Run tests after every code change
- Show git diff before committing
- Prefer small, focused commits over large batches
- Always explain WHY a change was made, not just WHAT
## Communication Style
- Be concise — skip preamble
- When showing code, highlight the changed lines
- If unsure, ask before making assumptions
You saw how CLAUDE.md at two levels serves different purposes. The project-level file contains team-wide rules that everyone shares (database patterns, API conventions, tech stack). The user-level file contains personal preferences (indentation, commit style). When Claude Code loads, it merges all applicable files, with more-specific rules winning conflicts.
Anti-pattern: putting personal editor preferences (indentation, line length) in the project-level .claude/CLAUDE.md. These belong in user-level ~/.claude/CLAUDE.md. The exam tests whether you can distinguish personal preferences from team standards and place each at the correct level.
Custom Slash Commands vs Skills
Commands are like speed-dial buttons on a phone. You press the button, it dials a specific number. You have to press it manually every time. It is fast and predictable, but it only does exactly what you programmed.
The pain with only having commands is that complex exploration tasks dump noise into your main conversation. If you run a command that reads 50 files looking for a pattern, all of those file contents are now in your context window — cluttering up the conversation you were having.
Skills solve this. A skill with context: fork is like delegating a task to an assistant in another room. They do their research, come back with a summary, and your workspace stays clean. The exploration noise stays in the forked context; only the result comes back to you.
Here is what that looks like in your terminal. Without a fork, you would see: [Read file_1.py] ... 200 lines ... [Read file_2.py] ... 350 lines ... [Read file_3.py] ... 180 lines ... Result: 3 issues found — all 730 lines dumped into your session. With context: fork, you see: [Skill: entity-resolution running...] ... Result: 3 issues found. Exact matches: 2 (filing NY-2024-001234, NY-2024-005678). Likely match: 1 ("ACME CORP" → 87% similarity). Just the summary. Clean.
Slash commandsMarkdown files in .claude/commands/ that create custom / commands in Claude Code. User invokes them explicitly by typing /command-name. They support $ARGUMENTS placeholders and YAML frontmatter for configuration.: Markdown files placed in .claude/commands/. When you create .claude/commands/deploy.md, it becomes the /deploy command. The file contains the prompt that Claude will execute. You can use $ARGUMENTS placeholders for user input.
Each command file supports YAML frontmatter at the top with four optional fields: allowed-tools (restrict which tools the command can use), argument-hint (show users what input to provide), model (override the default model), and description (explain what the command does in the / menu).
SkillsEnhanced slash commands in .claude/skills/ with extra capabilities: context isolation (context: fork), automatic invocation based on description matching, and tool restrictions. Skills are the modern replacement for commands when you need isolation or automatic triggering.: A directory in .claude/skills/ containing a SKILL.md file. Skills have all the capabilities of commands plus two critical extras.
First, context: fork — the skill runs in an isolated context window, so exploration noise does not pollute your main session. Second, automatic invocation — Claude can invoke the skill automatically when the user's request matches the skill's description. You do not need to type a slash command; Claude recognizes the intent and triggers the skill on its own.
The key distinction is context isolationWhen a skill runs with context: fork, it gets a separate context window. All the files it reads, searches it performs, and intermediate reasoning stays in the forked context. Only the final result is returned to the main session. This prevents context pollution from exploratory work.. Without it, a command that reads 50 files dumps all that content into your main conversation. With context: fork, the skill reads those files in its own context, analyzes them, and returns only the summary. Your main session stays focused.
Here is a skill definition with context isolation and tool restrictions.
---
name: entity-resolution
description: Analyze entity names across UCC filings to find matches and variations. Use when asked to resolve, match, or deduplicate entity names.
context: fork
allowed-tools:
- Read
- Grep
- Glob
---
# Entity Resolution Analysis
You are analyzing UCC filing entity names to find matches.
## Task
Given entity name `$ARGUMENTS`, search the codebase for:
1. Exact matches in filing records
2. Variations (abbreviations, misspellings, legal suffixes)
3. Related entities (parent/subsidiary relationships)
## Process
1. Use Glob to find all filing data files
2. Use Grep to search for the entity name and common variations
3. Read relevant files to extract full entity records
4. Return a structured summary:
- Exact matches (count, filing numbers)
- Likely matches (similarity score, reasoning)
- Recommended canonical name
## Constraints
- Do NOT modify any files
- Do NOT run any shell commands
- Return only the analysis summary to the main session
---
description: Review a UCC filing parser for correctness and edge cases
allowed-tools:
- Read
- Grep
- Glob
argument-hint: filing_type (e.g., UCC-1, UCC-3)
---
# Review Filing Parser
Review the parser implementation for `$ARGUMENTS` filings.
## Checklist
1. Read the parser source file for this filing type
2. Check: Does it handle all required fields?
3. Check: Are edge cases covered? (missing fields, malformed dates, encoding issues)
4. Check: Does it match the expected output schema?
5. Check: Are there tests? Do they cover edge cases?
## Report Format
- **Status**: PASS / FAIL / NEEDS REVIEW
- **Issues found**: list with severity (critical/warning/info)
- **Missing test coverage**: list of untested scenarios
- **Suggested fixes**: code snippets for critical issues
You created two reusable Claude Code workflows. The skill uses context: fork to run entity resolution in isolation — it can read and search dozens of files without polluting your main conversation. The command is simpler — good for a focused review task where context pollution is manageable. Both use allowed-tools to restrict what Claude can do (read-only, no shell access).
Anti-pattern: using commands for complex exploration tasks that read many files. Use skills with context: fork and allowed-tools restrictions instead. The exam tests whether you can identify when context isolation is needed and choose the right mechanism.
SKILL.md Frontmatter — The Full Spec
The two skills above used four frontmatter fields. The full SKILL.md frontmatter has 15 fields, all optional — only description is recommended. Skills also unify what used to be two separate concepts: custom commands have been merged into skills. A file at .claude/commands/deploy.md and a skill at .claude/skills/deploy/SKILL.md both create /deploy and use the same frontmatter; the skill format adds optional supporting files, automatic invocation, and subagent execution.
| Field | What it does |
|---|---|
name | Display name (defaults to dir name); lowercase + hyphens, max 64 chars |
description | When to use it — this drives auto-invocation. Front-load the use case; description + when_to_use is capped at 1,536 chars in the skill listing |
when_to_use | Trigger phrases / example requests, appended to description |
argument-hint | Autocomplete hint, e.g. [issue-number] or [filename] [format] |
arguments | Named positional args for $name substitution: arguments: [issue, branch] → $issue, $branch |
disable-model-invocation | true → only YOU can invoke (no auto-trigger); use for /deploy, /commit — anything with side effects |
user-invocable | false → hides from / menu; only Claude can invoke (background knowledge) |
allowed-tools | Pre-approves tools while skill is active — doesn't restrict, just skips per-use prompt. Add deny rules in permissions to actually block. |
model | Model override (sonnet/opus/haiku/full ID/inherit); resets after the turn |
effort | Thinking budget for the turn: low / medium / high / xhigh / max |
context | fork → runs in a forked subagent context with no main-conversation visibility |
agent | Which subagent type executes the fork: Explore, Plan, general-purpose, or any from .claude/agents/ |
hooks | Hooks scoped to this skill's lifecycle — supports once: true for one-shot |
paths | Glob patterns — auto-load only when working with matching files |
shell | bash (default) or powershell (Windows; needs CLAUDE_CODE_USE_POWERSHELL_TOOL=1) |
String Substitutions — Inject Args, Session ID, Effort, Skill Path
Skills have six substitution variables that get replaced before the body is sent to Claude. Argument positionals support shell-style quoting ("hello world" = one arg).
| Variable | Resolves to |
|---|---|
$ARGUMENTS | Full argument string as typed; if absent from body, appended as ARGUMENTS: <value> |
$ARGUMENTS[N] / $N | Positional arg by 0-based index ($0, $1, …) |
$name | Named arg from arguments frontmatter list, mapped by position |
${CLAUDE_SESSION_ID} | Current session ID — use for log file paths, correlation IDs |
${CLAUDE_EFFORT} | Active effort level — adapt instructions to match (e.g. skip exhaustive checks at low) |
${CLAUDE_SKILL_DIR} | Directory containing this skill's SKILL.md — use to invoke bundled scripts regardless of cwd |
Inline Shell Injection — !`cmd` Runs Before Claude Sees Anything
Skills (and slash commands) can run shell commands at render time using backtick-quoted !`command` syntax or fenced ```! blocks. The output replaces the placeholder and the rendered prompt is what reaches Claude — this is preprocessing, not Claude executing the command. Perfect for injecting live data (PR diffs, git status, env versions) without round-tripping through tool calls.
---
name: pr-summary
description: Summarize the current pull request using live GitHub data
context: fork
agent: Explore
allowed-tools: Bash(gh *)
---
## PR diff
!`gh pr diff`
## Reviewer comments
!`gh pr view --comments`
## Multi-line block (use ```! ... ```)
```!
node --version
npm --version
git status --short
```
Summarize the changes and flag anything that needs reviewer attention.
When the skill runs, the three commands execute first; their stdout is interpolated into the prompt; Claude receives a fully-rendered text body with the actual PR data already inlined. To disable shell injection for non-bundled skills (managed-policy enforcement), set "disableSkillShellExecution": true in settings — each !`cmd` is replaced with [shell command execution disabled by policy].
When invoked, a skill's rendered SKILL.md body enters the conversation as one message and stays there for the rest of the session. Claude does not re-read the skill file on later turns — write standing instructions, not one-shots. After auto-compaction, the most recent invocation of each skill is re-attached, keeping the first 5,000 tokens of each up to a combined 25,000-token budget; older skill invocations get dropped first. Description budget: all skill names always load, but descriptions are truncated to fit a 1% of context window budget (fallback 8,000 chars; raise via SLASH_COMMAND_TOOL_CHAR_BUDGET). Live change detection: adds/edits/removes inside ~/.claude/skills/, project .claude/skills/, or --add-dir targets take effect mid-session — new top-level directories require a restart.
Bundled Skills — What Ships in Every Session
Claude Code includes prompt-based bundled skills that you invoke like any other slash command. Unlike built-in commands (which execute fixed code), bundled skills hand Claude a detailed playbook and let it orchestrate using its tools. They appear in / autocomplete tagged Skill:
| Bundled skill | What it does |
|---|---|
/simplify | Refactors recently-changed code for clarity / consistency / maintainability |
/batch | Runs the same task across many files / branches / PRs in one shot |
/debug | Structured debugging playbook (reproduce, isolate, fix, verify) |
/loop | Background monitoring — reruns a check on a cadence and reports drift |
/claude-api | Helps build agents on top of the Claude Messages API |
A few built-in commands are also reachable through the Skill tool: /init, /review, /security-review. Others like /compact are not. Block all skills with a Skill deny rule, restrict to specific skills with Skill(name) or Skill(name *) in permissions.allow. To make a skill's body invoke extended thinking, include the literal word "ultrathink" anywhere in the skill content.
The exam tests three skill-specific discriminations: (1) allowed-tools grants pre-approval, not restriction — to actually block, add to permissions.deny; (2) shell injection runs before Claude sees the prompt — treat !`cmd` output as untrusted input you control, not as Claude's tool output; (3) plugin skills don't conflict with personal/project skills because they use plugin-name:skill-name namespacing, but a skill at the same scope as a custom command of the same name takes precedence over the command.
Plan Mode vs Direct Execution
Plan mode is the architect drawing blueprints before construction begins. You see the design, review it, suggest changes, and only then does building start. Direct execution is the builder who picks up a hammer and starts working immediately — fast, efficient, and exactly right when the task is clear.
The pain of always planning is wasted time on trivial tasks: you do not need blueprints to hang a picture frame. The pain of never planning is disaster on complex tasks: building a house without blueprints means load-bearing walls in wrong places and plumbing that does not connect.
The real skill is knowing WHEN each mode is appropriate. A simple variable rename? Direct execution. A complex database schema refactor across 30 files? Plan mode. The decision criteria are: complexity, familiarity with the codebase, risk level, and reversibility of the change.
Under the hood, plan mode changes how Claude Code processes your request. Normally, Claude reads your prompt and immediately starts calling tools — reading files, making edits, running tests. In plan mode, Claude still reads files and analyzes the codebase, but it stops before making any changes. Instead, it outputs a structured plan: which files it will modify, what changes it will make to each one, and what risks exist. You review the plan, suggest changes, and only when you say "proceed" does Claude execute.
This is more than just "thinking before acting." Plan mode forces Claude to commit to a strategy before it starts editing. Without it, Claude might start refactoring file A, realize halfway through that file B needs to change first, undo its work on A, edit B, then come back to A. Plan mode catches these dependency chains upfront, saving tool calls and avoiding half-finished states.
Here is what plan mode output actually looks like. When you type /plan or ask Claude to "plan first," you get a structured proposal like this before any code is touched:
Plan modeA Claude Code workflow mode where Claude generates a step-by-step plan, presents it for approval, and only executes after the user confirms. Activated with /plan or by asking Claude to "plan first." Good for complex, risky, or unfamiliar tasks.: Claude generates a plan, shows it to you, and waits for your approval before executing any changes. Use plan mode when: (1) you are working in an unfamiliar codebase, (2) the change is complex (touching 5+ files), (3) the change is risky or hard to reverse, or (4) you want to review the approach before code is modified.
Direct execution: Claude acts immediately — reads files, makes edits, runs commands. Use when: (1) the task is well-understood ("rename this variable"), (2) the change is small and reversible, (3) you have clear context about what needs to happen.
Three powerful iteration patterns work well with plan mode:
- TDD iteration: Red (write a failing test) → Green (ask Claude to make it pass) → Refactor. Each step is small and verifiable.
- Interview pattern: Ask Claude to ask YOU questions before starting. This surfaces assumptions and requirements you might have missed.
- Concrete examples: Provide 2–4 examples of desired output, then ask Claude to generalize. This anchors the implementation to real data.
"Plan mode is always the safe choice." — Plan mode adds overhead: Claude reads the codebase, generates a plan, waits for approval, then executes. For a simple variable rename, that is 3 extra steps that add nothing. The safe choice is the RIGHT choice for the task — plan mode for complex or risky changes, direct execution for simple well-understood ones.
"If I use plan mode, Claude won't make mistakes." — Plan mode reduces the chance of a wrong approach, but the plan itself can still be flawed. You might approve a plan that misses a dependency or underestimates the blast radius of a change. Always review the plan critically, especially the "Files modified" and "Risk" lines.
"Direct execution means Claude doesn't think." — Claude still reasons about your request in direct execution mode. It reads files, considers alternatives, and picks an approach. The difference is that it does not pause to show you the plan first. It is still "thinking" — it just acts on its conclusions immediately.
The exam tests decision criteria for when to use plan mode. It is NOT "always use plan mode." The correct answer depends on: task complexity, codebase familiarity, change risk, and reversibility. Know the tradeoffs — plan mode has overhead that is wasteful for simple tasks.
The Four Permission Modes
Permission modes are like the security level on your work badge. Visitor pass: you can look around but every door asks for an escort. Employee badge: most doors open, sensitive areas still need approval. Master key: every door opens automatically. The right level depends on whether you're a new contractor (visitor) or the building manager (master) — and on which floor you're working on.
Claude Code's permission system works the same way. plan mode is the visitor pass — Claude can read but not write. bypassPermissions is the master key — every tool runs without asking. Most production work happens in default or acceptEdits, but knowing all four is the difference between a paranoid workflow that interrupts you constantly and a reckless workflow that breaks production.
The skill is matching the mode to the task: prototyping a brand-new file? acceptEdits. Touching production database migrations? default with explicit approval. Running an unattended overnight job in CI? bypassPermissions in a sandboxed container with a strict allowedTools list.
Permission modes control which tool calls require user approval before execution. Set globally in ~/.claude/settings.json, per-project in .claude/settings.json, or per-session via --permission-mode. Combined with allowedTools and disallowedTools for fine-grained control.
| Mode | Behavior | Use when |
|---|---|---|
plan | Read-only. Claude analyzes and proposes a plan, but cannot edit, write, or run commands. | Architectural review, exploring an unfamiliar codebase, risky refactors |
default | The starting mode. Tool calls prompt for approval the first time and can be remembered for the session via permissions.allow. | Most interactive work; the safest default |
acceptEdits | File edits auto-approved. Bash and other tools still prompt. | Greenfield work, prototyping, scaffolding new files |
bypassPermissions | Skips all permission checks. Every tool runs without prompting. | Sandboxed environments only. Never on a workstation with prod credentials. Pair with disableBypassPermissions: true in managed/user settings to lock it out elsewhere. |
The strongest signal of a careless Claude Code setup is bypassPermissions in a developer's user-level settings. That single flag means every session, in every project, runs every tool without asking — including rm -rf, git push --force, and any MCP server tool with a connection string. Use bypassPermissions only inside ephemeral containers (Docker, CI runners, sandboxed worktrees), never on your laptop. For day-to-day work, default mode + a project-level permissions.allow list is the sweet spot.
The exam tests recognition of mode-to-scenario fit. Common stems: "unattended CI run inside a sandboxed container" → bypassPermissions + a strict permissions.allow list. "Reviewing a risky migration" → plan. "Prototyping a brand-new feature" → acceptEdits. "Production-touching work on a developer's laptop" → default (never bypass). Anti-pattern: using bypassPermissions on a workstation to avoid permission prompts — that defeats the entire safety model. Lock it out at the user/managed level with disableBypassPermissions: true.
settings.json — The Full Configuration Surface
Think of Claude Code's settings as the layered controls on a corporate laptop. The IT admin (managed settings) sets policies you can't override — "you cannot install software, you cannot disable the VPN." You (user settings) set personal defaults — "I prefer dark mode, my keyboard shortcuts." Each project folder (project settings) adds project-specific tweaks — "for this client, use their VPN, this branding." Local overrides (gitignored) are sticky-notes only you see — "skip the auth check on my dev machine."
The pain without this layering: every team member has to repeat the same setup, security policies live in untrusted places, and personal tweaks leak into shared config. The cascade fixes that — admins lock down what matters, projects share what's useful, and personal preferences stay personal.
Same idea here. Five precedence levels mean a security team can mandate sandbox rules organization-wide, a project team can pre-approve their CI tools, and an individual can opt into vim mode — all without conflicts, because higher levels always win the same key.
Five-Level Precedence — Who Beats Whom
Claude Code merges settings from up to five sources for every session. Higher-priority levels override lower ones for the same key. Managed settings can also set a few "managed-only" keys (like allowedMcpServers) that lower levels cannot define at all.
The Settings You'll Actually Touch
Settings group into eight functional areas. Here are the high-leverage knobs for each — the rest of the surface is in the upstream reference.
| Area | Key fields |
|---|---|
| Permissions | permissions.allow / ask / deny / defaultMode / additionalDirectories / disableBypassPermissions |
| Model & effort | model, availableModels, effortLevel, alwaysThinkingEnabled, fastModePerSessionOptIn, modelOverrides |
| Hooks | hooks, disableAllHooks, allowManagedHooksOnly, allowedHttpHookUrls, httpHookAllowedEnvVars |
| MCP | enableAllProjectMcpServers, enabledMcpjsonServers, disabledMcpjsonServers, allowedMcpServers*, deniedMcpServers* (* = managed-only) |
| Plugins & skills | enabledPlugins, extraKnownMarketplaces, strictKnownMarketplaces*, blockedMarketplaces*, disableSkillShellExecution*, allowedChannelPlugins* |
| Sandbox | sandbox.enabled, sandbox.filesystem.{allow,deny}{Read,Write}, sandbox.network.{allowed,denied}Domains, sandbox.excludedCommands |
| Git & attribution | attribution.commit, attribution.pr, includeGitInstructions, prUrlTemplate |
| UX & a11y | editorMode (vim/normal), tui, viewMode, prefersReducedMotion, voice.{enabled,mode,autoSubmit}, spinnerVerbs, statusLine, fileSuggestion, autoMemoryEnabled, autoMemoryDirectory |
Sandbox Settings — Bash With Filesystem + Network Walls
Set sandbox.enabled: true and Bash runs inside an OS-level jail (Seatbelt on macOS, namespaces on Linux/WSL2). Sandboxed commands skip the permission prompt by default (autoAllowBashIfSandboxed: true) because the OS is enforcing the boundary — making it the safest way to run bypassPermissions-style work without yolo'ing your filesystem.
{
"sandbox": {
"enabled": true,
"failIfUnavailable": true,
"filesystem": {
"allowWrite": ["./", "~/.cache/claude"],
"denyWrite": ["./.git", "./.env*"],
"denyRead": ["./.env", "~/.ssh", "~/.aws"]
},
"network": {
"allowedDomains": ["api.github.com", "registry.npmjs.org", "*.anthropic.com"],
"deniedDomains": ["*"]
},
"excludedCommands": ["docker", "kubectl"]
}
}
Path prefixes: / = absolute, ~/ = home, ./ or no prefix = project-relative. Network domains support wildcards. excludedCommands lets specific binaries bypass the sandbox when they need raw network/filesystem (e.g. docker). Managed admins can also set sandbox.network.allowManagedDomainsOnly to ignore project/user domain lists entirely.
Managed Settings — Org-Wide Policy
For enterprise rollouts, settings can be deployed via OS policy systems — out of reach of users. Three delivery mechanisms:
- macOS:
com.anthropic.claudecodeplist via Jamf, Kandji, etc. - Windows admin:
HKLM\SOFTWARE\Policies\ClaudeCodevia Group Policy or Intune - File-based:
managed-settings.jsonbase file plusmanaged-settings.d/*.jsondrop-in fragments (sorted alphabetically — later files override earlier)
Managed-only keys (cannot be set at any other level) include allowedMcpServers, deniedMcpServers, allowManagedMcpServersOnly, allowManagedPermissionRulesOnly, allowManagedHooksOnly, strictKnownMarketplaces, blockedMarketplaces, forceRemoteSettingsRefresh, disableSkillShellExecution, disableBypassPermissions. On Windows, wslInheritsWindowsSettings tells WSL to read the host's policies.
Attribution — Stop the AI From Spamming Co-Author Lines
By default, Claude Code adds a Co-Authored-By: Claude trailer to commits and an attribution line to PR descriptions. The attribution object lets you customize or disable both independently — replaces the deprecated includeCoAuthoredBy bool:
{
"attribution": {
"commit": false,
"pr": "Generated with Claude Code (internal)"
},
"prUrlTemplate": "https://github.acme.internal/{org}/{repo}/pull/{number}"
}
Voice, Status Line, Spinner — The UX Layer
Three settings that meaningfully change the moment-to-moment feel of working with Claude Code:
voice.enabled: true+voice.mode: "hold"(push-to-talk) or"tap"(toggle), withvoice.autoSubmit: trueto send when you stop talking. Configure interactively with/voice.statusLine.type: "command"+statusLine.command: "./bin/status.sh"— your script's stdout is shown at the bottom of every prompt (great for showing branch / cost / current task).spinnerVerbs.mode: "replace"+spinnerVerbs.verbs: ["compiling","analyzing","brewing"]— replaces the rotating "thinking" verbs. Use"append"mode to add custom ones to the defaults.
Three sibling files matter: ~/.claude.json (global config: OAuth, MCP servers, IDE caches), .mcp.json (project MCP servers, separate from settings.json), and .worktreeinclude (gitignored files to copy into worktrees so dev tooling keeps working). The $schema field on settings.json — https://json.schemastore.org/claude-code-settings.json — gives you autocomplete + validation in any editor with JSON Schema support.
The exam tests three governance discriminations: (1) Sandbox vs permission modes — permissions are advisory (Claude could ignore a hook); sandbox is OS-enforced. Enterprises layer both. (2) Managed-only fields — the certification names allowedMcpServers, allowManagedHooksOnly, disableSkillShellExecution as fields that cannot be overridden at user/project level. (3) Precedence vs union — for arrays like permissions.allow, levels are unioned; for scalar fields like model, higher level wins outright. Memorize which keys behave which way.
Built-in Tools: Read, Write, Edit, Bash, Grep, Glob
Claude Code ships with six built-in toolsTools that ship with Claude Code and require no MCP server setup. They provide file system access, code editing, shell execution, and search capabilities. Unlike MCP tools, they are always available.. These are always available — no MCP server setup, no configuration, no installation. They cover the four things you do most often when working with code: finding files, reading them, changing them, and running commands.
Glob — find files by name pattern. Use when you need to discover file structure or locate files: "find all *.test.ts files in src/".
Grep — search file contents with regex. Use when you need to find where something is used: "find all functions that call validateInput".
Read — read file contents. Use to examine and understand code before making changes.
Edit — modify specific parts of existing files (string replacement). Use for targeted changes: fix a bug, rename a variable, update a value.
Write — create new files or completely overwrite existing ones. Use for generating new code, config files, or test files.
Bash — execute shell commands. Use for running tests, installing dependencies, git operations, and anything that requires a shell. This is the most powerful tool and also the most dangerous — it can do anything your terminal can do, including deleting files or running destructive commands. That is why Claude Code prompts you before running Bash commands by default.
The typical workflow follows a predictable pattern: Glob (find the file) → Read (understand it) → Edit (change it) → Bash (test it). Most coding tasks follow this exact sequence. If you find yourself asking Claude to use Bash for searching file contents, that is a sign you should be using Grep instead — dedicated tools are faster, safer, and produce cleaner output than shell equivalents.
An important nuance: Edit vs Write. Edit does a targeted string replacement — it finds an exact match in the file and replaces it. Write overwrites the entire file. For a one-line bug fix, Edit is better because it shows exactly what changed. For generating a new 200-line file from scratch, Write is the right choice. If you confuse them, you risk either losing existing code (Write when you meant Edit) or failing to create a file (Edit on a file that does not exist yet).
The exam tests tool selection: given a task description, which built-in tool is correct? Key distinctions: Glob finds files by name pattern (not content). Grep searches file content (not file names). Edit modifies existing files using unique text matching. Write creates new files or overwrites entirely. Confusing Glob/Grep or Edit/Write is a common exam mistake. Bonus: when Edit fails because the target text isn't unique, the cert-correct fallback is Read + Write (read full content, then overwrite). Build codebase understanding incrementally — start with Grep for entry points, then Read to follow imports, rather than reading every file upfront.
"Claude Code is just the API with a CLI wrapper, right?" — No. Claude Code is a full agent that manages its own tool loop. When you give it a task, it decides which tools to call, reads the results, reasons about next steps, and iterates until the task is done. The API gives you a single message-in, message-out exchange. Claude Code gives you an autonomous agent that can read your codebase, make edits, run tests, and fix failures — all in one prompt.
"CLAUDE.md is like .env — it stores secrets and API keys." — Absolutely not. CLAUDE.md stores behavioral instructions: coding conventions, project context, and tool restrictions. Never put secrets in CLAUDE.md — it gets committed to git. Use environment variables for API keys, just like you would with any other tool.
"Skills replace commands entirely." — Not quite. Skills add capabilities that commands lack (context isolation, automatic invocation), but commands are simpler and perfectly fine for focused tasks that do not pollute context. Use a command when the task is small and scoped. Use a skill when you need isolation or auto-triggering.
"The config hierarchy is just about indentation preferences." — The cascade handles much more than formatting. Project-level CLAUDE.md typically includes architecture decisions, security policies, API patterns, database conventions, and deployment rules. Directory-level files can restrict tool access for sensitive areas (e.g., the payments directory might prohibit Bash to prevent accidental charges).
"Batch API is always better because it's cheaper." — Only when latency does not matter. Batch requests can take up to 24 hours. If a user is waiting for a response in a chat interface, they will not wait 24 hours for a 50% discount. Use batch for background processing (nightly reports, bulk extraction). Use synchronous for anything interactive.
CI/CD Integration
Claude Code can run in non-interactive modeA mode where Claude Code runs without waiting for user input. Activated with the -p flag: claude -p "your prompt here". Outputs the result and exits. Essential for CI/CD pipelines where there is no human to interact with. for automated pipelines. Three flags make this possible:
-p "prompt" — This is the non-interactive flag. Claude runs the prompt, prints the result, and exits. No human needed. This is what makes Claude Code usable in CI/CD at all — without it, the process would hang waiting for user input.
--output-format json — Returns structured JSON instead of plain text. Your pipeline scripts can parse the output programmatically with jq or JSON.parse() instead of trying to regex through natural language.
--json-schema — Goes one step further: it enforces a specific output structure. You define the exact fields your pipeline expects, and Claude is constrained to return only those fields. No surprises, no extra commentary — just the data your script needs.
The most critical design pattern for CI/CD is session isolationUsing SEPARATE Claude Code sessions for code generation and code review. If the same session generates code and then reviews it, the reviewer retains the generator's reasoning context, creating confirmation bias. Separate sessions ensure the reviewer evaluates the code independently.. The idea is simple: use SEPARATE sessions for code generation and code review.
Here is why this matters. If the same session generates code and then reviews it, the reviewer still has the generator's reasoning in its context window. It remembers WHY each decision was made. That creates confirmation bias — the reviewer is more likely to agree with the code, even if the code has bugs, because it already understands the intent behind each line.
Separate sessions fix this. Session B (the reviewer) sees only the raw code diff. It evaluates the code on its own merits, without being influenced by Session A's thought process. This mirrors what good engineering teams do: the person who writes the code is never the only person who reviews it.
Let's build a GitHub Actions workflow that puts session isolation into practice. The workflow triggers on every pull request, so every code change gets an independent AI review automatically.
The first half of the workflow — the "Summarize PR Changes" step — is Session A. It reads the git diff and produces a structured summary of what changed. Think of this as the "what happened" step. The second half — "Review PR" — is Session B. It reads the same diff but with a completely different prompt focused on bugs, security issues, and test gaps. Because these are separate claude -p invocations, Session B has zero knowledge of Session A's reasoning. That is the whole point.
One gotcha to watch for: the fetch-depth: 0 on checkout. Without it, GitHub Actions does a shallow clone (only the latest commit), and git diff origin/main...HEAD fails because there is no history to diff against. This is the most common cause of "the review step produced no output" in real CI setups.
# .github/workflows/claude-review.yml
# Automated PR review using Claude Code with session isolation.
# Session A generates a summary; Session B reviews independently.
name: Claude Code PR Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for diff context
- name: Install Claude Code
run: npm install -g @anthropic-ai/claude-code
# --- Session A: Summarize changes ---
# This session reads the diff and produces a structured summary.
- name: Summarize PR Changes
id: summary
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
DIFF=$(git diff origin/main...HEAD)
claude -p "Summarize these code changes. Focus on: what changed, why it might have changed, and any patterns you notice. Diff: $DIFF" \
--output-format json > summary.json
# --- Session B: Independent review ---
# SEPARATE session — no access to Session A's reasoning.
- name: Review PR (Independent Session)
id: review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
DIFF=$(git diff origin/main...HEAD)
claude -p "Review this PR diff for: security vulnerabilities, performance issues, missing error handling, and test coverage gaps. Be specific — cite line numbers. Diff: $DIFF" \
--output-format json > review.json
- name: Post Review Comment
uses: actions/github-script@v7
with:
script: |
const review = require('./review.json');
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: `## Claude Code Review\n\n${review.result}`
});
# scripts/review-pr.sh
# Standalone PR review script with session isolation.
# Can be used outside GitHub Actions (e.g., GitLab CI, local review).
#!/usr/bin/env bash
set -euo pipefail
# Ensure ANTHROPIC_API_KEY is set
if [ -z "${ANTHROPIC_API_KEY:-}" ]; then
echo "Error: ANTHROPIC_API_KEY not set" >&2
exit 1
fi
DIFF=$(git diff origin/main...HEAD)
if [ -z "$DIFF" ]; then
echo "No changes to review."
exit 0
fi
echo "=== Session A: Summarizing changes ==="
SUMMARY=$(claude -p "Summarize these code changes concisely: $DIFF" \
--output-format json 2>/dev/null)
echo "=== Session B: Independent review ==="
REVIEW=$(claude -p "Review this diff for security issues, bugs, and missing tests. Cite specific line numbers: $DIFF" \
--output-format json 2>/dev/null)
echo ""
echo "=== Summary ==="
echo "$SUMMARY" | jq -r '.result // .content // .'
echo ""
echo "=== Review ==="
echo "$REVIEW" | jq -r '.result // .content // .'
You built a CI/CD pipeline that uses Claude Code in two separate sessions. Session A summarizes what changed. Session B independently reviews the same diff for bugs and security issues. Because they are separate sessions, Session B cannot be influenced by Session A's reasoning — eliminating confirmation bias. This is the same principle as having different people write and review code.
Critical anti-pattern: same-session self-review in CI/CD. The exam tests this directly. When Claude generates code in Session A and reviews it in the same session, the reviewer retains the generator's reasoning context, creating confirmation bias. Always use separate sessions for generation vs review.
For CI/CD invocation, the exam-recommended pattern is claude -p "<prompt>" --output-format json --json-schema schema.json. The schema contract eliminates parser fragility — your CI script consumes structured fields like {verdict, blocking_issues[], confidence} instead of regex-ing freeform prose. Pair with separate generator/reviewer sessions (Domain 4.5) for confirmation-bias-free gating. Anti-pattern: parsing natural-language Claude output with regex, which breaks every time the model rephrases.
Batch Processing (Message Batches API)
The Message Batches APIAn Anthropic API endpoint that accepts up to 10,000 requests in a single batch. Requests are processed asynchronously within a 24-hour window at 50% lower cost than synchronous API calls. Ideal for latency-tolerant, high-volume tasks. is Anthropic's solution for high-volume workloads. Instead of sending requests one at a time and waiting for each response, you package up to 10,000 requests into a single batch. Anthropic processes them in the background, and you collect the results when they are ready.
50% cost reduction: This is the headline benefit. Every request in a batch costs half of what the same request would cost synchronously. For a team processing 10,000 UCC filings at $0.003 per 1K input tokens, that savings adds up to hundreds of dollars per run.
24-hour processing window: The tradeoff is latency. Batch requests are not instant — Anthropic processes them within a 24-hour window. You submit them, go home, and check the results the next morning. This is fundamentally different from a synchronous call that returns in seconds.
When to use batch vs synchronous: Ask one question: is someone waiting for this result right now? If yes (chatbot, interactive tool, real-time dashboard), use synchronous. If no (nightly analysis, bulk data extraction, dataset labeling, codebase-wide review), use batch and save 50%.
Under the hood, the Batch API works differently from the standard Messages API. When you call client.messages.create(), Anthropic allocates GPU capacity immediately, processes your request, and streams back the response. Your code blocks until it is done. With batch, you submit a manifest of requests — essentially a list of "here are 10,000 prompts I need answered" — and Anthropic schedules them to run whenever capacity is available. That is why it is cheaper: Anthropic can fill idle GPU slots instead of guaranteeing instant capacity.
If you have used background jobs in web development (Sidekiq, Celery, AWS Lambda queues), the mental model is the same. You submit work, get a job ID, and poll until the job is complete. The difference is the scale: a single batch can hold 10,000 requests, and the polling interval is minutes, not milliseconds.
One important distinction from what you already know: unlike the synchronous API where you handle one response at a time, batch results come back as a collection. Some requests in the batch may succeed while others fail. Your code must handle partial success — iterating through results and checking each one's status individually.
Let's walk through a complete batch processing pipeline step by step. The first function, create_batch_requests, is straightforward — it takes a list of filing text strings and wraps each one into the format the Batch API expects. Each request gets a unique custom_id so you can match results back to the original documents later. Think of it as putting a label on each envelope before dropping them in the mailbox.
The interesting part is run_batch. It submits the batch, then enters a polling loop — checking every 60 seconds whether the batch has finished processing. When the batch status changes to "ended", it iterates through the results. Here is the critical detail: not every request in a batch is guaranteed to succeed. A filing might be too long, or the model might fail to extract valid JSON from a poorly formatted document. So the code checks each result individually and separates successes from errors. Never assume 100% success in batch processing.
# batch_extract.py
# Extract structured data from UCC filing documents using
# the Message Batches API — 50% cheaper than synchronous calls.
import anthropic
import json
import time
def create_batch_requests(filing_texts: list[str]) -> list[dict]:
"""
Build batch request objects from filing document texts.
Each request asks Claude to extract structured data.
"""
requests = []
for i, text in enumerate(filing_texts):
requests.append({
"custom_id": f"filing-{i:04d}",
"params": {
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": (
"Extract structured data from this UCC filing. "
"Return JSON with fields: debtor_name, "
"secured_party, collateral_description, "
"filing_date, lapse_date, filing_number.\n\n"
f"Filing text:\n{text}"
),
}
],
},
})
return requests
def run_batch(filing_texts: list[str]) -> list[dict]:
"""
Submit a batch of filing extraction requests and wait
for results. Polls every 60 seconds.
"""
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY
requests = create_batch_requests(filing_texts)
print(f"Submitting batch of {len(requests)} requests...")
try:
batch = client.messages.batches.create(requests=requests)
print(f"Batch created: {batch.id}")
# Poll for completion
while True:
status = client.messages.batches.retrieve(batch.id)
print(f"Status: {status.processing_status} "
f"({status.request_counts.succeeded} done)")
if status.processing_status == "ended":
break
time.sleep(60) # Check every minute
# Collect results
results = []
for result in client.messages.batches.results(batch.id):
if result.result.type == "succeeded":
text = result.result.message.content[0].text
results.append({
"id": result.custom_id,
"data": json.loads(text),
})
else:
results.append({
"id": result.custom_id,
"error": str(result.result),
})
return results
except anthropic.APIError as e:
print(f"Batch API error: {e}")
return []
if __name__ == "__main__":
# Example: process 5 sample filings
sample_filings = [
"UCC-1 Filing: Debtor: Acme Corp, 123 Main St...",
"UCC-1 Filing: Debtor: BuildRight LLC, 456 Oak Ave...",
"UCC-3 Amendment: Original filing #2024-001234...",
"UCC-1 Filing: Debtor: Metro Holdings Inc...",
"UCC-3 Continuation: Original filing #2023-009876...",
]
results = run_batch(sample_filings)
for r in results:
print(json.dumps(r, indent=2))
// batch_extract.ts
// Extract structured data from UCC filing documents using
// the Message Batches API — 50% cheaper than synchronous calls.
import Anthropic from "@anthropic-ai/sdk";
interface BatchRequest {
custom_id: string;
params: {
model: string;
max_tokens: number;
messages: Anthropic.MessageParam[];
};
}
function createBatchRequests(filingTexts: string[]): BatchRequest[] {
return filingTexts.map((text, i) => ({
custom_id: `filing-${String(i).padStart(4, "0")}`,
params: {
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [
{
role: "user" as const,
content:
"Extract structured data from this UCC filing. " +
"Return JSON with fields: debtor_name, " +
"secured_party, collateral_description, " +
"filing_date, lapse_date, filing_number.\n\n" +
`Filing text:\n${text}`,
},
],
},
}));
}
async function runBatch(filingTexts: string[]) {
const client = new Anthropic(); // reads ANTHROPIC_API_KEY
const requests = createBatchRequests(filingTexts);
console.log(`Submitting batch of ${requests.length} requests...`);
try {
const batch = await client.messages.batches.create({ requests });
console.log(`Batch created: ${batch.id}`);
// Poll for completion
while (true) {
const status = await client.messages.batches.retrieve(batch.id);
console.log(
`Status: ${status.processing_status} ` +
`(${status.request_counts.succeeded} done)`
);
if (status.processing_status === "ended") break;
await new Promise((r) => setTimeout(r, 60_000)); // 1 min
}
// Collect results
const results: Array<{ id: string; data?: unknown; error?: string }> = [];
for await (const result of client.messages.batches.results(batch.id)) {
if (result.result.type === "succeeded") {
const text = (result.result.message.content[0] as Anthropic.TextBlock).text;
results.push({ id: result.custom_id, data: JSON.parse(text) });
} else {
results.push({ id: result.custom_id, error: String(result.result) });
}
}
return results;
} catch (error) {
console.error(`Batch API error: ${error}`);
return [];
}
}
// Example usage
const sampleFilings = [
"UCC-1 Filing: Debtor: Acme Corp, 123 Main St...",
"UCC-1 Filing: Debtor: BuildRight LLC, 456 Oak Ave...",
"UCC-3 Amendment: Original filing #2024-001234...",
];
runBatch(sampleFilings).then((results) =>
results.forEach((r) => console.log(JSON.stringify(r, null, 2)))
);
You built a batch processing pipeline that extracts structured data from UCC filings at 50% of the normal cost. The batch API handles up to 10,000 requests asynchronously within 24 hours. The code polls for completion, then collects results — including error handling for individual failed requests. This pattern works for any high-volume extraction: document processing, dataset labeling, bulk code review.
"I should batch everything to save money." — Only if latency does not matter. Batch requests can take up to 24 hours. A chatbot user will not wait 24 hours for a 50% discount. Use batch for background processing (nightly reports, bulk extraction, dataset labeling). Use synchronous for anything where a human or system is waiting for the result.
"If one request in a batch fails, the whole batch fails." — No. Batches support partial success. Some requests may succeed while others fail (e.g., one filing is too long, another has malformed text). Your code must iterate through results and check each one individually. Never assume 100% success.
"Batch API is a different model or lower quality." — It is the exact same model, same quality, same capabilities. The only difference is scheduling: synchronous gets immediate GPU allocation, while batch requests are queued and processed when capacity is available. That flexible scheduling is why Anthropic can offer the 50% discount.
The exam tests when to use batch vs synchronous. Batch: non-urgent, high-volume, cost-sensitive (nightly analysis, bulk extraction). Synchronous: user-facing, real-time, interactive. A user waiting for a chatbot response should NEVER hit the batch API — they would wait up to 24 hours.
Background Work & Composition Patterns
Single-session Claude Code is already a force multiplier. But the top-1% workflow is composition: multiple Claude sessions running in parallel, scheduled background agents, and independent reviews of your own work. These aren't experimental features — they ship in Claude Code today and unlock workflows you literally cannot do with any other AI tool.
Worktrees — parallel sessions on the same repo
Git worktrees let you check out multiple branches into separate directories. Claude Code uses this to run truly parallel sessions without stepping on each other's changes. Want one Claude implementing the feature while another reviews the previous PR? Spin up two worktrees, two sessions, two terminals.
# Create a worktree on a new branch, isolated from your main checkout
git worktree add ../proj-feature-auth -b feat/auth
# Open Claude Code there
cd ../proj-feature-auth && claude
# Meanwhile, in your original checkout, run a SECOND Claude session
# reviewing the last commit (the "two-Claude review" pattern)
cd ../proj && claude "review the last commit on main as a staff engineer. Be harsh."
# When done, clean up
git worktree remove ../proj-feature-auth
The Two-Claude Review Pattern
This is the single highest-leverage technique to add to your workflow today. Session A implements a feature with full context of the tradeoffs. Session B starts cold and reviews the diff with no context. The cold reviewer is dramatically harder to fool because it has no rationalization for shortcuts that "made sense at the time."
Background Monitoring with /loop
Some checks shouldn't block your main session — they should run in the background on a timer. Use /loop <interval> <prompt>. The session keeps running, you keep working, and Claude reports back when something changes.
# Watch the CI pipeline on a feature branch
/loop 5m check if the CI pipeline on branch feat/recommendations passed; report back
# Periodically scan for new failing tests on main
/loop 30m check for any new failing tests on main; only ping me on changes
# Self-paced (Claude decides the cadence based on what it's watching)
/loop check the long-running build at task ID build-7842 and tell me when it finishes
Scheduled Routines with /schedule
/schedule creates a cron-style remote agent that runs even when your laptop is closed. Common patterns: a Monday-morning triage agent that processes the weekend's PR queue, or a one-time agent in two weeks that opens a cleanup PR for a feature flag you just shipped.
# Recurring routine: weekly PR triage every Monday at 9am
/schedule "0 9 * * 1" "review all open PRs older than 5 days; comment on staleness"
# One-time routine: clean up a feature flag in 2 weeks
/schedule "in 14 days" "open a PR removing the recommendations_v2 feature flag if it's still rolled out 100%"
Remote Control — Async Workflows from Anywhere
Long-running Claude tasks shouldn't tie you to your laptop. Run claude remote-control on your machine and you can connect to that running session from claude.ai or the iOS app. Kick off a 20-minute refactor, close your laptop, check progress from your phone on the way to a meeting. The session runs on your machine; the browser/app is just a window into it.
Voice — /voice + Push-to-Talk
Some workflows are dramatically faster spoken than typed — especially exploratory thinking-out-loud or describing an architecture. Run /voice to enable push-to-talk. Hold space, describe what you want, release. Claude transcribes and responds. The natural use case: long-form context dumps and brainstorming where typing is the bottleneck.
Plugins — sharing your setup
Once you've curated a great set of skills, slash commands, and hooks for your project, you can package them as a plugin. Plugins are installable via the Claude Code marketplace (or your team's private registry) and are how engineering teams standardize agent infrastructure across projects. The same plugin that gives the SRE team a /incident command and a Slack notification hook can be installed in every repo with one command.
These four primitives — worktrees, two-Claude review, /loop, /schedule — turn Claude Code from a fast assistant into a programmable engineering team. The pattern is always the same: stop doing the work in real-time on your main session. Spawn it, schedule it, parallelize it, or hand it to a fresh reviewer. The compounding effect is what the Anthropic engineering guidance calls "agents over workflows" — once the architecture supports autonomous, parallel work, you stop being the bottleneck.
Power User Features & Keyboard Shortcuts
The Anthropic common workflows doc lists the daily features that separate "I use Claude Code" from "I use Claude Code well." This section is the consolidated cheat sheet — every UX detail, keyboard shortcut, and CLI flag the docs recommend.
Plan Mode — The Full UX
You learned plan mode earlier. Here's how the docs say to actually drive it:
| Action | How |
|---|---|
| Switch into Plan Mode mid-session | Shift+Tab cycles: Normal → Auto-Accept → Plan Mode |
| Start a session in Plan Mode | claude --permission-mode plan |
| Run headless plan | claude --permission-mode plan -p "<query>" |
| Edit the plan in your editor before accepting | Ctrl+G opens plan in $EDITOR |
| Make Plan Mode the default | {"permissions": {"defaultMode": "plan"}} in .claude/settings.json |
@-File & Resource References
The @ prefix has two distinct uses depending on where you type it. Don't confuse them:
- In prompts —
@src/utils/auth.jsincludes the file's full content.@src/componentsshows a directory listing.@github:repos/owner/repo/issuesfetches data from a connected MCP server. Adding@filealso pulls the CLAUDE.md from that file's directory and parents into context. Multiple references work in one message: "explain the bug in @app.ts that the tests in @app.test.ts are catching." - In CLAUDE.md —
@README.mdimports another file's content into CLAUDE.md at session start. Recursive imports up to 5 hops. Relative paths resolve relative to the file that contains the import. Use@~/.claude/my-prefs.mdto share personal instructions across worktrees.
Extended Thinking — The UX
You learned the concept of extended thinking in M22. Here's the daily UX:
| Action | How |
|---|---|
| See Claude's thinking inline | Ctrl+O toggles verbose mode |
| Toggle thinking for the session | Option+T (mac) / Alt+T (Win/Linux) |
| Ask for more thinking on one prompt | Include the keyword ultrathink anywhere in your prompt |
| Adjust effort level | /effort, or CLAUDE_CODE_EFFORT_LEVEL env var |
| Cap thinking budget | MAX_THINKING_TOKENS=10000 (Opus 4.7 ignores; uses adaptive) |
| Disable thinking globally | /config, sets alwaysThinkingEnabled: false |
Phrases like "think hard" or "think more" are read as regular prompt text, not thinking-budget allocations. Only ultrathink is interpreted as a thinking instruction.
Session Management — Resume, Name, Pick
Sessions persist automatically per project directory. The full vocabulary:
| Goal | CLI / Slash command |
|---|---|
| Resume the most recent session in this dir | claude --continue |
| Open the picker | claude --resume · from inside a session: /resume |
| Resume a session linked to a PR | claude --from-pr 123 (or paste PR URL into picker search) |
| Name a session at startup | claude -n auth-refactor |
| Rename mid-session | /rename auth-refactor |
| Resume by name (across worktrees) | claude --resume auth-refactor |
| Branch the current session | /branch · /rewind · --fork-session |
| In picker: search / preview / rename / widen | / · Space · Ctrl+R · Ctrl+A all projects · Ctrl+W worktrees · Ctrl+B branch filter |
Best practice: name early. /rename auth-refactor when starting a distinct task is much easier to find later than "explain this function." When the session is too old to fully reload, the picker offers a resume from summary path.
Worktree Extras — Built-In Flag, Includes, Subagent Worktrees
You learned worktrees as a parallel-session primitive. Three docs-recommended details:
- Built-in flag:
claude --worktree feature-authcreates<repo>/.claude/worktrees/feature-auth/on a new branchworktree-feature-auth. Omit the name for an auto-generated one. Worktrees branch fromorigin/HEAD— if your main branch changed, rungit remote set-head origin -ato re-sync. Add.claude/worktrees/to your.gitignore. .worktreeinclude: a gitignore-style file at the repo root listing files that should be copied into new worktrees (typically.env.local,config/secrets.json). Only files that match and are gitignored get copied — tracked files are never duplicated.- Subagent worktrees: add
isolation: worktreeto a custom subagent's frontmatter, or ask Claude to "use worktrees for your agents." Each subagent gets its own worktree, automatically cleaned up if no changes were made. Orphaned worktrees from interrupted runs sweep at startup based oncleanupPeriodDays.
Scheduling — Pick the Right Tool
The docs list four ways to run Claude on a schedule. Choose by where the task should run:
| Option | Where it runs | Best for |
|---|---|---|
| Routines | Anthropic-managed infra | Tasks that must run when your computer is off; API + GitHub triggers |
| Desktop scheduled tasks | Your machine, via desktop app | Tasks needing local files / uncommitted changes |
| GitHub Actions | Your CI pipeline | Tied to repo events (PR opens, cron in CI) |
/loop | Current CLI session | Quick polling while a session is open |
Scheduled tasks run autonomously — the prompt can't ask clarifying questions. Be explicit about success criteria and what to do with results: "Review open PRs labeled needs-review, comment on issues, post a summary in #eng-reviews Slack."
Unix-Style Utility — Headless & Pipes
Claude Code has a first-class non-interactive mode for shell pipelines and CI:
# Pipe data through Claude
cat build-error.txt | claude -p 'concisely explain the root cause' > output.txt
# Use as a linter in package.json
"lint:claude": "claude -p 'check the diff vs main for typos. report filename and line.'"
# Output formats
claude -p '...' --output-format text # default plain text
claude -p '...' --output-format json # full conversation log + cost/duration
claude -p '...' --output-format stream-json # real-time streaming JSON objects
Image Input
Three ways to give Claude an image: (1) drag and drop into the Claude Code window. (2) copy and paste with Ctrl+V (do not use Cmd+V). (3) reference a path: "Analyze this: /path/to/image.png." Useful for debugging error screenshots, generating CSS from design mockups, and reading database-schema diagrams. When Claude references an image like [Image #1], Cmd+Click / Ctrl+Click opens it.
Desktop Notifications via the Notification Hook
Long-running tasks shouldn't keep you tab-switching. The Notification hook fires when Claude needs permission, finishes work, or completes auth. Configure once in ~/.claude/settings.json:
{
"hooks": {
"Notification": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "osascript -e 'display notification \"Claude needs your attention\" with title \"Claude Code\"'"
}]
}]
}
}
Replace the command with notify-send on Linux or a PowerShell MessageBox on Windows. Narrow the matcher to permission_prompt / idle_prompt / auth_success if you only want specific events.
CLAUDE.md Extras the Earlier Section Skipped
Five smaller CLAUDE.md features that didn't fit the main hierarchy diagram but matter at scale:
/init— generates a starting CLAUDE.md by analyzing your codebase. If a CLAUDE.md exists, it suggests improvements rather than overwriting. SetCLAUDE_CODE_NEW_INIT=1for an interactive multi-phase flow that also offers to set up skills and hooks.- AGENTS.md compatibility — Claude Code reads
CLAUDE.md, notAGENTS.md. If your repo already has an AGENTS.md for other tools, create a CLAUDE.md that just imports it:@AGENTS.mdon line 1, then add Claude-specific instructions below. - Managed policy CLAUDE.md — org-wide instructions deployed via MDM/Group Policy/Ansible to
/Library/Application Support/ClaudeCode/CLAUDE.md(mac),/etc/claude-code/CLAUDE.md(Linux), orC:\Program Files\ClaudeCode\CLAUDE.md(Windows). Cannot be excluded by individual settings. claudeMdExcludes— in monorepos, ancestor CLAUDE.md files from other teams may pollute context. Skip them with glob patterns in.claude/settings.local.json:"claudeMdExcludes": ["**/monorepo/CLAUDE.md", "/home/user/monorepo/other-team/.claude/rules/**"].--add-dir— gives Claude access to directories outside your working dir. By default their CLAUDE.md isn't loaded; setCLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD=1to also load their memory files.
Three checks, in order: (1) Run /memory — if your CLAUDE.md isn't listed, Claude isn't seeing it (wrong path or excluded). (2) Make instructions specific. "Use 2-space indentation" beats "format code nicely." (3) Look for conflicting instructions across nested CLAUDE.md files — if two say different things for the same behavior, Claude picks one arbitrarily. For deep debugging, configure the InstructionsLoaded hook to log exactly which instruction files load, when, and why.
The exam tests your familiarity with the daily workflow primitives, not just the deep architecture. Common stems: "how do you switch into Plan Mode mid-session" → Shift+Tab. "how do you make Claude think more on one prompt" → ultrathink keyword. "how do you resume a session by PR" → --from-pr. Memorize the cheat sheets above — they're the operational vocabulary the cert assumes you know.
End-to-End: Building a Feature with the Full Stack
You've learned every primitive individually. The payoff is composition. Here's a real workflow for shipping a new API endpoint — from idea to merged PR — using all six layers together. The same task that takes 2–3 hours of manual work compresses to about 25 minutes when the system does the orchestration.
The scenario: add a /api/v2/recommendations endpoint to a Node.js API. Personalized content based on user history. Redis caching, auth middleware, tests.
The Six Steps in Detail
- Step 0 — CLAUDE.md is already loaded. Because you set this up once, Claude already knows your stack, test framework, git workflow, and the things it gets wrong. Zero setup per session.
- Step 1 — Interview pattern with AskUserQuestion. Instead of guessing requirements, ask Claude to interview you.
"Interview me using AskUserQuestion. Ask about auth, caching strategy, response shape, edge cases, performance constraints. Don't assume anything. When done, write SPEC.md." - Step 2 — Implementation in a fresh session. New terminal, new session:
"implement the spec in SPEC.md, TDD style". YourPostToolUsehook lints every file as Claude writes it. YourPreToolUsehook blocks dangerous commands. You're not babysitting any of this. - Step 3 — Parallel review subagent. While the main session is still running (or right after it commits):
"Use the code-reviewer subagent on the last commit."The subagent reads the diff cold and returns a structured MUST FIX / SHOULD FIX / CONSIDER report. - Step 4 — Fix and validate. Hand the report back to the main session:
"Reviewer found a Redis connection leak and auth middleware in wrong order. Fix both, re-run tests."Tests pass, hooks auto-lint. - Step 5 — Security audit subagent.
"Use the security-auditor subagent on this feature."Scans for injection vectors, exposed secrets, auth gaps, rate-limit gaps. Returns clean (or more fixes). - Step 6 — Automated PR via GitHub MCP.
"Create a PR. Include the spec, what changed and why, test coverage summary, known limitations."Claude uses GitHub MCP to create the PR, links Jira via Jira MCP, requests reviewers.
The Production-Grade File Tree
Every primitive in this module corresponds to a file location. This is the consolidated layout for a well-configured Claude Code project — what your .claude/ directory looks like once all six layers are wired up.
your-project/
├── CLAUDE.md ← project memory (commit this, <150 lines)
├── CLAUDE.local.md ← personal overrides (gitignore)
├── .claude/
│ ├── settings.json ← hooks, models, permissions, MCP
│ ├── agents/
│ │ ├── code-reviewer.md
│ │ ├── test-writer.md
│ │ ├── security-auditor.md
│ │ └── pm-spec.md
│ ├── skills/
│ │ ├── deploy.md ← how we deploy to staging/prod
│ │ ├── database-patterns.md ← our DB conventions
│ │ └── api-design.md ← our API design rules
│ ├── commands/
│ │ ├── review-pr.md ← /review-pr $ARGUMENTS
│ │ ├── ship.md ← /ship — full pipeline
│ │ └── diagnose.md ← /diagnose — debugging workflow
│ └── hooks/
│ ├── block_dangerous.py
│ ├── auto_format.sh
│ └── session_summary.py
├── src/
│ ├── api/CLAUDE.md ← directory-scoped rules (loaded on demand)
│ └── db/CLAUDE.md ← DB-specific conventions
Everyone else: "I'll give Claude a task and see how it does."
Top 1%: "I'll design a system where Claude operates effectively with minimum supervision."
That's an infrastructure mindset applied to AI tooling. You invest time upfront writing a tight CLAUDE.md, setting up hooks, defining subagents — and that investment compounds on every session. The developers shipping the most aren't the best prompters; they're the best system designers. They think about where context degrades and preempt it. Which quality gates should be automatic vs. human-reviewed. Which parts of a task can run in parallel vs. serially. It's not about being a better driver — it's about building a better road.
5-Day Rollout Plan — Where to Start Tomorrow
You don't need to do all of this at once. Start with the smallest change that compounds. Here's the prioritized rollout that the top-1% playbook recommends, in order of return-on-effort.
The exam tests your ability to recognize which primitive solves a given pain point. Stale context across sessions → CLAUDE.md. Ad-hoc lint enforcement → PostToolUse hook. Repetitive multi-step workflows → subagent. Live external data → MCP. Memorize the mapping; it shows up as scenario questions where 3 of 4 answers are technically possible but only 1 matches the right primitive for that pain.
Hands-On Exercise
What You'll Build
A complete Claude Code project configuration with three-level CLAUDE.md hierarchy, a custom slash command, a skill with context isolation, and a CI/CD review workflow.
Time estimate: 30–45 minutes
Prerequisites: Claude Code installed (npm install -g @anthropic-ai/claude-code), a code project directory to configure, and your ANTHROPIC_API_KEY set.
Files you'll create:
~/.claude/CLAUDE.md— Personal preferences (user-level).claude/CLAUDE.md— Team standards (project-level)src/api/CLAUDE.md— API-specific rules (directory-level).claude/commands/check-filing.md— Custom slash command.claude/skills/entity-resolution/SKILL.md— Skill with context fork.github/workflows/claude-review.yml— CI/CD workflow
Environment Setup
# Create project structure
mkdir -p my-ucc-project/src/api
mkdir -p my-ucc-project/.claude/commands
mkdir -p my-ucc-project/.claude/skills/entity-resolution
mkdir -p my-ucc-project/.github/workflows
cd my-ucc-project
# Verify Claude Code is installed
claude --version
# Verify API key is set
echo $ANTHROPIC_API_KEY | head -c 10 # Should print "sk-ant-..."
Step 1: Create User-Level CLAUDE.md
What & Why: The user-level file holds your personal coding preferences. It applies to every project you open with Claude Code but never gets committed to git, so it does not impose your style on teammates.
Create the file ~/.claude/CLAUDE.md with the following content:
# Personal Claude Code Preferences
## Editor Preferences
- Indentation: 2 spaces
- Line length: 100 characters max
- Trailing commas: always in multi-line structures
## Workflow Preferences
- Run tests after every code change
- Show git diff before committing
- Prefer small, focused commits over large batches
## Communication Style
- Be concise — skip preamble
- If unsure, ask before making assumptions
Run:
If you see your personal preferences printed, Step 1 is working. If not, check the troubleshooting below.
Troubleshooting:
- If you see
No such file or directory→ create the directory first:mkdir -p ~/.claude - On Windows, the path is
%USERPROFILE%\.claude\CLAUDE.md— usetype %USERPROFILE%\.claude\CLAUDE.mdinstead ofcat - If the file exists but is empty → check you saved the content (some editors don't auto-save new files)
Step 2: Create Project-Level CLAUDE.md
What & Why: The project-level file defines team standards. It gets committed to git so every developer shares the same rules. This is where tech stack, database conventions, and API patterns go.
Create the file .claude/CLAUDE.md in your project root:
# UCC Pipeline Project — Claude Code Configuration
## Project Context
This is a UCC (Uniform Commercial Code) filing data pipeline.
Stack: Python 3.12, FastAPI, PostgreSQL 15, Redis, Docker.
## Coding Conventions
- Use type hints on ALL function signatures
- Docstrings: Google style (Args/Returns/Raises)
- Tests: pytest with fixtures, minimum 80% coverage
- Error handling: never catch bare `except:`, always specific exceptions
## Database Rules
- ALL queries must use parameterized statements (no f-strings in SQL)
- Migrations via Alembic — never modify schema directly
## API Patterns
- Routes: /api/v1/{resource}
- Validation: Pydantic models for all request/response bodies
Run:
If you see the team-level configuration with tech stack and database rules, Step 2 is working.
Troubleshooting:
- If you see
No such file or directory→ make sure you created the file inside.claude/, not at the project root - If you accidentally created it at the project root → move it:
mv CLAUDE.md .claude/CLAUDE.md - If the
.claude/directory doesn't exist → runmkdir -p .claudefirst
Step 3: Create Directory-Level CLAUDE.md
What & Why: Directory-level files override project-level rules for specific code areas. The API directory might require Zod validation, while the database layer requires parameterized queries.
Create src/api/CLAUDE.md:
# API Layer Rules (overrides project-level where applicable)
## Input Validation
- ALL endpoints must validate input with Pydantic models
- Return 422 with field-level error details on validation failure
## Response Format
- Always wrap responses in: { "data": ..., "error": null, "metadata": {} }
- Include request_id in metadata for tracing
## Security
- Bearer token auth via X-API-Key header on all endpoints
- Rate limit: 100 req/min per API key
Run:
If you see the API-specific rules, Step 3 is working. You now have three CLAUDE.md files at three levels. When Claude Code operates inside src/api/, it merges all three, with directory-level rules winning any conflicts.
Troubleshooting:
- If you see
No such file or directory→ make suresrc/api/exists:mkdir -p src/api - If the file shows the project-level content instead → you may have saved to the wrong path. Check with
ls -la src/api/
Step 4: Build a Slash Command
What & Why: A slash command creates a reusable workflow you can trigger by typing /check-filing. This one validates a UCC filing parser against known test data.
Create .claude/commands/check-filing.md:
---
description: Validate a UCC filing parser against test data
allowed-tools:
- Read
- Grep
- Glob
argument-hint: filing_type (e.g., UCC-1, UCC-3)
---
# Check Filing Parser
Review the parser for `$ARGUMENTS` filings.
## Checklist
1. Read the parser source for this filing type
2. Check: Does it handle all required fields? (debtor, secured party, collateral, dates)
3. Check: Are edge cases covered? (missing fields, malformed dates)
4. Check: Are there tests with adequate coverage?
## Report Format
- **Status**: PASS / FAIL / NEEDS REVIEW
- **Issues found**: list with severity
- **Missing coverage**: untested scenarios
Run:
Then open Claude Code in your project directory and type / — you should see check-filing in the list of available commands.
If the command appears in the / menu and the YAML frontmatter is correct, Step 4 is working.
Troubleshooting:
- If the command does not appear in
/menu → check the file is in.claude/commands/(not.claude/skills/) and ends with.md - If you see
YAML parse error→ check that the frontmatter uses---delimiters (three dashes) and proper indentation - If the command appears but fails → verify the
allowed-toolslist uses correct tool names:Read,Grep,Glob(case-sensitive)
Step 5: Build a Skill with Context Isolation
What & Why: Skills with context: fork run in an isolated context window, so exploring dozens of files does not clutter your main session. This skill does entity resolution — searching for name matches across UCC filings.
Create .claude/skills/entity-resolution/SKILL.md:
---
name: entity-resolution
description: Analyze entity names across UCC filings to find matches and variations. Use when asked to resolve, match, or deduplicate entity names.
context: fork
allowed-tools:
- Read
- Grep
- Glob
---
# Entity Resolution Analysis
Given entity name `$ARGUMENTS`, search for:
1. Exact matches in filing records
2. Variations (abbreviations, misspellings, legal suffixes)
3. Related entities (parent/subsidiary relationships)
Return a structured summary with match counts and a recommended canonical name.
Do NOT modify any files. Return only the analysis summary.
Run:
Then in Claude Code, type "resolve entity Acme Corp" — the skill should trigger automatically based on its description matching your request. You can also invoke it explicitly with /entity-resolution Acme Corp.
If the skill is available as /entity-resolution and auto-triggers when you mention entity matching, Step 5 is working.
Troubleshooting:
- If the skill does not appear → check the file is named exactly
SKILL.md(all caps, exact casing) inside.claude/skills/entity-resolution/ - If the skill appears but does not auto-trigger → make sure the
descriptionfield in the frontmatter contains keywords that match your request (e.g., "resolve", "match", "entity") - If the skill runs but pollutes your main context → verify
context: forkis set in the YAML frontmatter
Step 6: Configure CI/CD Review Workflow
What & Why: This GitHub Actions workflow runs Claude Code on every PR with session isolation — separate sessions for summarizing and reviewing changes. This prevents confirmation bias.
Create .github/workflows/claude-review.yml using the GitHub Actions code from the CI/CD section above. Ensure your repository has ANTHROPIC_API_KEY set in GitHub Secrets.
Run:
Validate the YAML syntax:
If you see "YAML is valid" and the file header matches, Step 6 is working.
Troubleshooting:
- If the workflow does not trigger: ensure
on: pull_requestis at the top level, not nested underjobs. - If Claude Code fails in CI: check that
ANTHROPIC_API_KEYis set in repository Settings → Secrets → Actions. - If
npm install -gfails in CI: theubuntu-latestrunner includes Node.js by default, but you may needactions/setup-node@v4first.
Verify Everything Works
Run these checks to confirm your configuration is complete:
# Verify all files exist
echo "=== Checking CLAUDE.md hierarchy ==="
test -f ~/.claude/CLAUDE.md && echo "✓ User-level" || echo "✗ Missing ~/.claude/CLAUDE.md"
test -f .claude/CLAUDE.md && echo "✓ Project-level" || echo "✗ Missing .claude/CLAUDE.md"
test -f src/api/CLAUDE.md && echo "✓ Directory-level" || echo "✗ Missing src/api/CLAUDE.md"
echo "=== Checking commands and skills ==="
test -f .claude/commands/check-filing.md && echo "✓ Command" || echo "✗ Missing command"
test -f .claude/skills/entity-resolution/SKILL.md && echo "✓ Skill" || echo "✗ Missing skill"
echo "=== Checking CI/CD ==="
test -f .github/workflows/claude-review.yml && echo "✓ Workflow" || echo "✗ Missing workflow"
You have a fully configured Claude Code project with a three-level CLAUDE.md hierarchy, a custom command, a context-isolated skill, and a CI/CD review pipeline. This covers the core of Domain 3 on the certification exam.
Stretch goal (optional): Add a batch processing script that extracts structured data from 100 UCC filing documents using the Message Batches API from the Batch Processing section above.
You now have three ways to build agents.
Approach 1 — Raw API Loop (from M15B): 250 lines and full control. You drive every iteration of the loop, inspect stop_reason, and execute tools yourself.
Approach 2 — Agent SDK (from M26): 40 lines with hooks and sessions. The SDK runs the loop while you focus on tool definitions, hooks for guardrails, and sessions for state.
Approach 3 — Spec-Driven (from this module): 100 lines of spec and Claude Code generates everything. You describe what the agent should do; Claude builds the implementation, and you review and iterate on the spec.
Each builds on the one before. You cannot debug Approach 3 without understanding Approach 1.
CAPSTONE-7 is where you prove this by building the SAME agent all three ways and comparing code size, development time, and flexibility. That comparison is the graduation exercise of this course.
Knowledge Check
Q1: Where should personal code style preferences (indentation, line length) go?
~/.claude/CLAUDE.md.claude/CLAUDE.mdsrc/CLAUDE.md~/.claude/CLAUDE.md) so they apply to all projects but don't impose your style on teammates. Project-level is for team standards.~/.claude/CLAUDE.md). Putting them in project-level forces your style on the whole team. Directory-level is for path-specific rules.Q2: What does context: fork do in a skill definition?
context: fork gives the skill its own context window. Files read, searches performed, and intermediate reasoning stay in the forked context. Only the final result returns to the main session.context: fork creates an isolated context window for the skill. It is not about git branches, containers, or model splitting — it is about keeping the main conversation clean while the skill does exploratory work.Q3: When should you use plan mode instead of direct execution?
Q4: You need to find which files in your project contain the function validateInput. Which tool should you use?
grep -r validateInput .Q5: What is the critical anti-pattern with CI/CD code review?
Q6: When should you use the Message Batches API instead of synchronous calls?
Q7: A developer creates a command to explore a large codebase, reading 80+ files. What should they use instead?
context: fork so exploration noise stays isolatedcontext: fork runs in an isolated context window. The 80+ files are read in the forked context and only the analysis summary returns to the main session, keeping it clean.context: fork keeps the exploration isolated and returns only the summary.Q8: In the CLAUDE.md hierarchy, what happens when user-level says "indent: 4 spaces" and project-level says "indent: 2 spaces"?
Module Summary
- CLAUDE.md Hierarchy: User-level (personal) → project-level (team) → directory-level (path-specific). More specific wins. Use
@importfor shared rules. - Commands vs Skills: Commands are manual triggers. Skills add
context: forkfor isolation and automatic invocation. Use skills for complex exploration tasks. - Plan Mode: Use for complex, risky, or unfamiliar tasks. Skip for simple, well-understood changes. Decision depends on complexity, familiarity, risk, and reversibility.
- Built-in Tools: Glob (find files) → Read (understand) → Edit (change) → Bash (test). Know which tool fits which task.
- CI/CD: Use
-pfor non-interactive mode,--output-format jsonfor structured output. Always use separate sessions for generation and review. - Batch API: 50% cost reduction for high-volume, latency-tolerant tasks. 24-hour processing window. Not for real-time user-facing responses.
Next up: M26: Hooks, Sessions & Agent SDK covers event-driven automation, session management, and building custom agents with the Claude Agent SDK.
References & Resources
- Claude Code Documentation — Official guide to configuration and usage
- Message Batches API — Batch processing documentation
- Tool Use Documentation — Built-in and custom tool reference
- Anthropic Cookbook — Production-ready code examples