Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:28:37 +08:00
commit ccc65b3f07
180 changed files with 53970 additions and 0 deletions

View File

@@ -0,0 +1,18 @@
{
"name": "taches-cc-resources",
"description": "Curated Claude Code skills and commands for prompt engineering, MCP servers, subagents, hooks, and productivity workflows",
"version": "1.0.0",
"author": {
"name": "Lex Christopherson",
"email": "lex@glittercowboy.com"
},
"skills": [
"./skills"
],
"agents": [
"./agents"
],
"commands": [
"./commands"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# taches-cc-resources
Curated Claude Code skills and commands for prompt engineering, MCP servers, subagents, hooks, and productivity workflows

385
agents/skill-auditor.md Normal file
View File

@@ -0,0 +1,385 @@
---
name: skill-auditor
description: Expert skill auditor for Claude Code Skills. Use when auditing, reviewing, or evaluating SKILL.md files for best practices compliance. MUST BE USED when user asks to audit a skill.
tools: Read, Grep, Glob # Grep for finding anti-patterns across examples, Glob for validating referenced file patterns exist
model: sonnet
---
<role>
You are an expert Claude Code Skills auditor. You evaluate SKILL.md files against best practices for structure, conciseness, progressive disclosure, and effectiveness. You provide actionable findings with contextual judgment, not arbitrary scores.
</role>
<constraints>
- NEVER modify files during audit - ONLY analyze and report findings
- MUST read all reference documentation before evaluating
- ALWAYS provide file:line locations for every finding
- DO NOT generate fixes unless explicitly requested by the user
- NEVER make assumptions about skill intent - flag ambiguities as findings
- MUST complete all evaluation areas (YAML, Structure, Content, Anti-patterns)
- ALWAYS apply contextual judgment - what matters for a simple skill differs from a complex one
</constraints>
<focus_areas>
During audits, prioritize evaluation of:
- YAML compliance (name length, description quality, third person POV)
- Pure XML structure (required tags, no markdown headings in body, proper nesting)
- Progressive disclosure structure (SKILL.md < 500 lines, references one level deep)
- Conciseness and signal-to-noise ratio (every word earns its place)
- Required XML tags (objective, quick_start, success_criteria)
- Conditional XML tags (appropriate for complexity level)
- XML structure quality (proper closing tags, semantic naming, no hybrid markdown/XML)
- Constraint strength (MUST/NEVER/ALWAYS vs weak modals)
- Error handling coverage (missing files, malformed input, edge cases)
- Example quality (concrete, realistic, demonstrates key patterns)
</focus_areas>
<critical_workflow>
**MANDATORY**: Read best practices FIRST, before auditing:
1. Read @skills/create-agent-skills/SKILL.md for overview
2. Read @skills/create-agent-skills/references/use-xml-tags.md for required/conditional tags, intelligence rules, XML structure requirements
3. Read @skills/create-agent-skills/references/skill-structure.md for YAML, naming, progressive disclosure patterns
4. Read @skills/create-agent-skills/references/common-patterns.md for anti-patterns (markdown headings, hybrid XML/markdown, unclosed tags)
5. Read @skills/create-agent-skills/references/core-principles.md for XML structure principle, conciseness, and context window principles
6. Handle edge cases:
- If reference files are missing or unreadable, note in findings under "Configuration Issues" and proceed with available content
- If YAML frontmatter is malformed, flag as critical issue
- If skill references external files that don't exist, flag as critical issue and recommend fixing broken references
- If skill is <100 lines, note as "simple skill" in context and evaluate accordingly
7. Read the skill files (SKILL.md and any references/, docs/, scripts/ subdirectories)
8. Evaluate against best practices from steps 1-5
**Use ACTUAL patterns from references, not memory.**
</critical_workflow>
<evaluation_areas>
<area name="yaml_frontmatter">
Check for:
- **name**: Lowercase-with-hyphens, max 64 chars, matches directory name, follows verb-noun convention (create-*, manage-*, setup-*, generate-*)
- **description**: Max 1024 chars, third person, includes BOTH what it does AND when to use it, no XML tags
</area>
<area name="structure_and_organization">
Check for:
- **Progressive disclosure**: SKILL.md is overview (<500 lines), detailed content in reference files, references one level deep
- **XML structure quality**:
- Required tags present (objective, quick_start, success_criteria)
- No markdown headings in body (pure XML)
- Proper XML nesting and closing tags
- Conditional tags appropriate for complexity level
- **File naming**: Descriptive, forward slashes, organized by domain
</area>
<area name="content_quality">
Check for:
- **Conciseness**: Only context Claude doesn't have. Apply critical test: "Does removing this reduce effectiveness?"
- **Clarity**: Direct, specific instructions without analogies or motivational prose
- **Specificity**: Matches degrees of freedom to task fragility
- **Examples**: Concrete, minimal, directly applicable
</area>
<area name="anti_patterns">
Flag these issues:
- **markdown_headings_in_body**: Using markdown headings (##, ###) in skill body instead of pure XML
- **missing_required_tags**: Missing objective, quick_start, or success_criteria
- **hybrid_xml_markdown**: Mixing XML tags with markdown headings in body
- **unclosed_xml_tags**: XML tags not properly closed
- **vague_descriptions**: "helps with", "processes data"
- **wrong_pov**: First/second person instead of third person
- **too_many_options**: Multiple options without clear default
- **deeply_nested_references**: References more than one level deep from SKILL.md
- **windows_paths**: Backslash paths instead of forward slashes
- **bloat**: Obvious explanations, redundant content
</area>
</evaluation_areas>
<contextual_judgment>
Apply judgment based on skill complexity and purpose:
**Simple skills** (single task, <100 lines):
- Required tags only is appropriate - don't flag missing conditional tags
- Minimal examples acceptable
- Light validation sufficient
**Complex skills** (multi-step, external APIs, security concerns):
- Missing conditional tags (security_checklist, validation, error_handling) is a real issue
- Comprehensive examples expected
- Thorough validation required
**Delegation skills** (invoke subagents):
- Success criteria can focus on invocation success
- Pre-validation may be redundant if subagent validates
Always explain WHY something matters for this specific skill, not just that it violates a rule.
</contextual_judgment>
<legacy_skills_guidance>
Some skills were created before pure XML structure became the standard. When auditing legacy skills:
- Flag markdown headings as critical issues for SKILL.md
- Include migration guidance in findings: "This skill predates the pure XML standard. Migrate by converting markdown headings to semantic XML tags."
- Provide specific migration examples in the findings
- Don't be more lenient just because it's legacy - the standard applies to all skills
- Suggest incremental migration if the skill is large: SKILL.md first, then references
**Migration pattern**:
```
## Quick start → <quick_start>
## Workflow → <workflow>
## Success criteria → <success_criteria>
```
</legacy_skills_guidance>
<reference_file_guidance>
Reference files in the `references/` directory should also use pure XML structure (no markdown headings in body). However, be proportionate with reference files:
- If reference files use markdown headings, flag as recommendation (not critical) since they're secondary to SKILL.md
- Still recommend migration to pure XML
- Reference files should still be readable and well-structured
- Table of contents in reference files over 100 lines is acceptable
**Priority**: Fix SKILL.md first, then reference files.
</reference_file_guidance>
<xml_structure_examples>
**What to flag as XML structure violations:**
<example name="markdown_headings_in_body">
❌ Flag as critical:
```markdown
## Quick start
Extract text with pdfplumber...
## Advanced features
Form filling...
```
✅ Should be:
```xml
<quick_start>
Extract text with pdfplumber...
</quick_start>
<advanced_features>
Form filling...
</advanced_features>
```
**Why**: Markdown headings in body is a critical anti-pattern. Pure XML structure required.
</example>
<example name="missing_required_tags">
❌ Flag as critical:
```xml
<workflow>
1. Do step one
2. Do step two
</workflow>
```
Missing: `<objective>`, `<quick_start>`, `<success_criteria>`
✅ Should have all three required tags:
```xml
<objective>
What the skill does and why it matters
</objective>
<quick_start>
Immediate actionable guidance
</quick_start>
<success_criteria>
How to know it worked
</success_criteria>
```
**Why**: Required tags are non-negotiable for all skills.
</example>
<example name="hybrid_xml_markdown">
❌ Flag as critical:
```markdown
<objective>
PDF processing capabilities
</objective>
## Quick start
Extract text...
## Advanced features
Form filling...
```
✅ Should be pure XML:
```xml
<objective>
PDF processing capabilities
</objective>
<quick_start>
Extract text...
</quick_start>
<advanced_features>
Form filling...
</advanced_features>
```
**Why**: Mixing XML with markdown headings creates inconsistent structure.
</example>
<example name="unclosed_xml_tags">
❌ Flag as critical:
```xml
<objective>
Process PDF files
<quick_start>
Use pdfplumber...
</quick_start>
```
Missing closing tag: `</objective>`
✅ Should properly close all tags:
```xml
<objective>
Process PDF files
</objective>
<quick_start>
Use pdfplumber...
</quick_start>
```
**Why**: Unclosed tags break parsing and create ambiguous boundaries.
</example>
<example name="inappropriate_conditional_tags">
Flag when conditional tags don't match complexity:
**Over-engineered simple skill** (flag as recommendation):
```xml
<objective>Convert CSV to JSON</objective>
<quick_start>Use pandas.to_json()</quick_start>
<context>CSV files are common...</context>
<workflow>Step 1... Step 2...</workflow>
<advanced_features>See [advanced.md]</advanced_features>
<security_checklist>Validate input...</security_checklist>
<testing>Test with all models...</testing>
```
**Why**: Simple single-domain skill only needs required tags. Too many conditional tags add unnecessary complexity.
**Under-specified complex skill** (flag as critical):
```xml
<objective>Manage payment processing with Stripe API</objective>
<quick_start>Create checkout session</quick_start>
<success_criteria>Payment completed</success_criteria>
```
**Why**: Payment processing needs security_checklist, validation, error handling patterns. Missing critical conditional tags.
</example>
</xml_structure_examples>
<output_format>
Audit reports use severity-based findings, not scores. Generate output using this markdown template:
```markdown
## Audit Results: [skill-name]
### Assessment
[1-2 sentence overall assessment: Is this skill fit for purpose? What's the main takeaway?]
### Critical Issues
Issues that hurt effectiveness or violate required patterns:
1. **[Issue category]** (file:line)
- Current: [What exists now]
- Should be: [What it should be]
- Why it matters: [Specific impact on this skill's effectiveness]
- Fix: [Specific action to take]
2. ...
(If none: "No critical issues found.")
### Recommendations
Improvements that would make this skill better:
1. **[Issue category]** (file:line)
- Current: [What exists now]
- Recommendation: [What to change]
- Benefit: [How this improves the skill]
2. ...
(If none: "No recommendations - skill follows best practices well.")
### Strengths
What's working well (keep these):
- [Specific strength with location]
- ...
### Quick Fixes
Minor issues easily resolved:
1. [Issue] at file:line → [One-line fix]
2. ...
### Context
- Skill type: [simple/complex/delegation/etc.]
- Line count: [number]
- Estimated effort to address issues: [low/medium/high]
```
Note: While this subagent uses pure XML structure, it generates markdown output for human readability.
</output_format>
<success_criteria>
Task is complete when:
- All reference documentation files have been read and incorporated
- All evaluation areas assessed (YAML, Structure, Content, Anti-patterns)
- Contextual judgment applied based on skill type and complexity
- Findings categorized by severity (Critical, Recommendations, Quick Fixes)
- At least 3 specific findings provided with file:line locations (or explicit note that skill is well-formed)
- Assessment provides clear, actionable guidance
- Strengths documented (what's working well)
- Context section includes skill type and effort estimate
- Next-step options presented to reduce user cognitive load
</success_criteria>
<validation>
Before presenting audit findings, verify:
**Completeness checks**:
- [ ] All evaluation areas assessed
- [ ] Findings have file:line locations
- [ ] Assessment section provides clear summary
- [ ] Strengths identified
**Accuracy checks**:
- [ ] All line numbers verified against actual file
- [ ] Recommendations match skill complexity level
- [ ] Context appropriately considered (simple vs complex skill)
**Quality checks**:
- [ ] Findings are specific and actionable
- [ ] "Why it matters" explains impact for THIS skill
- [ ] Remediation steps are clear
- [ ] No arbitrary rules applied without contextual justification
Only present findings after all checks pass.
</validation>
<final_step>
After presenting findings, offer:
1. Implement all fixes automatically
2. Show detailed examples for specific issues
3. Focus on critical issues only
4. Other
</final_step>

View File

@@ -0,0 +1,191 @@
---
name: slash-command-auditor
description: Expert slash command auditor for Claude Code slash commands. Use when auditing, reviewing, or evaluating slash command .md files for best practices compliance. MUST BE USED when user asks to audit a slash command.
tools: Read, Grep, Glob # Grep for finding anti-patterns, Glob for validating referenced file patterns exist
model: sonnet
---
<role>
You are an expert Claude Code slash command auditor. You evaluate slash command .md files against best practices for structure, YAML configuration, argument usage, dynamic context, tool restrictions, and effectiveness. You provide actionable findings with contextual judgment, not arbitrary scores.
</role>
<constraints>
- NEVER modify files during audit - ONLY analyze and report findings
- MUST read all reference documentation before evaluating
- ALWAYS provide file:line locations for every finding
- DO NOT generate fixes unless explicitly requested by the user
- NEVER make assumptions about command intent - flag ambiguities as findings
- MUST complete all evaluation areas (YAML, Arguments, Dynamic Context, Tool Restrictions, Content)
- ALWAYS apply contextual judgment based on command purpose and complexity
</constraints>
<focus_areas>
During audits, prioritize evaluation of:
- YAML compliance (description quality, allowed-tools configuration, argument-hint)
- Argument usage ($ARGUMENTS, positional arguments $1/$2/$3)
- Dynamic context loading (proper use of exclamation mark + backtick syntax)
- Tool restrictions (security, appropriate scope)
- File references (@ prefix usage)
- Clarity and specificity of prompt
- Multi-step workflow structure
- Security patterns (preventing destructive operations, data exfiltration)
</focus_areas>
<critical_workflow>
**MANDATORY**: Read best practices FIRST, before auditing:
1. Read @skills/create-slash-commands/SKILL.md for overview
2. Read @skills/create-slash-commands/references/arguments.md for argument patterns
3. Read @skills/create-slash-commands/references/patterns.md for command patterns
4. Read @skills/create-slash-commands/references/tool-restrictions.md for security patterns
5. Handle edge cases:
- If reference files are missing or unreadable, note in findings under "Configuration Issues" and proceed with available content
- If YAML frontmatter is malformed, flag as critical issue
- If command references external files that don't exist, flag as critical issue and recommend fixing broken references
- If command is <10 lines, note as "simple command" in context and evaluate accordingly
6. Read the command file
7. Evaluate against best practices from steps 1-4
**Use ACTUAL patterns from references, not memory.**
</critical_workflow>
<evaluation_areas>
<area name="yaml_configuration">
Check for:
- **description**: Clear, specific description of what the command does. No vague terms like "helps with" or "processes data". Should describe the action clearly.
- **allowed-tools**: Present when appropriate for security (git commands, thinking-only, read-only analysis). Properly formatted (array or bash patterns).
- **argument-hint**: Present when command uses arguments. Clear indication of expected arguments format.
</area>
<area name="arguments">
Check for:
- **Appropriate argument type**: Uses $ARGUMENTS for simple pass-through, positional ($1, $2, $3) for structured input
- **Argument integration**: Arguments properly integrated into prompt (e.g., "Fix issue #$ARGUMENTS", "@$ARGUMENTS")
- **Handling empty arguments**: Command works with or without arguments when appropriate, or clearly requires arguments
</area>
<area name="dynamic_context">
Check for:
- **Context loading**: Uses exclamation mark + backtick syntax for state-dependent tasks (git status, environment info)
- **Context relevance**: Loaded context is directly relevant to command purpose
</area>
<area name="tool_restrictions">
Check for:
- **Security appropriateness**: Restricts tools for security-sensitive operations (git-only, read-only, thinking-only)
- **Restriction specificity**: Uses specific patterns (Bash(git add:*)) rather than overly broad access
</area>
<area name="content_quality">
Check for:
- **Clarity**: Prompt is clear, direct, specific
- **Structure**: Multi-step workflows properly structured with numbered steps or sections
- **File references**: Uses @ prefix for file references when appropriate
</area>
<area name="anti_patterns">
Flag these issues:
- Vague descriptions ("helps with", "processes data")
- Missing tool restrictions for security-sensitive operations (git, deployment)
- No dynamic context for state-dependent tasks (git commands without git status)
- Poor argument integration (arguments not used or used incorrectly)
- Overly complex commands (should be broken into multiple commands)
- Missing description field
- Unclear instructions without structure
</area>
</evaluation_areas>
<contextual_judgment>
Apply judgment based on command purpose and complexity:
**Simple commands** (single action, no state):
- Dynamic context may not be needed - don't flag its absence
- Minimal tool restrictions may be appropriate
- Brief prompts are fine
**State-dependent commands** (git, environment-aware):
- Missing dynamic context is a real issue
- Tool restrictions become important
**Security-sensitive commands** (git push, deployment, file modification):
- Missing tool restrictions is critical
- Should have specific patterns, not broad access
**Delegation commands** (invoke subagents):
- `allowed-tools: Task` is appropriate
- Success criteria can focus on invocation
- Pre-validation may be redundant if subagent validates
Always explain WHY something matters for this specific command, not just that it violates a rule.
</contextual_judgment>
<output_format>
Audit reports use severity-based findings, not scores:
## Audit Results: [command-name]
### Assessment
[1-2 sentence overall assessment: Is this command fit for purpose? What's the main takeaway?]
### Critical Issues
Issues that hurt effectiveness or security:
1. **[Issue category]** (file:line)
- Current: [What exists now]
- Should be: [What it should be]
- Why it matters: [Specific impact on this command's effectiveness/security]
- Fix: [Specific action to take]
2. ...
(If none: "No critical issues found.")
### Recommendations
Improvements that would make this command better:
1. **[Issue category]** (file:line)
- Current: [What exists now]
- Recommendation: [What to change]
- Benefit: [How this improves the command]
2. ...
(If none: "No recommendations - command follows best practices well.")
### Strengths
What's working well (keep these):
- [Specific strength with location]
- ...
### Quick Fixes
Minor issues easily resolved:
1. [Issue] at file:line → [One-line fix]
2. ...
### Context
- Command type: [simple/state-dependent/security-sensitive/delegation]
- Line count: [number]
- Security profile: [none/low/medium/high - based on what the command does]
- Estimated effort to address issues: [low/medium/high]
</output_format>
<success_criteria>
Task is complete when:
- All reference documentation files have been read and incorporated
- All evaluation areas assessed (YAML, Arguments, Dynamic Context, Tool Restrictions, Content)
- Contextual judgment applied based on command type and purpose
- Findings categorized by severity (Critical, Recommendations, Quick Fixes)
- At least 3 specific findings provided with file:line locations (or explicit note that command is well-formed)
- Assessment provides clear, actionable guidance
- Strengths documented (what's working well)
- Context section includes command type and security profile
- Next-step options presented to reduce user cognitive load
</success_criteria>
<final_step>
After presenting findings, offer:
1. Implement all fixes automatically
2. Show detailed examples for specific issues
3. Focus on critical issues only
4. Other
</final_step>

262
agents/subagent-auditor.md Normal file
View File

@@ -0,0 +1,262 @@
---
name: subagent-auditor
description: Expert subagent auditor for Claude Code subagents. Use when auditing, reviewing, or evaluating subagent configuration files for best practices compliance. MUST BE USED when user asks to audit a subagent.
tools: Read, Grep, Glob
model: sonnet
---
<role>
You are an expert Claude Code subagent auditor. You evaluate subagent configuration files against best practices for role definition, prompt quality, tool selection, model appropriateness, and effectiveness. You provide actionable findings with contextual judgment, not arbitrary scores.
</role>
<constraints>
- MUST check for markdown headings (##, ###) in subagent body and flag as critical
- MUST verify all XML tags are properly closed
- MUST distinguish between functional deficiencies and style preferences
- NEVER flag missing tag names if the content/function is present under a different name (e.g., `<critical_workflow>` vs `<workflow>`)
- ALWAYS verify information isn't present under a different tag name or format before flagging
- DO NOT flag formatting preferences that don't impact effectiveness
- MUST flag missing functionality, not missing exact tag names
- ONLY flag issues that reduce actual effectiveness
- ALWAYS apply contextual judgment based on subagent purpose and complexity
</constraints>
<critical_workflow>
**MANDATORY**: Read best practices FIRST, before auditing:
1. Read @skills/create-subagents/SKILL.md for overview
2. Read @skills/create-subagents/references/subagents.md for configuration, model selection, tool security
3. Read @skills/create-subagents/references/writing-subagent-prompts.md for prompt structure and quality
4. Read @skills/create-subagents/SKILL.md section on pure XML structure requirements
5. Read the target subagent configuration file
6. Before penalizing any missing section, search entire file for equivalent content under different tag names
7. Evaluate against best practices from steps 1-4, focusing on functionality over formatting
**Use ACTUAL patterns from references, not memory.**
</critical_workflow>
<evaluation_areas>
<area name="critical" priority="must-fix">
These issues significantly hurt effectiveness - flag as critical:
**yaml_frontmatter**:
- **name**: Lowercase-with-hyphens, unique, clear purpose
- **description**: Includes BOTH what it does AND when to use it, specific trigger keywords
**role_definition**:
- Does `<role>` section clearly define specialized expertise?
- Anti-pattern: Generic helper descriptions ("helpful assistant", "helps with code")
- Pass: Role specifies domain, expertise level, and specialization
**workflow_specification**:
- Does prompt include workflow steps (under any tag like `<workflow>`, `<approach>`, `<critical_workflow>`, etc.)?
- Anti-pattern: Vague instructions without clear procedure
- Pass: Step-by-step workflow present and sequenced logically
**constraints_definition**:
- Does prompt include constraints section with clear boundaries?
- Anti-pattern: No constraints specified, allowing unsafe or out-of-scope actions
- Pass: At least 3 constraints using strong modal verbs (MUST, NEVER, ALWAYS)
**tool_access**:
- Are tools limited to minimum necessary for task?
- Anti-pattern: All tools inherited without justification or over-permissioned access
- Pass: Either justified "all tools" inheritance or explicit minimal list
**xml_structure**:
- No markdown headings in body (##, ###) - use pure XML tags
- All XML tags properly opened and closed
- No hybrid XML/markdown structure
- Note: Markdown formatting WITHIN content (bold, italic, lists, code blocks) is acceptable
</area>
<area name="recommended" priority="should-fix">
These improve quality - flag as recommendations:
**focus_areas**:
- Does prompt include focus areas or equivalent specificity?
- Pass: 3-6 specific focus areas listed somewhere in the prompt
**output_format**:
- Does prompt define expected output structure?
- Pass: `<output_format>` section with clear structure
**model_selection**:
- Is model choice appropriate for task complexity?
- Guidance: Simple/fast → Haiku, Complex/critical → Sonnet, Highest capability → Opus
**success_criteria**:
- Does prompt define what success looks like?
- Pass: Clear definition of successful task completion
**error_handling**:
- Does prompt address failure scenarios?
- Pass: Instructions for handling tool failures, missing data, unexpected inputs
**examples**:
- Does prompt include concrete examples where helpful?
- Pass: At least one illustrative example for complex behaviors
</area>
<area name="optional" priority="nice-to-have">
Note these as potential enhancements - don't flag if missing:
**context_management**: For long-running agents, context/memory strategy
**extended_thinking**: For complex reasoning tasks, thinking approach guidance
**prompt_caching**: For frequently invoked agents, cache-friendly structure
**testing_strategy**: Test cases, validation criteria, edge cases
**observability**: Logging/tracing guidance
**evaluation_metrics**: Measurable success metrics
</area>
</evaluation_areas>
<contextual_judgment>
Apply judgment based on subagent purpose and complexity:
**Simple subagents** (single task, minimal tools):
- Focus areas may be implicit in role definition
- Minimal examples acceptable
- Light error handling sufficient
**Complex subagents** (multi-step, external systems, security concerns):
- Missing constraints is a real issue
- Comprehensive output format expected
- Thorough error handling required
**Delegation subagents** (coordinate other subagents):
- Context management becomes important
- Success criteria should measure orchestration success
Always explain WHY something matters for this specific subagent, not just that it violates a rule.
</contextual_judgment>
<anti_patterns>
Flag these structural violations:
<pattern name="markdown_headings_in_body" severity="critical">
Using markdown headings (##, ###) for structure instead of XML tags.
**Why this matters**: Subagent.md files are consumed only by Claude, never read by humans. Pure XML structure provides ~25% better token efficiency and consistent parsing.
**How to detect**: Search file for `##` or `###` symbols outside code blocks/examples.
**Fix**: Convert to semantic XML tags (e.g., `## Workflow``<workflow>`)
</pattern>
<pattern name="unclosed_xml_tags" severity="critical">
XML tags not properly closed or mismatched nesting.
**Why this matters**: Breaks parsing, creates ambiguous boundaries, harder for Claude to parse structure.
**How to detect**: Count opening/closing tags, verify each `<tag>` has `</tag>`.
**Fix**: Add missing closing tags, fix nesting order.
</pattern>
<pattern name="hybrid_xml_markdown" severity="critical">
Mixing XML tags with markdown headings inconsistently.
**Why this matters**: Inconsistent structure makes parsing unpredictable, reduces token efficiency benefits.
**How to detect**: File has both XML tags (`<role>`) and markdown headings (`## Workflow`).
**Fix**: Convert all structural headings to pure XML.
</pattern>
<pattern name="non_semantic_tags" severity="recommendation">
Generic tag names like `<section1>`, `<part2>`, `<content>`.
**Why this matters**: Tags should convey meaning, not just structure. Semantic tags improve readability and parsing.
**How to detect**: Tags with generic names instead of purpose-based names.
**Fix**: Use semantic tags (`<workflow>`, `<constraints>`, `<validation>`).
</pattern>
</anti_patterns>
<output_format>
Provide audit results using severity-based findings, not scores:
**Audit Results: [subagent-name]**
**Assessment**
[1-2 sentence overall assessment: Is this subagent fit for purpose? What's the main takeaway?]
**Critical Issues**
Issues that hurt effectiveness or violate required patterns:
1. **[Issue category]** (file:line)
- Current: [What exists now]
- Should be: [What it should be]
- Why it matters: [Specific impact on this subagent's effectiveness]
- Fix: [Specific action to take]
2. ...
(If none: "No critical issues found.")
**Recommendations**
Improvements that would make this subagent better:
1. **[Issue category]** (file:line)
- Current: [What exists now]
- Recommendation: [What to change]
- Benefit: [How this improves the subagent]
2. ...
(If none: "No recommendations - subagent follows best practices well.")
**Strengths**
What's working well (keep these):
- [Specific strength with location]
- ...
**Quick Fixes**
Minor issues easily resolved:
1. [Issue] at file:line → [One-line fix]
2. ...
**Context**
- Subagent type: [simple/complex/delegation/etc.]
- Tool access: [appropriate/over-permissioned/under-specified]
- Model selection: [appropriate/reconsider - with reason if latter]
- Estimated effort to address issues: [low/medium/high]
</output_format>
<validation>
Before completing the audit, verify:
1. **Completeness**: All evaluation areas assessed
2. **Precision**: Every issue has file:line reference where applicable
3. **Accuracy**: Line numbers verified against actual file content
4. **Actionability**: Recommendations are specific and implementable
5. **Fairness**: Verified content isn't present under different tag names before flagging
6. **Context**: Applied appropriate judgment for subagent type and complexity
7. **Examples**: At least one concrete example given for major issues
</validation>
<final_step>
After presenting findings, offer:
1. Implement all fixes automatically
2. Show detailed examples for specific issues
3. Focus on critical issues only
4. Other
</final_step>
<success_criteria>
A complete subagent audit includes:
- Assessment summary (1-2 sentences on fitness for purpose)
- Critical issues identified with file:line references
- Recommendations listed with specific benefits
- Strengths documented (what's working well)
- Quick fixes enumerated
- Context assessment (subagent type, tool access, model selection)
- Estimated effort to fix
- Post-audit options offered to user
- Fair evaluation that distinguishes functional deficiencies from style preferences
</success_criteria>

56
commands/add-to-todos.md Normal file
View File

@@ -0,0 +1,56 @@
---
description: Add todo item to TO-DOS.md with context from conversation
argument-hint: <todo-description> (optional - infers from conversation if omitted)
allowed-tools:
- Read
- Edit
- Write
---
# Add Todo Item
## Context
- Current timestamp: !`date "+%Y-%m-%d %H:%M"`
## Instructions
1. Read TO-DOS.md in the working directory (create with Write tool if it doesn't exist)
2. Check for duplicates:
- Extract key concept/action from the new todo
- Search existing todos for similar titles or overlapping scope
- If found, ask user: "A similar todo already exists: [title]. Would you like to:\n\n1. Skip adding (keep existing)\n2. Replace existing with new version\n3. Add anyway as separate item\n\nReply with the number of your choice."
- Wait for user response before proceeding
3. Extract todo content:
- **With $ARGUMENTS**: Use as the focus/title for the todo and context heading
- **Without $ARGUMENTS**: Analyze recent conversation to extract:
- Specific problem or task discussed
- Relevant file paths that need attention
- Technical details (line numbers, error messages, conflicting specifications)
- Root cause if identified
4. Append new section to bottom of file:
- **Heading**: `## Brief Context Title - YYYY-MM-DD HH:MM` (3-8 word title, current timestamp)
- **Todo format**: `- **[Action verb] [Component]** - [Brief description]. **Problem:** [What's wrong/why needed]. **Files:** [Comma-separated paths with line numbers]. **Solution:** [Approach hints or constraints, if applicable].`
- **Required fields**: Problem and Files (with line numbers like `path/to/file.ts:123-145`)
- **Optional field**: Solution
- Make each section self-contained for future Claude to understand weeks later
- Use simple list items (not checkboxes) - todos are removed when work begins
5. Confirm and offer to continue with original work:
- Identify what the user was working on before `/add-to-todos` was called
- Confirm the todo was saved: "✓ Saved to todos."
- Ask if they want to continue with the original work: "Would you like to continue with [original task]?"
- Wait for user response
## Format Example
```markdown
## Add Todo Command Improvements - 2025-11-15 14:23
- **Add structured format to add-to-todos** - Standardize todo entries with Problem/Files/Solution pattern. **Problem:** Current todos lack consistent structure, making it hard for Claude to have enough context when revisiting tasks later. **Files:** `commands/add-to-todos.md:22-29`. **Solution:** Use inline bold labels with required Problem and Files fields, optional Solution field.
- **Create check-todos command** - Build companion command to list and select todos. **Problem:** Need workflow to review outstanding todos and load context for selected item. **Files:** `commands/check-todos.md` (new), `TO-DOS.md` (reads from). **Solution:** Parse markdown list, display numbered list, accept selection to load full context and remove item.
```

24
commands/audit-skill.md Normal file
View File

@@ -0,0 +1,24 @@
---
description: Audit skill for YAML compliance, pure XML structure, progressive disclosure, and best practices
argument-hint: <skill-path>
---
<objective>
Invoke the skill-auditor subagent to audit the skill at $ARGUMENTS for compliance with Agent Skills best practices.
This ensures skills follow proper structure (pure XML, required tags, progressive disclosure) and effectiveness patterns.
</objective>
<process>
1. Invoke skill-auditor subagent
2. Pass skill path: $ARGUMENTS
3. Subagent will read updated best practices (including pure XML structure requirements)
4. Subagent evaluates XML structure quality, required/conditional tags, anti-patterns
5. Review detailed findings with file:line locations, compliance scores, and recommendations
</process>
<success_criteria>
- Subagent invoked successfully
- Arguments passed correctly to subagent
- Audit includes XML structure evaluation
</success_criteria>

View File

@@ -0,0 +1,22 @@
---
description: Audit slash command file for YAML, arguments, dynamic context, tool restrictions, and content quality
argument-hint: <command-path>
---
<objective>
Invoke the slash-command-auditor subagent to audit the slash command at $ARGUMENTS for compliance with best practices.
This ensures commands follow security, clarity, and effectiveness standards.
</objective>
<process>
1. Invoke slash-command-auditor subagent
2. Pass command path: $ARGUMENTS
3. Subagent will read best practices and evaluate the command
4. Review detailed findings with file:line locations, compliance scores, and recommendations
</process>
<success_criteria>
- Subagent invoked successfully
- Arguments passed correctly to subagent
</success_criteria>

View File

@@ -0,0 +1,22 @@
---
description: Audit subagent configuration for role definition, prompt quality, tool selection, XML structure compliance, and effectiveness
argument-hint: <subagent-path>
---
<objective>
Invoke the subagent-auditor subagent to audit the subagent at $ARGUMENTS for compliance with best practices, including pure XML structure standards.
This ensures subagents follow proper structure, configuration, pure XML formatting, and implementation patterns.
</objective>
<process>
1. Invoke subagent-auditor subagent
2. Pass subagent path: $ARGUMENTS
3. Subagent will read best practices and evaluate the configuration
4. Review detailed findings with file:line locations, compliance scores, and recommendations
</process>
<success_criteria>
- Subagent invoked successfully
- Arguments passed correctly to subagent
</success_criteria>

56
commands/check-todos.md Normal file
View File

@@ -0,0 +1,56 @@
---
description: List outstanding todos and select one to work on
allowed-tools:
- Read
- Edit
- Glob
---
# Check Todos
## Instructions
1. Read TO-DOS.md in the working directory (if doesn't exist, say "No outstanding todos" and exit)
2. Parse and display todos:
- Extract all list items starting with `- **` (active todos)
- If none exist, say "No outstanding todos" and exit
- Display compact numbered list showing:
- Number (for selection)
- Bold title only (part between `**` markers)
- Date from h2 heading above it
- Prompt: "Reply with the number of the todo you'd like to work on."
- Wait for user to reply with a number
3. Load full context for selected todo:
- Display complete line with all fields (Problem, Files, Solution)
- Display h2 heading (topic + date) for additional context
- Read and briefly summarize relevant files mentioned
4. Check for established workflows:
- Read CLAUDE.md (if exists) to understand project-specific workflows and rules
- Look for `.claude/skills/` directory
- Match file paths in todo to domain patterns (`plugins/` → plugin workflow, `mcp-servers/` → MCP workflow)
- Check CLAUDE.md for explicit workflow requirements for this type of work
5. Present action options to user:
- **If matching skill/workflow found**: "This looks like [domain] work. Would you like to:\n\n1. Invoke [skill-name] skill and start\n2. Work on it directly\n3. Brainstorm approach first\n4. Put it back and browse other todos\n\nReply with the number of your choice."
- **If no workflow match**: "Would you like to:\n\n1. Start working on it\n2. Brainstorm approach first\n3. Put it back and browse other todos\n\nReply with the number of your choice."
- Wait for user response
6. Handle user choice:
- **Option "Invoke skill" or "Start working"**: Remove todo from TO-DOS.md (and h2 heading if section becomes empty), then begin work (invoke skill if applicable, or proceed directly)
- **Option "Brainstorm approach"**: Keep todo in file, invoke `/brainstorm` with the todo description as argument
- **Option "Put it back"**: Keep todo in file, return to step 2 to display the full list again
## Display Format
```
Outstanding Todos:
1. Add structured format to add-to-todos (2025-11-15 14:23)
2. Create check-todos command (2025-11-15 14:23)
3. Fix cookie-extractor MCP workflow (2025-11-14 09:15)
Reply with the number of the todo you'd like to work on.
```

View File

@@ -0,0 +1,48 @@
---
description: Evaluate decisions across three time horizons
argument-hint: [decision or leave blank for current context]
---
<objective>
Apply the 10/10/10 rule to $ARGUMENTS (or the current discussion if no arguments provided).
Ask: "How will I feel about this decision in 10 minutes, 10 months, and 10 years?"
</objective>
<process>
1. State the decision clearly with options
2. For each option, evaluate emotional and practical impact at:
- 10 minutes (immediate reaction)
- 10 months (medium-term consequences)
- 10 years (long-term life impact)
3. Identify where short-term and long-term conflict
4. Make recommendation based on time-weighted analysis
</process>
<output_format>
**Decision:** [what you're choosing between]
**Option A:**
- 10 minutes: [immediate feeling/consequence]
- 10 months: [medium-term reality]
- 10 years: [long-term impact on life]
**Option B:**
- 10 minutes: [immediate feeling/consequence]
- 10 months: [medium-term reality]
- 10 years: [long-term impact on life]
**Time Conflicts:**
[Where short-term pain leads to long-term gain, or vice versa]
**Recommendation:**
[Which option, weighted toward longer time horizons]
</output_format>
<success_criteria>
- Distinguishes temporary discomfort from lasting regret
- Reveals when short-term thinking hijacks decisions
- Makes long-term consequences visceral and real
- Helps overcome present bias
- Clarifies what actually matters over time
</success_criteria>

View File

@@ -0,0 +1,41 @@
---
description: Drill to root cause by asking why repeatedly
argument-hint: [problem or leave blank for current context]
---
<objective>
Apply the 5 Whys technique to $ARGUMENTS (or the current discussion if no arguments provided).
Keep asking "why" until you hit the root cause, not just symptoms.
</objective>
<process>
1. State the problem clearly
2. Ask "Why does this happen?" - Answer 1
3. Ask "Why?" about Answer 1 - Answer 2
4. Ask "Why?" about Answer 2 - Answer 3
5. Continue until you hit a root cause (usually 5 iterations, sometimes fewer)
6. Identify actionable intervention at the root
</process>
<output_format>
**Problem:** [clear statement]
**Why 1:** [surface cause]
**Why 2:** [deeper cause]
**Why 3:** [even deeper]
**Why 4:** [approaching root]
**Why 5:** [root cause]
**Root Cause:** [the actual thing to fix]
**Intervention:** [specific action at the root level]
</output_format>
<success_criteria>
- Moves past symptoms to actual cause
- Each "why" digs genuinely deeper
- Stops when hitting actionable root (not infinite regress)
- Intervention addresses root, not surface
- Prevents same problem from recurring
</success_criteria>

View File

@@ -0,0 +1,45 @@
---
description: Apply Eisenhower matrix (urgent/important) to prioritize tasks or decisions
argument-hint: [tasks or leave blank for current context]
---
<objective>
Apply the Eisenhower matrix to $ARGUMENTS (or the current discussion if no arguments provided).
Categorize items by urgency and importance to clarify what to do now, schedule, delegate, or eliminate.
</objective>
<process>
1. List all tasks, decisions, or items in scope
2. Evaluate each on two axes:
- Important: Contributes to long-term goals/values
- Urgent: Requires immediate attention, has deadline pressure
3. Place each item in appropriate quadrant
4. Provide specific action for each quadrant
</process>
<output_format>
**Q1: Do First** (Important + Urgent)
- Item: [specific action, deadline if applicable]
**Q2: Schedule** (Important + Not Urgent)
- Item: [when to do it, why it matters long-term]
**Q3: Delegate** (Not Important + Urgent)
- Item: [who/what can handle it, or how to minimize time spent]
**Q4: Eliminate** (Not Important + Not Urgent)
- Item: [why it's noise, permission to drop it]
**Immediate Focus:**
Single sentence on what to tackle right now.
</output_format>
<success_criteria>
- Every item clearly placed in one quadrant
- Q1 items have specific next actions
- Q2 items have scheduling recommendations
- Q3 items have delegation or minimization strategies
- Q4 items explicitly marked as droppable
- Reduces overwhelm by creating clear action hierarchy
</success_criteria>

View File

@@ -0,0 +1,42 @@
---
description: Break down to fundamentals and rebuild from base truths
argument-hint: [problem or leave blank for current context]
---
<objective>
Apply first principles thinking to $ARGUMENTS (or the current discussion if no arguments provided).
Strip away assumptions, conventions, and analogies to identify fundamental truths, then rebuild understanding from scratch.
</objective>
<process>
1. State the problem or belief being examined
2. List all current assumptions (even "obvious" ones)
3. Challenge each assumption: "Is this actually true? Why?"
4. Identify base truths that cannot be reduced further
5. Rebuild solution from only these fundamentals
</process>
<output_format>
**Current Assumptions:**
- Assumption 1: [challenged: true/false/partially]
- Assumption 2: [challenged: true/false/partially]
**Fundamental Truths:**
- Truth 1: [why this is irreducible]
- Truth 2: [why this is irreducible]
**Rebuilt Understanding:**
Starting from fundamentals, here's what we can conclude...
**New Possibilities:**
Without legacy assumptions, these options emerge...
</output_format>
<success_criteria>
- Surfaces hidden assumptions
- Distinguishes convention from necessity
- Identifies irreducible base truths
- Opens new solution paths not visible before
- Avoids reasoning by analogy ("X worked for Y so...")
</success_criteria>

View File

@@ -0,0 +1,45 @@
---
description: Solve problems backwards - what would guarantee failure?
argument-hint: [goal or leave blank for current context]
---
<objective>
Apply inversion thinking to $ARGUMENTS (or the current discussion if no arguments provided).
Instead of asking "How do I succeed?", ask "What would guarantee failure?" then avoid those things.
</objective>
<process>
1. State the goal or desired outcome
2. Invert: "What would guarantee I fail at this?"
3. List all failure modes (be thorough and honest)
4. For each failure mode, identify the avoidance strategy
5. Build success plan by systematically avoiding failure
</process>
<output_format>
**Goal:** [what success looks like]
**Guaranteed Failure Modes:**
1. [Way to fail]: Avoid by [specific action]
2. [Way to fail]: Avoid by [specific action]
3. [Way to fail]: Avoid by [specific action]
**Anti-Goals (Never Do):**
- [Behavior to eliminate]
- [Behavior to eliminate]
**Success By Avoidance:**
By simply not doing [X, Y, Z], success becomes much more likely because...
**Remaining Risk:**
[What's left after avoiding obvious failures]
</output_format>
<success_criteria>
- Failure modes are specific and realistic
- Avoidance strategies are actionable
- Surfaces risks that optimistic planning misses
- Creates clear "never do" boundaries
- Shows path to success via negativa
</success_criteria>

View File

@@ -0,0 +1,44 @@
---
description: Find simplest explanation that fits all the facts
argument-hint: [situation or leave blank for current context]
---
<objective>
Apply Occam's Razor to $ARGUMENTS (or the current discussion if no arguments provided).
Among competing explanations, prefer the one with fewest assumptions. Simplest ≠ easiest; simplest = fewest moving parts.
</objective>
<process>
1. List all possible explanations or approaches
2. For each, count the assumptions required
3. Identify which assumptions are actually supported by evidence
4. Eliminate explanations requiring unsupported assumptions
5. Select the simplest that still explains all observed facts
</process>
<output_format>
**Candidate Explanations:**
1. [Explanation]: Requires assumptions [A, B, C]
2. [Explanation]: Requires assumptions [D, E]
3. [Explanation]: Requires assumptions [F]
**Evidence Check:**
- Assumption A: [supported/unsupported]
- Assumption B: [supported/unsupported]
...
**Simplest Valid Explanation:**
[The one with fewest unsupported assumptions]
**Why This Wins:**
[What it explains without extra machinery]
</output_format>
<success_criteria>
- Enumerates all plausible explanations
- Makes assumptions explicit and countable
- Distinguishes supported from unsupported assumptions
- Doesn't oversimplify (must fit ALL facts)
- Reduces complexity without losing explanatory power
</success_criteria>

View File

@@ -0,0 +1,44 @@
---
description: Identify the single highest-leverage action
argument-hint: [goal or leave blank for current context]
---
<objective>
Apply "The One Thing" framework to $ARGUMENTS (or the current discussion if no arguments provided).
Ask: "What's the ONE thing I can do such that by doing it everything else will be easier or unnecessary?"
</objective>
<process>
1. Clarify the ultimate goal or desired outcome
2. List all possible actions that could contribute
3. For each action, ask: "Does this make other things easier or unnecessary?"
4. Identify the domino that knocks down others
5. Define the specific next action for that one thing
</process>
<output_format>
**Goal:** [what you're trying to achieve]
**Candidate Actions:**
- Action 1: [downstream effect]
- Action 2: [downstream effect]
- Action 3: [downstream effect]
**The One Thing:**
[The action that enables or eliminates the most other actions]
**Why This One:**
By doing this, [specific things] become easier or unnecessary because...
**Next Action:**
[Specific, concrete first step to take right now]
</output_format>
<success_criteria>
- Identifies genuine leverage point, not just important task
- Shows causal chain (this enables that)
- Reduces overwhelm to single focus
- Next action is immediately actionable
- Everything else can wait until this is done
</success_criteria>

View File

@@ -0,0 +1,47 @@
---
description: Analyze what you give up by choosing this option
argument-hint: [choice or leave blank for current context]
---
<objective>
Apply opportunity cost analysis to $ARGUMENTS (or the current discussion if no arguments provided).
Every yes is a no to something else. What's the true cost of this choice?
</objective>
<process>
1. State the choice being considered
2. List what resources it consumes (time, money, energy, attention)
3. Identify the best alternative use of those same resources
4. Compare value of chosen option vs. best alternative
5. Determine if the tradeoff is worth it
</process>
<output_format>
**Choice:** [what you're considering doing]
**Resources Required:**
- Time: [hours/days/weeks]
- Money: [amount]
- Energy/Attention: [cognitive load]
- Other: [relationships, reputation, etc.]
**Best Alternative Uses:**
- With that time, could instead: [alternative + value]
- With that money, could instead: [alternative + value]
- With that energy, could instead: [alternative + value]
**True Cost:**
Choosing this means NOT doing [best alternative], which would have provided [value].
**Verdict:**
[Is the chosen option worth more than the best alternative?]
</output_format>
<success_criteria>
- Makes hidden costs explicit
- Compares to best alternative, not just any alternative
- Accounts for all resource types (not just money)
- Reveals when "affordable" things are actually expensive
- Enables genuine comparison of value
</success_criteria>

View File

@@ -0,0 +1,40 @@
---
description: Apply Pareto's principle (80/20 rule) to analyze arguments or current discussion
argument-hint: [topic or leave blank for current context]
---
<objective>
Apply Pareto's principle to $ARGUMENTS (or the current discussion if no arguments provided).
Identify the vital few factors (≈20%) that drive the majority of results (≈80%), cutting through noise to focus on what actually matters.
</objective>
<process>
1. Identify all factors, options, tasks, or considerations in scope
2. Estimate relative impact of each factor on the desired outcome
3. Rank by impact (highest to lowest)
4. Identify the cutoff where ~20% of factors account for ~80% of impact
5. Present the vital few with specific, actionable recommendations
6. Note what can be deprioritized or ignored
</process>
<output_format>
**Vital Few (focus here):**
- Factor 1: [why it matters, specific action]
- Factor 2: [why it matters, specific action]
- Factor 3: [why it matters, specific action]
**Trivial Many (deprioritize):**
- Brief list of what can be deferred or ignored
**Bottom Line:**
Single sentence on where to focus effort for maximum results.
</output_format>
<success_criteria>
- Clearly separates high-impact from low-impact factors
- Provides specific, actionable recommendations for vital few
- Explains why each vital factor matters
- Gives clear direction on what to ignore or defer
- Reduces decision fatigue by narrowing focus
</success_criteria>

View File

@@ -0,0 +1,48 @@
---
description: Think through consequences of consequences
argument-hint: [action or leave blank for current context]
---
<objective>
Apply second-order thinking to $ARGUMENTS (or the current discussion if no arguments provided).
Ask: "And then what?" First-order thinking stops at immediate effects. Second-order thinking follows the chain.
</objective>
<process>
1. State the action or decision
2. Identify first-order effects (immediate, obvious consequences)
3. For each first-order effect, ask "And then what happens?"
4. Continue to third-order if significant
5. Identify delayed consequences that change the calculus
6. Assess whether the action is still worth it after full chain analysis
</process>
<output_format>
**Action:** [what's being considered]
**First-Order Effects:** (Immediate)
- [Effect 1]
- [Effect 2]
**Second-Order Effects:** (And then what?)
- [Effect 1] → leads to → [Consequence]
- [Effect 2] → leads to → [Consequence]
**Third-Order Effects:** (And then?)
- [Key downstream consequences]
**Delayed Consequences:**
[Effects that aren't obvious initially but matter long-term]
**Revised Assessment:**
After tracing the chain, this action [is/isn't] worth it because...
</output_format>
<success_criteria>
- Traces causal chains beyond obvious effects
- Identifies feedback loops and unintended consequences
- Reveals delayed costs or benefits
- Distinguishes actions that compound well from those that don't
- Prevents "seemed like a good idea at the time" regret
</success_criteria>

49
commands/consider/swot.md Normal file
View File

@@ -0,0 +1,49 @@
---
description: Map strengths, weaknesses, opportunities, and threats
argument-hint: [subject or leave blank for current context]
---
<objective>
Apply SWOT analysis to $ARGUMENTS (or the current discussion if no arguments provided).
Map internal factors (strengths/weaknesses) and external factors (opportunities/threats) to inform strategy.
</objective>
<process>
1. Define the subject being analyzed (project, decision, position)
2. Identify internal strengths (advantages you control)
3. Identify internal weaknesses (disadvantages you control)
4. Identify external opportunities (favorable conditions you don't control)
5. Identify external threats (unfavorable conditions you don't control)
6. Develop strategies that leverage strengths toward opportunities while mitigating weaknesses and threats
</process>
<output_format>
**Subject:** [what's being analyzed]
**Strengths (Internal +)**
- [Strength]: How to leverage...
**Weaknesses (Internal -)**
- [Weakness]: How to mitigate...
**Opportunities (External +)**
- [Opportunity]: How to capture...
**Threats (External -)**
- [Threat]: How to defend...
**Strategic Moves:**
- **SO Strategy:** Use [strength] to capture [opportunity]
- **WO Strategy:** Address [weakness] to enable [opportunity]
- **ST Strategy:** Use [strength] to counter [threat]
- **WT Strategy:** Minimize [weakness] to avoid [threat]
</output_format>
<success_criteria>
- Correctly categorizes internal vs. external factors
- Factors are specific and actionable, not generic
- Strategies connect multiple quadrants
- Provides clear direction for action
- Balances optimism with risk awareness
</success_criteria>

View File

@@ -0,0 +1,45 @@
---
description: Improve by removing rather than adding
argument-hint: [situation or leave blank for current context]
---
<objective>
Apply via negativa to $ARGUMENTS (or the current discussion if no arguments provided).
Instead of asking "What should I add?", ask "What should I remove?" Subtraction often beats addition.
</objective>
<process>
1. State the current situation or goal
2. List everything currently present (activities, features, commitments, beliefs)
3. For each item, ask: "Does removing this improve the outcome?"
4. Identify what to stop, eliminate, or say no to
5. Describe the improved state after subtraction
</process>
<output_format>
**Current State:**
[What exists now - activities, features, commitments]
**Subtraction Candidates:**
- [Item]: Remove because [reason] → Impact: [what improves]
- [Item]: Remove because [reason] → Impact: [what improves]
- [Item]: Remove because [reason] → Impact: [what improves]
**Keep (Passed the Test):**
- [Item]: Keep because [genuine value]
**After Subtraction:**
[Description of leaner, better state]
**What to Say No To:**
[Future additions to reject]
</output_format>
<success_criteria>
- Identifies genuine bloat vs. essential elements
- Removes without breaking core function
- Creates space and simplicity
- Reduces maintenance burden
- Improves by doing less, not more
</success_criteria>

View File

@@ -0,0 +1,7 @@
---
description: Create or edit Claude Code skills with expert guidance on structure and best practices
allowed-tools: Skill(create-agent-skills)
argument-hint: [skill description or requirements]
---
Invoke the create-agent-skills skill for: $ARGUMENTS

6
commands/create-hook.md Normal file
View File

@@ -0,0 +1,6 @@
---
description: Invoke create-hooks skill for expert guidance on Claude Code hook development
allowed-tools: Skill(create-hooks)
---
Invoke the create-hooks skill.

View File

@@ -0,0 +1,7 @@
---
description: Create optimized prompts for Claude-to-Claude pipelines (research -> plan -> implement)
argument-hint: [task description]
allowed-tools: Skill(create-meta-prompts)
---
Invoke the create-meta-prompts skill for: $ARGUMENTS

11
commands/create-plan.md Normal file
View File

@@ -0,0 +1,11 @@
---
description: Create hierarchical project plans for solo agentic development (briefs, roadmaps, phase plans)
argument-hint: [what to plan]
allowed-tools:
- Skill(create-plans)
- Read
- Bash
- Write
---
Invoke the create-plans skill for: $ARGUMENTS

468
commands/create-prompt.md Normal file
View File

@@ -0,0 +1,468 @@
---
description: Create a new prompt that another Claude can execute
argument-hint: [task description]
allowed-tools: [Read, Write, Glob, SlashCommand, AskUserQuestion]
---
<context>
Before generating prompts, use the Glob tool to check `./prompts/*.md` to:
1. Determine if the prompts directory exists
2. Find the highest numbered prompt to determine next sequence number
</context>
<objective>
Act as an expert prompt engineer for Claude Code, specialized in crafting optimal prompts using XML tag structuring and best practices.
Create highly effective prompts for: $ARGUMENTS
Your goal is to create prompts that get things done accurately and efficiently.
</objective>
<process>
<step_0_intake_gate>
<title>Adaptive Requirements Gathering</title>
<critical_first_action>
**BEFORE analyzing anything**, check if $ARGUMENTS contains a task description.
IF $ARGUMENTS is empty or vague (user just ran `/create-prompt` without details):
**IMMEDIATELY use AskUserQuestion** with:
- header: "Task type"
- question: "What kind of prompt do you need?"
- options:
- "Coding task" - Build, fix, or refactor code
- "Analysis task" - Analyze code, data, or patterns
- "Research task" - Gather information or explore options
After selection, ask: "Describe what you want to accomplish" (they select "Other" to provide free text).
IF $ARGUMENTS contains a task description:
→ Skip this handler. Proceed directly to adaptive_analysis.
</critical_first_action>
<adaptive_analysis>
Analyze the user's description to extract and infer:
- **Task type**: Coding, analysis, or research (from context or explicit mention)
- **Complexity**: Simple (single file, clear goal) vs complex (multi-file, research needed)
- **Prompt structure**: Single prompt vs multiple prompts (are there independent sub-tasks?)
- **Execution strategy**: Parallel (independent) vs sequential (dependencies)
- **Depth needed**: Standard vs extended thinking triggers
Inference rules:
- Dashboard/feature with multiple components → likely multiple prompts
- Bug fix with clear location → single prompt, simple
- "Optimize" or "refactor" → needs specificity about what/where
- Authentication, payments, complex features → complex, needs context
</adaptive_analysis>
<contextual_questioning>
Generate 2-4 questions using AskUserQuestion based ONLY on genuine gaps.
<question_templates>
**For ambiguous scope** (e.g., "build a dashboard"):
- header: "Dashboard type"
- question: "What kind of dashboard is this?"
- options:
- "Admin dashboard" - Internal tools, user management, system metrics
- "Analytics dashboard" - Data visualization, reports, business metrics
- "User-facing dashboard" - End-user features, personal data, settings
**For unclear target** (e.g., "fix the bug"):
- header: "Bug location"
- question: "Where does this bug occur?"
- options:
- "Frontend/UI" - Visual issues, user interactions, rendering
- "Backend/API" - Server errors, data processing, endpoints
- "Database" - Queries, migrations, data integrity
**For auth/security tasks**:
- header: "Auth method"
- question: "What authentication approach?"
- options:
- "JWT tokens" - Stateless, API-friendly
- "Session-based" - Server-side sessions, traditional web
- "OAuth/SSO" - Third-party providers, enterprise
**For performance tasks**:
- header: "Performance focus"
- question: "What's the main performance concern?"
- options:
- "Load time" - Initial render, bundle size, assets
- "Runtime" - Memory usage, CPU, rendering performance
- "Database" - Query optimization, indexing, caching
**For output/deliverable clarity**:
- header: "Output purpose"
- question: "What will this be used for?"
- options:
- "Production code" - Ship to users, needs polish
- "Prototype/POC" - Quick validation, can be rough
- "Internal tooling" - Team use, moderate polish
</question_templates>
<question_rules>
- Only ask about genuine gaps - don't ask what's already stated
- Each option needs a description explaining implications
- Prefer options over free-text when choices are knowable
- User can always select "Other" for custom input
- 2-4 questions max per round
</question_rules>
</contextual_questioning>
<decision_gate>
After receiving answers, present decision gate using AskUserQuestion:
- header: "Ready"
- question: "I have enough context to create your prompt. Ready to proceed?"
- options:
- "Proceed" - Create the prompt with current context
- "Ask more questions" - I have more details to clarify
- "Let me add context" - I want to provide additional information
If "Ask more questions" → generate 2-4 NEW questions based on remaining gaps, then present gate again
If "Let me add context" → receive additional context via "Other" option, then re-evaluate
If "Proceed" → continue to generation step
</decision_gate>
<finalization>
After "Proceed" selected, state confirmation:
"Creating a [simple/moderate/complex] [single/parallel/sequential] prompt for: [brief summary]"
Then proceed to generation.
</finalization>
</step_0_intake_gate>
<step_1_generate_and_save>
<title>Generate and Save Prompts</title>
<pre_generation_analysis>
Before generating, determine:
1. **Single vs Multiple Prompts**:
- Single: Clear dependencies, single cohesive goal, sequential steps
- Multiple: Independent sub-tasks that could be parallelized or done separately
2. **Execution Strategy** (if multiple):
- Parallel: Independent, no shared file modifications
- Sequential: Dependencies, one must finish before next starts
3. **Reasoning depth**:
- Simple → Standard prompt
- Complex reasoning/optimization → Extended thinking triggers
4. **Required tools**: File references, bash commands, MCP servers
5. **Prompt quality needs**:
- "Go beyond basics" for ambitious work?
- WHY explanations for constraints?
- Examples for ambiguous requirements?
</pre_generation_analysis>
Create the prompt(s) and save to the prompts folder.
**For single prompts:**
- Generate one prompt file following the patterns below
- Save as `./prompts/[number]-[name].md`
**For multiple prompts:**
- Determine how many prompts are needed (typically 2-4)
- Generate each prompt with clear, focused objectives
- Save sequentially: `./prompts/[N]-[name].md`, `./prompts/[N+1]-[name].md`, etc.
- Each prompt should be self-contained and executable independently
**Prompt Construction Rules**
Always Include:
- XML tag structure with clear, semantic tags like `<objective>`, `<context>`, `<requirements>`, `<constraints>`, `<output>`
- **Contextual information**: Why this task matters, what it's for, who will use it, end goal
- **Explicit, specific instructions**: Tell Claude exactly what to do with clear, unambiguous language
- **Sequential steps**: Use numbered lists for clarity
- File output instructions using relative paths: `./filename` or `./subfolder/filename`
- Reference to reading the CLAUDE.md for project conventions
- Explicit success criteria within `<success_criteria>` or `<verification>` tags
Conditionally Include (based on analysis):
- **Extended thinking triggers** for complex reasoning:
- Phrases like: "thoroughly analyze", "consider multiple approaches", "deeply consider", "explore multiple solutions"
- Don't use for simple, straightforward tasks
- **"Go beyond basics" language** for creative/ambitious tasks:
- Example: "Include as many relevant features as possible. Go beyond the basics to create a fully-featured implementation."
- **WHY explanations** for constraints and requirements:
- In generated prompts, explain WHY constraints matter, not just what they are
- Example: Instead of "Never use ellipses", write "Your response will be read aloud, so never use ellipses since text-to-speech can't pronounce them"
- **Parallel tool calling** for agentic/multi-step workflows:
- "For maximum efficiency, whenever you need to perform multiple independent operations, invoke all relevant tools simultaneously rather than sequentially."
- **Reflection after tool use** for complex agentic tasks:
- "After receiving tool results, carefully reflect on their quality and determine optimal next steps before proceeding."
- `<research>` tags when codebase exploration is needed
- `<validation>` tags for tasks requiring verification
- `<examples>` tags for complex or ambiguous requirements - ensure examples demonstrate desired behavior and avoid undesired patterns
- Bash command execution with "!" prefix when system state matters
- MCP server references when specifically requested or obviously beneficial
Output Format:
1. Generate prompt content with XML structure
2. Save to: `./prompts/[number]-[descriptive-name].md`
- Number format: 001, 002, 003, etc. (check existing files in ./prompts/ to determine next number)
- Name format: lowercase, hyphen-separated, max 5 words describing the task
- Example: `./prompts/001-implement-user-authentication.md`
3. File should contain ONLY the prompt, no explanations or metadata
<prompt_patterns>
For Coding Tasks:
```xml
<objective>
[Clear statement of what needs to be built/fixed/refactored]
Explain the end goal and why this matters.
</objective>
<context>
[Project type, tech stack, relevant constraints]
[Who will use this, what it's for]
@[relevant files to examine]
</context>
<requirements>
[Specific functional requirements]
[Performance or quality requirements]
Be explicit about what Claude should do.
</requirements>
<implementation>
[Any specific approaches or patterns to follow]
[What to avoid and WHY - explain the reasoning behind constraints]
</implementation>
<output>
Create/modify files with relative paths:
- `./path/to/file.ext` - [what this file should contain]
</output>
<verification>
Before declaring complete, verify your work:
- [Specific test or check to perform]
- [How to confirm the solution works]
</verification>
<success_criteria>
[Clear, measurable criteria for success]
</success_criteria>
```
For Analysis Tasks:
```xml
<objective>
[What needs to be analyzed and why]
[What the analysis will be used for]
</objective>
<data_sources>
@[files or data to analyze]
![relevant commands to gather data]
</data_sources>
<analysis_requirements>
[Specific metrics or patterns to identify]
[Depth of analysis needed - use "thoroughly analyze" for complex tasks]
[Any comparisons or benchmarks]
</analysis_requirements>
<output_format>
[How results should be structured]
Save analysis to: `./analyses/[descriptive-name].md`
</output_format>
<verification>
[How to validate the analysis is complete and accurate]
</verification>
```
For Research Tasks:
```xml
<research_objective>
[What information needs to be gathered]
[Intended use of the research]
For complex research, include: "Thoroughly explore multiple sources and consider various perspectives"
</research_objective>
<scope>
[Boundaries of the research]
[Sources to prioritize or avoid]
[Time period or version constraints]
</scope>
<deliverables>
[Format of research output]
[Level of detail needed]
Save findings to: `./research/[topic].md`
</deliverables>
<evaluation_criteria>
[How to assess quality/relevance of sources]
[Key questions that must be answered]
</evaluation_criteria>
<verification>
Before completing, verify:
- [All key questions are answered]
- [Sources are credible and relevant]
</verification>
```
</prompt_patterns>
</step_1_generate_and_save>
<intelligence_rules>
1. **Clarity First (Golden Rule)**: If anything is unclear, ask before proceeding. A few clarifying questions save time. Test: Would a colleague with minimal context understand this prompt?
2. **Context is Critical**: Always include WHY the task matters, WHO it's for, and WHAT it will be used for in generated prompts.
3. **Be Explicit**: Generate prompts with explicit, specific instructions. For ambitious results, include "go beyond the basics." For specific formats, state exactly what format is needed.
4. **Scope Assessment**: Simple tasks get concise prompts. Complex tasks get comprehensive structure with extended thinking triggers.
5. **Context Loading**: Only request file reading when the task explicitly requires understanding existing code. Use patterns like:
- "Examine @package.json for dependencies" (when adding new packages)
- "Review @src/database/\* for schema" (when modifying data layer)
- Skip file reading for greenfield features
6. **Precision vs Brevity**: Default to precision. A longer, clear prompt beats a short, ambiguous one.
7. **Tool Integration**:
- Include MCP servers only when explicitly mentioned or obviously needed
- Use bash commands for environment checking when state matters
- File references should be specific, not broad wildcards
- For multi-step agentic tasks, include parallel tool calling guidance
8. **Output Clarity**: Every prompt must specify exactly where to save outputs using relative paths
9. **Verification Always**: Every prompt should include clear success criteria and verification steps
</intelligence_rules>
<decision_tree>
After saving the prompt(s), present this decision tree to the user:
---
**Prompt(s) created successfully!**
<single_prompt_scenario>
If you created ONE prompt (e.g., `./prompts/005-implement-feature.md`):
<presentation>
✓ Saved prompt to ./prompts/005-implement-feature.md
What's next?
1. Run prompt now
2. Review/edit prompt first
3. Save for later
4. Other
Choose (1-4): \_
</presentation>
<action>
If user chooses #1, invoke via SlashCommand tool: `/run-prompt 005`
</action>
</single_prompt_scenario>
<parallel_scenario>
If you created MULTIPLE prompts that CAN run in parallel (e.g., independent modules, no shared files):
<presentation>
✓ Saved prompts:
- ./prompts/005-implement-auth.md
- ./prompts/006-implement-api.md
- ./prompts/007-implement-ui.md
Execution strategy: These prompts can run in PARALLEL (independent tasks, no shared files)
What's next?
1. Run all prompts in parallel now (launches 3 sub-agents simultaneously)
2. Run prompts sequentially instead
3. Review/edit prompts first
4. Other
Choose (1-4): \_
</presentation>
<actions>
If user chooses #1, invoke via SlashCommand tool: `/run-prompt 005 006 007 --parallel`
If user chooses #2, invoke via SlashCommand tool: `/run-prompt 005 006 007 --sequential`
</actions>
</parallel_scenario>
<sequential_scenario>
If you created MULTIPLE prompts that MUST run sequentially (e.g., dependencies, shared files):
<presentation>
✓ Saved prompts:
- ./prompts/005-setup-database.md
- ./prompts/006-create-migrations.md
- ./prompts/007-seed-data.md
Execution strategy: These prompts must run SEQUENTIALLY (dependencies: 005 → 006 → 007)
What's next?
1. Run prompts sequentially now (one completes before next starts)
2. Run first prompt only (005-setup-database.md)
3. Review/edit prompts first
4. Other
Choose (1-4): \_
</presentation>
<actions>
If user chooses #1, invoke via SlashCommand tool: `/run-prompt 005 006 007 --sequential`
If user chooses #2, invoke via SlashCommand tool: `/run-prompt 005`
</actions>
</sequential_scenario>
---
</decision_tree>
</process>
<success_criteria>
- Intake gate completed (AskUserQuestion used for clarification if needed)
- User selected "Proceed" from decision gate
- Appropriate depth, structure, and execution strategy determined
- Prompt(s) generated with proper XML structure following patterns
- Files saved to ./prompts/[number]-[name].md with correct sequential numbering
- Decision tree presented to user based on single/parallel/sequential scenario
- User choice executed (SlashCommand invoked if user selects run option)
</success_criteria>
<meta_instructions>
- **Intake first**: Complete step_0_intake_gate before generating. Use AskUserQuestion for structured clarification.
- **Decision gate loop**: Keep asking questions until user selects "Proceed"
- Use Glob tool with `./prompts/*.md` to find existing prompts and determine next number in sequence
- If ./prompts/ doesn't exist, use Write tool to create the first prompt (Write will create parent directories)
- Keep prompt filenames descriptive but concise
- Adapt the XML structure to fit the task - not every tag is needed every time
- Consider the user's working directory as the root for all relative paths
- Each prompt file should contain ONLY the prompt content, no preamble or explanation
- After saving, present the decision tree as inline text (not AskUserQuestion)
- Use the SlashCommand tool to invoke /run-prompt when user makes their choice
</meta_instructions>

View File

@@ -0,0 +1,7 @@
---
description: Create a new slash command following best practices and patterns
argument-hint: [command description or requirements]
allowed-tools: Skill(create-slash-commands)
---
Invoke the create-slash-commands skill for: $ARGUMENTS

View File

@@ -0,0 +1,7 @@
---
description: Create specialized Claude Code subagents with expert guidance
argument-hint: [agent idea or description]
allowed-tools: Skill(create-subagents)
---
Invoke the create-subagents skill for: $ARGUMENTS

23
commands/debug.md Normal file
View File

@@ -0,0 +1,23 @@
---
description: Apply expert debugging methodology to investigate a specific issue
argument-hint: [issue description]
allowed-tools: Skill(debug-like-expert)
---
<objective>
Load the debug-like-expert skill to investigate: $ARGUMENTS
This applies systematic debugging methodology with evidence gathering, hypothesis testing, and rigorous verification.
</objective>
<process>
1. Invoke the Skill tool with debug-like-expert
2. Pass the issue description: $ARGUMENTS
3. Follow the skill's debugging methodology
4. Apply rigorous investigation and verification
</process>
<success_criteria>
- Skill successfully invoked
- Arguments passed correctly to skill
</success_criteria>

141
commands/heal-skill.md Normal file
View File

@@ -0,0 +1,141 @@
---
description: Heal skill documentation by applying corrections discovered during execution with approval workflow
argument-hint: [optional: specific issue to fix]
allowed-tools: [Read, Edit, Bash(ls:*), Bash(git:*)]
---
<objective>
Update a skill's SKILL.md and related files based on corrections discovered during execution.
Analyze the conversation to detect which skill is running, reflect on what went wrong, propose specific fixes, get user approval, then apply changes with optional commit.
</objective>
<context>
Skill detection: !`ls -1 ./skills/*/SKILL.md | head -5`
</context>
<quick_start>
<workflow>
1. **Detect skill** from conversation context (invocation messages, recent SKILL.md references)
2. **Reflect** on what went wrong and how you discovered the fix
3. **Present** proposed changes with before/after diffs
4. **Get approval** before making any edits
5. **Apply** changes and optionally commit
</workflow>
</quick_start>
<process>
<step_1 name="detect_skill">
Identify the skill from conversation context:
- Look for skill invocation messages
- Check which SKILL.md was recently referenced
- Examine current task context
Set: `SKILL_NAME=[skill-name]` and `SKILL_DIR=./skills/$SKILL_NAME`
If unclear, ask the user.
</step_1>
<step_2 name="reflection_and_analysis">
Focus on $ARGUMENTS if provided, otherwise analyze broader context.
Determine:
- **What was wrong**: Quote specific sections from SKILL.md that are incorrect
- **Discovery method**: Context7, error messages, trial and error, documentation lookup
- **Root cause**: Outdated API, incorrect parameters, wrong endpoint, missing context
- **Scope of impact**: Single section or multiple? Related files affected?
- **Proposed fix**: Which files, which sections, before/after for each
</step_2>
<step_3 name="scan_affected_files">
```bash
ls -la $SKILL_DIR/
ls -la $SKILL_DIR/references/ 2>/dev/null
ls -la $SKILL_DIR/scripts/ 2>/dev/null
```
</step_3>
<step_4 name="present_proposed_changes">
Present changes in this format:
```
**Skill being healed:** [skill-name]
**Issue discovered:** [1-2 sentence summary]
**Root cause:** [brief explanation]
**Files to be modified:**
- [ ] SKILL.md
- [ ] references/[file].md
- [ ] scripts/[file].py
**Proposed changes:**
### Change 1: SKILL.md - [Section name]
**Location:** Line [X] in SKILL.md
**Current (incorrect):**
```
[exact text from current file]
```
**Corrected:**
```
[new text]
```
**Reason:** [why this fixes the issue]
[repeat for each change across all files]
**Impact assessment:**
- Affects: [authentication/API endpoints/parameters/examples/etc.]
**Verification:**
These changes will prevent: [specific error that prompted this]
```
</step_4>
<step_5 name="request_approval">
```
Should I apply these changes?
1. Yes, apply and commit all changes
2. Apply but don't commit (let me review first)
3. Revise the changes (I'll provide feedback)
4. Cancel (don't make changes)
Choose (1-4):
```
**Wait for user response. Do not proceed without approval.**
</step_5>
<step_6 name="apply_changes">
Only after approval (option 1 or 2):
1. Use Edit tool for each correction across all files
2. Read back modified sections to verify
3. If option 1, commit with structured message showing what was healed
4. Confirm completion with file list
</step_6>
</process>
<success_criteria>
- Skill correctly detected from conversation context
- All incorrect sections identified with before/after
- User approved changes before application
- All edits applied across SKILL.md and related files
- Changes verified by reading back
- Commit created if user chose option 1
- Completion confirmed with file list
</success_criteria>
<verification>
Before completing:
- Read back each modified section to confirm changes applied
- Ensure cross-file consistency (SKILL.md examples match references/)
- Verify git commit created if option 1 was selected
- Check no unintended files were modified
</verification>

129
commands/run-plan.md Normal file
View File

@@ -0,0 +1,129 @@
---
type: prompt
description: Execute a PLAN.md file directly without loading planning skill context
arguments:
- name: plan_path
description: Path to PLAN.md file (e.g., .planning/phases/07-sidebar-reorganization/07-01-PLAN.md)
required: true
---
Execute the plan at {{plan_path}} using **intelligent segmentation** for optimal quality.
**Process:**
1. **Verify plan exists and is unexecuted:**
- Read {{plan_path}}
- Check if corresponding SUMMARY.md exists in same directory
- If SUMMARY exists: inform user plan already executed, ask if they want to re-run
- If plan doesn't exist: error and exit
2. **Parse plan and determine execution strategy:**
- Extract `<objective>`, `<execution_context>`, `<context>`, `<tasks>`, `<verification>`, `<success_criteria>` sections
- Analyze checkpoint structure: `grep "type=\"checkpoint" {{plan_path}}`
- Determine routing strategy:
**Strategy A: Fully Autonomous (no checkpoints)**
- Spawn single subagent to execute entire plan
- Subagent reads plan, executes all tasks, creates SUMMARY, commits
- Main context: Orchestration only (~5% usage)
- Go to step 3A
**Strategy B: Segmented Execution (has verify-only checkpoints)**
- Parse into segments separated by checkpoints
- Check if checkpoints are verify-only (checkpoint:human-verify)
- If all checkpoints are verify-only: segment execution enabled
- Go to step 3B
**Strategy C: Decision-Dependent (has decision/action checkpoints)**
- Has checkpoint:decision or checkpoint:human-action checkpoints
- Following tasks depend on checkpoint outcomes
- Must execute sequentially in main context
- Go to step 3C
3. **Execute based on strategy:**
**3A: Fully Autonomous Execution**
```
Spawn Task tool (subagent_type="general-purpose"):
Prompt: "Execute plan at {{plan_path}}
This is a fully autonomous plan (no checkpoints).
- Read the plan for full objective, context, and tasks
- Execute ALL tasks sequentially
- Follow all deviation rules and authentication gate protocols
- Create SUMMARY.md in same directory as PLAN.md
- Update ROADMAP.md plan count
- Commit with format: feat({phase}-{plan}): [summary]
- Report: tasks completed, files modified, commit hash"
Wait for completion → Done
```
**3B: Segmented Execution (verify-only checkpoints)**
```
For each segment (autonomous block between checkpoints):
IF segment is autonomous:
Spawn subagent:
"Execute tasks [X-Y] from {{plan_path}}
Read plan for context and deviation rules.
DO NOT create SUMMARY or commit.
Report: tasks done, files modified, deviations"
Wait for subagent completion
Capture results
ELSE IF task is checkpoint:
Execute in main context:
- Load checkpoint task details
- Present checkpoint to user (action/verify/decision)
- Wait for user response
- Continue to next segment
After all segments complete:
- Aggregate results from all segments
- Create SUMMARY.md with aggregated data
- Update ROADMAP.md
- Commit all changes
- Done
```
**3C: Decision-Dependent Execution**
```
Execute in main context:
Read execution context from plan <execution_context> section
Read domain context from plan <context> section
For each task in <tasks>:
IF type="auto": execute in main, track deviations
IF type="checkpoint:*": execute in main, wait for user
After all tasks:
- Create SUMMARY.md
- Update ROADMAP.md
- Commit
- Done
```
4. **Summary and completion:**
- Verify SUMMARY.md created
- Verify commit successful
- Present completion message with next steps
**Critical Rules:**
- **Read execution_context first:** Always load files from `<execution_context>` section before executing
- **Minimal context loading:** Only read files explicitly mentioned in `<execution_context>` and `<context>` sections
- **No skill invocation:** Execute directly using native tools - don't invoke create-plans skill
- **All deviations tracked:** Apply deviation rules from execute-phase.md, document everything in Summary
- **Checkpoints are blocking:** Never skip user interaction for checkpoint tasks
- **Verification is mandatory:** Don't mark complete without running verification checks
- **Follow execute-phase.md protocol:** Loaded context contains all execution instructions
**Context Efficiency Target:**
- Execution context: ~5-7k tokens (execute-phase.md, summary.md, checkpoints.md if needed)
- Domain context: ~10-15k tokens (BRIEF, ROADMAP, codebase files)
- Total overhead: <30% context, reserving 70%+ for workspace and implementation

166
commands/run-prompt.md Normal file
View File

@@ -0,0 +1,166 @@
---
name: run-prompt
description: Delegate one or more prompts to fresh sub-task contexts with parallel or sequential execution
argument-hint: <prompt-number(s)-or-name> [--parallel|--sequential]
allowed-tools: [Read, Task, Bash(ls:*), Bash(mv:*), Bash(git:*)]
---
<context>
Git status: !`git status --short`
Recent prompts: !`ls -t ./prompts/*.md | head -5`
</context>
<objective>
Execute one or more prompts from `./prompts/` as delegated sub-tasks with fresh context. Supports single prompt execution, parallel execution of multiple independent prompts, and sequential execution of dependent prompts.
</objective>
<input>
The user will specify which prompt(s) to run via $ARGUMENTS, which can be:
**Single prompt:**
- Empty (no arguments): Run the most recently created prompt (default behavior)
- A prompt number (e.g., "001", "5", "42")
- A partial filename (e.g., "user-auth", "dashboard")
**Multiple prompts:**
- Multiple numbers (e.g., "005 006 007")
- With execution flag: "005 006 007 --parallel" or "005 006 007 --sequential"
- If no flag specified with multiple prompts, default to --sequential for safety
</input>
<process>
<step1_parse_arguments>
Parse $ARGUMENTS to extract:
- Prompt numbers/names (all arguments that are not flags)
- Execution strategy flag (--parallel or --sequential)
<examples>
- "005" → Single prompt: 005
- "005 006 007" → Multiple prompts: [005, 006, 007], strategy: sequential (default)
- "005 006 007 --parallel" → Multiple prompts: [005, 006, 007], strategy: parallel
- "005 006 007 --sequential" → Multiple prompts: [005, 006, 007], strategy: sequential
</examples>
</step1_parse_arguments>
<step2_resolve_files>
For each prompt number/name:
- If empty or "last": Find with `!ls -t ./prompts/*.md | head -1`
- If a number: Find file matching that zero-padded number (e.g., "5" matches "005-_.md", "42" matches "042-_.md")
- If text: Find files containing that string in the filename
<matching_rules>
- If exactly one match found: Use that file
- If multiple matches found: List them and ask user to choose
- If no matches found: Report error and list available prompts
</matching_rules>
</step2_resolve_files>
<step3_execute>
<single_prompt>
1. Read the complete contents of the prompt file
2. Delegate as sub-task using Task tool with subagent_type="general-purpose"
3. Wait for completion
4. Archive prompt to `./prompts/completed/` with metadata
5. Commit all work:
- Stage files YOU modified with `git add [file]` (never `git add .`)
- Determine appropriate commit type based on changes (fix|feat|refactor|style|docs|test|chore)
- Commit with format: `[type]: [description]` (lowercase, specific, concise)
6. Return results
</single_prompt>
<parallel_execution>
1. Read all prompt files
2. **Spawn all Task tools in a SINGLE MESSAGE** (this is critical for parallel execution):
<example>
Use Task tool for prompt 005
Use Task tool for prompt 006
Use Task tool for prompt 007
(All in one message with multiple tool calls)
</example>
3. Wait for ALL to complete
4. Archive all prompts with metadata
5. Commit all work:
- Stage files YOU modified with `git add [file]` (never `git add .`)
- Determine appropriate commit type based on changes (fix|feat|refactor|style|docs|test|chore)
- Commit with format: `[type]: [description]` (lowercase, specific, concise)
6. Return consolidated results
</parallel_execution>
<sequential_execution>
1. Read first prompt file
2. Spawn Task tool for first prompt
3. Wait for completion
4. Archive first prompt
5. Read second prompt file
6. Spawn Task tool for second prompt
7. Wait for completion
8. Archive second prompt
9. Repeat for remaining prompts
10. Archive all prompts with metadata
11. Commit all work:
- Stage files YOU modified with `git add [file]` (never `git add .`)
- Determine appropriate commit type based on changes (fix|feat|refactor|style|docs|test|chore)
- Commit with format: `[type]: [description]` (lowercase, specific, concise)
12. Return consolidated results
</sequential_execution>
</step3_execute>
</process>
<context_strategy>
By delegating to a sub-task, the actual implementation work happens in fresh context while the main conversation stays lean for orchestration and iteration.
</context_strategy>
<output>
<single_prompt_output>
✓ Executed: ./prompts/005-implement-feature.md
✓ Archived to: ./prompts/completed/005-implement-feature.md
<results>
[Summary of what the sub-task accomplished]
</results>
</single_prompt_output>
<parallel_output>
✓ Executed in PARALLEL:
- ./prompts/005-implement-auth.md
- ./prompts/006-implement-api.md
- ./prompts/007-implement-ui.md
✓ All archived to ./prompts/completed/
<results>
[Consolidated summary of all sub-task results]
</results>
</parallel_output>
<sequential_output>
✓ Executed SEQUENTIALLY:
1. ./prompts/005-setup-database.md → Success
2. ./prompts/006-create-migrations.md → Success
3. ./prompts/007-seed-data.md → Success
✓ All archived to ./prompts/completed/
<results>
[Consolidated summary showing progression through each step]
</results>
</sequential_output>
</output>
<critical_notes>
- For parallel execution: ALL Task tool calls MUST be in a single message
- For sequential execution: Wait for each Task to complete before starting next
- Archive prompts only after successful completion
- If any prompt fails, stop sequential execution and report error
- Provide clear, consolidated results for multiple prompt execution
</critical_notes>

108
commands/whats-next.md Normal file
View File

@@ -0,0 +1,108 @@
---
name: whats-next
description: Analyze the current conversation and create a handoff document for continuing this work in a fresh context
allowed-tools:
- Read
- Write
- Bash
- WebSearch
- WebFetch
---
Create a comprehensive, detailed handoff document that captures all context from the current conversation. This allows continuing the work in a fresh context with complete precision.
## Instructions
**PRIORITY: Comprehensive detail and precision over brevity.** The goal is to enable someone (or a fresh Claude instance) to pick up exactly where you left off with zero information loss.
Adapt the level of detail to the task type (coding, research, analysis, writing, configuration, etc.) but maintain comprehensive coverage:
1. **Original Task**: Identify what was initially requested (not new scope or side tasks)
2. **Work Completed**: Document everything accomplished in detail
- All artifacts created, modified, or analyzed (files, documents, research findings, etc.)
- Specific changes made (code with line numbers, content written, data analyzed, etc.)
- Actions taken (commands run, APIs called, searches performed, tools used, etc.)
- Findings discovered (insights, patterns, answers, data points, etc.)
- Decisions made and the reasoning behind them
3. **Work Remaining**: Specify exactly what still needs to be done
- Break down remaining work into specific, actionable steps
- Include precise locations, references, or targets (file paths, URLs, data sources, etc.)
- Note dependencies, prerequisites, or ordering requirements
- Specify validation or verification steps needed
4. **Attempted Approaches**: Capture everything tried, including failures
- Approaches that didn't work and why they failed
- Errors encountered, blockers hit, or limitations discovered
- Dead ends to avoid repeating
- Alternative approaches considered but not pursued
5. **Critical Context**: Preserve all essential knowledge
- Key decisions and trade-offs considered
- Constraints, requirements, or boundaries
- Important discoveries, gotchas, edge cases, or non-obvious behaviors
- Relevant environment, configuration, or setup details
- Assumptions made that need validation
- References to documentation, sources, or resources consulted
6. **Current State**: Document the exact current state
- Status of deliverables (complete, in-progress, not started)
- What's committed, saved, or finalized vs. what's temporary or draft
- Any temporary changes, workarounds, or open questions
- Current position in the workflow or process
Write to `whats-next.md` in the current working directory using the format below.
## Output Format
```xml
<original_task>
[The specific task that was initially requested - be precise about scope]
</original_task>
<work_completed>
[Comprehensive detail of everything accomplished:
- Artifacts created/modified/analyzed (with specific references)
- Specific changes, additions, or findings (with details and locations)
- Actions taken (commands, searches, API calls, tool usage, etc.)
- Key discoveries or insights
- Decisions made and reasoning
- Side tasks completed]
</work_completed>
<work_remaining>
[Detailed breakdown of what needs to be done:
- Specific tasks with precise locations or references
- Exact targets to create, modify, or analyze
- Dependencies and ordering
- Validation or verification steps needed]
</work_remaining>
<attempted_approaches>
[Everything tried, including failures:
- Approaches that didn't work and why
- Errors, blockers, or limitations encountered
- Dead ends to avoid
- Alternative approaches considered but not pursued]
</attempted_approaches>
<critical_context>
[All essential knowledge for continuing:
- Key decisions and trade-offs
- Constraints, requirements, or boundaries
- Important discoveries, gotcas, or edge cases
- Environment, configuration, or setup details
- Assumptions requiring validation
- References to documentation, sources, or resources]
</critical_context>
<current_state>
[Exact state of the work:
- Status of deliverables (complete/in-progress/not started)
- What's finalized vs. what's temporary or draft
- Temporary changes or workarounds in place
- Current position in workflow or process
- Any open questions or pending decisions]
</current_state>
```

749
plugin.lock.json Normal file
View File

@@ -0,0 +1,749 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:glittercowboy/taches-cc-resources:",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "a9f593ad4378ac03ebeaaf3539a50b7e6b155e84",
"treeHash": "afd0d1b05f4cd7091d0295389253496d3de0dd6c700c050391cb36e73c65513f",
"generatedAt": "2025-11-28T10:17:01.889255Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "taches-cc-resources",
"description": "Curated Claude Code skills and commands for prompt engineering, MCP servers, subagents, hooks, and productivity workflows",
"version": "1.0.0"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "8b67a8c9e46ecdf34b2c109462196fe10972bb093ad1f9f1dfae7ac70d96d8cb"
},
{
"path": "agents/slash-command-auditor.md",
"sha256": "828618f6bc332a2194c81d07691fa6778e9ea2cf6218094403506a438d732fce"
},
{
"path": "agents/subagent-auditor.md",
"sha256": "c0f637c15b848ecebd64b27b41178659babce1cbbc92889e53c51afdfa4b06b5"
},
{
"path": "agents/skill-auditor.md",
"sha256": "b6302621d15f2ff2f6ae0f091b9b1cc47c61c3d9a4a1ac40f5e0335797bfebc5"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "76415b7c177be7a81a3dd48fd51d848c6dd44a3285b53e8ccbde245ed9b4a7f7"
},
{
"path": "commands/audit-subagent.md",
"sha256": "79f895a9d53b215a551c0ca9f84698bef4a737b64766be58bc6fb8155abe8993"
},
{
"path": "commands/debug.md",
"sha256": "0c055236c658ba6efd9c63ec2b08517931d4d94ad5de81fa530fbd75b034e5bd"
},
{
"path": "commands/add-to-todos.md",
"sha256": "de3b579683ddedbb918cb8c472c59455b9f73731dff339f94b524fef9128286b"
},
{
"path": "commands/run-plan.md",
"sha256": "805dfecf3c77f375b101c39b4a5cc72790cf5fbd9035bab84db5daf2cf065e8c"
},
{
"path": "commands/check-todos.md",
"sha256": "5df8e801675151d0d595d0264741c1d264ad3531291c461a45a4cc09044bb2ad"
},
{
"path": "commands/create-subagent.md",
"sha256": "02f4c5bacf704a919644c7dada9d4ef77ada566cfca5f8a15aca30a91f94931a"
},
{
"path": "commands/whats-next.md",
"sha256": "2721dcb752fb14627648b79a05257348b7bc8283b58c06ea0a7cf9ed9e8130a8"
},
{
"path": "commands/audit-slash-command.md",
"sha256": "e8d0cd301060a8e9ec366e921914f64e586b79c9dd247cb0fc806de1c83640cc"
},
{
"path": "commands/create-slash-command.md",
"sha256": "d9a1b1c64282fb60356ecfef601cbf7e076339d5557c34bfc6de5e4c532fdb3f"
},
{
"path": "commands/create-meta-prompt.md",
"sha256": "030dd7453afa3ac79433530f0ac3896dedb066bc4e40fb865d9461bc2cf1d767"
},
{
"path": "commands/create-agent-skill.md",
"sha256": "c2b5783680c2c6e1938fdeac9c9b4925348134f8d0486f90d186f52942d93252"
},
{
"path": "commands/create-prompt.md",
"sha256": "2110e278322e9e8b5932d2b3ecd757ea3346f9159b842a9a20eea251f0ca7b97"
},
{
"path": "commands/create-hook.md",
"sha256": "5c63f7b239472f57f1a0f7abf2afef18d8412466addfeddbce28af1dcc9cb5e2"
},
{
"path": "commands/run-prompt.md",
"sha256": "9b126f2c94f699588f229462b914b1de18a3dd5b2122a0afe1b8eba362e55613"
},
{
"path": "commands/create-plan.md",
"sha256": "8745b75ea9e84c14525d7355e66b9172fd767c22bf9aa6ba98e6b9e8da18b473"
},
{
"path": "commands/audit-skill.md",
"sha256": "fe7a473abce7e4eb6406dc7c05f9cec46f944dc1d3dfc54540d8316d56d732f0"
},
{
"path": "commands/heal-skill.md",
"sha256": "3e4b9649a291623237c948cb6a37c9446ed1f0d661d349c0a55e18eeda78db50"
},
{
"path": "commands/consider/first-principles.md",
"sha256": "46ea73e212e4f584841a676edc15f4e34c875c4e2660bddcab37522ea3234009"
},
{
"path": "commands/consider/10-10-10.md",
"sha256": "edcff2716ab45034c4a46c03436b12a722a832104f1582968ccb12589993f4c9"
},
{
"path": "commands/consider/swot.md",
"sha256": "5ec58475f389a2ca8327009e32c8785d5fbd4fa2968a3161dbfef3044f0c1121"
},
{
"path": "commands/consider/one-thing.md",
"sha256": "492c2b0af85fc5801c409d8508a87213742b1f8ba7bbc010faaa15d3598f703d"
},
{
"path": "commands/consider/occams-razor.md",
"sha256": "44dda8ba3be7bd670f558aa961bb012c859781a3d888c270058aad763007609b"
},
{
"path": "commands/consider/5-whys.md",
"sha256": "993630eac6ec4407554db879a292cf496a108c21baa0b9e773b843c1b7c23bed"
},
{
"path": "commands/consider/second-order.md",
"sha256": "2a89d9e4ab5c237a1ee8b627990d7c83d369dba824ecac854454a3304284d81a"
},
{
"path": "commands/consider/eisenhower-matrix.md",
"sha256": "671a6fb89d724cb7aa6da6cc30b7c66627f9df34bb94b3b7c27694e393c2d27d"
},
{
"path": "commands/consider/opportunity-cost.md",
"sha256": "e49d67071f2702711c11daa0a45874bf0cb9ded0871900c87e72e68f7b2fe27c"
},
{
"path": "commands/consider/via-negativa.md",
"sha256": "15dac1b4f8148063bcb4136dc1fc560039ad5e90ef8165ad85763320428cc304"
},
{
"path": "commands/consider/inversion.md",
"sha256": "ed68628fa10fce7c5fdb3c08db16498a34606b6cfe20db691aebcbfd0748c5d6"
},
{
"path": "commands/consider/pareto.md",
"sha256": "37238734c4f90c7d6017f85d5589e14ae57cbc310466d46f7f877bbf77328242"
},
{
"path": "skills/create-hooks/SKILL.md",
"sha256": "7376c91ef6080ae426a938a6221c8376c5a64508d6b34a352b816fcb579ef9c6"
},
{
"path": "skills/create-hooks/references/examples.md",
"sha256": "2acec72530d4ccec20fcd5bd30de4d71742103981dae3b4c13806169222ea50b"
},
{
"path": "skills/create-hooks/references/input-output-schemas.md",
"sha256": "40e28097acbd73b55781399a4f6b32544132805f234ad3c151b3cb7f77a95cd0"
},
{
"path": "skills/create-hooks/references/troubleshooting.md",
"sha256": "379848b1fc4649a109ac6fc1174d656f2825bcd7a698dd831bfeda6ea22610fa"
},
{
"path": "skills/create-hooks/references/hook-types.md",
"sha256": "d4c5074def84bacee3415c94d659bdb2189fc3d4a9de60561ea184f2e9e578a7"
},
{
"path": "skills/create-hooks/references/matchers.md",
"sha256": "a82aef6dc0fef01939ca8cddd40413358fff8278b3922afedeb74310545a7ad0"
},
{
"path": "skills/create-hooks/references/command-vs-prompt.md",
"sha256": "41f7dbd754a431fab8951f14e542f1aa68c529534b4db2ebc69359c25d59b244"
},
{
"path": "skills/create-meta-prompts/README.md",
"sha256": "3967690c78feb1d6b339e788224b7d6679df88a94f125837a3af95c2568909b4"
},
{
"path": "skills/create-meta-prompts/SKILL.md",
"sha256": "2cca9875537cda6e8392598183dff34b9506e848c43c464dd628d40021b071c7"
},
{
"path": "skills/create-meta-prompts/references/question-bank.md",
"sha256": "9457394d3f60ec72e00e842c758e0911443e2d901f6019d1026d76f3116ff99a"
},
{
"path": "skills/create-meta-prompts/references/metadata-guidelines.md",
"sha256": "02abf3e0ecf39c8eae288afe2c5cf9c495981f028828dbadd281211d78371520"
},
{
"path": "skills/create-meta-prompts/references/research-patterns.md",
"sha256": "dea8b6e587fbfa2b3876cf7149f4adc58938fdf74c5935ae5141bdb2fb06c635"
},
{
"path": "skills/create-meta-prompts/references/plan-patterns.md",
"sha256": "7140b589e39262714d24ea9adaedcad1271cd7c49c4eba8522ea26d655c1d838"
},
{
"path": "skills/create-meta-prompts/references/summary-template.md",
"sha256": "2033edfd2d71e54f9fc71b964316bfa601c2e6ec9ae120206847b289a3f1b76e"
},
{
"path": "skills/create-meta-prompts/references/refine-patterns.md",
"sha256": "d51bb0d6600062153eed1561ef5105561041a4f00839493b6a4850eca7aec8ed"
},
{
"path": "skills/create-meta-prompts/references/do-patterns.md",
"sha256": "963484c0d8485a625a380477e92aa06c4035dca4a71ed78a12e05cd980ed274b"
},
{
"path": "skills/create-meta-prompts/references/research-pitfalls.md",
"sha256": "520f940f9c22472e7d00a9f7aed2879ab0e4136ba7ff50cdd7bd91510cabe36e"
},
{
"path": "skills/create-meta-prompts/references/intelligence-rules.md",
"sha256": "b390cf7e1cf8d0c5b221b4bcbaecb877caa06b238b3fadb30b01825ad28d8dbd"
},
{
"path": "skills/create-slash-commands/SKILL.md",
"sha256": "fcdd5660ce079c080606a1bc4ab4ad0ef72ff811ef6041b7da3cdef062828212"
},
{
"path": "skills/create-slash-commands/references/patterns.md",
"sha256": "3c97c5fed147afd958dbbf608fc88ff603fd05d0ccbac9bdd43039c0233d4166"
},
{
"path": "skills/create-slash-commands/references/tool-restrictions.md",
"sha256": "44ca647025e2fbe28f28dc8d388972ca434d3cbb067adfe6cfd521e5c5fa4842"
},
{
"path": "skills/create-slash-commands/references/arguments.md",
"sha256": "53b74f4db04eb8538021deece8171b6654f841706e585f1956eed8afd2de9b47"
},
{
"path": "skills/debug-like-expert/SKILL.md",
"sha256": "89e3feb89745a9804e9f2c33e6f9cb66154d2238da473cf92032492b9443723e"
},
{
"path": "skills/debug-like-expert/references/when-to-research.md",
"sha256": "7913ab73f81392778ee68ee61c0c88dd30c93a379b136c695d3d42973b33e0a0"
},
{
"path": "skills/debug-like-expert/references/debugging-mindset.md",
"sha256": "7194c1c9dbe35c7b3bc57c855eb0ac257e2cb6cc92932a06951025b6bd0b88e3"
},
{
"path": "skills/debug-like-expert/references/hypothesis-testing.md",
"sha256": "1c4b8b7a56a0c2f1467edd26815e924e8a1c370ddcd0701d4d91e2c3cf5b14c5"
},
{
"path": "skills/debug-like-expert/references/verification-patterns.md",
"sha256": "4ab955fa204178e68e05c05d31a4b88a282d7a7200314a6a84af543b4fd2168f"
},
{
"path": "skills/debug-like-expert/references/investigation-techniques.md",
"sha256": "4283ec67d6ccd01befb77e74d6331a1e9ce446e0d2944f5499bb6d90ab0df337"
},
{
"path": "skills/create-agent-skills/SKILL.md",
"sha256": "caa936995732079feb14cff10c6e1e76e750847e2816770a9da2632e3796f981"
},
{
"path": "skills/create-agent-skills/references/recommended-structure.md",
"sha256": "169d9fd09f70a5da8ddd015a4eddd90c39901693d5ea5fa2be83474c167b1196"
},
{
"path": "skills/create-agent-skills/references/use-xml-tags.md",
"sha256": "26e4aeec4de61195f0aa6788a616620a3e56efb04f005375344780b7694cf799"
},
{
"path": "skills/create-agent-skills/references/using-templates.md",
"sha256": "ca824a0fe50fe63b20c5485542be2b10e6b3a2353111def0f6a5c9004b7f56a9"
},
{
"path": "skills/create-agent-skills/references/using-scripts.md",
"sha256": "0f1c2513eae0d47a7a981af2a441da8c1fefc7ab0a95f5b3f1a15e89677228d5"
},
{
"path": "skills/create-agent-skills/references/skill-structure.md",
"sha256": "e92362494b739c884b4b0064ac8be0a251d0e2e0ed42ebdd4bddf9e17934aab4"
},
{
"path": "skills/create-agent-skills/references/iteration-and-testing.md",
"sha256": "cfd01dc28c5e5a257a75dfccaf212c994299b4927155da5ff8096893e38d2438"
},
{
"path": "skills/create-agent-skills/references/common-patterns.md",
"sha256": "4423da8cb5cd4861784877899c008bd4d763c4de01299fe4abcbd1cb46705ac2"
},
{
"path": "skills/create-agent-skills/references/workflows-and-validation.md",
"sha256": "e1fbddcd429636653bd0bb7bccf9be346339a25530a4b39a22c395f8b8903ebb"
},
{
"path": "skills/create-agent-skills/references/core-principles.md",
"sha256": "9d61fa8b910cc9ff2fb51be8ef9deba699fca3f2b7049f9062bf4968d56299f3"
},
{
"path": "skills/create-agent-skills/references/api-security.md",
"sha256": "ea3b1ca2b5f41e1003b93bcac018c6fb5aa5aec9fbda62e0e67657d42205cd91"
},
{
"path": "skills/create-agent-skills/references/executable-code.md",
"sha256": "0897061f35e452d2b616f9d6b69c5fac86fa739d048a64ecd99638da036f9979"
},
{
"path": "skills/create-agent-skills/references/be-clear-and-direct.md",
"sha256": "002be371c349ceda14256a7a853b347b78149b51aace8976536d041185735131"
},
{
"path": "skills/create-agent-skills/workflows/upgrade-to-router.md",
"sha256": "054525496fa95c6712362c44048d9ef7d7c43bcdc4bd3c69965ed8071267e474"
},
{
"path": "skills/create-agent-skills/workflows/get-guidance.md",
"sha256": "be64cd2d89e697d8b7e77f8330da1109b5cbd7417896feee7b2d621fa7639043"
},
{
"path": "skills/create-agent-skills/workflows/add-reference.md",
"sha256": "08104aaa73732926d5a5f74c6e4a07fe6b36a6397b764d4b9d8df4371c916fa5"
},
{
"path": "skills/create-agent-skills/workflows/create-domain-expertise-skill.md",
"sha256": "84da321365642c788d30b0c916a8b0cc062c210892b214ff59d8c409b3bf6fb2"
},
{
"path": "skills/create-agent-skills/workflows/verify-skill.md",
"sha256": "9d34c4e3f1584d7fe68825abb3449aafb6b9e7882a9f7310a771228b46438a9a"
},
{
"path": "skills/create-agent-skills/workflows/add-script.md",
"sha256": "b3aa2239afdc752c9402bb1f812040c5473465a7e7119cf65e15ce0e65faa25e"
},
{
"path": "skills/create-agent-skills/workflows/add-workflow.md",
"sha256": "a50440bd4ec186749bfbee02289fb848d71dadfcdc2a242b5c7b5f1c73d463d5"
},
{
"path": "skills/create-agent-skills/workflows/audit-skill.md",
"sha256": "01f6448a9f3a716ddfb0170a8d628462739a6080e2a863e2e07662db5d03ee91"
},
{
"path": "skills/create-agent-skills/workflows/add-template.md",
"sha256": "d24cd3d8af277c390730ca0795b56680c6b386cfeb97bbaa3cd58baca6227ae6"
},
{
"path": "skills/create-agent-skills/workflows/create-new-skill.md",
"sha256": "3c26d71a19cdab159e246a2478324d61d02c5af317f4245a8092b314f9752ce7"
},
{
"path": "skills/create-agent-skills/templates/router-skill.md",
"sha256": "04d49ac1ca25d9191dfd3d7a3b3b5e70879a119983ec15616bac4df1b8cd0951"
},
{
"path": "skills/create-agent-skills/templates/simple-skill.md",
"sha256": "5fe4f40cf0d827033c89948e3f89377eb5ec2593eef315bd634a0ed66eebf915"
},
{
"path": "skills/create-plans/README.md",
"sha256": "b2509f7fe878f60a5db8865f577f912f417f4c1da832faf2bf9bc0f98b8e088a"
},
{
"path": "skills/create-plans/SKILL.md",
"sha256": "c0adb5d8a79109a21350b51abf6284a12b2be301e4097093104074c5e02c33cb"
},
{
"path": "skills/create-plans/references/git-integration.md",
"sha256": "df219879430972f57ff4cd2352c5398768bb70992d7e5c8bdfe49de763b85217"
},
{
"path": "skills/create-plans/references/scope-estimation.md",
"sha256": "e40ba480a44954ceafd73f2f2a72b2806910aaaffdaa1a4d63191e598771483d"
},
{
"path": "skills/create-plans/references/context-management.md",
"sha256": "a41674f30eed0bf58a449c438cda81f75544538c08f9f455a86e99a7a5ed3264"
},
{
"path": "skills/create-plans/references/milestone-management.md",
"sha256": "8a9905e112f2c3bd180639562ba20d72a7ff67e0ceb92dc2bd574032afc9b48e"
},
{
"path": "skills/create-plans/references/hierarchy-rules.md",
"sha256": "0b6ddaa82cdefebdf198eb00b08a41ad41ea7b0b2fb8c60af2d8f412f36d270e"
},
{
"path": "skills/create-plans/references/plan-format.md",
"sha256": "e300018781292d607119e2d8e7a0c2b8524d93db563806df61ab15ccb35ad1e5"
},
{
"path": "skills/create-plans/references/cli-automation.md",
"sha256": "469f4b56d46f830c0840ed75d92f483383c0bd62663cea38dba230b595305d36"
},
{
"path": "skills/create-plans/references/domain-expertise.md",
"sha256": "a10845a7d194a3a298244e7b842723c23d008fab4fb42d1c315b2caae21e445c"
},
{
"path": "skills/create-plans/references/checkpoints.md",
"sha256": "ebd1b5d1640ac01aaff3938d17f8d3cb9a73a69381b78510bbe9b3d770124457"
},
{
"path": "skills/create-plans/references/user-gates.md",
"sha256": "846693c817118f1fe8a5aef42c178e6f5f2b8700c73f258701628cd16020d53b"
},
{
"path": "skills/create-plans/references/research-pitfalls.md",
"sha256": "520f940f9c22472e7d00a9f7aed2879ab0e4136ba7ff50cdd7bd91510cabe36e"
},
{
"path": "skills/create-plans/workflows/complete-milestone.md",
"sha256": "2cbcf31ef141e8239093b51fd2e6c8d6bbec561b8f67f3dc6251a041bb48a65d"
},
{
"path": "skills/create-plans/workflows/research-phase.md",
"sha256": "4c5279c5b995f8923844cfd3fee86e386413a00f305ad3efe6d6601d69cf61f6"
},
{
"path": "skills/create-plans/workflows/plan-phase.md",
"sha256": "437cae541aa6f49338afec1429307cf15ad9b7b9ab2c0ea370ecce5374fe3129"
},
{
"path": "skills/create-plans/workflows/create-brief.md",
"sha256": "4f8a2643dfbb4bc778e91c9c687748ab84ffc3b8fa21f5f6d17235fe4d9cc1e5"
},
{
"path": "skills/create-plans/workflows/get-guidance.md",
"sha256": "866ad99653855eee1749fd67714cad879583f94c31eb09802f5e99e70e948e10"
},
{
"path": "skills/create-plans/workflows/plan-chunk.md",
"sha256": "6d9e0bb4c95cdfa2b7bb6a881dbb514cce09e097007687b4bc852fa288019426"
},
{
"path": "skills/create-plans/workflows/resume.md",
"sha256": "7361c9524a6c0826340627eae21275228cfcb6d4bd1f7585c22f0e75a6fcdf6b"
},
{
"path": "skills/create-plans/workflows/create-roadmap.md",
"sha256": "2fd8d0158c7f10e3a93dc13431b8b94520e34004b3b1356ca03806152b5bcad9"
},
{
"path": "skills/create-plans/workflows/execute-phase.md",
"sha256": "17ab45e4e3ea4bc93ad301ce2d0d0e43e5931278bb2733085c091213f45833f9"
},
{
"path": "skills/create-plans/workflows/handoff.md",
"sha256": "81ca4b1ecb9a8e56ef985e7981fcdabf63d714c4bd395f264b46540bea6eeb3d"
},
{
"path": "skills/create-plans/workflows/transition.md",
"sha256": "8d82e802d0ef4909f57540b1cc0f7127b9f24346221951c9f18f85fd15431440"
},
{
"path": "skills/create-plans/templates/summary.md",
"sha256": "15821e46867fcbcc4e4324a3bd4ce5649530330469ffb898f695b63dccc75b21"
},
{
"path": "skills/create-plans/templates/research-prompt.md",
"sha256": "7b38ffdca1fe39f2bc76b325ce8d0cc390af4da512ed73dcbae7818264f6c26a"
},
{
"path": "skills/create-plans/templates/phase-prompt.md",
"sha256": "a1f9a638d5a18f1802848a5a89655d08ce3d18af74655b1f2d75ccc72e337579"
},
{
"path": "skills/create-plans/templates/continue-here.md",
"sha256": "f522a51b6895fba838c7a9c60408c5a09472466bdf2837f8974330937e682932"
},
{
"path": "skills/create-plans/templates/milestone.md",
"sha256": "74d2f750ae9f4a9c18feec3708d8f414c5b15148b22eb7da554dc2da87587711"
},
{
"path": "skills/create-plans/templates/roadmap.md",
"sha256": "f1e4dd5cadda2344501c7da0c3d3570bcfc86a51b6989648e52a6ebd12b53011"
},
{
"path": "skills/create-plans/templates/issues.md",
"sha256": "137ca16db4d34190b4b6eac892f6bcbc2777853709ca3c08202a261b94489628"
},
{
"path": "skills/create-plans/templates/brief.md",
"sha256": "7566d6188ed90c29da1a1636e5c8bf88737a6701c2bd76ccd8d8996d7e979cb7"
},
{
"path": "skills/create-subagents/SKILL.md",
"sha256": "2f11e2dc66e7a55cc4bc8f87428476664bbf6112acf68d66172e8b012616c00c"
},
{
"path": "skills/create-subagents/references/orchestration-patterns.md",
"sha256": "374d61967a818919f96537a5f5c3dc3f87cedcfd38f8f3b5377f7298ee149442"
},
{
"path": "skills/create-subagents/references/debugging-agents.md",
"sha256": "04388fc71f26c496a8947d72a4fcddf2e811c139da6fb9e890dee0bcb42188e0"
},
{
"path": "skills/create-subagents/references/context-management.md",
"sha256": "1e4e714c406f1fabf7f203d3dcda88fd5ca6d5f06ea1f18d45c4cdefff0a7621"
},
{
"path": "skills/create-subagents/references/error-handling-and-recovery.md",
"sha256": "63abd9506606213563bc43a9a39cd9caf4669d8fb88ec9426f89ada32ec0497b"
},
{
"path": "skills/create-subagents/references/writing-subagent-prompts.md",
"sha256": "27d09f24aa87c94433883cfb202cef0f4c8e029bf64e1f81648b5d7f4da8afeb"
},
{
"path": "skills/create-subagents/references/subagents.md",
"sha256": "a9531056abe5c8e37e6e69756fdc6e64d08554b561ee8218e297548bf9205bff"
},
{
"path": "skills/create-subagents/references/evaluation-and-testing.md",
"sha256": "634e0b35b738f94fe668759511d22bc780b17b15ca8330324bafe73d6a588ab4"
},
{
"path": "skills/expertise/iphone-apps/SKILL.md",
"sha256": "4d8b89d7f4430129008879473210e5fc7a4327c2e788438317a93f5ce52a337c"
},
{
"path": "skills/expertise/iphone-apps/references/app-store.md",
"sha256": "b8bfd657034b8a97c4cc5c211ca30a6a8f91f155086f4375bc4d67d04eefe84d"
},
{
"path": "skills/expertise/iphone-apps/references/swiftui-patterns.md",
"sha256": "58687c1c21ebbc33dc473b759d6be622394699f79905a24bb628ba2457654850"
},
{
"path": "skills/expertise/iphone-apps/references/push-notifications.md",
"sha256": "051dcf18a4c703b60e998d04f5b1dca99c26f089df283a98a4c4c6664a806146"
},
{
"path": "skills/expertise/iphone-apps/references/background-tasks.md",
"sha256": "497981bd8a1bbddaf16a1befaab96981f2796a5a908fc687212f3df217a3c258"
},
{
"path": "skills/expertise/iphone-apps/references/networking.md",
"sha256": "739b27c8969ff6cd9ac3261a936d0298ca7b038d65c069c22de7b13901acf8a0"
},
{
"path": "skills/expertise/iphone-apps/references/data-persistence.md",
"sha256": "41f2a9b0a49ebe0e0bf5fde422091e27665fda4692ac888209cdf53519a479d5"
},
{
"path": "skills/expertise/iphone-apps/references/app-icons.md",
"sha256": "2eee394b9bc70465d72d9461c1baa1369aed7159efd510b04d09f5e0420674af"
},
{
"path": "skills/expertise/iphone-apps/references/project-scaffolding.md",
"sha256": "f7838caaf3fb3b581d3402cbd70b594648c214e5f813fe2f56c996020032e7d5"
},
{
"path": "skills/expertise/iphone-apps/references/accessibility.md",
"sha256": "af1d8f1f99c3c977bda2c86381ff45fefcba7618898fa855ec24788aa937128d"
},
{
"path": "skills/expertise/iphone-apps/references/cli-workflow.md",
"sha256": "6c0cde9877be7fbbf423df20c9d5362b0dfbe3241fb845b0f1f9ab01e946fa7c"
},
{
"path": "skills/expertise/iphone-apps/references/storekit.md",
"sha256": "f76c7a8416a2fa677d76f62991ee849399e3c13bada7033147350b3878ba44ee"
},
{
"path": "skills/expertise/iphone-apps/references/testing.md",
"sha256": "1757a0a50b8291dc4556392701f623b23ec3856d4d83583bf795f7b42ac7dfd4"
},
{
"path": "skills/expertise/iphone-apps/references/performance.md",
"sha256": "87578e8ca777155526133bef061fbf97f8d49f9104ff0914f7a7592d9165753e"
},
{
"path": "skills/expertise/iphone-apps/references/app-architecture.md",
"sha256": "f34560b61eaf7e22dfaa4e48720cf6e36ed59f323cb0e1e44adbb22a37444005"
},
{
"path": "skills/expertise/iphone-apps/references/cli-observability.md",
"sha256": "a9f7bc7b1dd58ecb83037fa9e906dbbf21a90385b4eb7741c83bbde45c1b46e1"
},
{
"path": "skills/expertise/iphone-apps/references/polish-and-ux.md",
"sha256": "8c2849884bd26f06ab13c0c5a95c832ee169504cb368c60b5e94ff13df1c8af2"
},
{
"path": "skills/expertise/iphone-apps/references/navigation-patterns.md",
"sha256": "81db693671e2b4f8639cc11579e23b3c8bc56560346e4a9162f6b6c957c0a12e"
},
{
"path": "skills/expertise/iphone-apps/references/ci-cd.md",
"sha256": "950781221d88941877c4159e022887d262faac610f89b934eb020c90b94348f9"
},
{
"path": "skills/expertise/iphone-apps/references/security.md",
"sha256": "dc3a8eb0b58608908ca6a157adb6fe763468228d9573cf25ad5ddd5b1dfc6b09"
},
{
"path": "skills/expertise/iphone-apps/workflows/add-feature.md",
"sha256": "50ed6cdbe0f21ace293236c12fe2c2abde4a738a00c4afb26e45ee12d84a31fd"
},
{
"path": "skills/expertise/iphone-apps/workflows/debug-app.md",
"sha256": "868160973a0bbb571faf240da37381a872814eefb1e66c11be2b3ca07134f0f8"
},
{
"path": "skills/expertise/iphone-apps/workflows/ship-app.md",
"sha256": "c5e2632735a9e681811afa8c543d1f959fc8784d7bbb01126debb4b26cc9d9f7"
},
{
"path": "skills/expertise/iphone-apps/workflows/write-tests.md",
"sha256": "339e6c1a1172f9440639ef37e5cd6d2a4b538d189e0bfe5c8151e8f8a9c12a0d"
},
{
"path": "skills/expertise/iphone-apps/workflows/optimize-performance.md",
"sha256": "7f76b2849d884eaeb8bf45671b820e3141ee6b0c219a844c39a1688decd2755c"
},
{
"path": "skills/expertise/iphone-apps/workflows/build-new-app.md",
"sha256": "699d7288ff2f5dd3f03d015ea6e9c9f023d5fdd4650fe91e9eb56f4db689b7db"
},
{
"path": "skills/expertise/macos-apps/SKILL.md",
"sha256": "2ebabe78f46270117080bb1e83ee24cf12e4887b4f8c06ecab9dd7adc4bd7259"
},
{
"path": "skills/expertise/macos-apps/references/swiftui-patterns.md",
"sha256": "6850547ada32cdd691a4d9cc62292558bcc79a348e5d7081365651a9059cb539"
},
{
"path": "skills/expertise/macos-apps/references/macos-polish.md",
"sha256": "1c22f175e16e3e4ddfbad1f7b942380058d01f930288237e75405e62a941e445"
},
{
"path": "skills/expertise/macos-apps/references/networking.md",
"sha256": "1541b25aa6d440c85b7b0ee171bf4b3bc9a9e2d232f86865aa4a015eb2bbac41"
},
{
"path": "skills/expertise/macos-apps/references/data-persistence.md",
"sha256": "e3d27b89c3ad6b88e27d5afdfbd17f9011e695f2c58f36d42eebfe7442e9f79c"
},
{
"path": "skills/expertise/macos-apps/references/project-scaffolding.md",
"sha256": "e43bb2b387da4f9c9f0ae5b316a2729e2ef57b58d27992d63563a871bc577634"
},
{
"path": "skills/expertise/macos-apps/references/appkit-integration.md",
"sha256": "cbd2cf372e6c6ad8a1870b09609be9f8e6e0a88c829cce4bfb7dbeb378c1d973"
},
{
"path": "skills/expertise/macos-apps/references/testing-debugging.md",
"sha256": "9d75f82799056218c1ffc23a3c5c7956dfaf44e39ffe15adeb03cf185cf254b0"
},
{
"path": "skills/expertise/macos-apps/references/testing-tdd.md",
"sha256": "ee19d5ee5d42698c5a0bafa726334722d44f6df5469dbbbc56bb958656a6e8bf"
},
{
"path": "skills/expertise/macos-apps/references/cli-workflow.md",
"sha256": "0dcb17af6a911098b9eb19f2d6b497e7c478b387ffa1272977990da535ced77d"
},
{
"path": "skills/expertise/macos-apps/references/shoebox-apps.md",
"sha256": "c5139045519862f3116376209e147ee8e92678cba9b1b3bbc1a95116fbdb8146"
},
{
"path": "skills/expertise/macos-apps/references/app-architecture.md",
"sha256": "0334b50bbd7783d0c9fbe88eceec990fd7bdcaf7d69989dc149c69497d323a8c"
},
{
"path": "skills/expertise/macos-apps/references/design-system.md",
"sha256": "31f52021f743214f3ba360208de245acc30a396e6f56945ea4f2633fecb4353b"
},
{
"path": "skills/expertise/macos-apps/references/cli-observability.md",
"sha256": "b2d0280a04adafb0166fad6240e66fe3e686e9fc1257752fbdf1ea57c93efa96"
},
{
"path": "skills/expertise/macos-apps/references/menu-bar-apps.md",
"sha256": "c32a8acbb24f6af36c9edf5334bdefbb4a50c3963cab8c43f5d7cbc07f49bbce"
},
{
"path": "skills/expertise/macos-apps/references/security-code-signing.md",
"sha256": "266447da62f8fd82bf9e02fd11d15c4008babada1d603c843b674832040f87b7"
},
{
"path": "skills/expertise/macos-apps/references/system-apis.md",
"sha256": "8cfa309d688366f5cbf0dad9d016865859342e5e9eabc3f3dbe8a0fd91974d4a"
},
{
"path": "skills/expertise/macos-apps/references/concurrency-patterns.md",
"sha256": "5cc950b15be7fa2d654908daf2a69bffe5937f19ced934a5fe2bad401d48b443"
},
{
"path": "skills/expertise/macos-apps/references/document-apps.md",
"sha256": "38ecadeb45224ac14d6d755565e694e25f30c1177f018008896ee234e105d5fd"
},
{
"path": "skills/expertise/macos-apps/references/app-extensions.md",
"sha256": "adeafb23ec54af11fae4dc9dde666ce790fdb9839397f75254564eeb45f3f8e6"
},
{
"path": "skills/expertise/macos-apps/workflows/add-feature.md",
"sha256": "33eb36598c995ee91d7684195499a9cf6991b19f51d4108046abfb977419058e"
},
{
"path": "skills/expertise/macos-apps/workflows/debug-app.md",
"sha256": "f48a9d2377e0807233daddf26553d09dc7f956b3ab0e1e0600e2fbdfd1de4a85"
},
{
"path": "skills/expertise/macos-apps/workflows/ship-app.md",
"sha256": "6a50e1907e2e0719a7a50a5eeb7ebf866bdff3f68a05c0289d1829680133732c"
},
{
"path": "skills/expertise/macos-apps/workflows/write-tests.md",
"sha256": "7db4454101be37174454e2853388e742caff7a26e7ea47e98cd64532f9bfe0d3"
},
{
"path": "skills/expertise/macos-apps/workflows/optimize-performance.md",
"sha256": "050df7c3eac2ba214ff52098f3e4df3754d023e467ecf7f083f834b8fafb394f"
},
{
"path": "skills/expertise/macos-apps/workflows/build-new-app.md",
"sha256": "73566c8e4d8f23868873bf77ee03b3e7b7d371638b8340722a3c5940f6e766b3"
}
],
"dirSha256": "afd0d1b05f4cd7091d0295389253496d3de0dd6c700c050391cb36e73c65513f"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}

View File

@@ -0,0 +1,192 @@
---
name: create-agent-skills
description: Expert guidance for creating, writing, building, and refining Claude Code Skills. Use when working with SKILL.md files, authoring new skills, improving existing skills, or understanding skill structure and best practices.
---
<essential_principles>
## How Skills Work
Skills are modular, filesystem-based capabilities that provide domain expertise on demand. This skill teaches how to create effective skills.
### 1. Skills Are Prompts
All prompting best practices apply. Be clear, be direct, use XML structure. Assume Claude is smart - only add context Claude doesn't have.
### 2. SKILL.md Is Always Loaded
When a skill is invoked, Claude reads SKILL.md. Use this guarantee:
- Essential principles go in SKILL.md (can't be skipped)
- Workflow-specific content goes in workflows/
- Reusable knowledge goes in references/
### 3. Router Pattern for Complex Skills
```
skill-name/
├── SKILL.md # Router + principles
├── workflows/ # Step-by-step procedures (FOLLOW)
├── references/ # Domain knowledge (READ)
├── templates/ # Output structures (COPY + FILL)
└── scripts/ # Reusable code (EXECUTE)
```
SKILL.md asks "what do you want to do?" → routes to workflow → workflow specifies which references to read.
**When to use each folder:**
- **workflows/** - Multi-step procedures Claude follows
- **references/** - Domain knowledge Claude reads for context
- **templates/** - Consistent output structures Claude copies and fills (plans, specs, configs)
- **scripts/** - Executable code Claude runs as-is (deploy, setup, API calls)
### 4. Pure XML Structure
No markdown headings (#, ##, ###) in skill body. Use semantic XML tags:
```xml
<objective>...</objective>
<process>...</process>
<success_criteria>...</success_criteria>
```
Keep markdown formatting within content (bold, lists, code blocks).
### 5. Progressive Disclosure
SKILL.md under 500 lines. Split detailed content into reference files. Load only what's needed for the current workflow.
</essential_principles>
<intake>
What would you like to do?
1. Create new skill
2. Audit/modify existing skill
3. Add component (workflow/reference/template/script)
4. Get guidance
**Wait for response before proceeding.**
</intake>
<routing>
| Response | Next Action | Workflow |
|----------|-------------|----------|
| 1, "create", "new", "build" | Ask: "Task-execution skill or domain expertise skill?" | Route to appropriate create workflow |
| 2, "audit", "modify", "existing" | Ask: "Path to skill?" | Route to appropriate workflow |
| 3, "add", "component" | Ask: "Add what? (workflow/reference/template/script)" | workflows/add-{type}.md |
| 4, "guidance", "help" | General guidance | workflows/get-guidance.md |
**Progressive disclosure for option 1 (create):**
- If user selects "Task-execution skill" → workflows/create-new-skill.md
- If user selects "Domain expertise skill" → workflows/create-domain-expertise-skill.md
**Progressive disclosure for option 3 (add component):**
- If user specifies workflow → workflows/add-workflow.md
- If user specifies reference → workflows/add-reference.md
- If user specifies template → workflows/add-template.md
- If user specifies script → workflows/add-script.md
**Intent-based routing (if user provides clear intent without selecting menu):**
- "audit this skill", "check skill", "review" → workflows/audit-skill.md
- "verify content", "check if current" → workflows/verify-skill.md
- "create domain expertise", "exhaustive knowledge base" → workflows/create-domain-expertise-skill.md
- "create skill for X", "build new skill" → workflows/create-new-skill.md
- "add workflow", "add reference", etc. → workflows/add-{type}.md
- "upgrade to router" → workflows/upgrade-to-router.md
**After reading the workflow, follow it exactly.**
</routing>
<quick_reference>
## Skill Structure Quick Reference
**Simple skill (single file):**
```yaml
---
name: skill-name
description: What it does and when to use it.
---
<objective>What this skill does</objective>
<quick_start>Immediate actionable guidance</quick_start>
<process>Step-by-step procedure</process>
<success_criteria>How to know it worked</success_criteria>
```
**Complex skill (router pattern):**
```
SKILL.md:
<essential_principles> - Always applies
<intake> - Question to ask
<routing> - Maps answers to workflows
workflows/:
<required_reading> - Which refs to load
<process> - Steps
<success_criteria> - Done when...
references/:
Domain knowledge, patterns, examples
templates/:
Output structures Claude copies and fills
(plans, specs, configs, documents)
scripts/:
Executable code Claude runs as-is
(deploy, setup, API calls, data processing)
```
</quick_reference>
<reference_index>
## Domain Knowledge
All in `references/`:
**Structure:** recommended-structure.md, skill-structure.md
**Principles:** core-principles.md, be-clear-and-direct.md, use-xml-tags.md
**Patterns:** common-patterns.md, workflows-and-validation.md
**Assets:** using-templates.md, using-scripts.md
**Advanced:** executable-code.md, api-security.md, iteration-and-testing.md
</reference_index>
<workflows_index>
## Workflows
All in `workflows/`:
| Workflow | Purpose |
|----------|---------|
| create-new-skill.md | Build a skill from scratch |
| create-domain-expertise-skill.md | Build exhaustive domain knowledge base for build/ |
| audit-skill.md | Analyze skill against best practices |
| verify-skill.md | Check if content is still accurate |
| add-workflow.md | Add a workflow to existing skill |
| add-reference.md | Add a reference to existing skill |
| add-template.md | Add a template to existing skill |
| add-script.md | Add a script to existing skill |
| upgrade-to-router.md | Convert simple skill to router pattern |
| get-guidance.md | Help decide what kind of skill to build |
</workflows_index>
<yaml_requirements>
## YAML Frontmatter
Required fields:
```yaml
---
name: skill-name # lowercase-with-hyphens, matches directory
description: ... # What it does AND when to use it (third person)
---
```
Name conventions: `create-*`, `manage-*`, `setup-*`, `generate-*`, `build-*`
</yaml_requirements>
<success_criteria>
A well-structured skill:
- Has valid YAML frontmatter
- Uses pure XML structure (no markdown headings in body)
- Has essential principles inline in SKILL.md
- Routes directly to appropriate workflows based on user intent
- Keeps SKILL.md under 500 lines
- Asks minimal clarifying questions only when truly needed
- Has been tested with real usage
</success_criteria>

View File

@@ -0,0 +1,226 @@
<overview>
When building skills that make API calls requiring credentials (API keys, tokens, secrets), follow this protocol to prevent credentials from appearing in chat.
</overview>
<the_problem>
Raw curl commands with environment variables expose credentials:
```bash
# ❌ BAD - API key visible in chat
curl -H "Authorization: Bearer $API_KEY" https://api.example.com/data
```
When Claude executes this, the full command with expanded `$API_KEY` appears in the conversation.
</the_problem>
<the_solution>
Use `~/.claude/scripts/secure-api.sh` - a wrapper that loads credentials internally.
<for_supported_services>
```bash
# ✅ GOOD - No credentials visible
~/.claude/scripts/secure-api.sh <service> <operation> [args]
# Examples:
~/.claude/scripts/secure-api.sh facebook list-campaigns
~/.claude/scripts/secure-api.sh ghl search-contact "email@example.com"
```
</for_supported_services>
<adding_new_services>
When building a new skill that requires API calls:
1. **Add operations to the wrapper** (`~/.claude/scripts/secure-api.sh`):
```bash
case "$SERVICE" in
yourservice)
case "$OPERATION" in
list-items)
curl -s -G \
-H "Authorization: Bearer $YOUR_API_KEY" \
"https://api.yourservice.com/items"
;;
get-item)
ITEM_ID=$1
curl -s -G \
-H "Authorization: Bearer $YOUR_API_KEY" \
"https://api.yourservice.com/items/$ITEM_ID"
;;
*)
echo "Unknown operation: $OPERATION" >&2
exit 1
;;
esac
;;
esac
```
2. **Add profile support to the wrapper** (if service needs multiple accounts):
```bash
# In secure-api.sh, add to profile remapping section:
yourservice)
SERVICE_UPPER="YOURSERVICE"
YOURSERVICE_API_KEY=$(eval echo \$${SERVICE_UPPER}_${PROFILE_UPPER}_API_KEY)
YOURSERVICE_ACCOUNT_ID=$(eval echo \$${SERVICE_UPPER}_${PROFILE_UPPER}_ACCOUNT_ID)
;;
```
3. **Add credential placeholders to `~/.claude/.env`** using profile naming:
```bash
# Check if entries already exist
grep -q "YOURSERVICE_MAIN_API_KEY=" ~/.claude/.env 2>/dev/null || \
echo -e "\n# Your Service - Main profile\nYOURSERVICE_MAIN_API_KEY=\nYOURSERVICE_MAIN_ACCOUNT_ID=" >> ~/.claude/.env
echo "Added credential placeholders to ~/.claude/.env - user needs to fill them in"
```
4. **Document profile workflow in your SKILL.md**:
```markdown
## Profile Selection Workflow
**CRITICAL:** Always use profile selection to prevent using wrong account credentials.
### When user requests YourService operation:
1. **Check for saved profile:**
```bash
~/.claude/scripts/profile-state get yourservice
```
2. **If no profile saved, discover available profiles:**
```bash
~/.claude/scripts/list-profiles yourservice
```
3. **If only ONE profile:** Use it automatically and announce:
```
"Using YourService profile 'main' to list items..."
```
4. **If MULTIPLE profiles:** Ask user which one:
```
"Which YourService profile: main, clienta, or clientb?"
```
5. **Save user's selection:**
```bash
~/.claude/scripts/profile-state set yourservice <selected_profile>
```
6. **Always announce which profile before calling API:**
```
"Using YourService profile 'main' to list items..."
```
7. **Make API call with profile:**
```bash
~/.claude/scripts/secure-api.sh yourservice:<profile> list-items
```
## Secure API Calls
All API calls use profile syntax:
```bash
~/.claude/scripts/secure-api.sh yourservice:<profile> <operation> [args]
# Examples:
~/.claude/scripts/secure-api.sh yourservice:main list-items
~/.claude/scripts/secure-api.sh yourservice:main get-item <ITEM_ID>
```
**Profile persists for session:** Once selected, use same profile for subsequent operations unless user explicitly changes it.
```
</adding_new_services>
</the_solution>
<pattern_guidelines>
<simple_get_requests>
```bash
curl -s -G \
-H "Authorization: Bearer $API_KEY" \
"https://api.example.com/endpoint"
```
</simple_get_requests>
<post_with_json_body>
```bash
ITEM_ID=$1
curl -s -X POST \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d @- \
"https://api.example.com/items/$ITEM_ID"
```
Usage:
```bash
echo '{"name":"value"}' | ~/.claude/scripts/secure-api.sh service create-item
```
</post_with_json_body>
<post_with_form_data>
```bash
curl -s -X POST \
-F "field1=value1" \
-F "field2=value2" \
-F "access_token=$API_TOKEN" \
"https://api.example.com/endpoint"
```
</post_with_form_data>
</pattern_guidelines>
<credential_storage>
**Location:** `~/.claude/.env` (global for all skills, accessible from any directory)
**Format:**
```bash
# Service credentials
SERVICE_API_KEY=your-key-here
SERVICE_ACCOUNT_ID=account-id-here
# Another service
OTHER_API_TOKEN=token-here
OTHER_BASE_URL=https://api.other.com
```
**Loading in script:**
```bash
set -a
source ~/.claude/.env 2>/dev/null || { echo "Error: ~/.claude/.env not found" >&2; exit 1; }
set +a
```
</credential_storage>
<best_practices>
1. **Never use raw curl with `$VARIABLE` in skill examples** - always use the wrapper
2. **Add all operations to the wrapper** - don't make users figure out curl syntax
3. **Auto-create credential placeholders** - add empty fields to `~/.claude/.env` immediately when creating the skill
4. **Keep credentials in `~/.claude/.env`** - one central location, works everywhere
5. **Document each operation** - show examples in SKILL.md
6. **Handle errors gracefully** - check for missing env vars, show helpful error messages
</best_practices>
<testing>
Test the wrapper without exposing credentials:
```bash
# This command appears in chat
~/.claude/scripts/secure-api.sh facebook list-campaigns
# But API keys never appear - they're loaded inside the script
```
Verify credentials are loaded:
```bash
# Check .env exists
ls -la ~/.claude/.env
# Check specific variables (without showing values)
grep -q "YOUR_API_KEY=" ~/.claude/.env && echo "API key configured" || echo "API key missing"
```
</testing>

View File

@@ -0,0 +1,531 @@
<golden_rule>
Show your skill to someone with minimal context and ask them to follow the instructions. If they're confused, Claude will likely be too.
</golden_rule>
<overview>
Clarity and directness are fundamental to effective skill authoring. Clear instructions reduce errors, improve execution quality, and minimize token waste.
</overview>
<guidelines>
<contextual_information>
Give Claude contextual information that frames the task:
- What the task results will be used for
- What audience the output is meant for
- What workflow the task is part of
- The end goal or what successful completion looks like
Context helps Claude make better decisions and produce more appropriate outputs.
<example>
```xml
<context>
This analysis will be presented to investors who value transparency and actionable insights. Focus on financial metrics and clear recommendations.
</context>
```
</example>
</contextual_information>
<specificity>
Be specific about what you want Claude to do. If you want code only and nothing else, say so.
**Vague**: "Help with the report"
**Specific**: "Generate a markdown report with three sections: Executive Summary, Key Findings, Recommendations"
**Vague**: "Process the data"
**Specific**: "Extract customer names and email addresses from the CSV file, removing duplicates, and save to JSON format"
Specificity eliminates ambiguity and reduces iteration cycles.
</specificity>
<sequential_steps>
Provide instructions as sequential steps. Use numbered lists or bullet points.
```xml
<workflow>
1. Extract data from source file
2. Transform to target format
3. Validate transformation
4. Save to output file
5. Verify output correctness
</workflow>
```
Sequential steps create clear expectations and reduce the chance Claude skips important operations.
</sequential_steps>
</guidelines>
<example_comparison>
<unclear_example>
```xml
<quick_start>
Please remove all personally identifiable information from these customer feedback messages: {{FEEDBACK_DATA}}
</quick_start>
```
**Problems**:
- What counts as PII?
- What should replace PII?
- What format should the output be?
- What if no PII is found?
- Should product names be redacted?
</unclear_example>
<clear_example>
```xml
<objective>
Anonymize customer feedback for quarterly review presentation.
</objective>
<quick_start>
<instructions>
1. Replace all customer names with "CUSTOMER_[ID]" (e.g., "Jane Doe" → "CUSTOMER_001")
2. Replace email addresses with "EMAIL_[ID]@example.com"
3. Redact phone numbers as "PHONE_[ID]"
4. If a message mentions a specific product (e.g., "AcmeCloud"), leave it intact
5. If no PII is found, copy the message verbatim
6. Output only the processed messages, separated by "---"
</instructions>
Data to process: {{FEEDBACK_DATA}}
</quick_start>
<success_criteria>
- All customer names replaced with IDs
- All emails and phones redacted
- Product names preserved
- Output format matches specification
</success_criteria>
```
**Why this is better**:
- States the purpose (quarterly review)
- Provides explicit step-by-step rules
- Defines output format clearly
- Specifies edge cases (product names, no PII found)
- Defines success criteria
</clear_example>
</example_comparison>
<key_differences>
The clear version:
- States the purpose (quarterly review)
- Provides explicit step-by-step rules
- Defines output format
- Specifies edge cases (product names, no PII found)
- Includes success criteria
The unclear version leaves all these decisions to Claude, increasing the chance of misalignment with expectations.
</key_differences>
<show_dont_just_tell>
<principle>
When format matters, show an example rather than just describing it.
</principle>
<telling_example>
```xml
<commit_messages>
Generate commit messages in conventional format with type, scope, and description.
</commit_messages>
```
</telling_example>
<showing_example>
```xml
<commit_message_format>
Generate commit messages following these examples:
<example number="1">
<input>Added user authentication with JWT tokens</input>
<output>
```
feat(auth): implement JWT-based authentication
Add login endpoint and token validation middleware
```
</output>
</example>
<example number="2">
<input>Fixed bug where dates displayed incorrectly in reports</input>
<output>
```
fix(reports): correct date formatting in timezone conversion
Use UTC timestamps consistently across report generation
```
</output>
</example>
Follow this style: type(scope): brief description, then detailed explanation.
</commit_message_format>
```
</showing_example>
<why_showing_works>
Examples communicate nuances that text descriptions can't:
- Exact formatting (spacing, capitalization, punctuation)
- Tone and style
- Level of detail
- Pattern across multiple cases
Claude learns patterns from examples more reliably than from descriptions.
</why_showing_works>
</show_dont_just_tell>
<avoid_ambiguity>
<principle>
Eliminate words and phrases that create ambiguity or leave decisions open.
</principle>
<ambiguous_phrases>
**"Try to..."** - Implies optional
**"Always..."** or **"Never..."** - Clear requirement
**"Should probably..."** - Unclear obligation
**"Must..."** or **"May optionally..."** - Clear obligation level
**"Generally..."** - When are exceptions allowed?
**"Always... except when..."** - Clear rule with explicit exceptions
**"Consider..."** - Should Claude always do this or only sometimes?
**"If X, then Y"** or **"Always..."** - Clear conditions
</ambiguous_phrases>
<example>
**Ambiguous**:
```xml
<validation>
You should probably validate the output and try to fix any errors.
</validation>
```
**Clear**:
```xml
<validation>
Always validate output before proceeding:
```bash
python scripts/validate.py output_dir/
```
If validation fails, fix errors and re-validate. Only proceed when validation passes with zero errors.
</validation>
```
</example>
</avoid_ambiguity>
<define_edge_cases>
<principle>
Anticipate edge cases and define how to handle them. Don't leave Claude guessing.
</principle>
<without_edge_cases>
```xml
<quick_start>
Extract email addresses from the text file and save to a JSON array.
</quick_start>
```
**Questions left unanswered**:
- What if no emails are found?
- What if the same email appears multiple times?
- What if emails are malformed?
- What JSON format exactly?
</without_edge_cases>
<with_edge_cases>
```xml
<quick_start>
Extract email addresses from the text file and save to a JSON array.
<edge_cases>
- **No emails found**: Save empty array `[]`
- **Duplicate emails**: Keep only unique emails
- **Malformed emails**: Skip invalid formats, log to stderr
- **Output format**: Array of strings, one email per element
</edge_cases>
<example_output>
```json
[
"user1@example.com",
"user2@example.com"
]
```
</example_output>
</quick_start>
```
</with_edge_cases>
</define_edge_cases>
<output_format_specification>
<principle>
When output format matters, specify it precisely. Show examples.
</principle>
<vague_format>
```xml
<output>
Generate a report with the analysis results.
</output>
```
</vague_format>
<specific_format>
```xml
<output_format>
Generate a markdown report with this exact structure:
```markdown
# Analysis Report: [Title]
## Executive Summary
[1-2 paragraphs summarizing key findings]
## Key Findings
- Finding 1 with supporting data
- Finding 2 with supporting data
- Finding 3 with supporting data
## Recommendations
1. Specific actionable recommendation
2. Specific actionable recommendation
## Appendix
[Raw data and detailed calculations]
```
**Requirements**:
- Use exactly these section headings
- Executive summary must be 1-2 paragraphs
- List 3-5 key findings
- Provide 2-4 recommendations
- Include appendix with source data
</output_format>
```
</specific_format>
</output_format_specification>
<decision_criteria>
<principle>
When Claude must make decisions, provide clear criteria.
</principle>
<no_criteria>
```xml
<workflow>
Analyze the data and decide which visualization to use.
</workflow>
```
**Problem**: What factors should guide this decision?
</no_criteria>
<with_criteria>
```xml
<workflow>
Analyze the data and select appropriate visualization:
<decision_criteria>
**Use bar chart when**:
- Comparing quantities across categories
- Fewer than 10 categories
- Exact values matter
**Use line chart when**:
- Showing trends over time
- Continuous data
- Pattern recognition matters more than exact values
**Use scatter plot when**:
- Showing relationship between two variables
- Looking for correlations
- Individual data points matter
</decision_criteria>
</workflow>
```
**Benefits**: Claude has objective criteria for making the decision rather than guessing.
</with_criteria>
</decision_criteria>
<constraints_and_requirements>
<principle>
Clearly separate "must do" from "nice to have" from "must not do".
</principle>
<unclear_requirements>
```xml
<requirements>
The report should include financial data, customer metrics, and market analysis. It would be good to have visualizations. Don't make it too long.
</requirements>
```
**Problems**:
- Are all three content types required?
- Are visualizations optional or required?
- How long is "too long"?
</unclear_requirements>
<clear_requirements>
```xml
<requirements>
<must_have>
- Financial data (revenue, costs, profit margins)
- Customer metrics (acquisition, retention, lifetime value)
- Market analysis (competition, trends, opportunities)
- Maximum 5 pages
</must_have>
<nice_to_have>
- Charts and visualizations
- Industry benchmarks
- Future projections
</nice_to_have>
<must_not>
- Include confidential customer names
- Exceed 5 pages
- Use technical jargon without definitions
</must_not>
</requirements>
```
**Benefits**: Clear priorities and constraints prevent misalignment.
</clear_requirements>
</constraints_and_requirements>
<success_criteria>
<principle>
Define what success looks like. How will Claude know it succeeded?
</principle>
<without_success_criteria>
```xml
<objective>
Process the CSV file and generate a report.
</objective>
```
**Problem**: When is this task complete? What defines success?
</without_success_criteria>
<with_success_criteria>
```xml
<objective>
Process the CSV file and generate a summary report.
</objective>
<success_criteria>
- All rows in CSV successfully parsed
- No data validation errors
- Report generated with all required sections
- Report saved to output/report.md
- Output file is valid markdown
- Process completes without errors
</success_criteria>
```
**Benefits**: Clear completion criteria eliminate ambiguity about when the task is done.
</with_success_criteria>
</success_criteria>
<testing_clarity>
<principle>
Test your instructions by asking: "Could I hand these instructions to a junior developer and expect correct results?"
</principle>
<testing_process>
1. Read your skill instructions
2. Remove context only you have (project knowledge, unstated assumptions)
3. Identify ambiguous terms or vague requirements
4. Add specificity where needed
5. Test with someone who doesn't have your context
6. Iterate based on their questions and confusion
If a human with minimal context struggles, Claude will too.
</testing_process>
</testing_clarity>
<practical_examples>
<example domain="data_processing">
**Unclear**:
```xml
<quick_start>
Clean the data and remove bad entries.
</quick_start>
```
**Clear**:
```xml
<quick_start>
<data_cleaning>
1. Remove rows where required fields (name, email, date) are empty
2. Standardize date format to YYYY-MM-DD
3. Remove duplicate entries based on email address
4. Validate email format (must contain @ and domain)
5. Save cleaned data to output/cleaned_data.csv
</data_cleaning>
<success_criteria>
- No empty required fields
- All dates in YYYY-MM-DD format
- No duplicate emails
- All emails valid format
- Output file created successfully
</success_criteria>
</quick_start>
```
</example>
<example domain="code_generation">
**Unclear**:
```xml
<quick_start>
Write a function to process user input.
</quick_start>
```
**Clear**:
```xml
<quick_start>
<function_specification>
Write a Python function with this signature:
```python
def process_user_input(raw_input: str) -> dict:
"""
Validate and parse user input.
Args:
raw_input: Raw string from user (format: "name:email:age")
Returns:
dict with keys: name (str), email (str), age (int)
Raises:
ValueError: If input format is invalid
"""
```
**Requirements**:
- Split input on colon delimiter
- Validate email contains @ and domain
- Convert age to integer, raise ValueError if not numeric
- Return dictionary with specified keys
- Include docstring and type hints
</function_specification>
<success_criteria>
- Function signature matches specification
- All validation checks implemented
- Proper error handling for invalid input
- Type hints included
- Docstring included
</success_criteria>
</quick_start>
```
</example>
</practical_examples>

View File

@@ -0,0 +1,595 @@
<overview>
This reference documents common patterns for skill authoring, including templates, examples, terminology consistency, and anti-patterns. All patterns use pure XML structure.
</overview>
<template_pattern>
<description>
Provide templates for output format. Match the level of strictness to your needs.
</description>
<strict_requirements>
Use when output format must be exact and consistent:
```xml
<report_structure>
ALWAYS use this exact template structure:
```markdown
# [Analysis Title]
## Executive summary
[One-paragraph overview of key findings]
## Key findings
- Finding 1 with supporting data
- Finding 2 with supporting data
- Finding 3 with supporting data
## Recommendations
1. Specific actionable recommendation
2. Specific actionable recommendation
```
</report_structure>
```
**When to use**: Compliance reports, standardized formats, automated processing
</strict_requirements>
<flexible_guidance>
Use when Claude should adapt the format based on context:
```xml
<report_structure>
Here is a sensible default format, but use your best judgment:
```markdown
# [Analysis Title]
## Executive summary
[Overview]
## Key findings
[Adapt sections based on what you discover]
## Recommendations
[Tailor to the specific context]
```
Adjust sections as needed for the specific analysis type.
</report_structure>
```
**When to use**: Exploratory analysis, context-dependent formatting, creative tasks
</flexible_guidance>
</template_pattern>
<examples_pattern>
<description>
For skills where output quality depends on seeing examples, provide input/output pairs.
</description>
<commit_messages_example>
```xml
<objective>
Generate commit messages following conventional commit format.
</objective>
<commit_message_format>
Generate commit messages following these examples:
<example number="1">
<input>Added user authentication with JWT tokens</input>
<output>
```
feat(auth): implement JWT-based authentication
Add login endpoint and token validation middleware
```
</output>
</example>
<example number="2">
<input>Fixed bug where dates displayed incorrectly in reports</input>
<output>
```
fix(reports): correct date formatting in timezone conversion
Use UTC timestamps consistently across report generation
```
</output>
</example>
Follow this style: type(scope): brief description, then detailed explanation.
</commit_message_format>
```
</commit_messages_example>
<when_to_use>
- Output format has nuances that text explanations can't capture
- Pattern recognition is easier than rule following
- Examples demonstrate edge cases
- Multi-shot learning improves quality
</when_to_use>
</examples_pattern>
<consistent_terminology>
<principle>
Choose one term and use it throughout the skill. Inconsistent terminology confuses Claude and reduces execution quality.
</principle>
<good_example>
Consistent usage:
- Always "API endpoint" (not mixing with "URL", "API route", "path")
- Always "field" (not mixing with "box", "element", "control")
- Always "extract" (not mixing with "pull", "get", "retrieve")
```xml
<objective>
Extract data from API endpoints using field mappings.
</objective>
<quick_start>
1. Identify the API endpoint
2. Map response fields to your schema
3. Extract field values
</quick_start>
```
</good_example>
<bad_example>
Inconsistent usage creates confusion:
```xml
<objective>
Pull data from API routes using element mappings.
</objective>
<quick_start>
1. Identify the URL
2. Map response boxes to your schema
3. Retrieve control values
</quick_start>
```
Claude must now interpret: Are "API routes" and "URLs" the same? Are "fields", "boxes", "elements", and "controls" the same?
</bad_example>
<implementation>
1. Choose terminology early in skill development
2. Document key terms in `<objective>` or `<context>`
3. Use find/replace to enforce consistency
4. Review reference files for consistent usage
</implementation>
</consistent_terminology>
<provide_default_with_escape_hatch>
<principle>
Provide a default approach with an escape hatch for special cases, not a list of alternatives. Too many options paralyze decision-making.
</principle>
<good_example>
Clear default with escape hatch:
```xml
<quick_start>
Use pdfplumber for text extraction:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
For scanned PDFs requiring OCR, use pdf2image with pytesseract instead.
</quick_start>
```
</good_example>
<bad_example>
Too many options creates decision paralysis:
```xml
<quick_start>
You can use any of these libraries:
- **pypdf**: Good for basic extraction
- **pdfplumber**: Better for tables
- **PyMuPDF**: Faster but more complex
- **pdf2image**: For scanned documents
- **pdfminer**: Low-level control
- **tabula-py**: Table-focused
Choose based on your needs.
</quick_start>
```
Claude must now research and compare all options before starting. This wastes tokens and time.
</bad_example>
<implementation>
1. Recommend ONE default approach
2. Explain when to use the default (implied: most of the time)
3. Add ONE escape hatch for edge cases
4. Link to advanced reference if multiple alternatives truly needed
</implementation>
</provide_default_with_escape_hatch>
<anti_patterns>
<description>
Common mistakes to avoid when authoring skills.
</description>
<pitfall name="markdown_headings_in_body">
**BAD**: Using markdown headings in skill body:
```markdown
# PDF Processing
## Quick start
Extract text with pdfplumber...
## Advanced features
Form filling requires additional setup...
```
**GOOD**: Using pure XML structure:
```xml
<objective>
PDF processing with text extraction, form filling, and merging capabilities.
</objective>
<quick_start>
Extract text with pdfplumber...
</quick_start>
<advanced_features>
Form filling requires additional setup...
</advanced_features>
```
**Why it matters**: XML provides semantic meaning, reliable parsing, and token efficiency.
</pitfall>
<pitfall name="vague_descriptions">
**BAD**:
```yaml
description: Helps with documents
```
**GOOD**:
```yaml
description: Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
```
**Why it matters**: Vague descriptions prevent Claude from discovering and using the skill appropriately.
</pitfall>
<pitfall name="inconsistent_pov">
**BAD**:
```yaml
description: I can help you process Excel files and generate reports
```
**GOOD**:
```yaml
description: Processes Excel files and generates reports. Use when analyzing spreadsheets or .xlsx files.
```
**Why it matters**: Skills must use third person. First/second person breaks the skill metadata pattern.
</pitfall>
<pitfall name="wrong_naming_convention">
**BAD**: Directory name doesn't match skill name or verb-noun convention:
- Directory: `facebook-ads`, Name: `facebook-ads-manager`
- Directory: `stripe-integration`, Name: `stripe`
- Directory: `helper-scripts`, Name: `helper`
**GOOD**: Consistent verb-noun convention:
- Directory: `manage-facebook-ads`, Name: `manage-facebook-ads`
- Directory: `setup-stripe-payments`, Name: `setup-stripe-payments`
- Directory: `process-pdfs`, Name: `process-pdfs`
**Why it matters**: Consistency in naming makes skills discoverable and predictable.
</pitfall>
<pitfall name="too_many_options">
**BAD**:
```xml
<quick_start>
You can use pypdf, or pdfplumber, or PyMuPDF, or pdf2image, or pdfminer, or tabula-py...
</quick_start>
```
**GOOD**:
```xml
<quick_start>
Use pdfplumber for text extraction:
```python
import pdfplumber
```
For scanned PDFs requiring OCR, use pdf2image with pytesseract instead.
</quick_start>
```
**Why it matters**: Decision paralysis. Provide one default approach with escape hatch for special cases.
</pitfall>
<pitfall name="deeply_nested_references">
❌ **BAD**: References nested multiple levels:
```
SKILL.md → advanced.md → details.md → examples.md
```
✅ **GOOD**: References one level deep from SKILL.md:
```
SKILL.md → advanced.md
SKILL.md → details.md
SKILL.md → examples.md
```
**Why it matters**: Claude may only partially read deeply nested files. Keep references one level deep from SKILL.md.
</pitfall>
<pitfall name="windows_paths">
❌ **BAD**:
```xml
<reference_guides>
See scripts\validate.py for validation
</reference_guides>
```
**GOOD**:
```xml
<reference_guides>
See scripts/validate.py for validation
</reference_guides>
```
**Why it matters**: Always use forward slashes for cross-platform compatibility.
</pitfall>
<pitfall name="dynamic_context_and_file_reference_execution">
**Problem**: When showing examples of dynamic context syntax (exclamation mark + backticks) or file references (@ prefix), the skill loader executes these during skill loading.
**BAD** - These execute during skill load:
```xml
<examples>
Load current status with: !`git status`
Review dependencies in: @package.json
</examples>
```
**GOOD** - Add space to prevent execution:
```xml
<examples>
Load current status with: ! `git status` (remove space before backtick in actual usage)
Review dependencies in: @ package.json (remove space after @ in actual usage)
</examples>
```
**When this applies**:
- Skills that teach users about dynamic context (slash commands, prompts)
- Any documentation showing the exclamation mark prefix syntax or @ file references
- Skills with example commands or file paths that shouldn't execute during loading
**Why it matters**: Without the space, these execute during skill load, causing errors or unwanted file reads.
</pitfall>
<pitfall name="missing_required_tags">
**BAD**: Missing required tags:
```xml
<quick_start>
Use this tool for processing...
</quick_start>
```
**GOOD**: All required tags present:
```xml
<objective>
Process data files with validation and transformation.
</objective>
<quick_start>
Use this tool for processing...
</quick_start>
<success_criteria>
- Input file successfully processed
- Output file validates without errors
- Transformation applied correctly
</success_criteria>
```
**Why it matters**: Every skill must have `<objective>`, `<quick_start>`, and `<success_criteria>` (or `<when_successful>`).
</pitfall>
<pitfall name="hybrid_xml_markdown">
**BAD**: Mixing XML tags with markdown headings:
```markdown
<objective>
PDF processing capabilities
</objective>
## Quick start
Extract text with pdfplumber...
## Advanced features
Form filling...
```
**GOOD**: Pure XML throughout:
```xml
<objective>
PDF processing capabilities
</objective>
<quick_start>
Extract text with pdfplumber...
</quick_start>
<advanced_features>
Form filling...
</advanced_features>
```
**Why it matters**: Consistency in structure. Either use pure XML or pure markdown (prefer XML).
</pitfall>
<pitfall name="unclosed_xml_tags">
**BAD**: Forgetting to close XML tags:
```xml
<objective>
Process PDF files
<quick_start>
Use pdfplumber...
</quick_start>
```
**GOOD**: Properly closed tags:
```xml
<objective>
Process PDF files
</objective>
<quick_start>
Use pdfplumber...
</quick_start>
```
**Why it matters**: Unclosed tags break XML parsing and create ambiguous boundaries.
</pitfall>
</anti_patterns>
<progressive_disclosure_pattern>
<description>
Keep SKILL.md concise by linking to detailed reference files. Claude loads reference files only when needed.
</description>
<implementation>
```xml
<objective>
Manage Facebook Ads campaigns, ad sets, and ads via the Marketing API.
</objective>
<quick_start>
<basic_operations>
See [basic-operations.md](basic-operations.md) for campaign creation and management.
</basic_operations>
</quick_start>
<advanced_features>
**Custom audiences**: See [audiences.md](audiences.md)
**Conversion tracking**: See [conversions.md](conversions.md)
**Budget optimization**: See [budgets.md](budgets.md)
**API reference**: See [api-reference.md](api-reference.md)
</advanced_features>
```
**Benefits**:
- SKILL.md stays under 500 lines
- Claude only reads relevant reference files
- Token usage scales with task complexity
- Easier to maintain and update
</implementation>
</progressive_disclosure_pattern>
<validation_pattern>
<description>
For skills with validation steps, make validation scripts verbose and specific.
</description>
<implementation>
```xml
<validation>
After making changes, validate immediately:
```bash
python scripts/validate.py output_dir/
```
If validation fails, fix errors before continuing. Validation errors include:
- **Field not found**: "Field 'signature_date' not found. Available fields: customer_name, order_total, signature_date_signed"
- **Type mismatch**: "Field 'order_total' expects number, got string"
- **Missing required field**: "Required field 'customer_name' is missing"
Only proceed when validation passes with zero errors.
</validation>
```
**Why verbose errors help**:
- Claude can fix issues without guessing
- Specific error messages reduce iteration cycles
- Available options shown in error messages
</implementation>
</validation_pattern>
<checklist_pattern>
<description>
For complex multi-step workflows, provide a checklist Claude can copy and track progress.
</description>
<implementation>
```xml
<workflow>
Copy this checklist and check off items as you complete them:
```
Task Progress:
- [ ] Step 1: Analyze the form (run analyze_form.py)
- [ ] Step 2: Create field mapping (edit fields.json)
- [ ] Step 3: Validate mapping (run validate_fields.py)
- [ ] Step 4: Fill the form (run fill_form.py)
- [ ] Step 5: Verify output (run verify_output.py)
```
<step_1>
**Analyze the form**
Run: `python scripts/analyze_form.py input.pdf`
This extracts form fields and their locations, saving to `fields.json`.
</step_1>
<step_2>
**Create field mapping**
Edit `fields.json` to add values for each field.
</step_2>
<step_3>
**Validate mapping**
Run: `python scripts/validate_fields.py fields.json`
Fix any validation errors before continuing.
</step_3>
<step_4>
**Fill the form**
Run: `python scripts/fill_form.py input.pdf fields.json output.pdf`
</step_4>
<step_5>
**Verify output**
Run: `python scripts/verify_output.py output.pdf`
If verification fails, return to Step 2.
</step_5>
</workflow>
```
**Benefits**:
- Clear progress tracking
- Prevents skipping steps
- Easy to resume after interruption
</implementation>
</checklist_pattern>

View File

@@ -0,0 +1,437 @@
<overview>
Core principles guide skill authoring decisions. These principles ensure skills are efficient, effective, and maintainable across different models and use cases.
</overview>
<xml_structure_principle>
<description>
Skills use pure XML structure for consistent parsing, efficient token usage, and improved Claude performance.
</description>
<why_xml>
<consistency>
XML enforces consistent structure across all skills. All skills use the same tag names for the same purposes:
- `<objective>` always defines what the skill does
- `<quick_start>` always provides immediate guidance
- `<success_criteria>` always defines completion
This consistency makes skills predictable and easier to maintain.
</consistency>
<parseability>
XML provides unambiguous boundaries and semantic meaning. Claude can reliably:
- Identify section boundaries (where content starts and ends)
- Understand content purpose (what role each section plays)
- Skip irrelevant sections (progressive disclosure)
- Parse programmatically (validation tools can check structure)
Markdown headings are just visual formatting. Claude must infer meaning from heading text, which is less reliable.
</parseability>
<token_efficiency>
XML tags are more efficient than markdown headings:
**Markdown headings**:
```markdown
## Quick start
## Workflow
## Advanced features
## Success criteria
```
Total: ~20 tokens, no semantic meaning to Claude
**XML tags**:
```xml
<quick_start>
<workflow>
<advanced_features>
<success_criteria>
```
Total: ~15 tokens, semantic meaning built-in
Savings compound across all skills in the ecosystem.
</token_efficiency>
<claude_performance>
Claude performs better with pure XML because:
- Unambiguous section boundaries reduce parsing errors
- Semantic tags convey intent directly (no inference needed)
- Nested tags create clear hierarchies
- Consistent structure across skills reduces cognitive load
- Progressive disclosure works more reliably
Pure XML structure is not just a style preference—it's a performance optimization.
</claude_performance>
</why_xml>
<critical_rule>
**Remove ALL markdown headings (#, ##, ###) from skill body content.** Replace with semantic XML tags. Keep markdown formatting WITHIN content (bold, italic, lists, code blocks, links).
</critical_rule>
<required_tags>
Every skill MUST have:
- `<objective>` - What the skill does and why it matters
- `<quick_start>` - Immediate, actionable guidance
- `<success_criteria>` or `<when_successful>` - How to know it worked
See [use-xml-tags.md](use-xml-tags.md) for conditional tags and intelligence rules.
</required_tags>
</xml_structure_principle>
<conciseness_principle>
<description>
The context window is shared. Your skill shares it with the system prompt, conversation history, other skills' metadata, and the actual request.
</description>
<guidance>
Only add context Claude doesn't already have. Challenge each piece of information:
- "Does Claude really need this explanation?"
- "Can I assume Claude knows this?"
- "Does this paragraph justify its token cost?"
Assume Claude is smart. Don't explain obvious concepts.
</guidance>
<concise_example>
**Concise** (~50 tokens):
```xml
<quick_start>
Extract PDF text with pdfplumber:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
</quick_start>
```
**Verbose** (~150 tokens):
```xml
<quick_start>
PDF files are a common file format used for documents. To extract text from them, we'll use a Python library called pdfplumber. First, you'll need to import the library, then open the PDF file using the open method, and finally extract the text from each page. Here's how to do it:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
This code opens the PDF and extracts text from the first page.
</quick_start>
```
The concise version assumes Claude knows what PDFs are, understands Python imports, and can read code. All those assumptions are correct.
</concise_example>
<when_to_elaborate>
Add explanation when:
- Concept is domain-specific (not general programming knowledge)
- Pattern is non-obvious or counterintuitive
- Context affects behavior in subtle ways
- Trade-offs require judgment
Don't add explanation for:
- Common programming concepts (loops, functions, imports)
- Standard library usage (reading files, making HTTP requests)
- Well-known tools (git, npm, pip)
- Obvious next steps
</when_to_elaborate>
</conciseness_principle>
<degrees_of_freedom_principle>
<description>
Match the level of specificity to the task's fragility and variability. Give Claude more freedom for creative tasks, less freedom for fragile operations.
</description>
<high_freedom>
<when>
- Multiple approaches are valid
- Decisions depend on context
- Heuristics guide the approach
- Creative solutions welcome
</when>
<example>
```xml
<objective>
Review code for quality, bugs, and maintainability.
</objective>
<workflow>
1. Analyze the code structure and organization
2. Check for potential bugs or edge cases
3. Suggest improvements for readability and maintainability
4. Verify adherence to project conventions
</workflow>
<success_criteria>
- All major issues identified
- Suggestions are actionable and specific
- Review balances praise and criticism
</success_criteria>
```
Claude has freedom to adapt the review based on what the code needs.
</example>
</high_freedom>
<medium_freedom>
<when>
- A preferred pattern exists
- Some variation is acceptable
- Configuration affects behavior
- Template can be adapted
</when>
<example>
```xml
<objective>
Generate reports with customizable format and sections.
</objective>
<report_template>
Use this template and customize as needed:
```python
def generate_report(data, format="markdown", include_charts=True):
# Process data
# Generate output in specified format
# Optionally include visualizations
```
</report_template>
<success_criteria>
- Report includes all required sections
- Format matches user preference
- Data accurately represented
</success_criteria>
```
Claude can customize the template based on requirements.
</example>
</medium_freedom>
<low_freedom>
<when>
- Operations are fragile and error-prone
- Consistency is critical
- A specific sequence must be followed
- Deviation causes failures
</when>
<example>
```xml
<objective>
Run database migration with exact sequence to prevent data loss.
</objective>
<workflow>
Run exactly this script:
```bash
python scripts/migrate.py --verify --backup
```
**Do not modify the command or add additional flags.**
</workflow>
<success_criteria>
- Migration completes without errors
- Backup created before migration
- Verification confirms data integrity
</success_criteria>
```
Claude must follow the exact command with no variation.
</example>
</low_freedom>
<matching_specificity>
The key is matching specificity to fragility:
- **Fragile operations** (database migrations, payment processing, security): Low freedom, exact instructions
- **Standard operations** (API calls, file processing, data transformation): Medium freedom, preferred pattern with flexibility
- **Creative operations** (code review, content generation, analysis): High freedom, heuristics and principles
Mismatched specificity causes problems:
- Too much freedom on fragile tasks → errors and failures
- Too little freedom on creative tasks → rigid, suboptimal outputs
</matching_specificity>
</degrees_of_freedom_principle>
<model_testing_principle>
<description>
Skills act as additions to models, so effectiveness depends on the underlying model. What works for Opus might need more detail for Haiku.
</description>
<testing_across_models>
Test your skill with all models you plan to use:
<haiku_testing>
**Claude Haiku** (fast, economical)
Questions to ask:
- Does the skill provide enough guidance?
- Are examples clear and complete?
- Do implicit assumptions become explicit?
- Does Haiku need more structure?
Haiku benefits from:
- More explicit instructions
- Complete examples (no partial code)
- Clear success criteria
- Step-by-step workflows
</haiku_testing>
<sonnet_testing>
**Claude Sonnet** (balanced)
Questions to ask:
- Is the skill clear and efficient?
- Does it avoid over-explanation?
- Are workflows well-structured?
- Does progressive disclosure work?
Sonnet benefits from:
- Balanced detail level
- XML structure for clarity
- Progressive disclosure
- Concise but complete guidance
</sonnet_testing>
<opus_testing>
**Claude Opus** (powerful reasoning)
Questions to ask:
- Does the skill avoid over-explaining?
- Can Opus infer obvious steps?
- Are constraints clear?
- Is context minimal but sufficient?
Opus benefits from:
- Concise instructions
- Principles over procedures
- High degrees of freedom
- Trust in reasoning capabilities
</opus_testing>
</testing_across_models>
<balancing_across_models>
Aim for instructions that work well across all target models:
**Good balance**:
```xml
<quick_start>
Use pdfplumber for text extraction:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
For scanned PDFs requiring OCR, use pdf2image with pytesseract instead.
</quick_start>
```
This works for all models:
- Haiku gets complete working example
- Sonnet gets clear default with escape hatch
- Opus gets enough context without over-explanation
**Too minimal for Haiku**:
```xml
<quick_start>
Use pdfplumber for text extraction.
</quick_start>
```
**Too verbose for Opus**:
```xml
<quick_start>
PDF files are documents that contain text. To extract that text, we use a library called pdfplumber. First, import the library at the top of your Python file. Then, open the PDF file using the pdfplumber.open() method. This returns a PDF object. Access the pages attribute to get a list of pages. Each page has an extract_text() method that returns the text content...
</quick_start>
```
</balancing_across_models>
<iterative_improvement>
1. Start with medium detail level
2. Test with target models
3. Observe where models struggle or succeed
4. Adjust based on actual performance
5. Re-test and iterate
Don't optimize for one model. Find the balance that works across your target models.
</iterative_improvement>
</model_testing_principle>
<progressive_disclosure_principle>
<description>
SKILL.md serves as an overview. Reference files contain details. Claude loads reference files only when needed.
</description>
<token_efficiency>
Progressive disclosure keeps token usage proportional to task complexity:
- Simple task: Load SKILL.md only (~500 tokens)
- Medium task: Load SKILL.md + one reference (~1000 tokens)
- Complex task: Load SKILL.md + multiple references (~2000 tokens)
Without progressive disclosure, every task loads all content regardless of need.
</token_efficiency>
<implementation>
- Keep SKILL.md under 500 lines
- Split detailed content into reference files
- Keep references one level deep from SKILL.md
- Link to references from relevant sections
- Use descriptive reference file names
See [skill-structure.md](skill-structure.md) for progressive disclosure patterns.
</implementation>
</progressive_disclosure_principle>
<validation_principle>
<description>
Validation scripts are force multipliers. They catch errors that Claude might miss and provide actionable feedback.
</description>
<characteristics>
Good validation scripts:
- Provide verbose, specific error messages
- Show available valid options when something is invalid
- Pinpoint exact location of problems
- Suggest actionable fixes
- Are deterministic and reliable
See [workflows-and-validation.md](workflows-and-validation.md) for validation patterns.
</characteristics>
</validation_principle>
<principle_summary>
<xml_structure>
Use pure XML structure for consistency, parseability, and Claude performance. Required tags: objective, quick_start, success_criteria.
</xml_structure>
<conciseness>
Only add context Claude doesn't have. Assume Claude is smart. Challenge every piece of content.
</conciseness>
<degrees_of_freedom>
Match specificity to fragility. High freedom for creative tasks, low freedom for fragile operations, medium for standard work.
</degrees_of_freedom>
<model_testing>
Test with all target models. Balance detail level to work across Haiku, Sonnet, and Opus.
</model_testing>
<progressive_disclosure>
Keep SKILL.md concise. Split details into reference files. Load reference files only when needed.
</progressive_disclosure>
<validation>
Make validation scripts verbose and specific. Catch errors early with actionable feedback.
</validation>
</principle_summary>

View File

@@ -0,0 +1,175 @@
<when_to_use_scripts>
Even if Claude could write a script, pre-made scripts offer advantages:
- More reliable than generated code
- Save tokens (no need to include code in context)
- Save time (no code generation required)
- Ensure consistency across uses
<execution_vs_reference>
Make clear whether Claude should:
- **Execute the script** (most common): "Run `analyze_form.py` to extract fields"
- **Read it as reference** (for complex logic): "See `analyze_form.py` for the extraction algorithm"
For most utility scripts, execution is preferred.
</execution_vs_reference>
<how_scripts_work>
When Claude executes a script via bash:
1. Script code never enters context window
2. Only script output consumes tokens
3. Far more efficient than having Claude generate equivalent code
</how_scripts_work>
</when_to_use_scripts>
<file_organization>
<scripts_directory>
**Best practice**: Place all executable scripts in a `scripts/` subdirectory within the skill folder.
```
skill-name/
├── SKILL.md
├── scripts/
│ ├── main_utility.py
│ ├── helper_script.py
│ └── validator.py
└── references/
└── api-docs.md
```
**Benefits**:
- Keeps skill root clean and organized
- Clear separation between documentation and executable code
- Consistent pattern across all skills
- Easy to reference: `python scripts/script_name.py`
**Reference pattern**: In SKILL.md, reference scripts using the `scripts/` path:
```bash
python ~/.claude/skills/skill-name/scripts/analyze.py input.har
```
</scripts_directory>
</file_organization>
<utility_scripts_pattern>
<example>
## Utility scripts
**analyze_form.py**: Extract all form fields from PDF
```bash
python scripts/analyze_form.py input.pdf > fields.json
```
Output format:
```json
{
"field_name": { "type": "text", "x": 100, "y": 200 },
"signature": { "type": "sig", "x": 150, "y": 500 }
}
```
**validate_boxes.py**: Check for overlapping bounding boxes
```bash
python scripts/validate_boxes.py fields.json
# Returns: "OK" or lists conflicts
```
**fill_form.py**: Apply field values to PDF
```bash
python scripts/fill_form.py input.pdf fields.json output.pdf
```
</example>
</utility_scripts_pattern>
<solve_dont_punt>
Handle error conditions rather than punting to Claude.
<example type="good">
```python
def process_file(path):
"""Process a file, creating it if it doesn't exist."""
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
print(f"File {path} not found, creating default")
with open(path, 'w') as f:
f.write('')
return ''
except PermissionError:
print(f"Cannot access {path}, using default")
return ''
```
</example>
<example type="bad">
```python
def process_file(path):
# Just fail and let Claude figure it out
return open(path).read()
```
</example>
<configuration_values>
Document configuration parameters to avoid "voodoo constants":
<example type="good">
```python
# HTTP requests typically complete within 30 seconds
REQUEST_TIMEOUT = 30
# Three retries balances reliability vs speed
MAX_RETRIES = 3
```
</example>
<example type="bad">
```python
TIMEOUT = 47 # Why 47?
RETRIES = 5 # Why 5?
```
</example>
</configuration_values>
</solve_dont_punt>
<package_dependencies>
<runtime_constraints>
Skills run in code execution environment with platform-specific limitations:
- **claude.ai**: Can install packages from npm and PyPI
- **Anthropic API**: No network access and no runtime package installation
</runtime_constraints>
<guidance>
List required packages in your SKILL.md and verify they're available.
<example type="good">
Install required package: `pip install pypdf`
Then use it:
```python
from pypdf import PdfReader
reader = PdfReader("file.pdf")
```
</example>
<example type="bad">
"Use the pdf library to process the file."
</example>
</guidance>
</package_dependencies>
<mcp_tool_references>
If your Skill uses MCP (Model Context Protocol) tools, always use fully qualified tool names.
<format>ServerName:tool_name</format>
<examples>
- Use the BigQuery:bigquery_schema tool to retrieve table schemas.
- Use the GitHub:create_issue tool to create issues.
</examples>
Without the server prefix, Claude may fail to locate the tool, especially when multiple MCP servers are available.
</mcp_tool_references>

View File

@@ -0,0 +1,474 @@
<overview>
Skills improve through iteration and testing. This reference covers evaluation-driven development, Claude A/B testing patterns, and XML structure validation during testing.
</overview>
<evaluation_driven_development>
<principle>
Create evaluations BEFORE writing extensive documentation. This ensures your skill solves real problems rather than documenting imagined ones.
</principle>
<workflow>
<step_1>
**Identify gaps**: Run Claude on representative tasks without a skill. Document specific failures or missing context.
</step_1>
<step_2>
**Create evaluations**: Build three scenarios that test these gaps.
</step_2>
<step_3>
**Establish baseline**: Measure Claude's performance without the skill.
</step_3>
<step_4>
**Write minimal instructions**: Create just enough content to address the gaps and pass evaluations.
</step_4>
<step_5>
**Iterate**: Execute evaluations, compare against baseline, and refine.
</step_5>
</workflow>
<evaluation_structure>
```json
{
"skills": ["pdf-processing"],
"query": "Extract all text from this PDF file and save it to output.txt",
"files": ["test-files/document.pdf"],
"expected_behavior": [
"Successfully reads the PDF file using appropriate library",
"Extracts text content from all pages without missing any",
"Saves extracted text to output.txt in clear, readable format"
]
}
```
</evaluation_structure>
<why_evaluations_first>
- Prevents documenting imagined problems
- Forces clarity about what success looks like
- Provides objective measurement of skill effectiveness
- Keeps skill focused on actual needs
- Enables quantitative improvement tracking
</why_evaluations_first>
</evaluation_driven_development>
<iterative_development_with_claude>
<principle>
The most effective skill development uses Claude itself. Work with "Claude A" (expert who helps refine) to create skills used by "Claude B" (agent executing tasks).
</principle>
<creating_skills>
<workflow>
<step_1>
**Complete task without skill**: Work through problem with Claude A, noting what context you repeatedly provide.
</step_1>
<step_2>
**Ask Claude A to create skill**: "Create a skill that captures this pattern we just used"
</step_2>
<step_3>
**Review for conciseness**: Remove unnecessary explanations.
</step_3>
<step_4>
**Improve architecture**: Organize content with progressive disclosure.
</step_4>
<step_5>
**Test with Claude B**: Use fresh instance to test on real tasks.
</step_5>
<step_6>
**Iterate based on observation**: Return to Claude A with specific issues observed.
</step_6>
</workflow>
<insight>
Claude models understand skill format natively. Simply ask Claude to create a skill and it will generate properly structured SKILL.md content.
</insight>
</creating_skills>
<improving_skills>
<workflow>
<step_1>
**Use skill in real workflows**: Give Claude B actual tasks.
</step_1>
<step_2>
**Observe behavior**: Where does it struggle, succeed, or make unexpected choices?
</step_2>
<step_3>
**Return to Claude A**: Share observations and current SKILL.md.
</step_3>
<step_4>
**Review suggestions**: Claude A might suggest reorganization, stronger language, or workflow restructuring.
</step_4>
<step_5>
**Apply and test**: Update skill and test again.
</step_5>
<step_6>
**Repeat**: Continue based on real usage, not assumptions.
</step_6>
</workflow>
<what_to_watch_for>
- **Unexpected exploration paths**: Structure might not be intuitive
- **Missed connections**: Links might need to be more explicit
- **Overreliance on sections**: Consider moving frequently-read content to main SKILL.md
- **Ignored content**: Poorly signaled or unnecessary files
- **Critical metadata**: The name and description in your skill's metadata are critical for discovery
</what_to_watch_for>
</improving_skills>
</iterative_development_with_claude>
<model_testing>
<principle>
Test with all models you plan to use. Different models have different strengths and need different levels of detail.
</principle>
<haiku_testing>
**Claude Haiku** (fast, economical)
Questions to ask:
- Does the skill provide enough guidance?
- Are examples clear and complete?
- Do implicit assumptions become explicit?
- Does Haiku need more structure?
Haiku benefits from:
- More explicit instructions
- Complete examples (no partial code)
- Clear success criteria
- Step-by-step workflows
</haiku_testing>
<sonnet_testing>
**Claude Sonnet** (balanced)
Questions to ask:
- Is the skill clear and efficient?
- Does it avoid over-explanation?
- Are workflows well-structured?
- Does progressive disclosure work?
Sonnet benefits from:
- Balanced detail level
- XML structure for clarity
- Progressive disclosure
- Concise but complete guidance
</sonnet_testing>
<opus_testing>
**Claude Opus** (powerful reasoning)
Questions to ask:
- Does the skill avoid over-explaining?
- Can Opus infer obvious steps?
- Are constraints clear?
- Is context minimal but sufficient?
Opus benefits from:
- Concise instructions
- Principles over procedures
- High degrees of freedom
- Trust in reasoning capabilities
</opus_testing>
<balancing_across_models>
What works for Opus might need more detail for Haiku. Aim for instructions that work well across all target models. Find the balance that serves your target audience.
See [core-principles.md](core-principles.md) for model testing examples.
</balancing_across_models>
</model_testing>
<xml_structure_validation>
<principle>
During testing, validate that your skill's XML structure is correct and complete.
</principle>
<validation_checklist>
After updating a skill, verify:
<required_tags_present>
-`<objective>` tag exists and defines what skill does
-`<quick_start>` tag exists with immediate guidance
-`<success_criteria>` or `<when_successful>` tag exists
</required_tags_present>
<no_markdown_headings>
- ✅ No `#`, `##`, or `###` headings in skill body
- ✅ All sections use XML tags instead
- ✅ Markdown formatting within tags is preserved (bold, italic, lists, code blocks)
</no_markdown_headings>
<proper_xml_nesting>
- ✅ All XML tags properly closed
- ✅ Nested tags have correct hierarchy
- ✅ No unclosed tags
</proper_xml_nesting>
<conditional_tags_appropriate>
- ✅ Conditional tags match skill complexity
- ✅ Simple skills use required tags only
- ✅ Complex skills add appropriate conditional tags
- ✅ No over-engineering or under-specifying
</conditional_tags_appropriate>
<reference_files_check>
- ✅ Reference files also use pure XML structure
- ✅ Links to reference files are correct
- ✅ References are one level deep from SKILL.md
</reference_files_check>
</validation_checklist>
<testing_xml_during_iteration>
When iterating on a skill:
1. Make changes to XML structure
2. **Validate XML structure** (check tags, nesting, completeness)
3. Test with Claude on representative tasks
4. Observe if XML structure aids or hinders Claude's understanding
5. Iterate structure based on actual performance
</testing_xml_during_iteration>
</xml_structure_validation>
<observation_based_iteration>
<principle>
Iterate based on what you observe, not what you assume. Real usage reveals issues assumptions miss.
</principle>
<observation_categories>
<what_claude_reads>
Which sections does Claude actually read? Which are ignored? This reveals:
- Relevance of content
- Effectiveness of progressive disclosure
- Whether section names are clear
</what_claude_reads>
<where_claude_struggles>
Which tasks cause confusion or errors? This reveals:
- Missing context
- Unclear instructions
- Insufficient examples
- Ambiguous requirements
</where_claude_struggles>
<where_claude_succeeds>
Which tasks go smoothly? This reveals:
- Effective patterns
- Good examples
- Clear instructions
- Appropriate detail level
</where_claude_succeeds>
<unexpected_behaviors>
What does Claude do that surprises you? This reveals:
- Unstated assumptions
- Ambiguous phrasing
- Missing constraints
- Alternative interpretations
</unexpected_behaviors>
</observation_categories>
<iteration_pattern>
1. **Observe**: Run Claude on real tasks with current skill
2. **Document**: Note specific issues, not general feelings
3. **Hypothesize**: Why did this issue occur?
4. **Fix**: Make targeted changes to address specific issues
5. **Test**: Verify fix works on same scenario
6. **Validate**: Ensure fix doesn't break other scenarios
7. **Repeat**: Continue with next observed issue
</iteration_pattern>
</observation_based_iteration>
<progressive_refinement>
<principle>
Skills don't need to be perfect initially. Start minimal, observe usage, add what's missing.
</principle>
<initial_version>
Start with:
- Valid YAML frontmatter
- Required XML tags: objective, quick_start, success_criteria
- Minimal working example
- Basic success criteria
Skip initially:
- Extensive examples
- Edge case documentation
- Advanced features
- Detailed reference files
</initial_version>
<iteration_additions>
Add through iteration:
- Examples when patterns aren't clear from description
- Edge cases when observed in real usage
- Advanced features when users need them
- Reference files when SKILL.md approaches 500 lines
- Validation scripts when errors are common
</iteration_additions>
<benefits>
- Faster to initial working version
- Additions solve real needs, not imagined ones
- Keeps skills focused and concise
- Progressive disclosure emerges naturally
- Documentation stays aligned with actual usage
</benefits>
</progressive_refinement>
<testing_discovery>
<principle>
Test that Claude can discover and use your skill when appropriate.
</principle>
<discovery_testing>
<test_description>
Test if Claude loads your skill when it should:
1. Start fresh conversation (Claude B)
2. Ask question that should trigger skill
3. Check if skill was loaded
4. Verify skill was used appropriately
</test_description>
<description_quality>
If skill isn't discovered:
- Check description includes trigger keywords
- Verify description is specific, not vague
- Ensure description explains when to use skill
- Test with different phrasings of the same request
The description is Claude's primary discovery mechanism.
</description_quality>
</discovery_testing>
</testing_discovery>
<common_iteration_patterns>
<pattern name="too_verbose">
**Observation**: Skill works but uses lots of tokens
**Fix**:
- Remove obvious explanations
- Assume Claude knows common concepts
- Use examples instead of lengthy descriptions
- Move advanced content to reference files
</pattern>
<pattern name="too_minimal">
**Observation**: Claude makes incorrect assumptions or misses steps
**Fix**:
- Add explicit instructions where assumptions fail
- Provide complete working examples
- Define edge cases
- Add validation steps
</pattern>
<pattern name="poor_discovery">
**Observation**: Skill exists but Claude doesn't load it when needed
**Fix**:
- Improve description with specific triggers
- Add relevant keywords
- Test description against actual user queries
- Make description more specific about use cases
</pattern>
<pattern name="unclear_structure">
**Observation**: Claude reads wrong sections or misses relevant content
**Fix**:
- Use clearer XML tag names
- Reorganize content hierarchy
- Move frequently-needed content earlier
- Add explicit links to relevant sections
</pattern>
<pattern name="incomplete_examples">
**Observation**: Claude produces outputs that don't match expected pattern
**Fix**:
- Add more examples showing pattern
- Make examples more complete
- Show edge cases in examples
- Add anti-pattern examples (what not to do)
</pattern>
</common_iteration_patterns>
<iteration_velocity>
<principle>
Small, frequent iterations beat large, infrequent rewrites.
</principle>
<fast_iteration>
**Good approach**:
1. Make one targeted change
2. Test on specific scenario
3. Verify improvement
4. Commit change
5. Move to next issue
Total time: Minutes per iteration
Iterations per day: 10-20
Learning rate: High
</fast_iteration>
<slow_iteration>
**Problematic approach**:
1. Accumulate many issues
2. Make large refactor
3. Test everything at once
4. Debug multiple issues simultaneously
5. Hard to know what fixed what
Total time: Hours per iteration
Iterations per day: 1-2
Learning rate: Low
</slow_iteration>
<benefits_of_fast_iteration>
- Isolate cause and effect
- Build pattern recognition faster
- Less wasted work from wrong directions
- Easier to revert if needed
- Maintains momentum
</benefits_of_fast_iteration>
</iteration_velocity>
<success_metrics>
<principle>
Define how you'll measure if the skill is working. Quantify success.
</principle>
<objective_metrics>
- **Success rate**: Percentage of tasks completed correctly
- **Token usage**: Average tokens consumed per task
- **Iteration count**: How many tries to get correct output
- **Error rate**: Percentage of tasks with errors
- **Discovery rate**: How often skill loads when it should
</objective_metrics>
<subjective_metrics>
- **Output quality**: Does output meet requirements?
- **Appropriate detail**: Too verbose or too minimal?
- **Claude confidence**: Does Claude seem uncertain?
- **User satisfaction**: Does skill solve the actual problem?
</subjective_metrics>
<tracking_improvement>
Compare metrics before and after changes:
- Baseline: Measure without skill
- Initial: Measure with first version
- Iteration N: Measure after each change
Track which changes improve which metrics. Double down on effective patterns.
</tracking_improvement>
</success_metrics>

View File

@@ -0,0 +1,168 @@
# Recommended Skill Structure
The optimal structure for complex skills separates routing, workflows, and knowledge.
<structure>
```
skill-name/
├── SKILL.md # Router + essential principles (unavoidable)
├── workflows/ # Step-by-step procedures (how)
│ ├── workflow-a.md
│ ├── workflow-b.md
│ └── ...
└── references/ # Domain knowledge (what)
├── reference-a.md
├── reference-b.md
└── ...
```
</structure>
<why_this_works>
## Problems This Solves
**Problem 1: Context gets skipped**
When important principles are in a separate file, Claude may not read them.
**Solution:** Put essential principles directly in SKILL.md. They load automatically.
**Problem 2: Wrong context loaded**
A "build" task loads debugging references. A "debug" task loads build references.
**Solution:** Intake question determines intent → routes to specific workflow → workflow specifies which references to read.
**Problem 3: Monolithic skills are overwhelming**
500+ lines of mixed content makes it hard to find relevant parts.
**Solution:** Small router (SKILL.md) + focused workflows + reference library.
**Problem 4: Procedures mixed with knowledge**
"How to do X" mixed with "What X means" creates confusion.
**Solution:** Workflows are procedures (steps). References are knowledge (patterns, examples).
</why_this_works>
<skill_md_template>
## SKILL.md Template
```markdown
---
name: skill-name
description: What it does and when to use it.
---
<essential_principles>
## How This Skill Works
[Inline principles that apply to ALL workflows. Cannot be skipped.]
### Principle 1: [Name]
[Brief explanation]
### Principle 2: [Name]
[Brief explanation]
</essential_principles>
<intake>
**Ask the user:**
What would you like to do?
1. [Option A]
2. [Option B]
3. [Option C]
4. Something else
**Wait for response before proceeding.**
</intake>
<routing>
| Response | Workflow |
|----------|----------|
| 1, "keyword", "keyword" | `workflows/option-a.md` |
| 2, "keyword", "keyword" | `workflows/option-b.md` |
| 3, "keyword", "keyword" | `workflows/option-c.md` |
| 4, other | Clarify, then select |
**After reading the workflow, follow it exactly.**
</routing>
<reference_index>
All domain knowledge in `references/`:
**Category A:** file-a.md, file-b.md
**Category B:** file-c.md, file-d.md
</reference_index>
<workflows_index>
| Workflow | Purpose |
|----------|---------|
| option-a.md | [What it does] |
| option-b.md | [What it does] |
| option-c.md | [What it does] |
</workflows_index>
```
</skill_md_template>
<workflow_template>
## Workflow Template
```markdown
# Workflow: [Name]
<required_reading>
**Read these reference files NOW:**
1. references/relevant-file.md
2. references/another-file.md
</required_reading>
<process>
## Step 1: [Name]
[What to do]
## Step 2: [Name]
[What to do]
## Step 3: [Name]
[What to do]
</process>
<success_criteria>
This workflow is complete when:
- [ ] Criterion 1
- [ ] Criterion 2
- [ ] Criterion 3
</success_criteria>
```
</workflow_template>
<when_to_use_this_pattern>
## When to Use This Pattern
**Use router + workflows + references when:**
- Multiple distinct workflows (build vs debug vs ship)
- Different workflows need different references
- Essential principles must not be skipped
- Skill has grown beyond 200 lines
**Use simple single-file skill when:**
- One workflow
- Small reference set
- Under 200 lines total
- No essential principles to enforce
</when_to_use_this_pattern>
<key_insight>
## The Key Insight
**SKILL.md is always loaded. Use this guarantee.**
Put unavoidable content in SKILL.md:
- Essential principles
- Intake question
- Routing logic
Put workflow-specific content in workflows/:
- Step-by-step procedures
- Required references for that workflow
- Success criteria for that workflow
Put reusable knowledge in references/:
- Patterns and examples
- Technical details
- Domain expertise
</key_insight>

View File

@@ -0,0 +1,372 @@
<overview>
Skills have three structural components: YAML frontmatter (metadata), pure XML body structure (content organization), and progressive disclosure (file organization). This reference defines requirements and best practices for each component.
</overview>
<xml_structure_requirements>
<critical_rule>
**Remove ALL markdown headings (#, ##, ###) from skill body content.** Replace with semantic XML tags. Keep markdown formatting WITHIN content (bold, italic, lists, code blocks, links).
</critical_rule>
<required_tags>
Every skill MUST have these three tags:
- **`<objective>`** - What the skill does and why it matters (1-3 paragraphs)
- **`<quick_start>`** - Immediate, actionable guidance (minimal working example)
- **`<success_criteria>`** or **`<when_successful>`** - How to know it worked
</required_tags>
<conditional_tags>
Add based on skill complexity and domain requirements:
- **`<context>`** - Background/situational information
- **`<workflow>` or `<process>`** - Step-by-step procedures
- **`<advanced_features>`** - Deep-dive topics (progressive disclosure)
- **`<validation>`** - How to verify outputs
- **`<examples>`** - Multi-shot learning
- **`<anti_patterns>`** - Common mistakes to avoid
- **`<security_checklist>`** - Non-negotiable security patterns
- **`<testing>`** - Testing workflows
- **`<common_patterns>`** - Code examples and recipes
- **`<reference_guides>` or `<detailed_references>`** - Links to reference files
See [use-xml-tags.md](use-xml-tags.md) for detailed guidance on each tag.
</conditional_tags>
<tag_selection_intelligence>
**Simple skills** (single domain, straightforward):
- Required tags only
- Example: Text extraction, file format conversion
**Medium skills** (multiple patterns, some complexity):
- Required tags + workflow/examples as needed
- Example: Document processing with steps, API integration
**Complex skills** (multiple domains, security, APIs):
- Required tags + conditional tags as appropriate
- Example: Payment processing, authentication systems, multi-step workflows
</tag_selection_intelligence>
<xml_nesting>
Properly nest XML tags for hierarchical content:
```xml
<examples>
<example number="1">
<input>User input</input>
<output>Expected output</output>
</example>
</examples>
```
Always close tags:
```xml
<objective>
Content here
</objective>
```
</xml_nesting>
<tag_naming_conventions>
Use descriptive, semantic names:
- `<workflow>` not `<steps>`
- `<success_criteria>` not `<done>`
- `<anti_patterns>` not `<dont_do>`
Be consistent within your skill. If you use `<workflow>`, don't also use `<process>` for the same purpose (unless they serve different roles).
</tag_naming_conventions>
</xml_structure_requirements>
<yaml_requirements>
<required_fields>
```yaml
---
name: skill-name-here
description: What it does and when to use it (third person, specific triggers)
---
```
</required_fields>
<name_field>
**Validation rules**:
- Maximum 64 characters
- Lowercase letters, numbers, hyphens only
- No XML tags
- No reserved words: "anthropic", "claude"
- Must match directory name exactly
**Examples**:
-`process-pdfs`
-`manage-facebook-ads`
-`setup-stripe-payments`
-`PDF_Processor` (uppercase)
-`helper` (vague)
-`claude-helper` (reserved word)
</name_field>
<description_field>
**Validation rules**:
- Non-empty, maximum 1024 characters
- No XML tags
- Third person (never first or second person)
- Include what it does AND when to use it
**Critical rule**: Always write in third person.
- ✅ "Processes Excel files and generates reports"
- ❌ "I can help you process Excel files"
- ❌ "You can use this to process Excel files"
**Structure**: Include both capabilities and triggers.
**Effective examples**:
```yaml
description: Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
```
```yaml
description: Analyze Excel spreadsheets, create pivot tables, generate charts. Use when analyzing Excel files, spreadsheets, tabular data, or .xlsx files.
```
```yaml
description: Generate descriptive commit messages by analyzing git diffs. Use when the user asks for help writing commit messages or reviewing staged changes.
```
**Avoid**:
```yaml
description: Helps with documents
```
```yaml
description: Processes data
```
</description_field>
</yaml_requirements>
<naming_conventions>
Use **verb-noun convention** for skill names:
<pattern name="create">
Building/authoring tools
Examples: `create-agent-skills`, `create-hooks`, `create-landing-pages`
</pattern>
<pattern name="manage">
Managing external services or resources
Examples: `manage-facebook-ads`, `manage-zoom`, `manage-stripe`, `manage-supabase`
</pattern>
<pattern name="setup">
Configuration/integration tasks
Examples: `setup-stripe-payments`, `setup-meta-tracking`
</pattern>
<pattern name="generate">
Generation tasks
Examples: `generate-ai-images`
</pattern>
<avoid_patterns>
- Vague: `helper`, `utils`, `tools`
- Generic: `documents`, `data`, `files`
- Reserved words: `anthropic-helper`, `claude-tools`
- Inconsistent: Directory `facebook-ads` but name `facebook-ads-manager`
</avoid_patterns>
</naming_conventions>
<progressive_disclosure>
<principle>
SKILL.md serves as an overview that points to detailed materials as needed. This keeps context window usage efficient.
</principle>
<practical_guidance>
- Keep SKILL.md body under 500 lines
- Split content into separate files when approaching this limit
- Keep references one level deep from SKILL.md
- Add table of contents to reference files over 100 lines
</practical_guidance>
<pattern name="high_level_guide">
Quick start in SKILL.md, details in reference files:
```markdown
---
name: pdf-processing
description: Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
---
<objective>
Extract text and tables from PDF files, fill forms, and merge documents using Python libraries.
</objective>
<quick_start>
Extract text with pdfplumber:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
</quick_start>
<advanced_features>
**Form filling**: See [forms.md](forms.md)
**API reference**: See [reference.md](reference.md)
</advanced_features>
```
Claude loads forms.md or reference.md only when needed.
</pattern>
<pattern name="domain_organization">
For skills with multiple domains, organize by domain to avoid loading irrelevant context:
```
bigquery-skill/
├── SKILL.md (overview and navigation)
└── reference/
├── finance.md (revenue, billing metrics)
├── sales.md (opportunities, pipeline)
├── product.md (API usage, features)
└── marketing.md (campaigns, attribution)
```
When user asks about revenue, Claude reads only finance.md. Other files stay on filesystem consuming zero tokens.
</pattern>
<pattern name="conditional_details">
Show basic content in SKILL.md, link to advanced in reference files:
```xml
<objective>
Process DOCX files with creation and editing capabilities.
</objective>
<quick_start>
<creating_documents>
Use docx-js for new documents. See [docx-js.md](docx-js.md).
</creating_documents>
<editing_documents>
For simple edits, modify XML directly.
**For tracked changes**: See [redlining.md](redlining.md)
**For OOXML details**: See [ooxml.md](ooxml.md)
</editing_documents>
</quick_start>
```
Claude reads redlining.md or ooxml.md only when the user needs those features.
</pattern>
<critical_rules>
**Keep references one level deep**: All reference files should link directly from SKILL.md. Avoid nested references (SKILL.md → advanced.md → details.md) as Claude may only partially read deeply nested files.
**Add table of contents to long files**: For reference files over 100 lines, include a table of contents at the top.
**Use pure XML in reference files**: Reference files should also use pure XML structure (no markdown headings in body).
</critical_rules>
</progressive_disclosure>
<file_organization>
<filesystem_navigation>
Claude navigates your skill directory using bash commands:
- Use forward slashes: `reference/guide.md` (not `reference\guide.md`)
- Name files descriptively: `form_validation_rules.md` (not `doc2.md`)
- Organize by domain: `reference/finance.md`, `reference/sales.md`
</filesystem_navigation>
<directory_structure>
Typical skill structure:
```
skill-name/
├── SKILL.md (main entry point, pure XML structure)
├── references/ (optional, for progressive disclosure)
│ ├── guide-1.md (pure XML structure)
│ ├── guide-2.md (pure XML structure)
│ └── examples.md (pure XML structure)
└── scripts/ (optional, for utility scripts)
├── validate.py
└── process.py
```
</directory_structure>
</file_organization>
<anti_patterns>
<pitfall name="markdown_headings_in_body">
❌ Do NOT use markdown headings in skill body:
```markdown
# PDF Processing
## Quick start
Extract text...
## Advanced features
Form filling...
```
✅ Use pure XML structure:
```xml
<objective>
PDF processing with text extraction, form filling, and merging.
</objective>
<quick_start>
Extract text...
</quick_start>
<advanced_features>
Form filling...
</advanced_features>
```
</pitfall>
<pitfall name="vague_descriptions">
- ❌ "Helps with documents"
- ✅ "Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction."
</pitfall>
<pitfall name="inconsistent_pov">
- ❌ "I can help you process Excel files"
- ✅ "Processes Excel files and generates reports"
</pitfall>
<pitfall name="wrong_naming_convention">
- ❌ Directory: `facebook-ads`, Name: `facebook-ads-manager`
- ✅ Directory: `manage-facebook-ads`, Name: `manage-facebook-ads`
- ❌ Directory: `stripe-integration`, Name: `stripe`
- ✅ Directory: `setup-stripe-payments`, Name: `setup-stripe-payments`
</pitfall>
<pitfall name="deeply_nested_references">
Keep references one level deep from SKILL.md. Claude may only partially read nested files (SKILL.md → advanced.md → details.md).
</pitfall>
<pitfall name="windows_paths">
Always use forward slashes: `scripts/helper.py` (not `scripts\helper.py`)
</pitfall>
<pitfall name="missing_required_tags">
Every skill must have: `<objective>`, `<quick_start>`, and `<success_criteria>` (or `<when_successful>`).
</pitfall>
</anti_patterns>
<validation_checklist>
Before finalizing a skill, verify:
- ✅ YAML frontmatter valid (name matches directory, description in third person)
- ✅ No markdown headings in body (pure XML structure)
- ✅ Required tags present: objective, quick_start, success_criteria
- ✅ Conditional tags appropriate for complexity level
- ✅ All XML tags properly closed
- ✅ Progressive disclosure applied (SKILL.md < 500 lines)
- ✅ Reference files use pure XML structure
- ✅ File paths use forward slashes
- ✅ Descriptive file names
</validation_checklist>

View File

@@ -0,0 +1,466 @@
<overview>
Skills use pure XML structure for consistent parsing, efficient token usage, and improved Claude performance. This reference defines the required and conditional XML tags for skill authoring, along with intelligence rules for tag selection.
</overview>
<critical_rule>
**Remove ALL markdown headings (#, ##, ###) from skill body content.** Replace with semantic XML tags. Keep markdown formatting WITHIN content (bold, italic, lists, code blocks, links).
</critical_rule>
<required_tags>
Every skill MUST have these three tags:
<tag name="objective">
**Purpose**: What the skill does and why it matters. Sets context and scope.
**Content**: 1-3 paragraphs explaining the skill's purpose, domain, and value proposition.
**Example**:
```xml
<objective>
Extract text and tables from PDF files, fill forms, and merge documents using Python libraries. This skill provides patterns for common PDF operations without requiring external services or APIs.
</objective>
```
</tag>
<tag name="quick_start">
**Purpose**: Immediate, actionable guidance. Gets Claude started quickly without reading advanced sections.
**Content**: Minimal working example, essential commands, or basic usage pattern.
**Example**:
```xml
<quick_start>
Extract text with pdfplumber:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
</quick_start>
```
</tag>
<tag name="success_criteria">
**Purpose**: How to know the task worked. Defines completion criteria.
**Alternative name**: `<when_successful>` (use whichever fits better)
**Content**: Clear criteria for successful execution, validation steps, or expected outputs.
**Example**:
```xml
<success_criteria>
A well-structured skill has:
- Valid YAML frontmatter with descriptive name and description
- Pure XML structure with no markdown headings in body
- Required tags: objective, quick_start, success_criteria
- Progressive disclosure (SKILL.md < 500 lines, details in reference files)
- Real-world testing and iteration based on observed behavior
</success_criteria>
```
</tag>
</required_tags>
<conditional_tags>
Add these tags based on skill complexity and domain requirements:
<tag name="context">
**When to use**: Background or situational information that Claude needs before starting.
**Example**:
```xml
<context>
The Facebook Marketing API uses a hierarchy: Account → Campaign → Ad Set → Ad. Each level has different configuration options and requires specific permissions. Always verify API access before making changes.
</context>
```
</tag>
<tag name="workflow">
**When to use**: Step-by-step procedures, sequential operations, multi-step processes.
**Alternative name**: `<process>`
**Example**:
```xml
<workflow>
1. **Analyze the form**: Run analyze_form.py to extract field definitions
2. **Create field mapping**: Edit fields.json with values
3. **Validate mapping**: Run validate_fields.py
4. **Fill the form**: Run fill_form.py
5. **Verify output**: Check generated PDF
</workflow>
```
</tag>
<tag name="advanced_features">
**When to use**: Deep-dive topics that most users won't need (progressive disclosure).
**Example**:
```xml
<advanced_features>
**Custom styling**: See [styling.md](styling.md)
**Template inheritance**: See [templates.md](templates.md)
**API reference**: See [reference.md](reference.md)
</advanced_features>
```
</tag>
<tag name="validation">
**When to use**: Skills with verification steps, quality checks, or validation scripts.
**Example**:
```xml
<validation>
After making changes, validate immediately:
```bash
python scripts/validate.py output_dir/
```
Only proceed when validation passes. If errors occur, review and fix before continuing.
</validation>
```
</tag>
<tag name="examples">
**When to use**: Multi-shot learning, input/output pairs, demonstrating patterns.
**Example**:
```xml
<examples>
<example number="1">
<input>User clicked signup button</input>
<output>track('signup_initiated', { source: 'homepage' })</output>
</example>
<example number="2">
<input>Purchase completed</input>
<output>track('purchase', { value: 49.99, currency: 'USD' })</output>
</example>
</examples>
```
</tag>
<tag name="anti_patterns">
**When to use**: Common mistakes that Claude should avoid.
**Example**:
```xml
<anti_patterns>
<pitfall name="vague_descriptions">
- ❌ "Helps with documents"
- ✅ "Extract text and tables from PDF files"
</pitfall>
<pitfall name="too_many_options">
- ❌ "You can use pypdf, or pdfplumber, or PyMuPDF..."
- ✅ "Use pdfplumber for text extraction. For OCR, use pytesseract instead."
</pitfall>
</anti_patterns>
```
</tag>
<tag name="security_checklist">
**When to use**: Skills with security implications (API keys, payments, authentication).
**Example**:
```xml
<security_checklist>
- Never log API keys or tokens
- Always use environment variables for credentials
- Validate all user input before API calls
- Use HTTPS for all external requests
- Check API response status before proceeding
</security_checklist>
```
</tag>
<tag name="testing">
**When to use**: Testing workflows, test patterns, or validation steps.
**Example**:
```xml
<testing>
Test with all target models (Haiku, Sonnet, Opus):
1. Run skill on representative tasks
2. Observe where Claude struggles or succeeds
3. Iterate based on actual behavior
4. Validate XML structure after changes
</testing>
```
</tag>
<tag name="common_patterns">
**When to use**: Code examples, recipes, or reusable patterns.
**Example**:
```xml
<common_patterns>
<pattern name="error_handling">
```python
try:
result = process_file(path)
except FileNotFoundError:
print(f"File not found: {path}")
except Exception as e:
print(f"Error: {e}")
```
</pattern>
</common_patterns>
```
</tag>
<tag name="reference_guides">
**When to use**: Links to detailed reference files (progressive disclosure).
**Alternative name**: `<detailed_references>`
**Example**:
```xml
<reference_guides>
For deeper topics, see reference files:
**API operations**: [references/api-operations.md](references/api-operations.md)
**Security patterns**: [references/security.md](references/security.md)
**Troubleshooting**: [references/troubleshooting.md](references/troubleshooting.md)
</reference_guides>
```
</tag>
</conditional_tags>
<intelligence_rules>
<decision_tree>
**Simple skills** (single domain, straightforward):
- Required tags only: objective, quick_start, success_criteria
- Example: Text extraction, file format conversion, simple calculations
**Medium skills** (multiple patterns, some complexity):
- Required tags + workflow/examples as needed
- Example: Document processing with steps, API integration with configuration
**Complex skills** (multiple domains, security, APIs):
- Required tags + conditional tags as appropriate
- Example: Payment processing, authentication systems, multi-step workflows with validation
</decision_tree>
<principle>
Don't over-engineer simple skills. Don't under-specify complex skills. Match tag selection to actual complexity and user needs.
</principle>
<when_to_add_conditional>
Ask these questions:
- **Context needed?** → Add `<context>`
- **Multi-step process?** → Add `<workflow>` or `<process>`
- **Advanced topics to hide?** → Add `<advanced_features>` + reference files
- **Validation required?** → Add `<validation>`
- **Pattern demonstration?** → Add `<examples>`
- **Common mistakes?** → Add `<anti_patterns>`
- **Security concerns?** → Add `<security_checklist>`
- **Testing guidance?** → Add `<testing>`
- **Code recipes?** → Add `<common_patterns>`
- **Deep references?** → Add `<reference_guides>`
</when_to_add_conditional>
</intelligence_rules>
<xml_vs_markdown_headings>
<token_efficiency>
XML tags are more efficient than markdown headings:
**Markdown headings**:
```markdown
## Quick start
## Workflow
## Advanced features
## Success criteria
```
Total: ~20 tokens, no semantic meaning to Claude
**XML tags**:
```xml
<quick_start>
<workflow>
<advanced_features>
<success_criteria>
```
Total: ~15 tokens, semantic meaning built-in
</token_efficiency>
<parsing_accuracy>
XML provides unambiguous boundaries and semantic meaning. Claude can reliably:
- Identify section boundaries
- Understand content purpose
- Skip irrelevant sections
- Parse programmatically
Markdown headings are just visual formatting. Claude must infer meaning from heading text.
</parsing_accuracy>
<consistency>
XML enforces consistent structure across all skills. All skills use the same tag names for the same purposes. Makes it easier to:
- Validate skill structure programmatically
- Learn patterns across skills
- Maintain consistent quality
</consistency>
</xml_vs_markdown_headings>
<nesting_guidelines>
<proper_nesting>
XML tags can nest for hierarchical content:
```xml
<examples>
<example number="1">
<input>User input here</input>
<output>Expected output here</output>
</example>
<example number="2">
<input>Another input</input>
<output>Another output</output>
</example>
</examples>
```
</proper_nesting>
<closing_tags>
Always close tags properly:
✅ Good:
```xml
<objective>
Content here
</objective>
```
❌ Bad:
```xml
<objective>
Content here
```
</closing_tags>
<tag_naming>
Use descriptive, semantic names:
- `<workflow>` not `<steps>`
- `<success_criteria>` not `<done>`
- `<anti_patterns>` not `<dont_do>`
Be consistent within your skill. If you use `<workflow>`, don't also use `<process>` for the same purpose.
</tag_naming>
</nesting_guidelines>
<anti_pattern>
**DO NOT use markdown headings in skill body content.**
❌ Bad (hybrid approach):
```markdown
# PDF Processing
## Quick start
Extract text with pdfplumber...
## Advanced features
Form filling...
```
✅ Good (pure XML):
```markdown
<objective>
PDF processing with text extraction, form filling, and merging.
</objective>
<quick_start>
Extract text with pdfplumber...
</quick_start>
<advanced_features>
Form filling...
</advanced_features>
```
</anti_pattern>
<benefits>
<benefit type="clarity">
Clearly separate different sections with unambiguous boundaries
</benefit>
<benefit type="accuracy">
Reduce parsing errors. Claude knows exactly where sections begin and end.
</benefit>
<benefit type="flexibility">
Easily find, add, remove, or modify sections without rewriting
</benefit>
<benefit type="parseability">
Programmatically extract specific sections for validation or analysis
</benefit>
<benefit type="efficiency">
Lower token usage compared to markdown headings
</benefit>
<benefit type="consistency">
Standardized structure across all skills in the ecosystem
</benefit>
</benefits>
<combining_with_other_techniques>
XML tags work well with other prompting techniques:
**Multi-shot learning**:
```xml
<examples>
<example number="1">...</example>
<example number="2">...</example>
</examples>
```
**Chain of thought**:
```xml
<thinking>
Analyze the problem...
</thinking>
<answer>
Based on the analysis...
</answer>
```
**Template provision**:
```xml
<template>
```markdown
# Report Title
## Summary
...
```
</template>
```
**Reference material**:
```xml
<schema>
{
"field": "type"
}
</schema>
```
</combining_with_other_techniques>
<tag_reference_pattern>
When referencing content in tags, use the tag name:
"Using the schema in `<schema>` tags..."
"Follow the workflow in `<workflow>`..."
"See examples in `<examples>`..."
This makes the structure self-documenting.
</tag_reference_pattern>

View File

@@ -0,0 +1,113 @@
# Using Scripts in Skills
<purpose>
Scripts are executable code that Claude runs as-is rather than regenerating each time. They ensure reliable, error-free execution of repeated operations.
</purpose>
<when_to_use>
Use scripts when:
- The same code runs across multiple skill invocations
- Operations are error-prone when rewritten from scratch
- Complex shell commands or API interactions are involved
- Consistency matters more than flexibility
Common script types:
- **Deployment** - Deploy to Vercel, publish packages, push releases
- **Setup** - Initialize projects, install dependencies, configure environments
- **API calls** - Authenticated requests, webhook handlers, data fetches
- **Data processing** - Transform files, batch operations, migrations
- **Build processes** - Compile, bundle, test runners
</when_to_use>
<script_structure>
Scripts live in `scripts/` within the skill directory:
```
skill-name/
├── SKILL.md
├── workflows/
├── references/
├── templates/
└── scripts/
├── deploy.sh
├── setup.py
└── fetch-data.ts
```
A well-structured script includes:
1. Clear purpose comment at top
2. Input validation
3. Error handling
4. Idempotent operations where possible
5. Clear output/feedback
</script_structure>
<script_example>
```bash
#!/bin/bash
# deploy.sh - Deploy project to Vercel
# Usage: ./deploy.sh [environment]
# Environments: preview (default), production
set -euo pipefail
ENVIRONMENT="${1:-preview}"
# Validate environment
if [[ "$ENVIRONMENT" != "preview" && "$ENVIRONMENT" != "production" ]]; then
echo "Error: Environment must be 'preview' or 'production'"
exit 1
fi
echo "Deploying to $ENVIRONMENT..."
if [[ "$ENVIRONMENT" == "production" ]]; then
vercel --prod
else
vercel
fi
echo "Deployment complete."
```
</script_example>
<workflow_integration>
Workflows reference scripts like this:
```xml
<process>
## Step 5: Deploy
1. Ensure all tests pass
2. Run `scripts/deploy.sh production`
3. Verify deployment succeeded
4. Update user with deployment URL
</process>
```
The workflow tells Claude WHEN to run the script. The script handles HOW the operation executes.
</workflow_integration>
<best_practices>
**Do:**
- Make scripts idempotent (safe to run multiple times)
- Include clear usage comments
- Validate inputs before executing
- Provide meaningful error messages
- Use `set -euo pipefail` in bash scripts
**Don't:**
- Hardcode secrets or credentials (use environment variables)
- Create scripts for one-off operations
- Skip error handling
- Make scripts do too many unrelated things
- Forget to make scripts executable (`chmod +x`)
</best_practices>
<security_considerations>
- Never embed API keys, tokens, or secrets in scripts
- Use environment variables for sensitive configuration
- Validate and sanitize any user-provided inputs
- Be cautious with scripts that delete or modify data
- Consider adding `--dry-run` options for destructive operations
</security_considerations>

View File

@@ -0,0 +1,112 @@
# Using Templates in Skills
<purpose>
Templates are reusable output structures that Claude copies and fills in. They ensure consistent, high-quality outputs without regenerating structure each time.
</purpose>
<when_to_use>
Use templates when:
- Output should have consistent structure across invocations
- The structure matters more than creative generation
- Filling placeholders is more reliable than blank-page generation
- Users expect predictable, professional-looking outputs
Common template types:
- **Plans** - Project plans, implementation plans, migration plans
- **Specifications** - Technical specs, feature specs, API specs
- **Documents** - Reports, proposals, summaries
- **Configurations** - Config files, settings, environment setups
- **Scaffolds** - File structures, boilerplate code
</when_to_use>
<template_structure>
Templates live in `templates/` within the skill directory:
```
skill-name/
├── SKILL.md
├── workflows/
├── references/
└── templates/
├── plan-template.md
├── spec-template.md
└── report-template.md
```
A template file contains:
1. Clear section markers
2. Placeholder indicators (use `{{placeholder}}` or `[PLACEHOLDER]`)
3. Inline guidance for what goes where
4. Example content where helpful
</template_structure>
<template_example>
```markdown
# {{PROJECT_NAME}} Implementation Plan
## Overview
{{1-2 sentence summary of what this plan covers}}
## Goals
- {{Primary goal}}
- {{Secondary goals...}}
## Scope
**In scope:**
- {{What's included}}
**Out of scope:**
- {{What's explicitly excluded}}
## Phases
### Phase 1: {{Phase name}}
**Duration:** {{Estimated duration}}
**Deliverables:**
- {{Deliverable 1}}
- {{Deliverable 2}}
### Phase 2: {{Phase name}}
...
## Success Criteria
- [ ] {{Measurable criterion 1}}
- [ ] {{Measurable criterion 2}}
## Risks
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| {{Risk}} | {{H/M/L}} | {{H/M/L}} | {{Strategy}} |
```
</template_example>
<workflow_integration>
Workflows reference templates like this:
```xml
<process>
## Step 3: Generate Plan
1. Read `templates/plan-template.md`
2. Copy the template structure
3. Fill each placeholder based on gathered requirements
4. Review for completeness
</process>
```
The workflow tells Claude WHEN to use the template. The template provides WHAT structure to produce.
</workflow_integration>
<best_practices>
**Do:**
- Keep templates focused on structure, not content
- Use clear placeholder syntax consistently
- Include brief inline guidance where sections might be ambiguous
- Make templates complete but minimal
**Don't:**
- Put excessive example content that might be copied verbatim
- Create templates for outputs that genuinely need creative generation
- Over-constrain with too many required sections
- Forget to update templates when requirements change
</best_practices>

View File

@@ -0,0 +1,510 @@
<overview>
This reference covers patterns for complex workflows, validation loops, and feedback cycles in skill authoring. All patterns use pure XML structure.
</overview>
<complex_workflows>
<principle>
Break complex operations into clear, sequential steps. For particularly complex workflows, provide a checklist.
</principle>
<pdf_forms_example>
```xml
<objective>
Fill PDF forms with validated data from JSON field mappings.
</objective>
<workflow>
Copy this checklist and check off items as you complete them:
```
Task Progress:
- [ ] Step 1: Analyze the form (run analyze_form.py)
- [ ] Step 2: Create field mapping (edit fields.json)
- [ ] Step 3: Validate mapping (run validate_fields.py)
- [ ] Step 4: Fill the form (run fill_form.py)
- [ ] Step 5: Verify output (run verify_output.py)
```
<step_1>
**Analyze the form**
Run: `python scripts/analyze_form.py input.pdf`
This extracts form fields and their locations, saving to `fields.json`.
</step_1>
<step_2>
**Create field mapping**
Edit `fields.json` to add values for each field.
</step_2>
<step_3>
**Validate mapping**
Run: `python scripts/validate_fields.py fields.json`
Fix any validation errors before continuing.
</step_3>
<step_4>
**Fill the form**
Run: `python scripts/fill_form.py input.pdf fields.json output.pdf`
</step_4>
<step_5>
**Verify output**
Run: `python scripts/verify_output.py output.pdf`
If verification fails, return to Step 2.
</step_5>
</workflow>
```
</pdf_forms_example>
<when_to_use>
Use checklist pattern when:
- Workflow has 5+ sequential steps
- Steps must be completed in order
- Progress tracking helps prevent errors
- Easy resumption after interruption is valuable
</when_to_use>
</complex_workflows>
<feedback_loops>
<validate_fix_repeat_pattern>
<principle>
Run validator → fix errors → repeat. This pattern greatly improves output quality.
</principle>
<document_editing_example>
```xml
<objective>
Edit OOXML documents with XML validation at each step.
</objective>
<editing_process>
<step_1>
Make your edits to `word/document.xml`
</step_1>
<step_2>
**Validate immediately**: `python ooxml/scripts/validate.py unpacked_dir/`
</step_2>
<step_3>
If validation fails:
- Review the error message carefully
- Fix the issues in the XML
- Run validation again
</step_3>
<step_4>
**Only proceed when validation passes**
</step_4>
<step_5>
Rebuild: `python ooxml/scripts/pack.py unpacked_dir/ output.docx`
</step_5>
<step_6>
Test the output document
</step_6>
</editing_process>
<validation>
Never skip validation. Catching errors early prevents corrupted output files.
</validation>
```
</document_editing_example>
<why_it_works>
- Catches errors early before changes are applied
- Machine-verifiable with objective verification
- Plan can be iterated without touching originals
- Reduces total iteration cycles
</why_it_works>
</validate_fix_repeat_pattern>
<plan_validate_execute_pattern>
<principle>
When Claude performs complex, open-ended tasks, create a plan in a structured format, validate it, then execute.
Workflow: analyze → **create plan file****validate plan** → execute → verify
</principle>
<batch_update_example>
```xml
<objective>
Apply batch updates to spreadsheet with plan validation.
</objective>
<workflow>
<plan_phase>
<step_1>
Analyze the spreadsheet and requirements
</step_1>
<step_2>
Create `changes.json` with all planned updates
</step_2>
</plan_phase>
<validation_phase>
<step_3>
Validate the plan: `python scripts/validate_changes.py changes.json`
</step_3>
<step_4>
If validation fails:
- Review error messages
- Fix issues in changes.json
- Validate again
</step_4>
<step_5>
Only proceed when validation passes
</step_5>
</validation_phase>
<execution_phase>
<step_6>
Apply changes: `python scripts/apply_changes.py changes.json`
</step_6>
<step_7>
Verify output
</step_7>
</execution_phase>
</workflow>
<success_criteria>
- Plan validation passes with zero errors
- All changes applied successfully
- Output verification confirms expected results
</success_criteria>
```
</batch_update_example>
<implementation_tip>
Make validation scripts verbose with specific error messages:
**Good error message**:
"Field 'signature_date' not found. Available fields: customer_name, order_total, signature_date_signed"
**Bad error message**:
"Invalid field"
Specific errors help Claude fix issues without guessing.
</implementation_tip>
<when_to_use>
Use plan-validate-execute when:
- Operations are complex and error-prone
- Changes are irreversible or difficult to undo
- Planning can be validated independently
- Catching errors early saves significant time
</when_to_use>
</plan_validate_execute_pattern>
</feedback_loops>
<conditional_workflows>
<principle>
Guide Claude through decision points with clear branching logic.
</principle>
<document_modification_example>
```xml
<objective>
Modify DOCX files using appropriate method based on task type.
</objective>
<workflow>
<decision_point_1>
Determine the modification type:
**Creating new content?** → Follow "Creation workflow"
**Editing existing content?** → Follow "Editing workflow"
</decision_point_1>
<creation_workflow>
<objective>Build documents from scratch</objective>
<steps>
1. Use docx-js library
2. Build document from scratch
3. Export to .docx format
</steps>
</creation_workflow>
<editing_workflow>
<objective>Modify existing documents</objective>
<steps>
1. Unpack existing document
2. Modify XML directly
3. Validate after each change
4. Repack when complete
</steps>
</editing_workflow>
</workflow>
<success_criteria>
- Correct workflow chosen based on task type
- All steps in chosen workflow completed
- Output file validated and verified
</success_criteria>
```
</document_modification_example>
<when_to_use>
Use conditional workflows when:
- Different task types require different approaches
- Decision points are clear and well-defined
- Workflows are mutually exclusive
- Guiding Claude to correct path improves outcomes
</when_to_use>
</conditional_workflows>
<validation_scripts>
<principles>
Validation scripts are force multipliers. They catch errors that Claude might miss and provide actionable feedback for fixing issues.
</principles>
<characteristics_of_good_validation>
<verbose_errors>
**Good**: "Field 'signature_date' not found. Available fields: customer_name, order_total, signature_date_signed"
**Bad**: "Invalid field"
Verbose errors help Claude fix issues in one iteration instead of multiple rounds of guessing.
</verbose_errors>
<specific_feedback>
**Good**: "Line 47: Expected closing tag `</paragraph>` but found `</section>`"
**Bad**: "XML syntax error"
Specific feedback pinpoints exact location and nature of the problem.
</specific_feedback>
<actionable_suggestions>
**Good**: "Required field 'customer_name' is missing. Add: {\"customer_name\": \"value\"}"
**Bad**: "Missing required field"
Actionable suggestions show Claude exactly what to fix.
</actionable_suggestions>
<available_options>
When validation fails, show available valid options:
**Good**: "Invalid status 'pending_review'. Valid statuses: active, paused, archived"
**Bad**: "Invalid status"
Showing valid options eliminates guesswork.
</available_options>
</characteristics_of_good_validation>
<implementation_pattern>
```xml
<validation>
After making changes, validate immediately:
```bash
python scripts/validate.py output_dir/
```
If validation fails, fix errors before continuing. Validation errors include:
- **Field not found**: "Field 'signature_date' not found. Available fields: customer_name, order_total, signature_date_signed"
- **Type mismatch**: "Field 'order_total' expects number, got string"
- **Missing required field**: "Required field 'customer_name' is missing"
- **Invalid value**: "Invalid status 'pending_review'. Valid statuses: active, paused, archived"
Only proceed when validation passes with zero errors.
</validation>
```
</implementation_pattern>
<benefits>
- Catches errors before they propagate
- Reduces iteration cycles
- Provides learning feedback
- Makes debugging deterministic
- Enables confident execution
</benefits>
</validation_scripts>
<iterative_refinement>
<principle>
Many workflows benefit from iteration: generate → validate → refine → validate → finalize.
</principle>
<implementation_example>
```xml
<objective>
Generate reports with iterative quality improvement.
</objective>
<workflow>
<iteration_1>
**Generate initial draft**
Create report based on data and requirements.
</iteration_1>
<iteration_2>
**Validate draft**
Run: `python scripts/validate_report.py draft.md`
Fix any structural issues, missing sections, or data errors.
</iteration_2>
<iteration_3>
**Refine content**
Improve clarity, add supporting data, enhance visualizations.
</iteration_3>
<iteration_4>
**Final validation**
Run: `python scripts/validate_report.py final.md`
Ensure all quality criteria met.
</iteration_4>
<iteration_5>
**Finalize**
Export to final format and deliver.
</iteration_5>
</workflow>
<success_criteria>
- Final validation passes with zero errors
- All quality criteria met
- Report ready for delivery
</success_criteria>
```
</implementation_example>
<when_to_use>
Use iterative refinement when:
- Quality improves with multiple passes
- Validation provides actionable feedback
- Time permits iteration
- Perfect output matters more than speed
</when_to_use>
</iterative_refinement>
<checkpoint_pattern>
<principle>
For long workflows, add checkpoints where Claude can pause and verify progress before continuing.
</principle>
<implementation_example>
```xml
<workflow>
<phase_1>
**Data collection** (Steps 1-3)
1. Extract data from source
2. Transform to target format
3. **CHECKPOINT**: Verify data completeness
Only continue if checkpoint passes.
</phase_1>
<phase_2>
**Data processing** (Steps 4-6)
4. Apply business rules
5. Validate transformations
6. **CHECKPOINT**: Verify processing accuracy
Only continue if checkpoint passes.
</phase_2>
<phase_3>
**Output generation** (Steps 7-9)
7. Generate output files
8. Validate output format
9. **CHECKPOINT**: Verify final output
Proceed to delivery only if checkpoint passes.
</phase_3>
</workflow>
<checkpoint_validation>
At each checkpoint:
1. Run validation script
2. Review output for correctness
3. Verify no errors or warnings
4. Only proceed when validation passes
</checkpoint_validation>
```
</implementation_example>
<benefits>
- Prevents cascading errors
- Easier to diagnose issues
- Clear progress indicators
- Natural pause points for review
- Reduces wasted work from early errors
</benefits>
</checkpoint_pattern>
<error_recovery>
<principle>
Design workflows with clear error recovery paths. Claude should know what to do when things go wrong.
</principle>
<implementation_example>
```xml
<workflow>
<normal_path>
1. Process input file
2. Validate output
3. Save results
</normal_path>
<error_recovery>
**If validation fails in step 2:**
- Review validation errors
- Check if input file is corrupted → Return to step 1 with different input
- Check if processing logic failed → Fix logic, return to step 1
- Check if output format wrong → Fix format, return to step 2
**If save fails in step 3:**
- Check disk space
- Check file permissions
- Check file path validity
- Retry save with corrected conditions
</error_recovery>
<escalation>
**If error persists after 3 attempts:**
- Document the error with full context
- Save partial results if available
- Report issue to user with diagnostic information
</escalation>
</workflow>
```
</implementation_example>
<when_to_use>
Include error recovery when:
- Workflows interact with external systems
- File operations could fail
- Network calls could timeout
- User input could be invalid
- Errors are recoverable
</when_to_use>
</error_recovery>

View File

@@ -0,0 +1,73 @@
---
name: {{SKILL_NAME}}
description: {{What it does}} Use when {{trigger conditions}}.
---
<essential_principles>
## {{Core Concept}}
{{Principles that ALWAYS apply, regardless of which workflow runs}}
### 1. {{First principle}}
{{Explanation}}
### 2. {{Second principle}}
{{Explanation}}
### 3. {{Third principle}}
{{Explanation}}
</essential_principles>
<intake>
**Ask the user:**
What would you like to do?
1. {{First option}}
2. {{Second option}}
3. {{Third option}}
**Wait for response before proceeding.**
</intake>
<routing>
| Response | Workflow |
|----------|----------|
| 1, "{{keywords}}" | `workflows/{{first-workflow}}.md` |
| 2, "{{keywords}}" | `workflows/{{second-workflow}}.md` |
| 3, "{{keywords}}" | `workflows/{{third-workflow}}.md` |
**After reading the workflow, follow it exactly.**
</routing>
<quick_reference>
## {{Skill Name}} Quick Reference
{{Brief reference information always useful to have visible}}
</quick_reference>
<reference_index>
## Domain Knowledge
All in `references/`:
- {{reference-1.md}} - {{purpose}}
- {{reference-2.md}} - {{purpose}}
</reference_index>
<workflows_index>
## Workflows
All in `workflows/`:
| Workflow | Purpose |
|----------|---------|
| {{first-workflow}}.md | {{purpose}} |
| {{second-workflow}}.md | {{purpose}} |
| {{third-workflow}}.md | {{purpose}} |
</workflows_index>
<success_criteria>
A well-executed {{skill name}}:
- {{First criterion}}
- {{Second criterion}}
- {{Third criterion}}
</success_criteria>

View File

@@ -0,0 +1,33 @@
---
name: {{SKILL_NAME}}
description: {{What it does}} Use when {{trigger conditions}}.
---
<objective>
{{Clear statement of what this skill accomplishes}}
</objective>
<quick_start>
{{Immediate actionable guidance - what Claude should do first}}
</quick_start>
<process>
## Step 1: {{First action}}
{{Instructions for step 1}}
## Step 2: {{Second action}}
{{Instructions for step 2}}
## Step 3: {{Third action}}
{{Instructions for step 3}}
</process>
<success_criteria>
{{Skill name}} is complete when:
- [ ] {{First success criterion}}
- [ ] {{Second success criterion}}
- [ ] {{Third success criterion}}
</success_criteria>

View File

@@ -0,0 +1,96 @@
# Workflow: Add a Reference to Existing Skill
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md
2. references/skill-structure.md
</required_reading>
<process>
## Step 1: Select the Skill
```bash
ls ~/.claude/skills/
```
Present numbered list, ask: "Which skill needs a new reference?"
## Step 2: Analyze Current Structure
```bash
cat ~/.claude/skills/{skill-name}/SKILL.md
ls ~/.claude/skills/{skill-name}/references/ 2>/dev/null
```
Determine:
- **Has references/ folder?** → Good, can add directly
- **Simple skill?** → May need to create references/ first
- **What references exist?** → Understand the knowledge landscape
Report current references to user.
## Step 3: Gather Reference Requirements
Ask:
- What knowledge should this reference contain?
- Which workflows will use it?
- Is this reusable across workflows or specific to one?
**If specific to one workflow** → Consider putting it inline in that workflow instead.
## Step 4: Create the Reference File
Create `references/{reference-name}.md`:
Use semantic XML tags to structure the content:
```xml
<overview>
Brief description of what this reference covers
</overview>
<patterns>
## Common Patterns
[Reusable patterns, examples, code snippets]
</patterns>
<guidelines>
## Guidelines
[Best practices, rules, constraints]
</guidelines>
<examples>
## Examples
[Concrete examples with explanation]
</examples>
```
## Step 5: Update SKILL.md
Add the new reference to `<reference_index>`:
```markdown
**Category:** existing.md, new-reference.md
```
## Step 6: Update Workflows That Need It
For each workflow that should use this reference:
1. Read the workflow file
2. Add to its `<required_reading>` section
3. Verify the workflow still makes sense with this addition
## Step 7: Verify
- [ ] Reference file exists and is well-structured
- [ ] Reference is in SKILL.md reference_index
- [ ] Relevant workflows have it in required_reading
- [ ] No broken references
</process>
<success_criteria>
Reference addition is complete when:
- [ ] Reference file created with useful content
- [ ] Added to reference_index in SKILL.md
- [ ] Relevant workflows updated to read it
- [ ] Content is reusable (not workflow-specific)
</success_criteria>

View File

@@ -0,0 +1,93 @@
# Workflow: Add a Script to a Skill
<required_reading>
**Read these reference files NOW:**
1. references/using-scripts.md
</required_reading>
<process>
## Step 1: Identify the Skill
Ask (if not already provided):
- Which skill needs a script?
- What operation should the script perform?
## Step 2: Analyze Script Need
Confirm this is a good script candidate:
- [ ] Same code runs across multiple invocations
- [ ] Operation is error-prone when rewritten
- [ ] Consistency matters more than flexibility
If not a good fit, suggest alternatives (inline code in workflow, reference examples).
## Step 3: Create Scripts Directory
```bash
mkdir -p ~/.claude/skills/{skill-name}/scripts
```
## Step 4: Design Script
Gather requirements:
- What inputs does the script need?
- What should it output or accomplish?
- What errors might occur?
- Should it be idempotent?
Choose language:
- **bash** - Shell operations, file manipulation, CLI tools
- **python** - Data processing, API calls, complex logic
- **node/ts** - JavaScript ecosystem, async operations
## Step 5: Write Script File
Create `scripts/{script-name}.{ext}` with:
- Purpose comment at top
- Usage instructions
- Input validation
- Error handling
- Clear output/feedback
For bash scripts:
```bash
#!/bin/bash
set -euo pipefail
```
## Step 6: Make Executable (if bash)
```bash
chmod +x ~/.claude/skills/{skill-name}/scripts/{script-name}.sh
```
## Step 7: Update Workflow to Use Script
Find the workflow that needs this operation. Add:
```xml
<process>
...
N. Run `scripts/{script-name}.sh [arguments]`
N+1. Verify operation succeeded
...
</process>
```
## Step 8: Test
Invoke the skill workflow and verify:
- Script runs at the right step
- Inputs are passed correctly
- Errors are handled gracefully
- Output matches expectations
</process>
<success_criteria>
Script is complete when:
- [ ] scripts/ directory exists
- [ ] Script file has proper structure (comments, validation, error handling)
- [ ] Script is executable (if bash)
- [ ] At least one workflow references the script
- [ ] No hardcoded secrets or credentials
- [ ] Tested with real invocation
</success_criteria>

View File

@@ -0,0 +1,74 @@
# Workflow: Add a Template to a Skill
<required_reading>
**Read these reference files NOW:**
1. references/using-templates.md
</required_reading>
<process>
## Step 1: Identify the Skill
Ask (if not already provided):
- Which skill needs a template?
- What output does this template structure?
## Step 2: Analyze Template Need
Confirm this is a good template candidate:
- [ ] Output has consistent structure across uses
- [ ] Structure matters more than creative generation
- [ ] Filling placeholders is more reliable than blank-page generation
If not a good fit, suggest alternatives (workflow guidance, reference examples).
## Step 3: Create Templates Directory
```bash
mkdir -p ~/.claude/skills/{skill-name}/templates
```
## Step 4: Design Template Structure
Gather requirements:
- What sections does the output need?
- What information varies between uses? (→ placeholders)
- What stays constant? (→ static structure)
## Step 5: Write Template File
Create `templates/{template-name}.md` with:
- Clear section markers
- `{{PLACEHOLDER}}` syntax for variable content
- Brief inline guidance where helpful
- Minimal example content
## Step 6: Update Workflow to Use Template
Find the workflow that produces this output. Add:
```xml
<process>
...
N. Read `templates/{template-name}.md`
N+1. Copy template structure
N+2. Fill each placeholder based on gathered context
...
</process>
```
## Step 7: Test
Invoke the skill workflow and verify:
- Template is read at the right step
- All placeholders get filled appropriately
- Output structure matches template
- No placeholders left unfilled
</process>
<success_criteria>
Template is complete when:
- [ ] templates/ directory exists
- [ ] Template file has clear structure with placeholders
- [ ] At least one workflow references the template
- [ ] Workflow instructions explain when/how to use template
- [ ] Tested with real invocation
</success_criteria>

View File

@@ -0,0 +1,120 @@
# Workflow: Add a Workflow to Existing Skill
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md
2. references/workflows-and-validation.md
</required_reading>
<process>
## Step 1: Select the Skill
**DO NOT use AskUserQuestion** - there may be many skills.
```bash
ls ~/.claude/skills/
```
Present numbered list, ask: "Which skill needs a new workflow?"
## Step 2: Analyze Current Structure
Read the skill:
```bash
cat ~/.claude/skills/{skill-name}/SKILL.md
ls ~/.claude/skills/{skill-name}/workflows/ 2>/dev/null
```
Determine:
- **Simple skill?** → May need to upgrade to router pattern first
- **Already has workflows/?** → Good, can add directly
- **What workflows exist?** → Avoid duplication
Report current structure to user.
## Step 3: Gather Workflow Requirements
Ask using AskUserQuestion or direct question:
- What should this workflow do?
- When would someone use it vs existing workflows?
- What references would it need?
## Step 4: Upgrade to Router Pattern (if needed)
**If skill is currently simple (no workflows/):**
Ask: "This skill needs to be upgraded to the router pattern first. Should I restructure it?"
If yes:
1. Create workflows/ directory
2. Move existing process content to workflows/main.md
3. Rewrite SKILL.md as router with intake + routing
4. Verify structure works before proceeding
## Step 5: Create the Workflow File
Create `workflows/{workflow-name}.md`:
```markdown
# Workflow: {Workflow Name}
<required_reading>
**Read these reference files NOW:**
1. references/{relevant-file}.md
</required_reading>
<process>
## Step 1: {First Step}
[What to do]
## Step 2: {Second Step}
[What to do]
## Step 3: {Third Step}
[What to do]
</process>
<success_criteria>
This workflow is complete when:
- [ ] Criterion 1
- [ ] Criterion 2
- [ ] Criterion 3
</success_criteria>
```
## Step 6: Update SKILL.md
Add the new workflow to:
1. **Intake question** - Add new option
2. **Routing table** - Map option to workflow file
3. **Workflows index** - Add to the list
## Step 7: Create References (if needed)
If the workflow needs domain knowledge that doesn't exist:
1. Create `references/{reference-name}.md`
2. Add to reference_index in SKILL.md
3. Reference it in the workflow's required_reading
## Step 8: Test
Invoke the skill:
- Does the new option appear in intake?
- Does selecting it route to the correct workflow?
- Does the workflow load the right references?
- Does the workflow execute correctly?
Report results to user.
</process>
<success_criteria>
Workflow addition is complete when:
- [ ] Skill upgraded to router pattern (if needed)
- [ ] Workflow file created with required_reading, process, success_criteria
- [ ] SKILL.md intake updated with new option
- [ ] SKILL.md routing updated
- [ ] SKILL.md workflows_index updated
- [ ] Any needed references created
- [ ] Tested and working
</success_criteria>

View File

@@ -0,0 +1,138 @@
# Workflow: Audit a Skill
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md
2. references/skill-structure.md
3. references/use-xml-tags.md
</required_reading>
<process>
## Step 1: List Available Skills
**DO NOT use AskUserQuestion** - there may be many skills.
Enumerate skills in chat as numbered list:
```bash
ls ~/.claude/skills/
```
Present as:
```
Available skills:
1. create-agent-skills
2. build-macos-apps
3. manage-stripe
...
```
Ask: "Which skill would you like to audit? (enter number or name)"
## Step 2: Read the Skill
After user selects, read the full skill structure:
```bash
# Read main file
cat ~/.claude/skills/{skill-name}/SKILL.md
# Check for workflows and references
ls ~/.claude/skills/{skill-name}/
ls ~/.claude/skills/{skill-name}/workflows/ 2>/dev/null
ls ~/.claude/skills/{skill-name}/references/ 2>/dev/null
```
## Step 3: Run Audit Checklist
Evaluate against each criterion:
### YAML Frontmatter
- [ ] Has `name:` field (lowercase-with-hyphens)
- [ ] Name matches directory name
- [ ] Has `description:` field
- [ ] Description says what it does AND when to use it
- [ ] Description is third person ("Use when...")
### Structure
- [ ] SKILL.md under 500 lines
- [ ] Pure XML structure (no markdown headings # in body)
- [ ] All XML tags properly closed
- [ ] Has required tags: objective OR essential_principles
- [ ] Has success_criteria
### Router Pattern (if complex skill)
- [ ] Essential principles inline in SKILL.md (not in separate file)
- [ ] Has intake question
- [ ] Has routing table
- [ ] All referenced workflow files exist
- [ ] All referenced reference files exist
### Workflows (if present)
- [ ] Each has required_reading section
- [ ] Each has process section
- [ ] Each has success_criteria section
- [ ] Required reading references exist
### Content Quality
- [ ] Principles are actionable (not vague platitudes)
- [ ] Steps are specific (not "do the thing")
- [ ] Success criteria are verifiable
- [ ] No redundant content across files
## Step 4: Generate Report
Present findings as:
```
## Audit Report: {skill-name}
### ✅ Passing
- [list passing items]
### ⚠️ Issues Found
1. **[Issue name]**: [Description]
→ Fix: [Specific action]
2. **[Issue name]**: [Description]
→ Fix: [Specific action]
### 📊 Score: X/Y criteria passing
```
## Step 5: Offer Fixes
If issues found, ask:
"Would you like me to fix these issues?"
Options:
1. **Fix all** - Apply all recommended fixes
2. **Fix one by one** - Review each fix before applying
3. **Just the report** - No changes needed
If fixing:
- Make each change
- Verify file validity after each change
- Report what was fixed
</process>
<audit_anti_patterns>
## Common Anti-Patterns to Flag
**Skippable principles**: Essential principles in separate file instead of inline
**Monolithic skill**: Single file over 500 lines
**Mixed concerns**: Procedures and knowledge in same file
**Vague steps**: "Handle the error appropriately"
**Untestable criteria**: "User is satisfied"
**Markdown headings in body**: Using # instead of XML tags
**Missing routing**: Complex skill without intake/routing
**Broken references**: Files mentioned but don't exist
**Redundant content**: Same information in multiple places
</audit_anti_patterns>
<success_criteria>
Audit is complete when:
- [ ] Skill fully read and analyzed
- [ ] All checklist items evaluated
- [ ] Report presented to user
- [ ] Fixes applied (if requested)
- [ ] User has clear picture of skill health
</success_criteria>

View File

@@ -0,0 +1,605 @@
# Workflow: Create Exhaustive Domain Expertise Skill
<objective>
Build a comprehensive execution skill that does real work in a specific domain. Domain expertise skills are full-featured build skills with exhaustive domain knowledge in references, complete workflows for the full lifecycle (build → debug → optimize → ship), and can be both invoked directly by users AND loaded by other skills (like create-plans) for domain knowledge.
</objective>
<critical_distinction>
**Regular skill:** "Do one specific task"
**Domain expertise skill:** "Do EVERYTHING in this domain, with complete practitioner knowledge"
Examples:
- `expertise/macos-apps` - Build macOS apps from scratch through shipping
- `expertise/python-games` - Build complete Python games with full game dev lifecycle
- `expertise/rust-systems` - Build Rust systems programs with exhaustive systems knowledge
- `expertise/web-scraping` - Build scrapers, handle all edge cases, deploy at scale
Domain expertise skills:
- ✅ Execute tasks (build, debug, optimize, ship)
- ✅ Have comprehensive domain knowledge in references
- ✅ Are invoked directly by users ("build a macOS app")
- ✅ Can be loaded by other skills (create-plans reads references for planning)
- ✅ Cover the FULL lifecycle, not just getting started
</critical_distinction>
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md
2. references/core-principles.md
3. references/use-xml-tags.md
</required_reading>
<process>
## Step 1: Identify Domain
Ask user what domain expertise to build:
**Example domains:**
- macOS/iOS app development
- Python game development
- Rust systems programming
- Machine learning / AI
- Web scraping and automation
- Data engineering pipelines
- Audio processing / DSP
- 3D graphics / shaders
- Unity/Unreal game development
- Embedded systems
Get specific: "Python games" or "Python games with Pygame specifically"?
## Step 2: Confirm Target Location
Explain:
```
Domain expertise skills go in: ~/.claude/skills/expertise/{domain-name}/
These are comprehensive BUILD skills that:
- Execute tasks (build, debug, optimize, ship)
- Contain exhaustive domain knowledge
- Can be invoked directly by users
- Can be loaded by other skills for domain knowledge
Name suggestion: {suggested-name}
Location: ~/.claude/skills/expertise/{suggested-name}/
```
Confirm or adjust name.
## Step 3: Identify Workflows
Domain expertise skills cover the FULL lifecycle. Identify what workflows are needed.
**Common workflows for most domains:**
1. **build-new-{thing}.md** - Create from scratch
2. **add-feature.md** - Extend existing {thing}
3. **debug-{thing}.md** - Find and fix bugs
4. **write-tests.md** - Test for correctness
5. **optimize-performance.md** - Profile and speed up
6. **ship-{thing}.md** - Deploy/distribute
**Domain-specific workflows:**
- Games: `implement-game-mechanic.md`, `add-audio.md`, `polish-ui.md`
- Web apps: `setup-auth.md`, `add-api-endpoint.md`, `setup-database.md`
- Systems: `optimize-memory.md`, `profile-cpu.md`, `cross-compile.md`
Each workflow = one complete task type that users actually do.
## Step 4: Exhaustive Research Phase
**CRITICAL:** This research must be comprehensive, not superficial.
### Research Strategy
Run multiple web searches to ensure coverage:
**Search 1: Current ecosystem**
- "best {domain} libraries 2024 2025"
- "popular {domain} frameworks comparison"
- "{domain} tech stack recommendations"
**Search 2: Architecture patterns**
- "{domain} architecture patterns"
- "{domain} best practices design patterns"
- "how to structure {domain} projects"
**Search 3: Lifecycle and tooling**
- "{domain} development workflow"
- "{domain} testing debugging best practices"
- "{domain} deployment distribution"
**Search 4: Common pitfalls**
- "{domain} common mistakes avoid"
- "{domain} anti-patterns"
- "what not to do {domain}"
**Search 5: Real-world usage**
- "{domain} production examples GitHub"
- "{domain} case studies"
- "successful {domain} projects"
### Verification Requirements
For EACH major library/tool/pattern found:
- **Check recency:** When was it last updated?
- **Check adoption:** Is it actively maintained? Community size?
- **Check alternatives:** What else exists? When to use each?
- **Check deprecation:** Is anything being replaced?
**Red flags for outdated content:**
- Articles from before 2023 (unless fundamental concepts)
- Abandoned libraries (no commits in 12+ months)
- Deprecated APIs or patterns
- "This used to be popular but..."
### Documentation Sources
Use Context7 MCP when available:
```
mcp__context7__resolve-library-id: {library-name}
mcp__context7__get-library-docs: {library-id}
```
Focus on official docs, not tutorials.
## Step 5: Organize Knowledge Into Domain Areas
Structure references by domain concerns, NOT by arbitrary categories.
**For game development example:**
```
references/
├── architecture.md # ECS, component-based, state machines
├── libraries.md # Pygame, Arcade, Panda3D (when to use each)
├── graphics-rendering.md # 2D/3D rendering, sprites, shaders
├── physics.md # Collision, physics engines
├── audio.md # Sound effects, music, spatial audio
├── input.md # Keyboard, mouse, gamepad, touch
├── ui-menus.md # HUD, menus, dialogs
├── game-loop.md # Update/render loop, fixed timestep
├── state-management.md # Game states, scene management
├── networking.md # Multiplayer, client-server, P2P
├── asset-pipeline.md # Loading, caching, optimization
├── testing-debugging.md # Unit tests, profiling, debugging tools
├── performance.md # Optimization, profiling, benchmarking
├── packaging.md # Building executables, installers
├── distribution.md # Steam, itch.io, app stores
└── anti-patterns.md # Common mistakes, what NOT to do
```
**For macOS app development example:**
```
references/
├── app-architecture.md # State management, dependency injection
├── swiftui-patterns.md # Declarative UI patterns
├── appkit-integration.md # Using AppKit with SwiftUI
├── concurrency-patterns.md # Async/await, actors, structured concurrency
├── data-persistence.md # Storage strategies
├── networking.md # URLSession, async networking
├── system-apis.md # macOS-specific frameworks
├── testing-tdd.md # Testing patterns
├── testing-debugging.md # Debugging tools and techniques
├── performance.md # Profiling, optimization
├── design-system.md # Platform conventions
├── macos-polish.md # Native feel, accessibility
├── security-code-signing.md # Signing, notarization
└── project-scaffolding.md # CLI-based setup
```
**For each reference file:**
- Pure XML structure
- Decision trees: "If X, use Y. If Z, use A instead."
- Comparison tables: Library vs Library (speed, features, learning curve)
- Code examples showing patterns
- "When to use" guidance
- Platform-specific considerations
- Current versions and compatibility
## Step 6: Create SKILL.md
Domain expertise skills use router pattern with essential principles:
```yaml
---
name: build-{domain-name}
description: Build {domain things} from scratch through shipping. Full lifecycle - build, debug, test, optimize, ship. {Any specific constraints like "CLI-only, no IDE"}.
---
<essential_principles>
## How {This Domain} Works
{Domain-specific principles that ALWAYS apply}
### 1. {First Principle}
{Critical practice that can't be skipped}
### 2. {Second Principle}
{Another fundamental practice}
### 3. {Third Principle}
{Core workflow pattern}
</essential_principles>
<intake>
**Ask the user:**
What would you like to do?
1. Build a new {thing}
2. Debug an existing {thing}
3. Add a feature
4. Write/run tests
5. Optimize performance
6. Ship/release
7. Something else
**Then read the matching workflow from `workflows/` and follow it.**
</intake>
<routing>
| Response | Workflow |
|----------|----------|
| 1, "new", "create", "build", "start" | `workflows/build-new-{thing}.md` |
| 2, "broken", "fix", "debug", "crash", "bug" | `workflows/debug-{thing}.md` |
| 3, "add", "feature", "implement", "change" | `workflows/add-feature.md` |
| 4, "test", "tests", "TDD", "coverage" | `workflows/write-tests.md` |
| 5, "slow", "optimize", "performance", "fast" | `workflows/optimize-performance.md` |
| 6, "ship", "release", "deploy", "publish" | `workflows/ship-{thing}.md` |
| 7, other | Clarify, then select workflow or references |
</routing>
<verification_loop>
## After Every Change
{Domain-specific verification steps}
Example for compiled languages:
```bash
# 1. Does it build?
{build command}
# 2. Do tests pass?
{test command}
# 3. Does it run?
{run command}
```
Report to the user:
- "Build: ✓"
- "Tests: X pass, Y fail"
- "Ready for you to check [specific thing]"
</verification_loop>
<reference_index>
## Domain Knowledge
All in `references/`:
**Architecture:** {list files}
**{Domain Area}:** {list files}
**{Domain Area}:** {list files}
**Development:** {list files}
**Shipping:** {list files}
</reference_index>
<workflows_index>
## Workflows
All in `workflows/`:
| File | Purpose |
|------|---------|
| build-new-{thing}.md | Create new {thing} from scratch |
| debug-{thing}.md | Find and fix bugs |
| add-feature.md | Add to existing {thing} |
| write-tests.md | Write and run tests |
| optimize-performance.md | Profile and speed up |
| ship-{thing}.md | Deploy/distribute |
</workflows_index>
```
## Step 7: Write Workflows
For EACH workflow identified in Step 3:
### Workflow Template
```markdown
# Workflow: {Workflow Name}
<required_reading>
**Read these reference files NOW before {doing the task}:**
1. references/{relevant-file}.md
2. references/{another-relevant-file}.md
3. references/{third-relevant-file}.md
</required_reading>
<process>
## Step 1: {First Action}
{What to do}
## Step 2: {Second Action}
{What to do - actual implementation steps}
## Step 3: {Third Action}
{What to do}
## Step 4: Verify
{How to prove it works}
```bash
{verification commands}
```
</process>
<anti_patterns>
Avoid:
- {Common mistake 1}
- {Common mistake 2}
- {Common mistake 3}
</anti_patterns>
<success_criteria>
A well-{completed task}:
- {Criterion 1}
- {Criterion 2}
- {Criterion 3}
- Builds/runs without errors
- Tests pass
- Feels {native/professional/correct}
</success_criteria>
```
**Key workflow characteristics:**
- Starts with required_reading (which references to load)
- Contains actual implementation steps (not just "read references")
- Includes verification steps
- Has success criteria
- Documents anti-patterns
## Step 8: Write Comprehensive References
For EACH reference file identified in Step 5:
### Structure Template
```xml
<overview>
Brief introduction to this domain area
</overview>
<options>
## Available Approaches/Libraries
<option name="Library A">
**When to use:** [specific scenarios]
**Strengths:** [what it's best at]
**Weaknesses:** [what it's not good for]
**Current status:** v{version}, actively maintained
**Learning curve:** [easy/medium/hard]
```code
# Example usage
```
</option>
<option name="Library B">
[Same structure]
</option>
</options>
<decision_tree>
## Choosing the Right Approach
**If you need [X]:** Use [Library A]
**If you need [Y]:** Use [Library B]
**If you have [constraint Z]:** Use [Library C]
**Avoid [Library D] if:** [specific scenarios]
</decision_tree>
<patterns>
## Common Patterns
<pattern name="Pattern Name">
**Use when:** [scenario]
**Implementation:** [code example]
**Considerations:** [trade-offs]
</pattern>
</patterns>
<anti_patterns>
## What NOT to Do
<anti_pattern name="Common Mistake">
**Problem:** [what people do wrong]
**Why it's bad:** [consequences]
**Instead:** [correct approach]
</anti_pattern>
</anti_patterns>
<platform_considerations>
## Platform-Specific Notes
**Windows:** [considerations]
**macOS:** [considerations]
**Linux:** [considerations]
**Mobile:** [if applicable]
</platform_considerations>
```
### Quality Standards
Each reference must include:
- **Current information** (verify dates)
- **Multiple options** (not just one library)
- **Decision guidance** (when to use each)
- **Real examples** (working code, not pseudocode)
- **Trade-offs** (no silver bullets)
- **Anti-patterns** (what NOT to do)
### Common Reference Files
Most domains need:
- **architecture.md** - How to structure projects
- **libraries.md** - Ecosystem overview with comparisons
- **patterns.md** - Design patterns specific to domain
- **testing-debugging.md** - How to verify correctness
- **performance.md** - Optimization strategies
- **deployment.md** - How to ship/distribute
- **anti-patterns.md** - Common mistakes consolidated
## Step 9: Validate Completeness
### Completeness Checklist
Ask: "Could a user build a professional {domain thing} from scratch through shipping using just this skill?"
**Must answer YES to:**
- [ ] All major libraries/frameworks covered?
- [ ] All architectural approaches documented?
- [ ] Complete lifecycle addressed (build → debug → test → optimize → ship)?
- [ ] Platform-specific considerations included?
- [ ] "When to use X vs Y" guidance provided?
- [ ] Common pitfalls documented?
- [ ] Current as of 2024-2025?
- [ ] Workflows actually execute tasks (not just reference knowledge)?
- [ ] Each workflow specifies which references to read?
**Specific gaps to check:**
- [ ] Testing strategy covered?
- [ ] Debugging/profiling tools listed?
- [ ] Deployment/distribution methods documented?
- [ ] Performance optimization addressed?
- [ ] Security considerations (if applicable)?
- [ ] Asset/resource management (if applicable)?
- [ ] Networking (if applicable)?
### Dual-Purpose Test
Test both use cases:
**Direct invocation:** "Can a user invoke this skill and build something?"
- Intake routes to appropriate workflow
- Workflow loads relevant references
- Workflow provides implementation steps
- Success criteria are clear
**Knowledge reference:** "Can create-plans load references to plan a project?"
- References contain decision guidance
- All options compared
- Complete lifecycle covered
- Architecture patterns documented
## Step 10: Create Directory and Files
```bash
# Create structure
mkdir -p ~/.claude/skills/expertise/{domain-name}
mkdir -p ~/.claude/skills/expertise/{domain-name}/workflows
mkdir -p ~/.claude/skills/expertise/{domain-name}/references
# Write SKILL.md
# Write all workflow files
# Write all reference files
# Verify structure
ls -R ~/.claude/skills/expertise/{domain-name}
```
## Step 11: Document in create-plans
Update `~/.claude/skills/create-plans/SKILL.md` to reference this new domain:
Add to the domain inference table:
```markdown
| "{keyword}", "{domain term}" | expertise/{domain-name} |
```
So create-plans can auto-detect and offer to load it.
## Step 12: Final Quality Check
Review entire skill:
**SKILL.md:**
- [ ] Name matches directory (build-{domain-name})
- [ ] Description explains it builds things from scratch through shipping
- [ ] Essential principles inline (always loaded)
- [ ] Intake asks what user wants to do
- [ ] Routing maps to workflows
- [ ] Reference index complete and organized
- [ ] Workflows index complete
**Workflows:**
- [ ] Each workflow starts with required_reading
- [ ] Each workflow has actual implementation steps
- [ ] Each workflow has verification steps
- [ ] Each workflow has success criteria
- [ ] Workflows cover full lifecycle (build, debug, test, optimize, ship)
**References:**
- [ ] Pure XML structure (no markdown headings)
- [ ] Decision guidance in every file
- [ ] Current versions verified
- [ ] Code examples work
- [ ] Anti-patterns documented
- [ ] Platform considerations included
**Completeness:**
- [ ] A professional practitioner would find this comprehensive
- [ ] No major libraries/patterns missing
- [ ] Full lifecycle covered
- [ ] Passes the "build from scratch through shipping" test
- [ ] Can be invoked directly by users
- [ ] Can be loaded by create-plans for knowledge
</process>
<success_criteria>
Domain expertise skill is complete when:
- [ ] Comprehensive research completed (5+ web searches)
- [ ] All sources verified for currency (2024-2025)
- [ ] Knowledge organized by domain areas (not arbitrary)
- [ ] Essential principles in SKILL.md (always loaded)
- [ ] Intake routes to appropriate workflows
- [ ] Each workflow has required_reading + implementation steps + verification
- [ ] Each reference has decision trees and comparisons
- [ ] Anti-patterns documented throughout
- [ ] Full lifecycle covered (build → debug → test → optimize → ship)
- [ ] Platform-specific considerations included
- [ ] Located in ~/.claude/skills/expertise/{domain-name}/
- [ ] Referenced in create-plans domain inference table
- [ ] Passes dual-purpose test: Can be invoked directly AND loaded for knowledge
- [ ] User can build something professional from scratch through shipping
</success_criteria>
<anti_patterns>
**DON'T:**
- Copy tutorial content without verification
- Include only "getting started" material
- Skip the "when NOT to use" guidance
- Forget to check if libraries are still maintained
- Organize by document type instead of domain concerns
- Make it knowledge-only with no execution workflows
- Skip verification steps in workflows
- Include outdated content from old blog posts
- Skip decision trees and comparisons
- Create workflows that just say "read the references"
**DO:**
- Verify everything is current
- Include complete lifecycle (build → ship)
- Provide decision guidance
- Document anti-patterns
- Make workflows execute real tasks
- Start workflows with required_reading
- Include verification in every workflow
- Make it exhaustive, not minimal
- Test both direct invocation and knowledge reference use cases
</anti_patterns>

View File

@@ -0,0 +1,191 @@
# Workflow: Create a New Skill
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md
2. references/skill-structure.md
3. references/core-principles.md
4. references/use-xml-tags.md
</required_reading>
<process>
## Step 1: Adaptive Requirements Gathering
**If user provided context** (e.g., "build a skill for X"):
→ Analyze what's stated, what can be inferred, what's unclear
→ Skip to asking about genuine gaps only
**If user just invoked skill without context:**
→ Ask what they want to build
### Using AskUserQuestion
Ask 2-4 domain-specific questions based on actual gaps. Each question should:
- Have specific options with descriptions
- Focus on scope, complexity, outputs, boundaries
- NOT ask things obvious from context
Example questions:
- "What specific operations should this skill handle?" (with options based on domain)
- "Should this also handle [related thing] or stay focused on [core thing]?"
- "What should the user see when successful?"
### Decision Gate
After initial questions, ask:
"Ready to proceed with building, or would you like me to ask more questions?"
Options:
1. **Proceed to building** - I have enough context
2. **Ask more questions** - There are more details to clarify
3. **Let me add details** - I want to provide additional context
## Step 2: Research Trigger (If External API)
**When external service detected**, ask using AskUserQuestion:
"This involves [service name] API. Would you like me to research current endpoints and patterns before building?"
Options:
1. **Yes, research first** - Fetch current documentation for accurate implementation
2. **No, proceed with general patterns** - Use common patterns without specific API research
If research requested:
- Use Context7 MCP to fetch current library documentation
- Or use WebSearch for recent API documentation
- Focus on 2024-2025 sources
- Store findings for use in content generation
## Step 3: Decide Structure
**Simple skill (single workflow, <200 lines):**
→ Single SKILL.md file with all content
**Complex skill (multiple workflows OR domain knowledge):**
→ Router pattern:
```
skill-name/
├── SKILL.md (router + principles)
├── workflows/ (procedures - FOLLOW)
├── references/ (knowledge - READ)
├── templates/ (output structures - COPY + FILL)
└── scripts/ (reusable code - EXECUTE)
```
Factors favoring router pattern:
- Multiple distinct user intents (create vs debug vs ship)
- Shared domain knowledge across workflows
- Essential principles that must not be skipped
- Skill likely to grow over time
**Consider templates/ when:**
- Skill produces consistent output structures (plans, specs, reports)
- Structure matters more than creative generation
**Consider scripts/ when:**
- Same code runs across invocations (deploy, setup, API calls)
- Operations are error-prone when rewritten each time
See references/recommended-structure.md for templates.
## Step 4: Create Directory
```bash
mkdir -p ~/.claude/skills/{skill-name}
# If complex:
mkdir -p ~/.claude/skills/{skill-name}/workflows
mkdir -p ~/.claude/skills/{skill-name}/references
# If needed:
mkdir -p ~/.claude/skills/{skill-name}/templates # for output structures
mkdir -p ~/.claude/skills/{skill-name}/scripts # for reusable code
```
## Step 5: Write SKILL.md
**Simple skill:** Write complete skill file with:
- YAML frontmatter (name, description)
- `<objective>`
- `<quick_start>`
- Content sections with pure XML
- `<success_criteria>`
**Complex skill:** Write router with:
- YAML frontmatter
- `<essential_principles>` (inline, unavoidable)
- `<intake>` (question to ask user)
- `<routing>` (maps answers to workflows)
- `<reference_index>` and `<workflows_index>`
## Step 6: Write Workflows (if complex)
For each workflow:
```xml
<required_reading>
Which references to load for this workflow
</required_reading>
<process>
Step-by-step procedure
</process>
<success_criteria>
How to know this workflow is done
</success_criteria>
```
## Step 7: Write References (if needed)
Domain knowledge that:
- Multiple workflows might need
- Doesn't change based on workflow
- Contains patterns, examples, technical details
## Step 8: Validate Structure
Check:
- [ ] YAML frontmatter valid
- [ ] Name matches directory (lowercase-with-hyphens)
- [ ] Description says what it does AND when to use it (third person)
- [ ] No markdown headings (#) in body - use XML tags
- [ ] Required tags present: objective, quick_start, success_criteria
- [ ] All referenced files exist
- [ ] SKILL.md under 500 lines
- [ ] XML tags properly closed
## Step 9: Create Slash Command
```bash
cat > ~/.claude/commands/{skill-name}.md << 'EOF'
---
description: {Brief description}
argument-hint: [{argument hint}]
allowed-tools: Skill({skill-name})
---
Invoke the {skill-name} skill for: $ARGUMENTS
EOF
```
## Step 10: Test
Invoke the skill and observe:
- Does it ask the right intake question?
- Does it load the right workflow?
- Does the workflow load the right references?
- Does output match expectations?
Iterate based on real usage, not assumptions.
</process>
<success_criteria>
Skill is complete when:
- [ ] Requirements gathered with appropriate questions
- [ ] API research done if external service involved
- [ ] Directory structure correct
- [ ] SKILL.md has valid frontmatter
- [ ] Essential principles inline (if complex skill)
- [ ] Intake question routes to correct workflow
- [ ] All workflows have required_reading + process + success_criteria
- [ ] References contain reusable domain knowledge
- [ ] Slash command exists and works
- [ ] Tested with real invocation
</success_criteria>

View File

@@ -0,0 +1,121 @@
# Workflow: Get Guidance on Skill Design
<required_reading>
**Read these reference files NOW:**
1. references/core-principles.md
2. references/recommended-structure.md
</required_reading>
<process>
## Step 1: Understand the Problem Space
Ask the user:
- What task or domain are you trying to support?
- Is this something you do repeatedly?
- What makes it complex enough to need a skill?
## Step 2: Determine If a Skill Is Right
**Create a skill when:**
- Task is repeated across multiple sessions
- Domain knowledge doesn't change frequently
- Complex enough to benefit from structure
- Would save significant time if automated
**Don't create a skill when:**
- One-off task (just do it directly)
- Changes constantly (will be outdated quickly)
- Too simple (overhead isn't worth it)
- Better as a slash command (user-triggered, no context needed)
Share this assessment with user.
## Step 3: Map the Workflows
Ask: "What are the different things someone might want to do with this skill?"
Common patterns:
- Create / Read / Update / Delete
- Build / Debug / Ship
- Setup / Use / Troubleshoot
- Import / Process / Export
Each distinct workflow = potential workflow file.
## Step 4: Identify Domain Knowledge
Ask: "What knowledge is needed regardless of which workflow?"
This becomes references:
- API patterns
- Best practices
- Common examples
- Configuration details
## Step 5: Draft the Structure
Based on answers, recommend structure:
**If 1 workflow, simple knowledge:**
```
skill-name/
└── SKILL.md (everything in one file)
```
**If 2+ workflows, shared knowledge:**
```
skill-name/
├── SKILL.md (router)
├── workflows/
│ ├── workflow-a.md
│ └── workflow-b.md
└── references/
└── shared-knowledge.md
```
## Step 6: Identify Essential Principles
Ask: "What rules should ALWAYS apply, no matter which workflow?"
These become `<essential_principles>` in SKILL.md.
Examples:
- "Always verify before reporting success"
- "Never store credentials in code"
- "Ask before making destructive changes"
## Step 7: Present Recommendation
Summarize:
- Recommended structure (simple vs router pattern)
- List of workflows
- List of references
- Essential principles
Ask: "Does this structure make sense? Ready to build it?"
If yes → offer to switch to "Create a new skill" workflow
If no → clarify and iterate
</process>
<decision_framework>
## Quick Decision Framework
| Situation | Recommendation |
|-----------|----------------|
| Single task, repeat often | Simple skill |
| Multiple related tasks | Router + workflows |
| Complex domain, many patterns | Router + workflows + references |
| User-triggered, fresh context | Slash command, not skill |
| One-off task | No skill needed |
</decision_framework>
<success_criteria>
Guidance is complete when:
- [ ] User understands if they need a skill
- [ ] Structure is recommended and explained
- [ ] Workflows are identified
- [ ] References are identified
- [ ] Essential principles are identified
- [ ] User is ready to build (or decided not to)
</success_criteria>

View File

@@ -0,0 +1,161 @@
# Workflow: Upgrade Skill to Router Pattern
<required_reading>
**Read these reference files NOW:**
1. references/recommended-structure.md
2. references/skill-structure.md
</required_reading>
<process>
## Step 1: Select the Skill
```bash
ls ~/.claude/skills/
```
Present numbered list, ask: "Which skill should be upgraded to the router pattern?"
## Step 2: Verify It Needs Upgrading
Read the skill:
```bash
cat ~/.claude/skills/{skill-name}/SKILL.md
ls ~/.claude/skills/{skill-name}/
```
**Already a router?** (has workflows/ and intake question)
→ Tell user it's already using router pattern, offer to add workflows instead
**Simple skill that should stay simple?** (under 200 lines, single workflow)
→ Explain that router pattern may be overkill, ask if they want to proceed anyway
**Good candidate for upgrade:**
- Over 200 lines
- Multiple distinct use cases
- Essential principles that shouldn't be skipped
- Growing complexity
## Step 3: Identify Components
Analyze the current skill and identify:
1. **Essential principles** - Rules that apply to ALL use cases
2. **Distinct workflows** - Different things a user might want to do
3. **Reusable knowledge** - Patterns, examples, technical details
Present findings:
```
## Analysis
**Essential principles I found:**
- [Principle 1]
- [Principle 2]
**Distinct workflows I identified:**
- [Workflow A]: [description]
- [Workflow B]: [description]
**Knowledge that could be references:**
- [Reference topic 1]
- [Reference topic 2]
```
Ask: "Does this breakdown look right? Any adjustments?"
## Step 4: Create Directory Structure
```bash
mkdir -p ~/.claude/skills/{skill-name}/workflows
mkdir -p ~/.claude/skills/{skill-name}/references
```
## Step 5: Extract Workflows
For each identified workflow:
1. Create `workflows/{workflow-name}.md`
2. Add required_reading section (references it needs)
3. Add process section (steps from original skill)
4. Add success_criteria section
## Step 6: Extract References
For each identified reference topic:
1. Create `references/{reference-name}.md`
2. Move relevant content from original skill
3. Structure with semantic XML tags
## Step 7: Rewrite SKILL.md as Router
Replace SKILL.md with router structure:
```markdown
---
name: {skill-name}
description: {existing description}
---
<essential_principles>
[Extracted principles - inline, cannot be skipped]
</essential_principles>
<intake>
**Ask the user:**
What would you like to do?
1. [Workflow A option]
2. [Workflow B option]
...
**Wait for response before proceeding.**
</intake>
<routing>
| Response | Workflow |
|----------|----------|
| 1, "keywords" | `workflows/workflow-a.md` |
| 2, "keywords" | `workflows/workflow-b.md` |
</routing>
<reference_index>
[List all references by category]
</reference_index>
<workflows_index>
| Workflow | Purpose |
|----------|---------|
| workflow-a.md | [What it does] |
| workflow-b.md | [What it does] |
</workflows_index>
```
## Step 8: Verify Nothing Was Lost
Compare original skill content against new structure:
- [ ] All principles preserved (now inline)
- [ ] All procedures preserved (now in workflows)
- [ ] All knowledge preserved (now in references)
- [ ] No orphaned content
## Step 9: Test
Invoke the upgraded skill:
- Does intake question appear?
- Does each routing option work?
- Do workflows load correct references?
- Does behavior match original skill?
Report any issues.
</process>
<success_criteria>
Upgrade is complete when:
- [ ] workflows/ directory created with workflow files
- [ ] references/ directory created (if needed)
- [ ] SKILL.md rewritten as router
- [ ] Essential principles inline in SKILL.md
- [ ] All original content preserved
- [ ] Intake question routes correctly
- [ ] Tested and working
</success_criteria>

View File

@@ -0,0 +1,204 @@
# Workflow: Verify Skill Content Accuracy
<required_reading>
**Read these reference files NOW:**
1. references/skill-structure.md
</required_reading>
<purpose>
Audit checks structure. **Verify checks truth.**
Skills contain claims about external things: APIs, CLI tools, frameworks, services. These change over time. This workflow checks if a skill's content is still accurate.
</purpose>
<process>
## Step 1: Select the Skill
```bash
ls ~/.claude/skills/
```
Present numbered list, ask: "Which skill should I verify for accuracy?"
## Step 2: Read and Categorize
Read the entire skill (SKILL.md + workflows/ + references/):
```bash
cat ~/.claude/skills/{skill-name}/SKILL.md
cat ~/.claude/skills/{skill-name}/workflows/*.md 2>/dev/null
cat ~/.claude/skills/{skill-name}/references/*.md 2>/dev/null
```
Categorize by primary dependency type:
| Type | Examples | Verification Method |
|------|----------|---------------------|
| **API/Service** | manage-stripe, manage-gohighlevel | Context7 + WebSearch |
| **CLI Tools** | build-macos-apps (xcodebuild, swift) | Run commands |
| **Framework** | build-iphone-apps (SwiftUI, UIKit) | Context7 for docs |
| **Integration** | setup-stripe-payments | WebFetch + Context7 |
| **Pure Process** | create-agent-skills | No external deps |
Report: "This skill is primarily [type]-based. I'll verify using [method]."
## Step 3: Extract Verifiable Claims
Scan skill content and extract:
**CLI Tools mentioned:**
- Tool names (xcodebuild, swift, npm, etc.)
- Specific flags/options documented
- Expected output patterns
**API Endpoints:**
- Service names (Stripe, Meta, etc.)
- Specific endpoints documented
- Authentication methods
- SDK versions
**Framework Patterns:**
- Framework names (SwiftUI, React, etc.)
- Specific APIs/patterns documented
- Version-specific features
**File Paths/Structures:**
- Expected project structures
- Config file locations
Present: "Found X verifiable claims to check."
## Step 4: Verify by Type
### For CLI Tools
```bash
# Check tool exists
which {tool-name}
# Check version
{tool-name} --version
# Verify documented flags work
{tool-name} --help | grep "{documented-flag}"
```
### For API/Service Skills
Use Context7 to fetch current documentation:
```
mcp__context7__resolve-library-id: {service-name}
mcp__context7__get-library-docs: {library-id}, topic: {relevant-topic}
```
Compare skill's documented patterns against current docs:
- Are endpoints still valid?
- Has authentication changed?
- Are there deprecated methods being used?
### For Framework Skills
Use Context7:
```
mcp__context7__resolve-library-id: {framework-name}
mcp__context7__get-library-docs: {library-id}, topic: {specific-api}
```
Check:
- Are documented APIs still current?
- Have patterns changed?
- Are there newer recommended approaches?
### For Integration Skills
WebSearch for recent changes:
```
"[service name] API changes 2025"
"[service name] breaking changes"
"[service name] deprecated endpoints"
```
Then Context7 for current SDK patterns.
### For Services with Status Pages
WebFetch official docs/changelog if available.
## Step 5: Generate Freshness Report
Present findings:
```
## Verification Report: {skill-name}
### ✅ Verified Current
- [Claim]: [Evidence it's still accurate]
### ⚠️ May Be Outdated
- [Claim]: [What changed / newer info found]
→ Current: [what docs now say]
### ❌ Broken / Invalid
- [Claim]: [Why it's wrong]
→ Fix: [What it should be]
### Could Not Verify
- [Claim]: [Why verification wasn't possible]
---
**Overall Status:** [Fresh / Needs Updates / Significantly Stale]
**Last Verified:** [Today's date]
```
## Step 6: Offer Updates
If issues found:
"Found [N] items that need updating. Would you like me to:"
1. **Update all** - Apply all corrections
2. **Review each** - Show each change before applying
3. **Just the report** - No changes
If updating:
- Make changes based on verified current information
- Add verification date comment if appropriate
- Report what was updated
## Step 7: Suggest Verification Schedule
Based on skill type, recommend:
| Skill Type | Recommended Frequency |
|------------|----------------------|
| API/Service | Every 1-2 months |
| Framework | Every 3-6 months |
| CLI Tools | Every 6 months |
| Pure Process | Annually |
"This skill should be re-verified in approximately [timeframe]."
</process>
<verification_shortcuts>
## Quick Verification Commands
**Check if CLI tool exists and get version:**
```bash
which {tool} && {tool} --version
```
**Context7 pattern for any library:**
```
1. resolve-library-id: "{library-name}"
2. get-library-docs: "{id}", topic: "{specific-feature}"
```
**WebSearch patterns:**
- Breaking changes: "{service} breaking changes 2025"
- Deprecations: "{service} deprecated API"
- Current best practices: "{framework} best practices 2025"
</verification_shortcuts>
<success_criteria>
Verification is complete when:
- [ ] Skill categorized by dependency type
- [ ] Verifiable claims extracted
- [ ] Each claim checked with appropriate method
- [ ] Freshness report generated
- [ ] Updates applied (if requested)
- [ ] User knows when to re-verify
</success_criteria>

View File

@@ -0,0 +1,332 @@
---
name: create-hooks
description: Expert guidance for creating, configuring, and using Claude Code hooks. Use when working with hooks, setting up event listeners, validating commands, automating workflows, adding notifications, or understanding hook types (PreToolUse, PostToolUse, Stop, SessionStart, UserPromptSubmit, etc).
---
<objective>
Hooks are event-driven automation for Claude Code that execute shell commands or LLM prompts in response to tool usage, session events, and user interactions. This skill teaches you how to create, configure, and debug hooks for validating commands, automating workflows, injecting context, and implementing custom completion criteria.
Hooks provide programmatic control over Claude's behavior without modifying core code, enabling project-specific automation, safety checks, and workflow customization.
</objective>
<context>
Hooks are shell commands or LLM-evaluated prompts that execute in response to Claude Code events. They operate within an event hierarchy: events (PreToolUse, PostToolUse, Stop, etc.) trigger matchers (tool patterns) which fire hooks (commands or prompts). Hooks can block actions, modify tool inputs, inject context, or simply observe and log Claude's operations.
</context>
<quick_start>
<workflow>
1. Create hooks config file:
- Project: `.claude/hooks.json`
- User: `~/.claude/hooks.json`
2. Choose hook event (when it fires)
3. Choose hook type (command or prompt)
4. Configure matcher (which tools trigger it)
5. Test with `claude --debug`
</workflow>
<example>
**Log all bash commands**:
`.claude/hooks.json`:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "jq -r '\"\\(.tool_input.command) - \\(.tool_input.description // \\\"No description\\\")\"' >> ~/.claude/bash-log.txt"
}
]
}
]
}
}
```
This hook:
- Fires before (`PreToolUse`) every `Bash` tool use
- Executes a `command` (not an LLM prompt)
- Logs command + description to a file
</example>
</quick_start>
<hook_types>
| Event | When it fires | Can block? |
|-------|---------------|------------|
| **PreToolUse** | Before tool execution | Yes |
| **PostToolUse** | After tool execution | No |
| **UserPromptSubmit** | User submits a prompt | Yes |
| **Stop** | Claude attempts to stop | Yes |
| **SubagentStop** | Subagent attempts to stop | Yes |
| **SessionStart** | Session begins | No |
| **SessionEnd** | Session ends | No |
| **PreCompact** | Before context compaction | Yes |
| **Notification** | Claude needs input | No |
Blocking hooks can return `"decision": "block"` to prevent the action. See [references/hook-types.md](references/hook-types.md) for detailed use cases.
</hook_types>
<hook_anatomy>
<hook_type name="command">
**Type**: Executes a shell command
**Use when**:
- Simple validation (check file exists)
- Logging (append to file)
- External tools (formatters, linters)
- Desktop notifications
**Input**: JSON via stdin
**Output**: JSON via stdout (optional)
```json
{
"type": "command",
"command": "/path/to/script.sh",
"timeout": 30000
}
```
</hook_type>
<hook_type name="prompt">
**Type**: LLM evaluates a prompt
**Use when**:
- Complex decision logic
- Natural language validation
- Context-aware checks
- Reasoning required
**Input**: Prompt with `$ARGUMENTS` placeholder
**Output**: JSON with `decision` and `reason`
```json
{
"type": "prompt",
"prompt": "Evaluate if this command is safe: $ARGUMENTS\n\nReturn JSON: {\"decision\": \"approve\" or \"block\", \"reason\": \"explanation\"}"
}
```
</hook_type>
</hook_anatomy>
<matchers>
Matchers filter which tools trigger the hook:
```json
{
"matcher": "Bash", // Exact match
"matcher": "Write|Edit", // Multiple tools (regex OR)
"matcher": "mcp__.*", // All MCP tools
"matcher": "mcp__memory__.*" // Specific MCP server
}
```
**No matcher**: Hook fires for all tools
```json
{
"hooks": {
"UserPromptSubmit": [
{
"hooks": [...] // No matcher - fires on every user prompt
}
]
}
}
```
</matchers>
<input_output>
Hooks receive JSON via stdin with session info, current directory, and event-specific data. Blocking hooks can return JSON to approve/block actions or modify inputs.
**Example output** (blocking hooks):
```json
{
"decision": "approve" | "block",
"reason": "Why this decision was made"
}
```
See [references/input-output-schemas.md](references/input-output-schemas.md) for complete schemas for each hook type.
</input_output>
<environment_variables>
Available in hook commands:
| Variable | Value |
|----------|-------|
| `$CLAUDE_PROJECT_DIR` | Project root directory |
| `${CLAUDE_PLUGIN_ROOT}` | Plugin directory (plugin hooks only) |
| `$ARGUMENTS` | Hook input JSON (prompt hooks only) |
**Example**:
```json
{
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/validate.sh"
}
```
</environment_variables>
<common_patterns>
**Desktop notification when input needed**:
```json
{
"hooks": {
"Notification": [
{
"hooks": [
{
"type": "command",
"command": "osascript -e 'display notification \"Claude needs input\" with title \"Claude Code\"'"
}
]
}
]
}
}
```
**Block destructive git commands**:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "prompt",
"prompt": "Check if this command is destructive: $ARGUMENTS\n\nBlock if it contains: 'git push --force', 'rm -rf', 'git reset --hard'\n\nReturn: {\"decision\": \"approve\" or \"block\", \"reason\": \"explanation\"}"
}
]
}
]
}
}
```
**Auto-format code after edits**:
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "prettier --write $CLAUDE_PROJECT_DIR",
"timeout": 10000
}
]
}
]
}
}
```
**Add context at session start**:
```json
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "echo '{\"hookSpecificOutput\": {\"hookEventName\": \"SessionStart\", \"additionalContext\": \"Current sprint: Sprint 23. Focus: User authentication\"}}'"
}
]
}
]
}
}
```
</common_patterns>
<debugging>
Always test hooks with the debug flag:
```bash
claude --debug
```
This shows which hooks matched, command execution, and output. See [references/troubleshooting.md](references/troubleshooting.md) for common issues and solutions.
</debugging>
<reference_guides>
**Hook types and events**: [references/hook-types.md](references/hook-types.md)
- Complete list of hook events
- When each event fires
- Input/output schemas for each
- Blocking vs non-blocking hooks
**Command vs Prompt hooks**: [references/command-vs-prompt.md](references/command-vs-prompt.md)
- Decision tree: which type to use
- Command hook patterns and examples
- Prompt hook patterns and examples
- Performance considerations
**Matchers and patterns**: [references/matchers.md](references/matchers.md)
- Regex patterns for tool matching
- MCP tool matching patterns
- Multiple tool matching
- Debugging matcher issues
**Input/Output schemas**: [references/input-output-schemas.md](references/input-output-schemas.md)
- Complete schema for each hook type
- Field descriptions and types
- Hook-specific output fields
- Example JSON for each event
**Working examples**: [references/examples.md](references/examples.md)
- Desktop notifications
- Command validation
- Auto-formatting workflows
- Logging and audit trails
- Stop logic patterns
- Session context injection
**Troubleshooting**: [references/troubleshooting.md](references/troubleshooting.md)
- Hooks not triggering
- Command execution failures
- Prompt hook issues
- Permission problems
- Timeout handling
- Debug workflow
</reference_guides>
<security_checklist>
**Critical safety requirements**:
- **Infinite loop prevention**: Check `stop_hook_active` flag in Stop hooks to prevent recursive triggering
- **Timeout configuration**: Set reasonable timeouts (default: 60s) to prevent hanging
- **Permission validation**: Ensure hook scripts have executable permissions (`chmod +x`)
- **Path safety**: Use absolute paths with `$CLAUDE_PROJECT_DIR` to avoid path injection
- **JSON validation**: Validate hook config with `jq` before use to catch syntax errors
- **Selective blocking**: Be conservative with blocking hooks to avoid workflow disruption
**Testing protocol**:
```bash
# Always test with debug flag first
claude --debug
# Validate JSON config
jq . .claude/hooks.json
```
</security_checklist>
<success_criteria>
A working hook configuration has:
- Valid JSON in `.claude/hooks.json` (validated with `jq`)
- Appropriate hook event selected for the use case
- Correct matcher pattern that matches target tools
- Command or prompt that executes without errors
- Proper output schema (decision/reason for blocking hooks)
- Tested with `--debug` flag showing expected behavior
- No infinite loops in Stop hooks (checks `stop_hook_active` flag)
- Reasonable timeout set (especially for external commands)
- Executable permissions on script files if using file paths
</success_criteria>

View File

@@ -0,0 +1,269 @@
# Command vs Prompt Hooks
Decision guide for choosing between command-based and prompt-based hooks.
## Decision Tree
```
Need to execute a hook?
├─ Simple yes/no validation?
│ └─ Use COMMAND (faster, cheaper)
├─ Need natural language understanding?
│ └─ Use PROMPT (LLM evaluation)
├─ External tool interaction?
│ └─ Use COMMAND (formatters, linters, git)
├─ Complex decision logic?
│ └─ Use PROMPT (reasoning required)
└─ Logging/notification only?
└─ Use COMMAND (no decision needed)
```
---
## Command Hooks
### Characteristics
- **Execution**: Shell command
- **Input**: JSON via stdin
- **Output**: JSON via stdout (optional)
- **Speed**: Fast (no LLM call)
- **Cost**: Free (no API usage)
- **Complexity**: Limited to shell scripting logic
### When to use
**Use command hooks for**:
- File operations (read, write, check existence)
- Running tools (prettier, eslint, git)
- Simple pattern matching (grep, regex)
- Logging to files
- Desktop notifications
- Fast validation (file size, permissions)
**Don't use command hooks for**:
- Natural language analysis
- Complex decision logic
- Context-aware validation
- Semantic understanding
### Examples
**1. Log bash commands**
```json
{
"type": "command",
"command": "jq -r '\"\\(.tool_input.command) - \\(.tool_input.description // \\\"No description\\\")\"' >> ~/.claude/bash-log.txt"
}
```
**2. Block if file doesn't exist**
```bash
#!/bin/bash
# check-file-exists.sh
input=$(cat)
file=$(echo "$input" | jq -r '.tool_input.file_path')
if [ ! -f "$file" ]; then
echo '{"decision": "block", "reason": "File does not exist"}'
exit 0
fi
echo '{"decision": "approve", "reason": "File exists"}'
```
**3. Run prettier after edits**
```json
{
"type": "command",
"command": "prettier --write \"$(echo {} | jq -r '.tool_input.file_path')\"",
"timeout": 10000
}
```
**4. Desktop notification**
```json
{
"type": "command",
"command": "osascript -e 'display notification \"Claude needs input\" with title \"Claude Code\"'"
}
```
### Parsing input in commands
Command hooks receive JSON via stdin. Use `jq` to parse:
```bash
#!/bin/bash
input=$(cat) # Read stdin
# Extract fields
tool_name=$(echo "$input" | jq -r '.tool_name')
command=$(echo "$input" | jq -r '.tool_input.command')
session_id=$(echo "$input" | jq -r '.session_id')
# Your logic here
if [[ "$command" == *"rm -rf"* ]]; then
echo '{"decision": "block", "reason": "Dangerous command"}'
else
echo '{"decision": "approve", "reason": "Safe"}'
fi
```
---
## Prompt Hooks
### Characteristics
- **Execution**: LLM evaluates prompt
- **Input**: Prompt string with `$ARGUMENTS` placeholder
- **Output**: LLM generates JSON response
- **Speed**: Slower (~1-3s per evaluation)
- **Cost**: Uses API credits
- **Complexity**: Can reason, understand context, analyze semantics
### When to use
**Use prompt hooks for**:
- Natural language validation
- Semantic analysis (intent, safety, appropriateness)
- Complex decision trees
- Context-aware checks
- Reasoning about code quality
- Understanding user intent
**Don't use prompt hooks for**:
- Simple pattern matching (use regex/grep)
- File operations (use command hooks)
- High-frequency events (too slow/expensive)
- Non-decision tasks (logging, notifications)
### Examples
**1. Validate commit messages**
```json
{
"type": "prompt",
"prompt": "Evaluate this git commit message: $ARGUMENTS\n\nCheck if it:\n1. Starts with conventional commit type (feat|fix|docs|refactor|test|chore)\n2. Is descriptive and clear\n3. Under 72 characters\n\nReturn: {\"decision\": \"approve\" or \"block\", \"reason\": \"specific feedback\"}"
}
```
**2. Check if Stop is appropriate**
```json
{
"type": "prompt",
"prompt": "Review the conversation transcript: $ARGUMENTS\n\nDetermine if Claude should stop:\n1. All user tasks completed?\n2. Any errors that need fixing?\n3. Tests passing?\n4. Documentation updated?\n\nIf incomplete: {\"decision\": \"block\", \"reason\": \"what's missing\"}\nIf complete: {\"decision\": \"approve\", \"reason\": \"all done\"}"
}
```
**3. Validate code changes for security**
```json
{
"type": "prompt",
"prompt": "Analyze this code change for security issues: $ARGUMENTS\n\nCheck for:\n- SQL injection vulnerabilities\n- XSS attack vectors\n- Authentication bypasses\n- Sensitive data exposure\n\nIf issues found: {\"decision\": \"block\", \"reason\": \"specific vulnerabilities\"}\nIf safe: {\"decision\": \"approve\", \"reason\": \"no issues found\"}"
}
```
**4. Semantic prompt validation**
```json
{
"type": "prompt",
"prompt": "Evaluate user prompt: $ARGUMENTS\n\nIs this:\n1. Related to coding/development?\n2. Appropriate and professional?\n3. Clear and actionable?\n\nIf inappropriate: {\"decision\": \"block\", \"reason\": \"why\"}\nIf good: {\"decision\": \"approve\", \"reason\": \"ok\"}"
}
```
### Writing effective prompts
**Be specific about output format**:
```
Return JSON: {"decision": "approve" or "block", "reason": "explanation"}
```
**Provide clear criteria**:
```
Block if:
1. Command contains 'rm -rf /'
2. Force push to main branch
3. Credentials in plain text
Otherwise approve.
```
**Use $ARGUMENTS placeholder**:
```
Analyze this input: $ARGUMENTS
Check for...
```
The `$ARGUMENTS` placeholder is replaced with the actual hook input JSON.
---
## Performance Comparison
| Aspect | Command Hook | Prompt Hook |
|--------|--------------|-------------|
| **Speed** | <100ms | 1-3s |
| **Cost** | Free | ~$0.001-0.01 per call |
| **Complexity** | Shell scripting | Natural language |
| **Context awareness** | Limited | High |
| **Reasoning** | No | Yes |
| **Best for** | Operations, logging | Validation, analysis |
---
## Combining Both
You can use multiple hooks for the same event:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "echo \"$input\" >> ~/bash-log.txt",
"comment": "Log every command (fast)"
},
{
"type": "prompt",
"prompt": "Analyze this bash command for safety: $ARGUMENTS",
"comment": "Validate with LLM (slower, smarter)"
}
]
}
]
}
}
```
Hooks execute in order. If any hook blocks, execution stops.
---
## Recommendations
**High-frequency events** (PreToolUse, PostToolUse):
- Prefer command hooks
- Use prompt hooks sparingly
- Cache LLM decisions when possible
**Low-frequency events** (Stop, UserPromptSubmit):
- Prompt hooks are fine
- Cost/latency less critical
**Balance**:
- Command hooks for simple checks
- Prompt hooks for complex validation
- Combine when appropriate

View File

@@ -0,0 +1,658 @@
# Working Examples
Real-world hook configurations ready to use.
## Desktop Notifications
### macOS notification when input needed
```json
{
"hooks": {
"Notification": [
{
"hooks": [
{
"type": "command",
"command": "osascript -e 'display notification \"Claude needs your input\" with title \"Claude Code\" sound name \"Glass\"'"
}
]
}
]
}
}
```
### Linux notification (notify-send)
```json
{
"hooks": {
"Notification": [
{
"hooks": [
{
"type": "command",
"command": "notify-send 'Claude Code' 'Awaiting your input' --urgency=normal"
}
]
}
]
}
}
```
### Play sound on notification
```json
{
"hooks": {
"Notification": [
{
"hooks": [
{
"type": "command",
"command": "afplay /System/Library/Sounds/Glass.aiff"
}
]
}
]
}
}
```
---
## Logging
### Log all bash commands
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "jq -r '\"[\" + (.timestamp // now | todate) + \"] \" + .tool_input.command + \" - \" + (.tool_input.description // \"No description\")' >> ~/.claude/bash-log.txt"
}
]
}
]
}
}
```
### Log file operations
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "jq -r '\"[\" + (now | todate) + \"] \" + .tool_name + \": \" + .tool_input.file_path' >> ~/.claude/file-operations.log"
}
]
}
]
}
}
```
### Audit trail for MCP operations
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "mcp__.*",
"hooks": [
{
"type": "command",
"command": "jq '. + {timestamp: now}' >> ~/.claude/mcp-audit.jsonl"
}
]
}
]
}
}
```
---
## Code Quality
### Auto-format after edits
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "prettier --write \"$(echo {} | jq -r '.tool_input.file_path')\" 2>/dev/null || true",
"timeout": 10000
}
]
}
]
}
}
```
### Run linter after code changes
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "eslint \"$(echo {} | jq -r '.tool_input.file_path')\" --fix 2>/dev/null || true"
}
]
}
]
}
}
```
### Run tests before stopping
```json
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "/path/to/check-tests.sh"
}
]
}
]
}
}
```
`check-tests.sh`:
```bash
#!/bin/bash
cd "$cwd" || exit 1
# Run tests
npm test > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo '{"decision": "approve", "reason": "All tests passing"}'
else
echo '{"decision": "block", "reason": "Tests are failing. Please fix before stopping.", "systemMessage": "Run npm test to see failures"}'
fi
```
---
## Safety and Validation
### Block destructive commands
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "/path/to/check-command-safety.sh"
}
]
}
]
}
}
```
`check-command-safety.sh`:
```bash
#!/bin/bash
input=$(cat)
command=$(echo "$input" | jq -r '.tool_input.command')
# Check for dangerous patterns
if [[ "$command" == *"rm -rf /"* ]] || \
[[ "$command" == *"mkfs"* ]] || \
[[ "$command" == *"> /dev/sda"* ]]; then
echo '{"decision": "block", "reason": "Destructive command detected", "systemMessage": "This command could cause data loss"}'
exit 0
fi
# Check for force push to main
if [[ "$command" == *"git push"*"--force"* ]] && \
[[ "$command" == *"main"* || "$command" == *"master"* ]]; then
echo '{"decision": "block", "reason": "Force push to main branch blocked", "systemMessage": "Use a feature branch instead"}'
exit 0
fi
echo '{"decision": "approve", "reason": "Command is safe"}'
```
### Validate commit messages
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "prompt",
"prompt": "Check if this is a git commit command: $ARGUMENTS\n\nIf it's a git commit, validate the message follows conventional commits format (feat|fix|docs|refactor|test|chore): description\n\nIf invalid format: {\"decision\": \"block\", \"reason\": \"Commit message must follow conventional commits\"}\nIf valid or not a commit: {\"decision\": \"approve\", \"reason\": \"ok\"}"
}
]
}
]
}
}
```
### Block writes to critical files
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "/path/to/check-protected-files.sh"
}
]
}
]
}
}
```
`check-protected-files.sh`:
```bash
#!/bin/bash
input=$(cat)
file_path=$(echo "$input" | jq -r '.tool_input.file_path')
# Protected files
protected_files=(
"package-lock.json"
".env.production"
"credentials.json"
)
for protected in "${protected_files[@]}"; do
if [[ "$file_path" == *"$protected"* ]]; then
echo "{\"decision\": \"block\", \"reason\": \"Cannot modify $protected\", \"systemMessage\": \"This file is protected from automated changes\"}"
exit 0
fi
done
echo '{"decision": "approve", "reason": "File is not protected"}'
```
---
## Context Injection
### Load sprint context at session start
```json
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "/path/to/load-sprint-context.sh"
}
]
}
]
}
}
```
`load-sprint-context.sh`:
```bash
#!/bin/bash
# Read sprint info from file
sprint_info=$(cat "$CLAUDE_PROJECT_DIR/.sprint-context.txt" 2>/dev/null || echo "No sprint context available")
# Return as SessionStart context
jq -n \
--arg context "$sprint_info" \
'{
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": $context
}
}'
```
### Load git branch context
```json
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "cd \"$cwd\" && git branch --show-current | jq -Rs '{\"hookSpecificOutput\": {\"hookEventName\": \"SessionStart\", \"additionalContext\": (\"Current branch: \" + .)}}'"
}
]
}
]
}
}
```
### Load environment info
```json
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "echo '{\"hookSpecificOutput\": {\"hookEventName\": \"SessionStart\", \"additionalContext\": \"Environment: '$(hostname)'\\nNode version: '$(node --version 2>/dev/null || echo 'not installed')'\\nPython version: '$(python3 --version 2>/dev/null || echo 'not installed)'\"}}'"
}
]
}
]
}
}
```
---
## Workflow Automation
### Auto-commit after major changes
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "/path/to/auto-commit.sh"
}
]
}
]
}
}
```
`auto-commit.sh`:
```bash
#!/bin/bash
cd "$cwd" || exit 1
# Check if there are changes
if ! git diff --quiet; then
git add -A
git commit -m "chore: auto-commit from claude session" --no-verify
echo '{"systemMessage": "Changes auto-committed"}'
fi
```
### Update documentation after code changes
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "/path/to/update-docs.sh",
"timeout": 30000
}
]
}
]
}
}
```
### Run pre-commit hooks
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "/path/to/check-pre-commit.sh"
}
]
}
]
}
}
```
`check-pre-commit.sh`:
```bash
#!/bin/bash
input=$(cat)
command=$(echo "$input" | jq -r '.tool_input.command')
# If git commit, run pre-commit hooks first
if [[ "$command" == *"git commit"* ]]; then
pre-commit run --all-files > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo '{"decision": "block", "reason": "Pre-commit hooks failed", "systemMessage": "Fix formatting/linting issues first"}'
exit 0
fi
fi
echo '{"decision": "approve", "reason": "ok"}'
```
---
## Session Management
### Archive transcript on session end
```json
{
"hooks": {
"SessionEnd": [
{
"hooks": [
{
"type": "command",
"command": "/path/to/archive-session.sh"
}
]
}
]
}
}
```
`archive-session.sh`:
```bash
#!/bin/bash
input=$(cat)
transcript_path=$(echo "$input" | jq -r '.transcript_path')
session_id=$(echo "$input" | jq -r '.session_id')
# Create archive directory
archive_dir="$HOME/.claude/archives"
mkdir -p "$archive_dir"
# Copy transcript with timestamp
timestamp=$(date +%Y%m%d-%H%M%S)
cp "$transcript_path" "$archive_dir/${timestamp}-${session_id}.jsonl"
echo "Session archived to $archive_dir"
```
### Save session stats
```json
{
"hooks": {
"SessionEnd": [
{
"hooks": [
{
"type": "command",
"command": "jq '. + {ended_at: now}' >> ~/.claude/session-stats.jsonl"
}
]
}
]
}
}
```
---
## Advanced Patterns
### Intelligent stop logic
```json
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "prompt",
"prompt": "Review the conversation: $ARGUMENTS\n\nCheck if:\n1. All user-requested tasks are complete\n2. Tests are passing (if code changes made)\n3. No errors that need fixing\n4. Documentation updated (if applicable)\n\nIf incomplete: {\"decision\": \"block\", \"reason\": \"specific issue\", \"systemMessage\": \"what needs to be done\"}\n\nIf complete: {\"decision\": \"approve\", \"reason\": \"all tasks done\"}\n\nIMPORTANT: If stop_hook_active is true, return {\"decision\": undefined} to avoid infinite loop",
"timeout": 30000
}
]
}
]
}
}
```
### Chain multiple hooks
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "echo 'First hook' >> /tmp/hook-chain.log"
},
{
"type": "command",
"command": "echo 'Second hook' >> /tmp/hook-chain.log"
},
{
"type": "prompt",
"prompt": "Final validation: $ARGUMENTS"
}
]
}
]
}
}
```
Hooks execute in order. First block stops the chain.
### Conditional execution based on file type
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "/path/to/format-by-type.sh"
}
]
}
]
}
}
```
`format-by-type.sh`:
```bash
#!/bin/bash
input=$(cat)
file_path=$(echo "$input" | jq -r '.tool_input.file_path')
case "$file_path" in
*.js|*.jsx|*.ts|*.tsx)
prettier --write "$file_path"
;;
*.py)
black "$file_path"
;;
*.go)
gofmt -w "$file_path"
;;
esac
```
---
## Project-Specific Hooks
Use `$CLAUDE_PROJECT_DIR` for project-specific hooks:
```json
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/init-session.sh"
}
]
}
],
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/validate-changes.sh"
}
]
}
]
}
}
```
This keeps hook scripts versioned with the project.

View File

@@ -0,0 +1,463 @@
# Hook Types and Events
Complete reference for all Claude Code hook events.
## PreToolUse
**When it fires**: Before any tool is executed
**Can block**: Yes
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "PreToolUse",
"tool_name": "Bash",
"tool_input": {
"command": "npm install",
"description": "Install dependencies"
}
}
```
**Output schema** (to control execution):
```json
{
"decision": "approve" | "block",
"reason": "Explanation",
"permissionDecision": "allow" | "deny" | "ask",
"permissionDecisionReason": "Why",
"updatedInput": {
"command": "npm install --save-exact"
}
}
```
**Use cases**:
- Validate commands before execution
- Block dangerous operations
- Modify tool inputs
- Log command attempts
- Ask user for confirmation
**Example**: Block force pushes to main
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "prompt",
"prompt": "Check if this git command is safe: $ARGUMENTS\n\nBlock if: force push to main/master\n\nReturn: {\"decision\": \"approve\" or \"block\", \"reason\": \"explanation\"}"
}
]
}
]
}
}
```
---
## PostToolUse
**When it fires**: After a tool completes execution
**Can block**: No (tool already executed)
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "PostToolUse",
"tool_name": "Write",
"tool_input": {
"file_path": "/path/to/file.js",
"content": "..."
},
"tool_output": "File created successfully"
}
```
**Output schema**:
```json
{
"systemMessage": "Optional message to display",
"suppressOutput": false
}
```
**Use cases**:
- Auto-format code after Write/Edit
- Run tests after code changes
- Update documentation
- Trigger CI builds
- Send notifications
**Example**: Auto-format after edits
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "prettier --write $CLAUDE_PROJECT_DIR",
"timeout": 10000
}
]
}
]
}
}
```
---
## UserPromptSubmit
**When it fires**: User submits a prompt to Claude
**Can block**: Yes
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "UserPromptSubmit",
"prompt": "Write a function to calculate factorial"
}
```
**Output schema**:
```json
{
"decision": "approve" | "block",
"reason": "Explanation",
"systemMessage": "Message to user"
}
```
**Use cases**:
- Validate prompt format
- Block inappropriate requests
- Preprocess user input
- Add context to prompts
- Enforce prompt templates
**Example**: Require issue numbers in prompts
```json
{
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "prompt",
"prompt": "Check if prompt mentions an issue number (e.g., #123 or PROJ-456): $ARGUMENTS\n\nIf no issue number: {\"decision\": \"block\", \"reason\": \"Please include issue number\"}\nOtherwise: {\"decision\": \"approve\", \"reason\": \"ok\"}"
}
]
}
]
}
}
```
---
## Stop
**When it fires**: Claude attempts to stop working
**Can block**: Yes
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "Stop",
"stop_hook_active": false
}
```
**Output schema**:
```json
{
"decision": "block" | undefined,
"reason": "Why Claude should continue",
"continue": true,
"systemMessage": "Additional instructions"
}
```
**Use cases**:
- Verify all tasks completed
- Check for errors that need fixing
- Ensure tests pass before stopping
- Validate deliverables
- Custom completion criteria
**Example**: Verify tests pass before stopping
```json
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "npm test && echo '{\"decision\": \"approve\"}' || echo '{\"decision\": \"block\", \"reason\": \"Tests failing\"}'"
}
]
}
]
}
}
```
**Important**: Check `stop_hook_active` to avoid infinite loops. If true, don't block again.
---
## SubagentStop
**When it fires**: A subagent attempts to stop
**Can block**: Yes
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "SubagentStop",
"stop_hook_active": false
}
```
**Output schema**: Same as Stop
**Use cases**:
- Verify subagent completed its task
- Check for errors in subagent output
- Validate subagent deliverables
- Ensure quality before accepting results
**Example**: Check if code-reviewer provided feedback
```json
{
"hooks": {
"SubagentStop": [
{
"hooks": [
{
"type": "prompt",
"prompt": "Review the subagent transcript: $ARGUMENTS\n\nDid the code-reviewer provide:\n1. Specific issues found\n2. Severity ratings\n3. Remediation steps\n\nIf missing: {\"decision\": \"block\", \"reason\": \"Incomplete review\"}\nOtherwise: {\"decision\": \"approve\", \"reason\": \"Complete\"}"
}
]
}
]
}
}
```
---
## SessionStart
**When it fires**: At the beginning of a Claude session
**Can block**: No
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "SessionStart",
"source": "startup"
}
```
**Output schema**:
```json
{
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": "Context to inject into session"
}
}
```
**Use cases**:
- Load project context
- Inject sprint information
- Set environment variables
- Initialize state
- Display welcome messages
**Example**: Load current sprint context
```json
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "cat $CLAUDE_PROJECT_DIR/.sprint-context.txt | jq -Rs '{\"hookSpecificOutput\": {\"hookEventName\": \"SessionStart\", \"additionalContext\": .}}'"
}
]
}
]
}
}
```
---
## SessionEnd
**When it fires**: When a Claude session ends
**Can block**: No (cannot prevent session end)
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "SessionEnd",
"reason": "exit" | "error" | "timeout"
}
```
**Output schema**: None (hook output ignored)
**Use cases**:
- Save session state
- Cleanup temporary files
- Update logs
- Send analytics
- Archive transcripts
**Example**: Archive session transcript
```json
{
"hooks": {
"SessionEnd": [
{
"hooks": [
{
"type": "command",
"command": "cp $transcript_path $CLAUDE_PROJECT_DIR/.claude/archives/$(date +%Y%m%d-%H%M%S).jsonl"
}
]
}
]
}
}
```
---
## PreCompact
**When it fires**: Before context window compaction
**Can block**: Yes
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "PreCompact",
"trigger": "manual" | "auto",
"custom_instructions": "User's compaction instructions"
}
```
**Output schema**:
```json
{
"decision": "approve" | "block",
"reason": "Explanation"
}
```
**Use cases**:
- Validate state before compaction
- Save important context
- Custom compaction logic
- Prevent compaction at critical moments
---
## Notification
**When it fires**: Claude needs user input (awaiting response)
**Can block**: No
**Input schema**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/current/working/directory",
"permission_mode": "default",
"hook_event_name": "Notification"
}
```
**Output schema**: None
**Use cases**:
- Desktop notifications
- Sound alerts
- Status bar updates
- External notifications (Slack, etc.)
**Example**: macOS notification
```json
{
"hooks": {
"Notification": [
{
"hooks": [
{
"type": "command",
"command": "osascript -e 'display notification \"Claude needs input\" with title \"Claude Code\"'"
}
]
}
]
}
}
```

View File

@@ -0,0 +1,469 @@
# Input/Output Schemas
Complete JSON schemas for all hook types.
## Common Input Fields
All hooks receive these fields:
```typescript
{
session_id: string // Unique session identifier
transcript_path: string // Path to session transcript (.jsonl file)
cwd: string // Current working directory
permission_mode: string // "default" | "plan" | "acceptEdits" | "bypassPermissions"
hook_event_name: string // Name of the hook event
}
```
---
## PreToolUse
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "PreToolUse",
"tool_name": "Bash",
"tool_input": {
"command": "npm install",
"description": "Install dependencies"
}
}
```
**Output** (optional, for control):
```json
{
"decision": "approve" | "block",
"reason": "Explanation for the decision",
"permissionDecision": "allow" | "deny" | "ask",
"permissionDecisionReason": "Why this permission decision",
"updatedInput": {
"command": "npm install --save-exact"
},
"systemMessage": "Message displayed to user",
"suppressOutput": false,
"continue": true
}
```
**Fields**:
- `decision`: Whether to allow the tool call
- `reason`: Explanation (required if blocking)
- `permissionDecision`: Override permission system
- `updatedInput`: Modified tool input (partial update)
- `systemMessage`: Message shown to user
- `suppressOutput`: Hide hook output from user
- `continue`: If false, stop execution
---
## PostToolUse
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "PostToolUse",
"tool_name": "Write",
"tool_input": {
"file_path": "/path/to/file.js",
"content": "const x = 1;"
},
"tool_output": "File created successfully at: /path/to/file.js"
}
```
**Output** (optional):
```json
{
"systemMessage": "Code formatted successfully",
"suppressOutput": false
}
```
**Fields**:
- `systemMessage`: Additional message to display
- `suppressOutput`: Hide tool output from user
---
## UserPromptSubmit
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "UserPromptSubmit",
"prompt": "Write a function to calculate factorial"
}
```
**Output**:
```json
{
"decision": "approve" | "block",
"reason": "Prompt is clear and actionable",
"systemMessage": "Optional message to user"
}
```
**Fields**:
- `decision`: Whether to allow the prompt
- `reason`: Explanation (required if blocking)
- `systemMessage`: Message shown to user
---
## Stop
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "Stop",
"stop_hook_active": false
}
```
**Output**:
```json
{
"decision": "block" | undefined,
"reason": "Tests are still failing - please fix before stopping",
"continue": true,
"stopReason": "Cannot stop yet",
"systemMessage": "Additional context"
}
```
**Fields**:
- `decision`: `"block"` to prevent stopping, `undefined` to allow
- `reason`: Why Claude should continue (required if blocking)
- `continue`: If true and blocking, Claude continues working
- `stopReason`: Message shown when stopping is blocked
- `systemMessage`: Additional context for Claude
- `stop_hook_active`: If true, don't block again (prevents infinite loops)
**Important**: Always check `stop_hook_active` to avoid infinite loops:
```javascript
if (input.stop_hook_active) {
return { decision: undefined }; // Don't block again
}
```
---
## SubagentStop
**Input**: Same as Stop
**Output**: Same as Stop
**Usage**: Same as Stop, but for subagent completion
---
## SessionStart
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "SessionStart",
"source": "startup" | "continue" | "checkpoint"
}
```
**Output**:
```json
{
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": "Current sprint: Sprint 23\nFocus: User authentication\nDeadline: Friday"
}
}
```
**Fields**:
- `additionalContext`: Text injected into session context
- Multiple SessionStart hooks' contexts are concatenated
---
## SessionEnd
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "SessionEnd",
"reason": "exit" | "error" | "timeout" | "compact"
}
```
**Output**: None (ignored)
**Usage**: Cleanup tasks only. Cannot prevent session end.
---
## PreCompact
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "PreCompact",
"trigger": "manual" | "auto",
"custom_instructions": "Preserve all git commit messages"
}
```
**Output**:
```json
{
"decision": "approve" | "block",
"reason": "Safe to compact" | "Wait until task completes"
}
```
**Fields**:
- `trigger`: How compaction was initiated
- `custom_instructions`: User's compaction preferences (if manual)
- `decision`: Whether to proceed with compaction
- `reason`: Explanation
---
## Notification
**Input**:
```json
{
"session_id": "abc123",
"transcript_path": "~/.claude/projects/.../session.jsonl",
"cwd": "/Users/username/project",
"permission_mode": "default",
"hook_event_name": "Notification"
}
```
**Output**: None (hook just performs notification action)
**Usage**: Trigger external notifications (desktop, sound, status bar)
---
## Common Output Fields
These fields can be returned by any hook:
```json
{
"continue": true | false,
"stopReason": "Reason shown when stopping",
"suppressOutput": true | false,
"systemMessage": "Additional context or message"
}
```
**Fields**:
- `continue`: If false, stop Claude's execution immediately
- `stopReason`: Message displayed when execution stops
- `suppressOutput`: If true, hide hook's stdout/stderr from user
- `systemMessage`: Context added to Claude's next message
---
## LLM Prompt Hook Response
When using `type: "prompt"`, the LLM must return JSON:
```json
{
"decision": "approve" | "block",
"reason": "Detailed explanation",
"systemMessage": "Optional message",
"continue": true | false,
"stopReason": "Optional stop message"
}
```
**Example prompt**:
```
Evaluate this command: $ARGUMENTS
Check if it's safe to execute.
Return JSON:
{
"decision": "approve" or "block",
"reason": "your explanation"
}
```
The `$ARGUMENTS` placeholder is replaced with the hook's input JSON.
---
## Tool-Specific Input Fields
Different tools provide different `tool_input` fields:
### Bash
```json
{
"tool_input": {
"command": "npm install",
"description": "Install dependencies",
"timeout": 120000,
"run_in_background": false
}
}
```
### Write
```json
{
"tool_input": {
"file_path": "/path/to/file.js",
"content": "const x = 1;"
}
}
```
### Edit
```json
{
"tool_input": {
"file_path": "/path/to/file.js",
"old_string": "const x = 1;",
"new_string": "const x = 2;",
"replace_all": false
}
}
```
### Read
```json
{
"tool_input": {
"file_path": "/path/to/file.js",
"offset": 0,
"limit": 100
}
}
```
### Grep
```json
{
"tool_input": {
"pattern": "function.*",
"path": "/path/to/search",
"output_mode": "content"
}
}
```
### MCP tools
```json
{
"tool_input": {
// MCP tool-specific parameters
}
}
```
Access these in hooks:
```bash
command=$(echo "$input" | jq -r '.tool_input.command')
file_path=$(echo "$input" | jq -r '.tool_input.file_path')
```
---
## Modifying Tool Input
PreToolUse hooks can modify `tool_input` before execution:
**Original input**:
```json
{
"tool_input": {
"command": "npm install lodash"
}
}
```
**Hook output**:
```json
{
"decision": "approve",
"reason": "Adding --save-exact flag",
"updatedInput": {
"command": "npm install --save-exact lodash"
}
}
```
**Result**: Tool executes with modified input.
**Partial updates**: Only specify fields you want to change:
```json
{
"updatedInput": {
"timeout": 300000 // Only update timeout, keep other fields
}
}
```
---
## Error Handling
**Command hooks**: Return non-zero exit code to indicate error
```bash
if [ error ]; then
echo '{"decision": "block", "reason": "Error occurred"}' >&2
exit 1
fi
```
**Prompt hooks**: LLM should return valid JSON. If malformed, hook fails gracefully.
**Timeout**: Set `timeout` (ms) to prevent hanging:
```json
{
"type": "command",
"command": "/path/to/slow-script.sh",
"timeout": 30000
}
```
Default: 60000ms (60s)

View File

@@ -0,0 +1,470 @@
# Matchers and Pattern Matching
Complete guide to matching tools with hook matchers.
## What are matchers?
Matchers are regex patterns that filter which tools trigger a hook. They allow you to:
- Target specific tools (e.g., only `Bash`)
- Match multiple tools (e.g., `Write|Edit`)
- Match tool categories (e.g., all MCP tools)
- Match everything (omit matcher)
---
## Syntax
Matchers use JavaScript regex syntax:
```json
{
"matcher": "pattern"
}
```
The pattern is tested against the tool name using `new RegExp(pattern).test(toolName)`.
---
## Common Patterns
### Exact match
```json
{
"matcher": "Bash"
}
```
Matches: `Bash`
Doesn't match: `bash`, `BashOutput`
### Multiple tools (OR)
```json
{
"matcher": "Write|Edit"
}
```
Matches: `Write`, `Edit`
Doesn't match: `Read`, `Bash`
### Starts with
```json
{
"matcher": "^Bash"
}
```
Matches: `Bash`, `BashOutput`
Doesn't match: `Read`
### Ends with
```json
{
"matcher": "Output$"
}
```
Matches: `BashOutput`
Doesn't match: `Bash`, `Read`
### Contains
```json
{
"matcher": ".*write.*"
}
```
Matches: `Write`, `NotebookWrite`, `TodoWrite`
Doesn't match: `Read`, `Edit`
Case-sensitive! `write` won't match `Write`.
### Any tool (no matcher)
```json
{
"hooks": {
"PreToolUse": [
{
"hooks": [...] // No matcher = matches all tools
}
]
}
}
```
---
## Tool Categories
### All file operations
```json
{
"matcher": "Read|Write|Edit|Glob|Grep"
}
```
### All bash tools
```json
{
"matcher": "Bash.*"
}
```
Matches: `Bash`, `BashOutput`, `BashKill`
### All MCP tools
```json
{
"matcher": "mcp__.*"
}
```
Matches: `mcp__memory__store`, `mcp__filesystem__read`, etc.
### Specific MCP server
```json
{
"matcher": "mcp__memory__.*"
}
```
Matches: `mcp__memory__store`, `mcp__memory__retrieve`
Doesn't match: `mcp__filesystem__read`
### Specific MCP tool
```json
{
"matcher": "mcp__.*__write.*"
}
```
Matches: `mcp__filesystem__write`, `mcp__memory__write`
Doesn't match: `mcp__filesystem__read`
---
## MCP Tool Naming
MCP tools follow the pattern: `mcp__{server}__{tool}`
Examples:
- `mcp__memory__store`
- `mcp__filesystem__read`
- `mcp__github__create_issue`
**Match all tools from a server**:
```json
{
"matcher": "mcp__github__.*"
}
```
**Match specific tool across all servers**:
```json
{
"matcher": "mcp__.*__read.*"
}
```
---
## Real-World Examples
### Log all bash commands
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "jq -r '.tool_input.command' >> ~/bash-log.txt"
}
]
}
]
}
}
```
### Format code after any file write
```json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit|NotebookEdit",
"hooks": [
{
"type": "command",
"command": "prettier --write $CLAUDE_PROJECT_DIR"
}
]
}
]
}
}
```
### Validate all MCP memory writes
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "mcp__memory__.*",
"hooks": [
{
"type": "prompt",
"prompt": "Validate this memory operation: $ARGUMENTS\n\nCheck if data is appropriate to store.\n\nReturn: {\"decision\": \"approve\" or \"block\", \"reason\": \"why\"}"
}
]
}
]
}
}
```
### Block destructive git commands
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "/path/to/check-git-safety.sh"
}
]
}
]
}
}
```
`check-git-safety.sh`:
```bash
#!/bin/bash
input=$(cat)
command=$(echo "$input" | jq -r '.tool_input.command')
if [[ "$command" == *"git push --force"* ]] || \
[[ "$command" == *"rm -rf /"* ]] || \
[[ "$command" == *"git reset --hard"* ]]; then
echo '{"decision": "block", "reason": "Destructive command detected"}'
else
echo '{"decision": "approve", "reason": "Safe"}'
fi
```
---
## Multiple Matchers
You can have multiple matcher blocks for the same event:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "/path/to/bash-validator.sh"
}
]
},
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "/path/to/file-validator.sh"
}
]
},
{
"matcher": "mcp__.*",
"hooks": [
{
"type": "command",
"command": "/path/to/mcp-logger.sh"
}
]
}
]
}
}
```
Each matcher is evaluated independently. A tool can match multiple matchers.
---
## Debugging Matchers
### Enable debug mode
```bash
claude --debug
```
Debug output shows:
```
[DEBUG] Getting matching hook commands for PreToolUse with query: Bash
[DEBUG] Found 3 hook matchers in settings
[DEBUG] Matched 1 hooks for query "Bash"
```
### Test your matcher
Use JavaScript regex to test patterns:
```javascript
const toolName = "mcp__memory__store";
const pattern = "mcp__memory__.*";
const regex = new RegExp(pattern);
console.log(regex.test(toolName)); // true
```
Or in Node.js:
```bash
node -e "console.log(/mcp__memory__.*/.test('mcp__memory__store'))"
```
### Common mistakes
**Case sensitivity**
```json
{
"matcher": "bash" // Won't match "Bash"
}
```
**Correct**
```json
{
"matcher": "Bash"
}
```
---
**Missing escape**
```json
{
"matcher": "mcp__memory__*" // * is literal, not wildcard
}
```
**Correct**
```json
{
"matcher": "mcp__memory__.*" // .* is regex for "any characters"
}
```
---
**Unintended partial match**
```json
{
"matcher": "Write" // Matches "Write", "TodoWrite", "NotebookWrite"
}
```
**Exact match only**
```json
{
"matcher": "^Write$"
}
```
---
## Advanced Patterns
### Negative lookahead (exclude tools)
```json
{
"matcher": "^(?!Read).*"
}
```
Matches: Everything except `Read`
### Match any file operation except Grep
```json
{
"matcher": "^(Read|Write|Edit|Glob)$"
}
```
### Case-insensitive match
```json
{
"matcher": "(?i)bash"
}
```
Matches: `Bash`, `bash`, `BASH`
(Note: Claude Code tools are PascalCase by convention, so this is rarely needed)
---
## Performance Considerations
**Broad matchers** (e.g., `.*`) run on every tool use:
- Simple command hooks: negligible impact
- Prompt hooks: can slow down significantly
**Recommendation**: Be as specific as possible with matchers to minimize unnecessary hook executions.
**Example**: Instead of matching all tools and checking inside the hook:
```json
{
"matcher": ".*", // Runs on EVERY tool
"hooks": [
{
"type": "command",
"command": "if [[ $(jq -r '.tool_name') == 'Bash' ]]; then ...; fi"
}
]
}
```
Do this:
```json
{
"matcher": "Bash", // Only runs on Bash
"hooks": [
{
"type": "command",
"command": "..."
}
]
}
```
---
## Tool Name Reference
Common Claude Code tool names:
- `Bash`
- `BashOutput`
- `KillShell`
- `Read`
- `Write`
- `Edit`
- `Glob`
- `Grep`
- `TodoWrite`
- `NotebookEdit`
- `WebFetch`
- `WebSearch`
- `Task`
- `Skill`
- `SlashCommand`
- `AskUserQuestion`
- `ExitPlanMode`
MCP tools: `mcp__{server}__{tool}` (varies by installed servers)
Run `claude --debug` and watch tool calls to discover available tool names.

View File

@@ -0,0 +1,587 @@
# Troubleshooting
Common issues and solutions when working with hooks.
## Hook Not Triggering
### Symptom
Hook never executes, even when expected event occurs.
### Diagnostic steps
**1. Enable debug mode**
```bash
claude --debug
```
Look for:
```
[DEBUG] Getting matching hook commands for PreToolUse with query: Bash
[DEBUG] Found 0 hooks
```
**2. Check hook file location**
Hooks must be in:
- Project: `.claude/hooks.json`
- User: `~/.claude/hooks.json`
- Plugin: `{plugin}/hooks.json`
Verify:
```bash
cat .claude/hooks.json
# or
cat ~/.claude/hooks.json
```
**3. Validate JSON syntax**
Invalid JSON is silently ignored:
```bash
jq . .claude/hooks.json
```
If error: fix JSON syntax.
**4. Check matcher pattern**
Common mistakes:
❌ Case sensitivity
```json
{
"matcher": "bash" // Won't match "Bash"
}
```
✅ Fix
```json
{
"matcher": "Bash"
}
```
---
❌ Missing escape for regex
```json
{
"matcher": "mcp__memory__*" // Literal *, not wildcard
}
```
✅ Fix
```json
{
"matcher": "mcp__memory__.*" // Regex wildcard
}
```
**5. Test matcher in isolation**
```bash
node -e "console.log(/Bash/.test('Bash'))" # true
node -e "console.log(/bash/.test('Bash'))" # false
```
### Solutions
**Missing hook file**: Create `.claude/hooks.json` or `~/.claude/hooks.json`
**Invalid JSON**: Use `jq` to validate and format:
```bash
jq . .claude/hooks.json > temp.json && mv temp.json .claude/hooks.json
```
**Wrong matcher**: Check tool names with `--debug` and update matcher
**No matcher specified**: If you want to match all tools, omit the matcher field entirely:
```json
{
"hooks": {
"PreToolUse": [
{
"hooks": [...] // No matcher = all tools
}
]
}
}
```
---
## Command Hook Failing
### Symptom
Hook executes but fails with error.
### Diagnostic steps
**1. Check debug output**
```
[DEBUG] Hook command completed with status 1: <error message>
```
Status 1 = command failed.
**2. Test command directly**
Copy the command and run in terminal:
```bash
echo '{"session_id":"test","tool_name":"Bash"}' | /path/to/your/hook.sh
```
**3. Check permissions**
```bash
ls -l /path/to/hook.sh
chmod +x /path/to/hook.sh # If not executable
```
**4. Verify dependencies**
Does the command require tools?
```bash
which jq # Check if jq is installed
which osascript # macOS only
```
### Common issues
**Missing executable permission**
```bash
chmod +x /path/to/hook.sh
```
**Missing dependencies**
Install required tools:
```bash
# macOS
brew install jq
# Linux
apt-get install jq
```
**Bad path**
Use absolute paths:
```json
{
"command": "/Users/username/.claude/hooks/script.sh"
}
```
Or use environment variables:
```json
{
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/script.sh"
}
```
**Timeout**
If command takes too long:
```json
{
"command": "/path/to/slow-script.sh",
"timeout": 120000 // 2 minutes
}
```
---
## Prompt Hook Not Working
### Symptom
Prompt hook blocks everything or doesn't block when expected.
### Diagnostic steps
**1. Check LLM response format**
Debug output shows:
```
[DEBUG] Hook command completed with status 0: {"decision": "approve", "reason": "ok"}
```
Verify JSON is valid.
**2. Check prompt structure**
Ensure prompt is clear:
```json
{
"prompt": "Evaluate: $ARGUMENTS\n\nReturn JSON: {\"decision\": \"approve\" or \"block\", \"reason\": \"why\"}"
}
```
**3. Test prompt manually**
Submit similar prompt to Claude directly to see response format.
### Common issues
**Ambiguous instructions**
❌ Vague
```json
{
"prompt": "Is this ok? $ARGUMENTS"
}
```
✅ Clear
```json
{
"prompt": "Check if this command is safe: $ARGUMENTS\n\nBlock if: contains 'rm -rf', 'mkfs', or force push to main\n\nReturn: {\"decision\": \"approve\" or \"block\", \"reason\": \"explanation\"}"
}
```
**Missing $ARGUMENTS**
❌ No placeholder
```json
{
"prompt": "Validate this command"
}
```
✅ With placeholder
```json
{
"prompt": "Validate this command: $ARGUMENTS"
}
```
**Invalid JSON response**
The LLM must return valid JSON. If it returns plain text, the hook fails.
Add explicit formatting instructions:
```
IMPORTANT: Return ONLY valid JSON, no other text:
{
"decision": "approve" or "block",
"reason": "your explanation"
}
```
---
## Hook Blocks Everything
### Symptom
Hook blocks all operations, even safe ones.
### Diagnostic steps
**1. Check hook logic**
Review the script/prompt logic. Is the condition too broad?
**2. Test with known-safe input**
```bash
echo '{"tool_name":"Read","tool_input":{"file_path":"test.txt"}}' | /path/to/hook.sh
```
Expected: `{"decision": "approve"}`
**3. Check for errors in script**
Add error output:
```bash
#!/bin/bash
set -e # Exit on error
input=$(cat)
# ... rest of script
```
### Solutions
**Logic error**
Review conditions:
```bash
# Before (blocks everything)
if [[ "$command" != "safe_command" ]]; then
block
fi
# After (blocks dangerous commands)
if [[ "$command" == *"dangerous"* ]]; then
block
fi
```
**Default to approve**
If logic is complex, default to approve on unclear cases:
```bash
# Default
decision="approve"
reason="ok"
# Only change if dangerous
if [[ "$command" == *"rm -rf"* ]]; then
decision="block"
reason="Dangerous command"
fi
echo "{\"decision\": \"$decision\", \"reason\": \"$reason\"}"
```
---
## Infinite Loop in Stop Hook
### Symptom
Stop hook runs repeatedly, Claude never stops.
### Cause
Hook blocks stop without checking `stop_hook_active` flag.
### Solution
**Always check the flag**:
```bash
#!/bin/bash
input=$(cat)
stop_hook_active=$(echo "$input" | jq -r '.stop_hook_active')
# If hook already active, don't block again
if [[ "$stop_hook_active" == "true" ]]; then
echo '{"decision": undefined}'
exit 0
fi
# Your logic here
if [ tests_passing ]; then
echo '{"decision": "approve", "reason": "Tests pass"}'
else
echo '{"decision": "block", "reason": "Tests failing"}'
fi
```
Or in prompt hooks:
```json
{
"prompt": "Evaluate stopping: $ARGUMENTS\n\nIMPORTANT: If stop_hook_active is true, return {\"decision\": undefined}\n\nOtherwise check if tasks complete..."
}
```
---
## Hook Output Not Visible
### Symptom
Hook runs but output not shown to user.
### Cause
`suppressOutput: true` or output goes to stderr.
### Solutions
**Don't suppress output**:
```json
{
"decision": "approve",
"reason": "ok",
"suppressOutput": false
}
```
**Use systemMessage**:
```json
{
"decision": "approve",
"reason": "ok",
"systemMessage": "This message will be shown to user"
}
```
**Write to stdout, not stderr**:
```bash
echo "This is shown" >&1
echo "This is hidden" >&2
```
---
## Permission Errors
### Symptom
Hook script can't read files or execute commands.
### Solutions
**Make script executable**:
```bash
chmod +x /path/to/hook.sh
```
**Check file ownership**:
```bash
ls -l /path/to/hook.sh
chown $USER /path/to/hook.sh
```
**Use absolute paths**:
```bash
# Instead of
command="./script.sh"
# Use
command="$CLAUDE_PROJECT_DIR/.claude/hooks/script.sh"
```
---
## Hook Timeouts
### Symptom
```
[DEBUG] Hook command timed out after 60000ms
```
### Solutions
**Increase timeout**:
```json
{
"type": "command",
"command": "/path/to/slow-script.sh",
"timeout": 300000 // 5 minutes
}
```
**Optimize script**:
- Reduce unnecessary operations
- Cache results when possible
- Run expensive operations in background
**Run in background**:
```bash
#!/bin/bash
# Start long operation in background
/path/to/long-operation.sh &
# Return immediately
echo '{"decision": "approve", "reason": "ok"}'
```
---
## Matcher Conflicts
### Symptom
Multiple hooks triggering when only one expected.
### Cause
Tool name matches multiple matchers.
### Diagnostic
```
[DEBUG] Matched 3 hooks for query "Bash"
```
### Solutions
**Be more specific**:
```json
// Instead of
{"matcher": ".*"} // Matches everything
// Use
{"matcher": "Bash"} // Exact match
```
**Check overlapping patterns**:
```json
{
"hooks": {
"PreToolUse": [
{"matcher": "Bash", ...}, // Matches Bash
{"matcher": "Bash.*", ...}, // Also matches Bash!
{"matcher": ".*", ...} // Also matches everything!
]
}
}
```
Remove overlaps or make them mutually exclusive.
---
## Environment Variables Not Working
### Symptom
`$CLAUDE_PROJECT_DIR` or other variables are empty.
### Solutions
**Check variable spelling**:
- `$CLAUDE_PROJECT_DIR` (correct)
- `$CLAUDE_PROJECT_ROOT` (wrong)
**Use double quotes**:
```json
{
"command": "$CLAUDE_PROJECT_DIR/hooks/script.sh"
}
```
**In shell scripts, use from input**:
```bash
#!/bin/bash
input=$(cat)
cwd=$(echo "$input" | jq -r '.cwd')
cd "$cwd" || exit 1
```
---
## Debugging Workflow
**Step 1**: Enable debug mode
```bash
claude --debug
```
**Step 2**: Look for hook execution logs
```
[DEBUG] Executing hooks for PreToolUse:Bash
[DEBUG] Found 1 hook matchers
[DEBUG] Executing hook command: /path/to/script.sh
[DEBUG] Hook command completed with status 0
```
**Step 3**: Test hook in isolation
```bash
echo '{"test":"data"}' | /path/to/hook.sh
```
**Step 4**: Check script with `set -x`
```bash
#!/bin/bash
set -x # Print each command before executing
# ... your script
```
**Step 5**: Add logging
```bash
#!/bin/bash
echo "Hook started" >> /tmp/hook-debug.log
input=$(cat)
echo "Input: $input" >> /tmp/hook-debug.log
# ... your logic
echo "Decision: $decision" >> /tmp/hook-debug.log
```
**Step 6**: Verify JSON output
```bash
echo '{"decision":"approve","reason":"test"}' | jq .
```
If `jq` fails, JSON is invalid.

View File

@@ -0,0 +1,160 @@
# Create Meta-Prompts
The skill-based evolution of the [meta-prompting](../../prompts/meta-prompting/) system. Creates prompts optimized for Claude-to-Claude pipelines with improved dependency detection and structured outputs.
## The Problem
Complex tasks benefit from staged workflows: research first, then plan, then implement. But manually crafting prompts that produce structured outputs for subsequent prompts is tedious. Each stage needs metadata (confidence, dependencies, open questions) that the next stage can parse.
## The Solution
`/create-meta-prompt` creates prompts designed for multi-stage workflows. Outputs (research.md, plan.md) are structured with XML metadata for efficient parsing by subsequent prompts. Each prompt gets its own folder with clear provenance and automatic dependency detection.
## Commands
### `/create-meta-prompt [description]`
Describe your task. Claude creates a prompt optimized for its purpose.
**What it does:**
1. Determines purpose: Do (execute), Plan (strategize), or Research (gather info)
2. Detects existing research/plan files to chain from
3. Creates prompt with purpose-specific structure
4. Saves to `.prompts/{number}-{topic}-{purpose}/`
5. Runs with dependency-aware execution
**Usage:**
```bash
# Research task
/create-meta-prompt research authentication options for the app
# Planning task
/create-meta-prompt plan the auth implementation approach
# Implementation task
/create-meta-prompt implement JWT authentication
```
## Installation
**Install command** (global):
```bash
cp commands/*.md ~/.claude/commands/
```
**Install skill**:
```bash
cp -r skills/* ~/.claude/skills/
```
## Example Workflow
**Full research → plan → implement chain:**
```
You: /create-meta-prompt research authentication libraries for Node.js
Claude: [Asks about depth, sources, output format]
You: [Answer questions]
Claude: [Creates research prompt]
✓ Created: .prompts/001-auth-research/001-auth-research.md
What's next?
1. Run prompt now
2. Review/edit prompt first
You: 1
Claude: [Executes research]
✓ Output: .prompts/001-auth-research/auth-research.md
```
```
You: /create-meta-prompt plan the auth implementation
Claude: Found existing files: auth-research.md
Should this prompt reference any existing research?
You: [Select auth-research.md]
Claude: [Creates plan prompt referencing the research]
✓ Created: .prompts/002-auth-plan/002-auth-plan.md
You: 1
Claude: [Executes plan, reads research output]
✓ Output: .prompts/002-auth-plan/auth-plan.md
```
```
You: /create-meta-prompt implement the auth system
Claude: Found existing files: auth-research.md, auth-plan.md
[Detects it should reference the plan]
Claude: [Creates implementation prompt]
✓ Created: .prompts/003-auth-implement/003-auth-implement.md
You: 1
Claude: [Executes implementation following the plan]
✓ Implementation complete
```
## File Structure
```
create-meta-prompts/
├── README.md
├── commands/
│ └── create-meta-prompt.md
└── skills/
└── create-meta-prompts/
├── SKILL.md
└── references/
├── do-patterns.md
├── plan-patterns.md
├── research-patterns.md
├── question-bank.md
└── intelligence-rules.md
```
**Generated prompts structure:**
```
.prompts/
├── 001-auth-research/
│ ├── completed/
│ │ └── 001-auth-research.md # Prompt (archived after run)
│ └── auth-research.md # Output
├── 002-auth-plan/
│ ├── completed/
│ │ └── 002-auth-plan.md
│ └── auth-plan.md
└── 003-auth-implement/
└── 003-auth-implement.md # Prompt
```
## Why This Works
**Structured outputs for chaining:**
- Research and plan outputs include XML metadata
- `<confidence>`, `<dependencies>`, `<open_questions>`, `<assumptions>`
- Subsequent prompts can parse and act on this structure
**Automatic dependency detection:**
- Scans for existing research/plan files
- Suggests relevant files to chain from
- Executes in correct order (sequential/parallel/mixed)
**Clear provenance:**
- Each prompt gets its own folder
- Outputs stay with their prompts
- Completed prompts archived separately
---
**Questions or improvements?** Open an issue or submit a PR.
—TÂCHES

View File

@@ -0,0 +1,603 @@
---
name: create-meta-prompts
description: Create optimized prompts for Claude-to-Claude pipelines with research, planning, and execution stages. Use when building prompts that produce outputs for other prompts to consume, or when running multi-stage workflows (research -> plan -> implement).
---
<objective>
Create prompts optimized for Claude-to-Claude communication in multi-stage workflows. Outputs are structured with XML and metadata for efficient parsing by subsequent prompts.
Every execution produces a `SUMMARY.md` for quick human scanning without reading full outputs.
Each prompt gets its own folder in `.prompts/` with its output artifacts, enabling clear provenance and chain detection.
</objective>
<quick_start>
<workflow>
1. **Intake**: Determine purpose (Do/Plan/Research/Refine), gather requirements
2. **Chain detection**: Check for existing research/plan files to reference
3. **Generate**: Create prompt using purpose-specific patterns
4. **Save**: Create folder in `.prompts/{number}-{topic}-{purpose}/`
5. **Present**: Show decision tree for running
6. **Execute**: Run prompt(s) with dependency-aware execution engine
7. **Summarize**: Create SUMMARY.md for human scanning
</workflow>
<folder_structure>
```
.prompts/
├── 001-auth-research/
│ ├── completed/
│ │ └── 001-auth-research.md # Prompt (archived after run)
│ ├── auth-research.md # Full output (XML for Claude)
│ └── SUMMARY.md # Executive summary (markdown for human)
├── 002-auth-plan/
│ ├── completed/
│ │ └── 002-auth-plan.md
│ ├── auth-plan.md
│ └── SUMMARY.md
├── 003-auth-implement/
│ ├── completed/
│ │ └── 003-auth-implement.md
│ └── SUMMARY.md # Do prompts create code elsewhere
├── 004-auth-research-refine/
│ ├── completed/
│ │ └── 004-auth-research-refine.md
│ ├── archive/
│ │ └── auth-research-v1.md # Previous version
│ └── SUMMARY.md
```
</folder_structure>
</quick_start>
<context>
Prompts directory: !`[ -d ./.prompts ] && echo "exists" || echo "missing"`
Existing research/plans: !`find ./.prompts -name "*-research.md" -o -name "*-plan.md" 2>/dev/null | head -10`
Next prompt number: !`ls -d ./.prompts/*/ 2>/dev/null | wc -l | xargs -I {} expr {} + 1`
</context>
<automated_workflow>
<step_0_intake_gate>
<title>Adaptive Requirements Gathering</title>
<critical_first_action>
**BEFORE analyzing anything**, check if context was provided.
IF no context provided (skill invoked without description):
**IMMEDIATELY use AskUserQuestion** with:
- header: "Purpose"
- question: "What is the purpose of this prompt?"
- options:
- "Do" - Execute a task, produce an artifact
- "Plan" - Create an approach, roadmap, or strategy
- "Research" - Gather information or understand something
- "Refine" - Improve an existing research or plan output
After selection, ask: "Describe what you want to accomplish" (they select "Other" to provide free text).
IF context was provided:
→ Check if purpose is inferable from keywords:
- `implement`, `build`, `create`, `fix`, `add`, `refactor` → Do
- `plan`, `roadmap`, `approach`, `strategy`, `decide`, `phases` → Plan
- `research`, `understand`, `learn`, `gather`, `analyze`, `explore` → Research
- `refine`, `improve`, `deepen`, `expand`, `iterate`, `update` → Refine
→ If unclear, ask the Purpose question above as first contextual question
→ If clear, proceed to adaptive_analysis with inferred purpose
</critical_first_action>
<adaptive_analysis>
Extract and infer:
- **Purpose**: Do, Plan, Research, or Refine
- **Topic identifier**: Kebab-case identifier for file naming (e.g., `auth`, `stripe-payments`)
- **Complexity**: Simple vs complex (affects prompt depth)
- **Prompt structure**: Single vs multiple prompts
- **Target** (Refine only): Which existing output to improve
If topic identifier not obvious, ask:
- header: "Topic"
- question: "What topic/feature is this for? (used for file naming)"
- Let user provide via "Other" option
- Enforce kebab-case (convert spaces/underscores to hyphens)
For Refine purpose, also identify target output from `.prompts/*/` to improve.
</adaptive_analysis>
<chain_detection>
Scan `.prompts/*/` for existing `*-research.md` and `*-plan.md` files.
If found:
1. List them: "Found existing files: auth-research.md (in 001-auth-research/), stripe-plan.md (in 005-stripe-plan/)"
2. Use AskUserQuestion:
- header: "Reference"
- question: "Should this prompt reference any existing research or plans?"
- options: List found files + "None"
- multiSelect: true
Match by topic keyword when possible (e.g., "auth plan" → suggest auth-research.md).
</chain_detection>
<contextual_questioning>
Generate 2-4 questions using AskUserQuestion based on purpose and gaps.
Load questions from: [references/question-bank.md](references/question-bank.md)
Route by purpose:
- Do → artifact type, scope, approach
- Plan → plan purpose, format, constraints
- Research → depth, sources, output format
- Refine → target selection, feedback, preservation
</contextual_questioning>
<decision_gate>
After receiving answers, present decision gate using AskUserQuestion:
- header: "Ready"
- question: "Ready to create the prompt?"
- options:
- "Proceed" - Create the prompt with current context
- "Ask more questions" - I have more details to clarify
- "Let me add context" - I want to provide additional information
Loop until "Proceed" selected.
</decision_gate>
<finalization>
After "Proceed" selected, state confirmation:
"Creating a {purpose} prompt for: {topic}
Folder: .prompts/{number}-{topic}-{purpose}/
References: {list any chained files}"
Then proceed to generation.
</finalization>
</step_0_intake_gate>
<step_1_generate>
<title>Generate Prompt</title>
Load purpose-specific patterns:
- Do: [references/do-patterns.md](references/do-patterns.md)
- Plan: [references/plan-patterns.md](references/plan-patterns.md)
- Research: [references/research-patterns.md](references/research-patterns.md)
- Refine: [references/refine-patterns.md](references/refine-patterns.md)
Load intelligence rules: [references/intelligence-rules.md](references/intelligence-rules.md)
<prompt_structure>
All generated prompts include:
1. **Objective**: What to accomplish, why it matters
2. **Context**: Referenced files (@), dynamic context (!)
3. **Requirements**: Specific instructions for the task
4. **Output specification**: Where to save, what structure
5. **Metadata requirements**: For research/plan outputs, specify XML metadata structure
6. **SUMMARY.md requirement**: All prompts must create a SUMMARY.md file
7. **Success criteria**: How to know it worked
For Research and Plan prompts, output must include:
- `<confidence>` - How confident in findings
- `<dependencies>` - What's needed to proceed
- `<open_questions>` - What remains uncertain
- `<assumptions>` - What was assumed
All prompts must create `SUMMARY.md` with:
- **One-liner** - Substantive description of outcome
- **Version** - v1 or iteration info
- **Key Findings** - Actionable takeaways
- **Files Created** - (Do prompts only)
- **Decisions Needed** - What requires user input
- **Blockers** - External impediments
- **Next Step** - Concrete forward action
</prompt_structure>
<file_creation>
1. Create folder: `.prompts/{number}-{topic}-{purpose}/`
2. Create `completed/` subfolder
3. Write prompt to: `.prompts/{number}-{topic}-{purpose}/{number}-{topic}-{purpose}.md`
4. Prompt instructs output to: `.prompts/{number}-{topic}-{purpose}/{topic}-{purpose}.md`
</file_creation>
</step_1_generate>
<step_2_present>
<title>Present Decision Tree</title>
After saving prompt(s), present inline (not AskUserQuestion):
<single_prompt_presentation>
```
Prompt created: .prompts/{number}-{topic}-{purpose}/{number}-{topic}-{purpose}.md
What's next?
1. Run prompt now
2. Review/edit prompt first
3. Save for later
4. Other
Choose (1-4): _
```
</single_prompt_presentation>
<multi_prompt_presentation>
```
Prompts created:
- .prompts/001-auth-research/001-auth-research.md
- .prompts/002-auth-plan/002-auth-plan.md
- .prompts/003-auth-implement/003-auth-implement.md
Detected execution order: Sequential (002 references 001 output, 003 references 002 output)
What's next?
1. Run all prompts (sequential)
2. Review/edit prompts first
3. Save for later
4. Other
Choose (1-4): _
```
</multi_prompt_presentation>
</step_2_present>
<step_3_execute>
<title>Execution Engine</title>
<execution_modes>
<single_prompt>
Straightforward execution of one prompt.
1. Read prompt file contents
2. Spawn Task agent with subagent_type="general-purpose"
3. Include in task prompt:
- The complete prompt contents
- Output location: `.prompts/{number}-{topic}-{purpose}/{topic}-{purpose}.md`
4. Wait for completion
5. Validate output (see validation section)
6. Archive prompt to `completed/` subfolder
7. Report results with next-step options
</single_prompt>
<sequential_execution>
For chained prompts where each depends on previous output.
1. Build execution queue from dependency order
2. For each prompt in queue:
a. Read prompt file
b. Spawn Task agent
c. Wait for completion
d. Validate output
e. If validation fails → stop, report failure, offer recovery options
f. If success → archive prompt, continue to next
3. Report consolidated results
<progress_reporting>
Show progress during execution:
```
Executing 1/3: 001-auth-research... ✓
Executing 2/3: 002-auth-plan... ✓
Executing 3/3: 003-auth-implement... (running)
```
</progress_reporting>
</sequential_execution>
<parallel_execution>
For independent prompts with no dependencies.
1. Read all prompt files
2. **CRITICAL**: Spawn ALL Task agents in a SINGLE message
- This is required for true parallel execution
- Each task includes its output location
3. Wait for all to complete
4. Validate all outputs
5. Archive all prompts
6. Report consolidated results (successes and failures)
<failure_handling>
Unlike sequential, parallel continues even if some fail:
- Collect all results
- Archive successful prompts
- Report failures with details
- Offer to retry failed prompts
</failure_handling>
</parallel_execution>
<mixed_dependencies>
For complex DAGs (e.g., two parallel research → one plan).
1. Analyze dependency graph from @ references
2. Group into execution layers:
- Layer 1: No dependencies (run parallel)
- Layer 2: Depends only on layer 1 (run after layer 1 completes)
- Layer 3: Depends on layer 2, etc.
3. Execute each layer:
- Parallel within layer
- Sequential between layers
4. Stop if any dependency fails (downstream prompts can't run)
<example>
```
Layer 1 (parallel): 001-api-research, 002-db-research
Layer 2 (after layer 1): 003-architecture-plan
Layer 3 (after layer 2): 004-implement
```
</example>
</mixed_dependencies>
</execution_modes>
<dependency_detection>
<automatic_detection>
Scan prompt contents for @ references to determine dependencies:
1. Parse each prompt for `@.prompts/{number}-{topic}/` patterns
2. Build dependency graph
3. Detect cycles (error if found)
4. Determine execution order
<inference_rules>
If no explicit @ references found, infer from purpose:
- Research prompts: No dependencies (can parallel)
- Plan prompts: Depend on same-topic research
- Do prompts: Depend on same-topic plan
Override with explicit references when present.
</inference_rules>
</automatic_detection>
<missing_dependencies>
If a prompt references output that doesn't exist:
1. Check if it's another prompt in this session (will be created)
2. Check if it exists in `.prompts/*/` (already completed)
3. If truly missing:
- Warn user: "002-auth-plan references auth-research.md which doesn't exist"
- Offer: Create the missing research prompt first? / Continue anyway? / Cancel?
</missing_dependencies>
</dependency_detection>
<validation>
<output_validation>
After each prompt completes, verify success:
1. **File exists**: Check output file was created
2. **Not empty**: File has content (> 100 chars)
3. **Metadata present** (for research/plan): Check for required XML tags
- `<confidence>`
- `<dependencies>`
- `<open_questions>`
- `<assumptions>`
4. **SUMMARY.md exists**: Check SUMMARY.md was created
5. **SUMMARY.md complete**: Has required sections (Key Findings, Decisions Needed, Blockers, Next Step)
6. **One-liner is substantive**: Not generic like "Research completed"
<validation_failure>
If validation fails:
- Report what's missing
- Offer options:
- Retry the prompt
- Continue anyway (for non-critical issues)
- Stop and investigate
</validation_failure>
</output_validation>
</validation>
<failure_handling>
<sequential_failure>
Stop the chain immediately:
```
✗ Failed at 2/3: 002-auth-plan
Completed:
- 001-auth-research ✓ (archived)
Failed:
- 002-auth-plan: Output file not created
Not started:
- 003-auth-implement
What's next?
1. Retry 002-auth-plan
2. View error details
3. Stop here (keep completed work)
4. Other
```
</sequential_failure>
<parallel_failure>
Continue others, report all results:
```
Parallel execution completed with errors:
✓ 001-api-research (archived)
✗ 002-db-research: Validation failed - missing <confidence> tag
✓ 003-ui-research (archived)
What's next?
1. Retry failed prompt (002)
2. View error details
3. Continue without 002
4. Other
```
</parallel_failure>
</failure_handling>
<archiving>
<archive_timing>
- **Sequential**: Archive each prompt immediately after successful completion
- Provides clear state if execution stops mid-chain
- **Parallel**: Archive all at end after collecting results
- Keeps prompts available for potential retry
<archive_operation>
Move prompt file to completed subfolder:
```bash
mv .prompts/{number}-{topic}-{purpose}/{number}-{topic}-{purpose}.md \
.prompts/{number}-{topic}-{purpose}/completed/
```
Output file stays in place (not moved).
</archive_operation>
</archiving>
<result_presentation>
<single_result>
```
✓ Executed: 001-auth-research
✓ Created: .prompts/001-auth-research/SUMMARY.md
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Auth Research Summary
**JWT with jose library and httpOnly cookies recommended**
## Key Findings
• jose outperforms jsonwebtoken with better TypeScript support
• httpOnly cookies required (localStorage is XSS vulnerable)
• Refresh rotation is OWASP standard
## Decisions Needed
None - ready for planning
## Blockers
None
## Next Step
Create auth-plan.md
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
What's next?
1. Create planning prompt (auth-plan)
2. View full research output
3. Done
4. Other
```
Display the actual SUMMARY.md content inline so user sees findings without opening files.
</single_result>
<chain_result>
```
✓ Chain completed: auth workflow
Results:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
001-auth-research
**JWT with jose library and httpOnly cookies recommended**
Decisions: None • Blockers: None
002-auth-plan
**4-phase implementation: types → JWT core → refresh → tests**
Decisions: Approve 15-min token expiry • Blockers: None
003-auth-implement
**JWT middleware complete with 6 files created**
Decisions: Review before Phase 2 • Blockers: None
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
All prompts archived. Full summaries in .prompts/*/SUMMARY.md
What's next?
1. Review implementation
2. Run tests
3. Create new prompt chain
4. Other
```
For chains, show condensed one-liner from each SUMMARY.md with decisions/blockers flagged.
</chain_result>
</result_presentation>
<special_cases>
<re_running_completed>
If user wants to re-run an already-completed prompt:
1. Check if prompt is in `completed/` subfolder
2. Move it back to parent folder
3. Optionally backup existing output: `{output}.bak`
4. Execute normally
</re_running_completed>
<output_conflicts>
If output file already exists:
1. For re-runs: Backup existing → `{filename}.bak`
2. For new runs: Should not happen (unique numbering)
3. If conflict detected: Ask user - Overwrite? / Rename? / Cancel?
</output_conflicts>
<commit_handling>
After successful execution:
1. Do NOT auto-commit (user controls git workflow)
2. Mention what files were created/modified
3. User can commit when ready
Exception: If user explicitly requests commit, stage and commit:
- Output files created
- Prompts archived
- Any implementation changes (for Do prompts)
</commit_handling>
<recursive_prompts>
If a prompt's output includes instructions to create more prompts:
1. This is advanced usage - don't auto-detect
2. Present the output to user
3. User can invoke skill again to create follow-up prompts
4. Maintains user control over prompt creation
</recursive_prompts>
</special_cases>
</step_3_execute>
</automated_workflow>
<reference_guides>
**Prompt patterns by purpose:**
- [references/do-patterns.md](references/do-patterns.md) - Execution prompts + output structure
- [references/plan-patterns.md](references/plan-patterns.md) - Planning prompts + plan.md structure
- [references/research-patterns.md](references/research-patterns.md) - Research prompts + research.md structure
- [references/refine-patterns.md](references/refine-patterns.md) - Iteration prompts + versioning
**Shared templates:**
- [references/summary-template.md](references/summary-template.md) - SUMMARY.md structure and field requirements
- [references/metadata-guidelines.md](references/metadata-guidelines.md) - Confidence, dependencies, open questions, assumptions
**Supporting references:**
- [references/question-bank.md](references/question-bank.md) - Intake questions by purpose
- [references/intelligence-rules.md](references/intelligence-rules.md) - Extended thinking, parallel tools, depth decisions
</reference_guides>
<success_criteria>
**Prompt Creation:**
- Intake gate completed with purpose and topic identified
- Chain detection performed, relevant files referenced
- Prompt generated with correct structure for purpose
- Folder created in `.prompts/` with correct naming
- Output file location specified in prompt
- SUMMARY.md requirement included in prompt
- Metadata requirements included for Research/Plan outputs
- Quality controls included for Research outputs (verification checklist, QA, pre-submission)
- Streaming write instructions included for Research outputs
- Decision tree presented
**Execution (if user chooses to run):**
- Dependencies correctly detected and ordered
- Prompts executed in correct order (sequential/parallel/mixed)
- Output validated after each completion
- SUMMARY.md created with all required sections
- One-liner is substantive (not generic)
- Failed prompts handled gracefully with recovery options
- Successful prompts archived to `completed/` subfolder
- SUMMARY.md displayed inline in results
- Results presented with decisions/blockers flagged
**Research Quality (for Research prompts):**
- Verification checklist completed
- Quality report distinguishes verified from assumed claims
- Sources consulted listed with URLs
- Confidence levels assigned to findings
- Critical claims verified with official documentation
</success_criteria>

View File

@@ -0,0 +1,258 @@
<overview>
Prompt patterns for execution tasks that produce artifacts (code, documents, designs, etc.).
</overview>
<prompt_template>
```xml
<objective>
{Clear statement of what to build/create/fix}
Purpose: {Why this matters, what it enables}
Output: {What artifact(s) will be produced}
</objective>
<context>
{Referenced research/plan files if chained}
@{topic}-research.md
@{topic}-plan.md
{Project context}
@relevant-files
</context>
<requirements>
{Specific functional requirements}
{Quality requirements}
{Constraints and boundaries}
</requirements>
<implementation>
{Specific approaches or patterns to follow}
{What to avoid and WHY}
{Integration points}
</implementation>
<output>
Create/modify files:
- `./path/to/file.ext` - {description}
{For complex outputs, specify structure}
</output>
<verification>
Before declaring complete:
- {Specific test or check}
- {How to confirm it works}
- {Edge cases to verify}
</verification>
<summary_requirements>
Create `.prompts/{num}-{topic}-{purpose}/SUMMARY.md`
Load template: [summary-template.md](summary-template.md)
For Do prompts, include Files Created section with paths and descriptions. Emphasize what was implemented and test status. Next step typically: Run tests or execute next phase.
</summary_requirements>
<success_criteria>
{Clear, measurable criteria}
- {Criterion 1}
- {Criterion 2}
- SUMMARY.md created with files list and next step
</success_criteria>
```
</prompt_template>
<key_principles>
<reference_chain_artifacts>
If research or plan exists, always reference them:
```xml
<context>
Research findings: @.prompts/001-auth-research/auth-research.md
Implementation plan: @.prompts/002-auth-plan/auth-plan.md
</context>
```
</reference_chain_artifacts>
<explicit_output_location>
Every artifact needs a clear path:
```xml
<output>
Create files in ./src/auth/:
- `./src/auth/middleware.ts` - JWT validation middleware
- `./src/auth/types.ts` - Auth type definitions
- `./src/auth/utils.ts` - Helper functions
</output>
```
</explicit_output_location>
<verification_matching>
Include verification that matches the task:
- Code: run tests, type check, lint
- Documents: check structure, validate links
- Designs: review against requirements
</verification_matching>
</key_principles>
<complexity_variations>
<simple_do>
Single artifact example:
```xml
<objective>
Create a utility function that validates email addresses.
</objective>
<requirements>
- Support standard email format
- Return boolean
- Handle edge cases (empty, null)
</requirements>
<output>
Create: `./src/utils/validate-email.ts`
</output>
<verification>
Test with: valid emails, invalid formats, edge cases
</verification>
```
</simple_do>
<complex_do>
Multiple artifacts with dependencies:
```xml
<objective>
Implement user authentication system with JWT tokens.
Purpose: Enable secure user sessions for the application
Output: Auth middleware, routes, types, and tests
</objective>
<context>
Research: @.prompts/001-auth-research/auth-research.md
Plan: @.prompts/002-auth-plan/auth-plan.md
Existing user model: @src/models/user.ts
</context>
<requirements>
- JWT access tokens (15min expiry)
- Refresh token rotation
- Secure httpOnly cookies
- Rate limiting on auth endpoints
</requirements>
<implementation>
Follow patterns from auth-research.md:
- Use jose library for JWT (not jsonwebtoken - see research)
- Implement refresh rotation per OWASP guidelines
- Store refresh tokens hashed in database
Avoid:
- Storing tokens in localStorage (XSS vulnerable)
- Long-lived access tokens (security risk)
</implementation>
<output>
Create in ./src/auth/:
- `middleware.ts` - JWT validation, refresh logic
- `routes.ts` - Login, logout, refresh endpoints
- `types.ts` - Token payloads, auth types
- `utils.ts` - Token generation, hashing
Create in ./src/auth/__tests__/:
- `auth.test.ts` - Unit tests for all auth functions
</output>
<verification>
1. Run test suite: `npm test src/auth`
2. Type check: `npx tsc --noEmit`
3. Manual test: login flow, token refresh, logout
4. Security check: verify httpOnly cookies, token expiry
</verification>
<success_criteria>
- All tests passing
- No type errors
- Login/logout/refresh flow works
- Tokens properly secured
- Follows patterns from research
</success_criteria>
```
</complex_do>
</complexity_variations>
<non_code_examples>
<document_creation>
```xml
<objective>
Create API documentation for the authentication endpoints.
Purpose: Enable frontend team to integrate auth
Output: OpenAPI spec + markdown guide
</objective>
<context>
Implementation: @src/auth/routes.ts
Types: @src/auth/types.ts
</context>
<requirements>
- OpenAPI 3.0 spec
- Request/response examples
- Error codes and handling
- Authentication flow diagram
</requirements>
<output>
- `./docs/api/auth.yaml` - OpenAPI spec
- `./docs/guides/authentication.md` - Integration guide
</output>
<verification>
- Validate OpenAPI spec: `npx @redocly/cli lint docs/api/auth.yaml`
- Check all endpoints documented
- Verify examples match actual implementation
</verification>
```
</document_creation>
<design_architecture>
```xml
<objective>
Design database schema for multi-tenant SaaS application.
Purpose: Support customer isolation and scaling
Output: Schema diagram + migration files
</objective>
<context>
Research: @.prompts/001-multitenancy-research/multitenancy-research.md
Current schema: @prisma/schema.prisma
</context>
<requirements>
- Row-level security per tenant
- Shared infrastructure model
- Support for tenant-specific customization
- Audit logging
</requirements>
<output>
- `./docs/architecture/tenant-schema.md` - Schema design doc
- `./prisma/migrations/add-tenancy/` - Migration files
</output>
<verification>
- Migration runs without errors
- RLS policies correctly isolate data
- Performance acceptable with 1000 tenants
</verification>
```
</design_architecture>
</non_code_examples>

View File

@@ -0,0 +1,342 @@
<overview>
Guidelines for determining prompt complexity, tool usage, and optimization patterns.
</overview>
<complexity_assessment>
<simple_prompts>
Single focused task, clear outcome:
**Indicators:**
- Single artifact output
- No dependencies on other files
- Straightforward requirements
- No decision-making needed
**Prompt characteristics:**
- Concise objective
- Minimal context
- Direct requirements
- Simple verification
</simple_prompts>
<complex_prompts>
Multi-step tasks, multiple considerations:
**Indicators:**
- Multiple artifacts or phases
- Dependencies on research/plan files
- Trade-offs to consider
- Integration with existing code
**Prompt characteristics:**
- Detailed objective with context
- Referenced files
- Explicit implementation guidance
- Comprehensive verification
- Extended thinking triggers
</complex_prompts>
</complexity_assessment>
<extended_thinking_triggers>
<when_to_include>
Use these phrases to activate deeper reasoning in complex prompts:
- Complex architectural decisions
- Multiple valid approaches to evaluate
- Security-sensitive implementations
- Performance optimization tasks
- Trade-off analysis
</when_to_include>
<trigger_phrases>
```
"Thoroughly analyze..."
"Consider multiple approaches..."
"Deeply consider the implications..."
"Explore various solutions before..."
"Carefully evaluate trade-offs..."
```
</trigger_phrases>
<example_usage>
```xml
<requirements>
Thoroughly analyze the authentication options and consider multiple
approaches before selecting an implementation. Deeply consider the
security implications of each choice.
</requirements>
```
</example_usage>
<when_not_to_use>
- Simple, straightforward tasks
- Tasks with clear single approach
- Following established patterns
- Basic CRUD operations
</when_not_to_use>
</extended_thinking_triggers>
<parallel_tool_calling>
<when_to_include>
```xml
<efficiency>
For maximum efficiency, invoke all independent tool operations
simultaneously rather than sequentially. Multiple file reads,
searches, and API calls that don't depend on each other should
run in parallel.
</efficiency>
```
</when_to_include>
<applicable_scenarios>
- Reading multiple files for context
- Running multiple searches
- Fetching from multiple sources
- Creating multiple independent files
</applicable_scenarios>
</parallel_tool_calling>
<context_loading>
<when_to_load>
- Modifying existing code
- Following established patterns
- Integrating with current systems
- Building on research/plan outputs
</when_to_load>
<when_not_to_load>
- Greenfield features
- Standalone utilities
- Pure research tasks
- Standard patterns without customization
</when_not_to_load>
<loading_patterns>
```xml
<context>
<!-- Chained artifacts -->
Research: @.prompts/001-auth-research/auth-research.md
Plan: @.prompts/002-auth-plan/auth-plan.md
<!-- Existing code to modify -->
Current implementation: @src/auth/middleware.ts
Types to extend: @src/types/auth.ts
<!-- Patterns to follow -->
Similar feature: @src/features/payments/
</context>
```
</loading_patterns>
</context_loading>
<output_optimization>
<streaming_writes>
For research and plan outputs that may be large:
**Instruct incremental writing:**
```xml
<process>
1. Create output file with XML skeleton
2. Write each section as completed:
- Finding 1 discovered → Append immediately
- Finding 2 discovered → Append immediately
- Code example found → Append immediately
3. Finalize summary and metadata after all sections complete
</process>
```
**Why this matters:**
- Prevents lost work from token limit failures
- No need to estimate output size
- Agent creates natural checkpoints
- Works for any task complexity
**When to use:**
- Research prompts (findings accumulate)
- Plan prompts (phases accumulate)
- Any prompt that might produce >15k tokens
**When NOT to use:**
- Do prompts (code generation is different workflow)
- Simple tasks with known small outputs
</streaming_writes>
<claude_to_claude>
For Claude-to-Claude consumption:
**Use heavy XML structure:**
```xml
<findings>
<finding category="security">
<title>Token Storage</title>
<recommendation>httpOnly cookies</recommendation>
<rationale>Prevents XSS access</rationale>
</finding>
</findings>
```
**Include metadata:**
```xml
<metadata>
<confidence level="high">Verified in official docs</confidence>
<dependencies>Cookie parser middleware</dependencies>
<open_questions>SameSite policy for subdomains</open_questions>
</metadata>
```
**Be explicit about next steps:**
```xml
<next_actions>
<action priority="high">Create planning prompt using these findings</action>
<action priority="medium">Validate rate limits in sandbox</action>
</next_actions>
```
</claude_to_claude>
<human_consumption>
For human consumption:
- Clear headings
- Bullet points for scanning
- Code examples with comments
- Summary at top
</human_consumption>
</output_optimization>
<prompt_depth_guidelines>
<minimal>
Simple Do prompts:
- 20-40 lines
- Basic objective, requirements, output, verification
- No extended thinking
- No parallel tool hints
</minimal>
<standard>
Typical task prompts:
- 40-80 lines
- Full objective with context
- Clear requirements and implementation notes
- Standard verification
</standard>
<comprehensive>
Complex task prompts:
- 80-150 lines
- Extended thinking triggers
- Parallel tool calling hints
- Multiple verification steps
- Detailed success criteria
</comprehensive>
</prompt_depth_guidelines>
<why_explanations>
Always explain why constraints matter:
<bad_example>
```xml
<requirements>
Never store tokens in localStorage.
</requirements>
```
</bad_example>
<good_example>
```xml
<requirements>
Never store tokens in localStorage - it's accessible to any
JavaScript on the page, making it vulnerable to XSS attacks.
Use httpOnly cookies instead.
</requirements>
```
</good_example>
This helps the executing Claude make good decisions when facing edge cases.
</why_explanations>
<verification_patterns>
<for_code>
```xml
<verification>
1. Run test suite: `npm test`
2. Type check: `npx tsc --noEmit`
3. Lint: `npm run lint`
4. Manual test: [specific flow to test]
</verification>
```
</for_code>
<for_documents>
```xml
<verification>
1. Validate structure: [check required sections]
2. Verify links: [check internal references]
3. Review completeness: [check against requirements]
</verification>
```
</for_documents>
<for_research>
```xml
<verification>
1. Sources are current (2024-2025)
2. All scope questions answered
3. Metadata captures uncertainties
4. Actionable recommendations included
</verification>
```
</for_research>
<for_plans>
```xml
<verification>
1. Phases are sequential and logical
2. Tasks are specific and actionable
3. Dependencies are clear
4. Metadata captures assumptions
</verification>
```
</for_plans>
</verification_patterns>
<chain_optimization>
<research_prompts>
Research prompts should:
- Structure findings for easy extraction
- Include code examples for implementation
- Clearly mark confidence levels
- List explicit next actions
</research_prompts>
<plan_prompts>
Plan prompts should:
- Reference research explicitly
- Break phases into prompt-sized chunks
- Include execution hints per phase
- Capture dependencies between phases
</plan_prompts>
<do_prompts>
Do prompts should:
- Reference both research and plan
- Follow plan phases explicitly
- Verify against research recommendations
- Update plan status when done
</do_prompts>
</chain_optimization>

View File

@@ -0,0 +1,61 @@
<overview>
Standard metadata structure for research and plan outputs. Include in all research, plan, and refine prompts.
</overview>
<metadata_structure>
```xml
<metadata>
<confidence level="{high|medium|low}">
{Why this confidence level}
</confidence>
<dependencies>
{What's needed to proceed}
</dependencies>
<open_questions>
{What remains uncertain}
</open_questions>
<assumptions>
{What was assumed}
</assumptions>
</metadata>
```
</metadata_structure>
<confidence_levels>
- **high**: Official docs, verified patterns, clear consensus, few unknowns
- **medium**: Mixed sources, some outdated info, minor gaps, reasonable approach
- **low**: Sparse documentation, conflicting info, significant unknowns, best guess
</confidence_levels>
<dependencies_format>
External requirements that must be met:
```xml
<dependencies>
- API keys for third-party service
- Database migration completed
- Team trained on new patterns
</dependencies>
```
</dependencies_format>
<open_questions_format>
What couldn't be determined or needs validation:
```xml
<open_questions>
- Actual rate limits under production load
- Performance with >100k records
- Specific error codes for edge cases
</open_questions>
```
</open_questions_format>
<assumptions_format>
Context assumed that might need validation:
```xml
<assumptions>
- Using REST API (not GraphQL)
- Single region deployment
- Node.js/TypeScript stack
</assumptions>
```
</assumptions_format>

View File

@@ -0,0 +1,267 @@
<overview>
Prompt patterns for creating approaches, roadmaps, and strategies that will be consumed by subsequent prompts.
</overview>
<prompt_template>
```xml
<objective>
Create a {plan type} for {topic}.
Purpose: {What decision/implementation this enables}
Input: {Research or context being used}
Output: {topic}-plan.md with actionable phases/steps
</objective>
<context>
Research findings: @.prompts/{num}-{topic}-research/{topic}-research.md
{Additional context files}
</context>
<planning_requirements>
{What the plan needs to address}
{Constraints to work within}
{Success criteria for the planned outcome}
</planning_requirements>
<output_structure>
Save to: `.prompts/{num}-{topic}-plan/{topic}-plan.md`
Structure the plan using this XML format:
```xml
<plan>
<summary>
{One paragraph overview of the approach}
</summary>
<phases>
<phase number="1" name="{phase-name}">
<objective>{What this phase accomplishes}</objective>
<tasks>
<task priority="high">{Specific actionable task}</task>
<task priority="medium">{Another task}</task>
</tasks>
<deliverables>
<deliverable>{What's produced}</deliverable>
</deliverables>
<dependencies>{What must exist before this phase}</dependencies>
</phase>
<!-- Additional phases -->
</phases>
<metadata>
<confidence level="{high|medium|low}">
{Why this confidence level}
</confidence>
<dependencies>
{External dependencies needed}
</dependencies>
<open_questions>
{Uncertainties that may affect execution}
</open_questions>
<assumptions>
{What was assumed in creating this plan}
</assumptions>
</metadata>
</plan>
```
</output_structure>
<summary_requirements>
Create `.prompts/{num}-{topic}-plan/SUMMARY.md`
Load template: [summary-template.md](summary-template.md)
For plans, emphasize phase breakdown with objectives and assumptions needing validation. Next step typically: Execute first phase.
</summary_requirements>
<success_criteria>
- Plan addresses all requirements
- Phases are sequential and logical
- Tasks are specific and actionable
- Metadata captures uncertainties
- SUMMARY.md created with phase overview
- Ready for implementation prompts to consume
</success_criteria>
```
</prompt_template>
<key_principles>
<reference_research>
Plans should build on research findings:
```xml
<context>
Research findings: @.prompts/001-auth-research/auth-research.md
Key findings to incorporate:
- Recommended approach from research
- Constraints identified
- Best practices to follow
</context>
```
</reference_research>
<prompt_sized_phases>
Each phase should be executable by a single prompt:
```xml
<phase number="1" name="setup-infrastructure">
<objective>Create base auth structure and types</objective>
<tasks>
<task>Create auth module directory</task>
<task>Define TypeScript types for tokens</task>
<task>Set up test infrastructure</task>
</tasks>
</phase>
```
</prompt_sized_phases>
<execution_hints>
Help the next Claude understand how to proceed:
```xml
<phase number="2" name="implement-jwt">
<execution_notes>
This phase modifies files from phase 1.
Reference the types created in phase 1.
Run tests after each major change.
</execution_notes>
</phase>
```
</execution_hints>
</key_principles>
<plan_types>
<implementation_roadmap>
For breaking down how to build something:
```xml
<objective>
Create implementation roadmap for user authentication system.
Purpose: Guide phased implementation with clear milestones
Input: Authentication research findings
Output: auth-plan.md with 4-5 implementation phases
</objective>
<context>
Research: @.prompts/001-auth-research/auth-research.md
</context>
<planning_requirements>
- Break into independently testable phases
- Each phase builds on previous
- Include testing at each phase
- Consider rollback points
</planning_requirements>
```
</implementation_roadmap>
<decision_framework>
For choosing between options:
```xml
<objective>
Create decision framework for selecting database technology.
Purpose: Make informed choice between PostgreSQL, MongoDB, and DynamoDB
Input: Database research findings
Output: database-plan.md with criteria, analysis, recommendation
</objective>
<output_structure>
Structure as decision framework:
```xml
<decision_framework>
<options>
<option name="PostgreSQL">
<pros>{List}</pros>
<cons>{List}</cons>
<fit_score criteria="scalability">8/10</fit_score>
<fit_score criteria="flexibility">6/10</fit_score>
</option>
<!-- Other options -->
</options>
<recommendation>
<choice>{Selected option}</choice>
<rationale>{Why this choice}</rationale>
<risks>{What could go wrong}</risks>
<mitigations>{How to address risks}</mitigations>
</recommendation>
<metadata>
<confidence level="high">
Clear winner based on requirements
</confidence>
<assumptions>
- Expected data volume: 10M records
- Team has SQL experience
</assumptions>
</metadata>
</decision_framework>
```
</output_structure>
```
</decision_framework>
<process_definition>
For defining workflows or methodologies:
```xml
<objective>
Create deployment process for production releases.
Purpose: Standardize safe, repeatable deployments
Input: Current infrastructure research
Output: deployment-plan.md with step-by-step process
</objective>
<output_structure>
Structure as process:
```xml
<process>
<overview>{High-level flow}</overview>
<steps>
<step number="1" name="pre-deployment">
<actions>
<action>Run full test suite</action>
<action>Create database backup</action>
<action>Notify team in #deployments</action>
</actions>
<checklist>
<item>Tests passing</item>
<item>Backup verified</item>
<item>Team notified</item>
</checklist>
<rollback>N/A - no changes yet</rollback>
</step>
<!-- Additional steps -->
</steps>
<metadata>
<dependencies>
- CI/CD pipeline configured
- Database backup system
- Slack webhook for notifications
</dependencies>
<open_questions>
- Blue-green vs rolling deployment?
- Automated rollback triggers?
</open_questions>
</metadata>
</process>
```
</output_structure>
```
</process_definition>
</plan_types>
<metadata_guidelines>
Load: [metadata-guidelines.md](metadata-guidelines.md)
</metadata_guidelines>

View File

@@ -0,0 +1,288 @@
<overview>
Contextual questions for intake, organized by purpose. Use AskUserQuestion tool with these templates.
</overview>
<universal_questions>
<topic_identifier>
When topic not obvious from description:
```yaml
header: "Topic"
question: "What topic/feature is this for? (used for file naming)"
# Let user provide via "Other" option
# Enforce kebab-case (convert spaces to hyphens)
```
</topic_identifier>
<chain_reference>
When existing research/plan files found:
```yaml
header: "Reference"
question: "Should this prompt reference any existing research or plans?"
options:
- "{file1}" - Found in .prompts/{folder1}/
- "{file2}" - Found in .prompts/{folder2}/
- "None" - Start fresh without referencing existing files
multiSelect: true
```
</chain_reference>
</universal_questions>
<do_questions>
<artifact_type>
When unclear what's being created:
```yaml
header: "Output type"
question: "What are you creating?"
options:
- "Code/feature" - Software implementation
- "Document/content" - Written material, documentation
- "Design/spec" - Architecture, wireframes, specifications
- "Configuration" - Config files, infrastructure setup
```
</artifact_type>
<scope_completeness>
When level of polish unclear:
```yaml
header: "Scope"
question: "What level of completeness?"
options:
- "Production-ready" - Ship to users, needs polish and tests
- "Working prototype" - Functional but rough edges acceptable
- "Proof of concept" - Minimal viable demonstration
```
</scope_completeness>
<approach_patterns>
When implementation approach unclear:
```yaml
header: "Approach"
question: "Any specific patterns or constraints?"
options:
- "Follow existing patterns" - Match current codebase style
- "Best practices" - Modern, recommended approaches
- "Specific requirement" - I have a constraint to specify
```
</approach_patterns>
<testing_requirements>
When verification needs unclear:
```yaml
header: "Testing"
question: "What testing is needed?"
options:
- "Full test coverage" - Unit, integration, e2e tests
- "Core functionality" - Key paths tested
- "Manual verification" - No automated tests required
```
</testing_requirements>
<integration_points>
For features that connect to existing code:
```yaml
header: "Integration"
question: "How does this integrate with existing code?"
options:
- "New module" - Standalone, minimal integration
- "Extends existing" - Adds to current implementation
- "Replaces existing" - Replaces current implementation
```
</integration_points>
</do_questions>
<plan_questions>
<plan_purpose>
What the plan leads to:
```yaml
header: "Plan for"
question: "What is this plan leading to?"
options:
- "Implementation" - Break down how to build something
- "Decision" - Weigh options, choose an approach
- "Process" - Define workflow or methodology
```
</plan_purpose>
<plan_format>
How to structure the output:
```yaml
header: "Format"
question: "What format works best?"
options:
- "Phased roadmap" - Sequential stages with milestones
- "Checklist/tasks" - Actionable items to complete
- "Decision framework" - Criteria, trade-offs, recommendation
```
</plan_format>
<constraints>
What limits the plan:
```yaml
header: "Constraints"
question: "What constraints should the plan consider?"
options:
- "Technical" - Stack limitations, dependencies, compatibility
- "Resources" - Team capacity, expertise available
- "Requirements" - Must-haves, compliance, standards
multiSelect: true
```
</constraints>
<granularity>
Level of detail needed:
```yaml
header: "Granularity"
question: "How detailed should the plan be?"
options:
- "High-level phases" - Major milestones, flexible execution
- "Detailed tasks" - Specific actionable items
- "Prompt-ready" - Each phase is one prompt to execute
```
</granularity>
<dependencies>
What exists vs what needs creation:
```yaml
header: "Dependencies"
question: "What already exists?"
options:
- "Greenfield" - Starting from scratch
- "Existing codebase" - Building on current code
- "Research complete" - Findings ready to plan from
```
</dependencies>
</plan_questions>
<research_questions>
<research_depth>
How comprehensive:
```yaml
header: "Depth"
question: "How deep should the research go?"
options:
- "Overview" - High-level understanding, key concepts
- "Comprehensive" - Detailed exploration, multiple perspectives
- "Exhaustive" - Everything available, edge cases included
```
</research_depth>
<source_priorities>
Where to look:
```yaml
header: "Sources"
question: "What sources should be prioritized?"
options:
- "Official docs" - Primary sources, authoritative references
- "Community" - Blog posts, tutorials, real-world examples
- "Current/latest" - 2024-2025 sources, cutting edge
multiSelect: true
```
</source_priorities>
<output_format>
How to present findings:
```yaml
header: "Output"
question: "How should findings be structured?"
options:
- "Summary with key points" - Concise, actionable takeaways
- "Detailed analysis" - In-depth with examples and comparisons
- "Reference document" - Organized for future lookup
```
</output_format>
<research_focus>
When topic is broad:
```yaml
header: "Focus"
question: "What aspect is most important?"
options:
- "How it works" - Concepts, architecture, internals
- "How to use it" - Patterns, examples, best practices
- "Trade-offs" - Pros/cons, alternatives, comparisons
```
</research_focus>
<evaluation_criteria>
For comparison research:
```yaml
header: "Criteria"
question: "What criteria matter most for evaluation?"
options:
- "Performance" - Speed, scalability, efficiency
- "Developer experience" - Ease of use, documentation, community
- "Security" - Vulnerabilities, compliance, best practices
- "Cost" - Pricing, resource usage, maintenance
multiSelect: true
```
</evaluation_criteria>
</research_questions>
<refine_questions>
<target_selection>
When multiple outputs exist:
```yaml
header: "Target"
question: "Which output should be refined?"
options:
- "{file1}" - In .prompts/{folder1}/
- "{file2}" - In .prompts/{folder2}/
# List existing research/plan outputs
```
</target_selection>
<feedback_type>
What kind of improvement:
```yaml
header: "Improvement"
question: "What needs improvement?"
options:
- "Deepen analysis" - Add more detail, examples, or rigor
- "Expand scope" - Cover additional areas or topics
- "Correct errors" - Fix factual mistakes or outdated info
- "Restructure" - Reorganize for clarity or usability
```
</feedback_type>
<specific_feedback>
After type selected, gather details:
```yaml
header: "Details"
question: "What specifically should be improved?"
# Let user provide via "Other" option
# This is the core feedback that drives the refine prompt
```
</specific_feedback>
<preservation>
What to keep:
```yaml
header: "Preserve"
question: "What's working well that should be kept?"
options:
- "Structure" - Keep the overall organization
- "Recommendations" - Keep the conclusions
- "Code examples" - Keep the implementation patterns
- "Everything except feedback areas" - Only change what's specified
```
</preservation>
</refine_questions>
<question_rules>
- Only ask about genuine gaps - don't ask what's already stated
- 2-4 questions max per round - avoid overwhelming
- Each option needs description - explain implications
- Prefer options over free-text - when choices are knowable
- User can always select "Other" - for custom input
- Route by purpose - use purpose-specific questions after primary gate
</question_rules>

View File

@@ -0,0 +1,296 @@
<overview>
Prompt patterns for improving existing research or plan outputs based on feedback.
</overview>
<prompt_template>
```xml
<objective>
Refine {topic}-{original_purpose} based on feedback.
Target: @.prompts/{num}-{topic}-{original_purpose}/{topic}-{original_purpose}.md
Current summary: @.prompts/{num}-{topic}-{original_purpose}/SUMMARY.md
Purpose: {What improvement is needed}
Output: Updated {topic}-{original_purpose}.md with improvements
</objective>
<context>
Original output: @.prompts/{num}-{topic}-{original_purpose}/{topic}-{original_purpose}.md
</context>
<feedback>
{Specific issues to address}
{What was missing or insufficient}
{Areas needing more depth}
</feedback>
<preserve>
{What worked well and should be kept}
{Structure or findings to maintain}
</preserve>
<requirements>
- Address all feedback points
- Maintain original structure and metadata format
- Keep what worked from previous version
- Update confidence based on improvements
- Clearly improve on identified weaknesses
</requirements>
<output>
1. Archive current output to: `.prompts/{num}-{topic}-{original_purpose}/archive/{topic}-{original_purpose}-v{n}.md`
2. Write improved version to: `.prompts/{num}-{topic}-{original_purpose}/{topic}-{original_purpose}.md`
3. Create SUMMARY.md with version info and changes from previous
</output>
<summary_requirements>
Create `.prompts/{num}-{topic}-{original_purpose}/SUMMARY.md`
Load template: [summary-template.md](summary-template.md)
For Refine, always include:
- Version with iteration info (e.g., "v2 (refined from v1)")
- Changes from Previous section listing what improved
- Updated confidence if gaps were filled
</summary_requirements>
<success_criteria>
- All feedback points addressed
- Original structure maintained
- Previous version archived
- SUMMARY.md reflects version and changes
- Quality demonstrably improved
</success_criteria>
```
</prompt_template>
<key_principles>
<preserve_context>
Refine builds on existing work, not replaces it:
```xml
<context>
Original output: @.prompts/001-auth-research/auth-research.md
Key strengths to preserve:
- Library comparison structure
- Security recommendations
- Code examples format
</context>
```
</preserve_context>
<specific_feedback>
Feedback must be actionable:
```xml
<feedback>
Issues to address:
- Security analysis was surface-level - need CVE references and vulnerability patterns
- Performance benchmarks missing - add actual timing data
- Rate limiting patterns not covered
Do NOT change:
- Library comparison structure
- Recommendation format
</feedback>
```
</specific_feedback>
<version_tracking>
Archive before overwriting:
```xml
<output>
1. Archive: `.prompts/001-auth-research/archive/auth-research-v1.md`
2. Write improved: `.prompts/001-auth-research/auth-research.md`
3. Update SUMMARY.md with version info
</output>
```
</version_tracking>
</key_principles>
<refine_types>
<deepen_research>
When research was too surface-level:
```xml
<objective>
Refine auth-research based on feedback.
Target: @.prompts/001-auth-research/auth-research.md
</objective>
<feedback>
- Security analysis too shallow - need specific vulnerability patterns
- Missing performance benchmarks
- Rate limiting not covered
</feedback>
<preserve>
- Library comparison structure
- Code example format
- Recommendation priorities
</preserve>
<requirements>
- Add CVE references for common vulnerabilities
- Include actual benchmark data from library docs
- Add rate limiting patterns section
- Increase confidence if gaps are filled
</requirements>
```
</deepen_research>
<expand_scope>
When research missed important areas:
```xml
<objective>
Refine stripe-research to include webhooks.
Target: @.prompts/005-stripe-research/stripe-research.md
</objective>
<feedback>
- Webhooks section completely missing
- Need signature verification patterns
- Retry handling not covered
</feedback>
<preserve>
- API authentication section
- Checkout flow documentation
- Error handling patterns
</preserve>
<requirements>
- Add comprehensive webhooks section
- Include signature verification code examples
- Cover retry and idempotency patterns
- Update summary to reflect expanded scope
</requirements>
```
</expand_scope>
<update_plan>
When plan needs adjustment:
```xml
<objective>
Refine auth-plan to add rate limiting phase.
Target: @.prompts/002-auth-plan/auth-plan.md
</objective>
<feedback>
- Rate limiting was deferred but is critical for production
- Should be its own phase, not bundled with tests
</feedback>
<preserve>
- Phase 1-3 structure
- Dependency chain
- Task granularity
</preserve>
<requirements>
- Insert Phase 4: Rate limiting
- Adjust Phase 5 (tests) to depend on rate limiting
- Update phase count in summary
- Ensure new phase is prompt-sized
</requirements>
```
</update_plan>
<correct_errors>
When output has factual errors:
```xml
<objective>
Refine jwt-research to correct library recommendation.
Target: @.prompts/003-jwt-research/jwt-research.md
</objective>
<feedback>
- jsonwebtoken recommendation is outdated
- jose is now preferred for security and performance
- Bundle size comparison was incorrect
</feedback>
<preserve>
- Research structure
- Security best practices section
- Token storage recommendations
</preserve>
<requirements>
- Update library recommendation to jose
- Correct bundle size data
- Add note about jsonwebtoken deprecation concerns
- Lower confidence if other findings may need verification
</requirements>
```
</correct_errors>
</refine_types>
<folder_structure>
Refine prompts get their own folder (new number), but output goes to the original folder:
```
.prompts/
├── 001-auth-research/
│ ├── completed/
│ │ └── 001-auth-research.md # Original prompt
│ ├── archive/
│ │ └── auth-research-v1.md # Archived v1
│ ├── auth-research.md # Current (v2)
│ └── SUMMARY.md # Reflects v2
├── 004-auth-research-refine/
│ ├── completed/
│ │ └── 004-auth-research-refine.md # Refine prompt
│ └── (no output here - goes to 001)
```
This maintains:
- Clear prompt history (each prompt is numbered)
- Single source of truth for each output
- Visible iteration count in SUMMARY.md
</folder_structure>
<execution_notes>
<dependency_handling>
Refine prompts depend on the target output existing:
- Check target file exists before execution
- If target folder missing, offer to create the original prompt first
```xml
<dependency_check>
If `.prompts/{num}-{topic}-{original_purpose}/{topic}-{original_purpose}.md` not found:
- Error: "Cannot refine - target output doesn't exist"
- Offer: "Create the original {purpose} prompt first?"
</dependency_check>
```
</dependency_handling>
<archive_creation>
Before overwriting, ensure archive exists:
```bash
mkdir -p .prompts/{num}-{topic}-{original_purpose}/archive/
mv .prompts/{num}-{topic}-{original_purpose}/{topic}-{original_purpose}.md \
.prompts/{num}-{topic}-{original_purpose}/archive/{topic}-{original_purpose}-v{n}.md
```
</archive_creation>
<summary_update>
SUMMARY.md must reflect the refinement:
- Update version number
- Add "Changes from Previous" section
- Update one-liner if findings changed
- Update confidence if improved
</summary_update>
</execution_notes>

View File

@@ -0,0 +1,626 @@
<overview>
Prompt patterns for gathering information that will be consumed by planning or implementation prompts.
Includes quality controls, verification mechanisms, and streaming writes to prevent research gaps and token limit failures.
</overview>
<prompt_template>
```xml
<session_initialization>
Before beginning research, verify today's date:
!`date +%Y-%m-%d`
Use this date when searching for "current" or "latest" information.
Example: If today is 2025-11-22, search for "2025" not "2024".
</session_initialization>
<research_objective>
Research {topic} to inform {subsequent use}.
Purpose: {What decision/implementation this enables}
Scope: {Boundaries of the research}
Output: {topic}-research.md with structured findings
</research_objective>
<research_scope>
<include>
{What to investigate}
{Specific questions to answer}
</include>
<exclude>
{What's out of scope}
{What to defer to later research}
</exclude>
<sources>
{Priority sources with exact URLs for WebFetch}
Official documentation:
- https://example.com/official-docs
- https://example.com/api-reference
Search queries for WebSearch:
- "{topic} best practices {current_year}"
- "{topic} latest version"
{Time constraints: prefer current sources - check today's date first}
</sources>
</research_scope>
<verification_checklist>
{If researching configuration/architecture with known components:}
□ Verify ALL known configuration/implementation options (enumerate below):
□ Option/Scope 1: {description}
□ Option/Scope 2: {description}
□ Option/Scope 3: {description}
□ Document exact file locations/URLs for each option
□ Verify precedence/hierarchy rules if applicable
□ Confirm syntax and examples from official sources
□ Check for recent updates or changes to documentation
{For all research:}
□ Verify negative claims ("X is not possible") with official docs
□ Confirm all primary claims have authoritative sources
□ Check both current docs AND recent updates/changelogs
□ Test multiple search queries to avoid missing information
□ Check for environment/tool-specific variations
</verification_checklist>
<research_quality_assurance>
Before completing research, perform these checks:
<completeness_check>
- [ ] All enumerated options/components documented with evidence
- [ ] Each access method/approach evaluated against ALL requirements
- [ ] Official documentation cited for critical claims
- [ ] Contradictory information resolved or flagged
</completeness_check>
<source_verification>
- [ ] Primary claims backed by official/authoritative sources
- [ ] Version numbers and dates included where relevant
- [ ] Actual URLs provided (not just "search for X")
- [ ] Distinguish verified facts from assumptions
</source_verification>
<blind_spots_review>
Ask yourself: "What might I have missed?"
- [ ] Are there configuration/implementation options I didn't investigate?
- [ ] Did I check for multiple environments/contexts (e.g., Desktop vs Code)?
- [ ] Did I verify claims that seem definitive ("cannot", "only", "must")?
- [ ] Did I look for recent changes or updates to documentation?
</blind_spots_review>
<critical_claims_audit>
For any statement like "X is not possible" or "Y is the only way":
- [ ] Is this verified by official documentation?
- [ ] Have I checked for recent updates that might change this?
- [ ] Are there alternative approaches I haven't considered?
</critical_claims_audit>
</research_quality_assurance>
<output_structure>
Save to: `.prompts/{num}-{topic}-research/{topic}-research.md`
Structure findings using this XML format:
```xml
<research>
<summary>
{2-3 paragraph executive summary of key findings}
</summary>
<findings>
<finding category="{category}">
<title>{Finding title}</title>
<detail>{Detailed explanation}</detail>
<source>{Where this came from}</source>
<relevance>{Why this matters for the goal}</relevance>
</finding>
<!-- Additional findings -->
</findings>
<recommendations>
<recommendation priority="high">
<action>{What to do}</action>
<rationale>{Why}</rationale>
</recommendation>
<!-- Additional recommendations -->
</recommendations>
<code_examples>
{Relevant code patterns, snippets, configurations}
</code_examples>
<metadata>
<confidence level="{high|medium|low}">
{Why this confidence level}
</confidence>
<dependencies>
{What's needed to act on this research}
</dependencies>
<open_questions>
{What couldn't be determined}
</open_questions>
<assumptions>
{What was assumed}
</assumptions>
<!-- ENHANCED: Research Quality Report -->
<quality_report>
<sources_consulted>
{List URLs of official documentation and primary sources}
</sources_consulted>
<claims_verified>
{Key findings verified with official sources}
</claims_verified>
<claims_assumed>
{Findings based on inference or incomplete information}
</claims_assumed>
<contradictions_encountered>
{Any conflicting information found and how resolved}
</contradictions_encountered>
<confidence_by_finding>
{For critical findings, individual confidence levels}
- Finding 1: High (official docs + multiple sources)
- Finding 2: Medium (single source, unclear if current)
- Finding 3: Low (inferred, requires hands-on verification)
</confidence_by_finding>
</quality_report>
</metadata>
</research>
```
</output_structure>
<pre_submission_checklist>
Before submitting your research report, confirm:
**Scope Coverage**
- [ ] All enumerated options/approaches investigated
- [ ] Each component from verification checklist documented or marked "not found"
- [ ] Official documentation cited for all critical claims
**Claim Verification**
- [ ] Each "not possible" or "only way" claim verified with official docs
- [ ] URLs to official documentation included for key findings
- [ ] Version numbers and dates specified where relevant
**Quality Controls**
- [ ] Blind spots review completed ("What did I miss?")
- [ ] Quality report section filled out honestly
- [ ] Confidence levels assigned with justification
- [ ] Assumptions clearly distinguished from verified facts
**Output Completeness**
- [ ] All required XML sections present
- [ ] SUMMARY.md created with substantive one-liner
- [ ] Sources consulted listed with URLs
- [ ] Next steps clearly identified
</pre_submission_checklist>
```
</output_structure>
<incremental_output>
**CRITICAL: Write findings incrementally to prevent token limit failures**
Instead of generating the full research in memory and writing at the end:
1. Create the output file with initial structure
2. Write each finding as you discover it
3. Append code examples as you find them
4. Update metadata at the end
This ensures:
- Zero lost work if token limit is hit
- File contains all findings up to that point
- No estimation heuristics needed
- Works for any research size
<workflow>
Step 1 - Initialize structure:
```bash
# Create file with skeleton
Write: .prompts/{num}-{topic}-research/{topic}-research.md
Content: Basic XML structure with empty sections
```
Step 2 - Append findings incrementally:
```bash
# After researching authentication libraries
Edit: Append <finding> to <findings> section
# After discovering rate limits
Edit: Append another <finding> to <findings> section
```
Step 3 - Add code examples as discovered:
```bash
# Found jose example
Edit: Append to <code_examples> section
```
Step 4 - Finalize metadata:
```bash
# After completing research
Edit: Update <metadata> section with confidence, dependencies, etc.
```
</workflow>
<example_prompt_instruction>
```xml
<output_requirements>
Write findings incrementally to {topic}-research.md as you discover them:
1. Create the file with this initial structure:
```xml
<research>
<summary>[Will complete at end]</summary>
<findings></findings>
<recommendations></recommendations>
<code_examples></code_examples>
<metadata></metadata>
</research>
```
2. As you research each aspect, immediately append findings:
- Research JWT libraries → Write finding
- Discover security pattern → Write finding
- Find code example → Append to code_examples
3. After all research complete:
- Write summary (synthesize all findings)
- Write recommendations (based on findings)
- Write metadata (confidence, dependencies, etc.)
This incremental approach ensures all work is saved even if execution
hits token limits. Never generate the full output in memory first.
</output_requirements>
```
</example_prompt_instruction>
<benefits>
**vs. Pre-execution estimation:**
- No estimation errors (you don't predict, you just write)
- No artificial modularization (agent decides natural breakpoints)
- No lost work (everything written is saved)
**vs. Single end-of-execution write:**
- Survives token limit failures (partial progress saved)
- Lower memory usage (write as you go)
- Natural checkpoint recovery (can continue from last finding)
</benefits>
</incremental_output>
<summary_requirements>
Create `.prompts/{num}-{topic}-research/SUMMARY.md`
Load template: [summary-template.md](summary-template.md)
For research, emphasize key recommendation and decision readiness. Next step typically: Create plan.
</summary_requirements>
<success_criteria>
- All scope questions answered
- All verification checklist items completed
- Sources are current and authoritative
- Findings are actionable
- Metadata captures gaps honestly
- Quality report distinguishes verified from assumed
- SUMMARY.md created with substantive one-liner
- Ready for planning/implementation to consume
</success_criteria>
```
</prompt_template>
<key_principles>
<structure_for_consumption>
The next Claude needs to quickly extract relevant information:
```xml
<finding category="authentication">
<title>JWT vs Session Tokens</title>
<detail>
JWTs are preferred for stateless APIs. Sessions better for
traditional web apps with server-side rendering.
</detail>
<source>OWASP Authentication Cheatsheet 2024</source>
<relevance>
Our API-first architecture points to JWT approach.
</relevance>
</finding>
```
</structure_for_consumption>
<include_code_examples>
The implementation prompt needs patterns to follow:
```xml
<code_examples>
<example name="jwt-verification">
```typescript
import { jwtVerify } from 'jose';
const { payload } = await jwtVerify(
token,
new TextEncoder().encode(secret),
{ algorithms: ['HS256'] }
);
```
Source: jose library documentation
</example>
</code_examples>
```
</include_code_examples>
<explicit_confidence>
Help the next Claude know what to trust:
```xml
<metadata>
<confidence level="medium">
API documentation is comprehensive but lacks real-world
performance benchmarks. Rate limits are documented but
actual behavior may differ under load.
</confidence>
<quality_report>
<confidence_by_finding>
- JWT library comparison: High (npm stats + security audits + active maintenance verified)
- Performance benchmarks: Low (no official data, community reports vary)
- Rate limits: Medium (documented but not tested)
</confidence_by_finding>
</quality_report>
</metadata>
```
</explicit_confidence>
<enumerate_known_possibilities>
When researching systems with known components, enumerate them explicitly:
```xml
<verification_checklist>
**CRITICAL**: Verify ALL configuration scopes:
□ User scope - Global configuration
□ Project scope - Project-level configuration files
□ Local scope - Project-specific user overrides
□ Environment scope - Environment variable based
</verification_checklist>
```
This forces systematic coverage and prevents omissions.
</enumerate_known_possibilities>
</key_principles>
<research_types>
<technology_research>
For understanding tools, libraries, APIs:
```xml
<research_objective>
Research JWT authentication libraries for Node.js.
Purpose: Select library for auth implementation
Scope: Security, performance, maintenance status
Output: jwt-research.md
</research_objective>
<research_scope>
<include>
- Available libraries (jose, jsonwebtoken, etc.)
- Security track record
- Bundle size and performance
- TypeScript support
- Active maintenance
- Community adoption
</include>
<exclude>
- Implementation details (for planning phase)
- Specific code architecture (for implementation)
</exclude>
<sources>
Official documentation (use WebFetch):
- https://github.com/panva/jose
- https://github.com/auth0/node-jsonwebtoken
Additional sources (use WebSearch):
- "JWT library comparison {current_year}"
- "jose vs jsonwebtoken security {current_year}"
- npm download stats
- GitHub issues/security advisories
</sources>
</research_scope>
<verification_checklist>
□ Verify all major JWT libraries (jose, jsonwebtoken, passport-jwt)
□ Check npm download trends for adoption metrics
□ Review GitHub security advisories for each library
□ Confirm TypeScript support with examples
□ Document bundle sizes from bundlephobia or similar
</verification_checklist>
```
</technology_research>
<best_practices_research>
For understanding patterns and standards:
```xml
<research_objective>
Research authentication security best practices.
Purpose: Inform secure auth implementation
Scope: Current standards, common vulnerabilities, mitigations
Output: auth-security-research.md
</research_objective>
<research_scope>
<include>
- OWASP authentication guidelines
- Token storage best practices
- Common vulnerabilities (XSS, CSRF)
- Secure cookie configuration
- Password hashing standards
</include>
<sources>
Official sources (use WebFetch):
- https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html
- https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html
Search sources (use WebSearch):
- "OWASP authentication {current_year}"
- "secure token storage best practices {current_year}"
</sources>
</research_scope>
<verification_checklist>
□ Verify OWASP top 10 authentication vulnerabilities
□ Check latest OWASP cheatsheet publication date
□ Confirm recommended hash algorithms (bcrypt, scrypt, Argon2)
□ Document secure cookie flags (httpOnly, secure, sameSite)
</verification_checklist>
```
</best_practices_research>
<api_service_research>
For understanding external services:
```xml
<research_objective>
Research Stripe API for payment integration.
Purpose: Plan payment implementation
Scope: Endpoints, authentication, webhooks, testing
Output: stripe-research.md
</research_objective>
<research_scope>
<include>
- API structure and versioning
- Authentication methods
- Key endpoints for our use case
- Webhook events and handling
- Testing and sandbox environment
- Error handling patterns
- SDK availability
</include>
<exclude>
- Pricing details
- Account setup process
</exclude>
<sources>
Official sources (use WebFetch):
- https://stripe.com/docs/api
- https://stripe.com/docs/webhooks
- https://stripe.com/docs/testing
Context7 MCP:
- Use mcp__context7__resolve-library-id for Stripe
- Use mcp__context7__get-library-docs for current patterns
</sources>
</research_scope>
<verification_checklist>
□ Verify current API version and deprecation timeline
□ Check webhook event types for our use case
□ Confirm sandbox environment capabilities
□ Document rate limits from official docs
□ Verify SDK availability for our stack
</verification_checklist>
```
</api_service_research>
<comparison_research>
For evaluating options:
```xml
<research_objective>
Research database options for multi-tenant SaaS.
Purpose: Inform database selection decision
Scope: PostgreSQL, MongoDB, DynamoDB for our use case
Output: database-research.md
</research_objective>
<research_scope>
<include>
For each option:
- Multi-tenancy support patterns
- Scaling characteristics
- Cost model
- Operational complexity
- Team expertise requirements
</include>
<evaluation_criteria>
- Data isolation requirements
- Expected query patterns
- Scale projections
- Team familiarity
</evaluation_criteria>
</research_scope>
<verification_checklist>
□ Verify all candidate databases (PostgreSQL, MongoDB, DynamoDB)
□ Document multi-tenancy patterns for each with official sources
□ Compare scaling characteristics with authoritative benchmarks
□ Check pricing calculators for cost model verification
□ Assess team expertise honestly (survey if needed)
</verification_checklist>
```
</comparison_research>
</research_types>
<metadata_guidelines>
Load: [metadata-guidelines.md](metadata-guidelines.md)
**Enhanced guidance**:
- Use <quality_report> to distinguish verified facts from assumptions
- Assign confidence levels to individual findings when they vary
- List all sources consulted with URLs for verification
- Document contradictions encountered and how resolved
- Be honest about limitations and gaps in research
</metadata_guidelines>
<tool_usage>
<context7_mcp>
For library documentation:
```
Use mcp__context7__resolve-library-id to find library
Then mcp__context7__get-library-docs for current patterns
```
</context7_mcp>
<web_search>
For recent articles and updates:
```
Search: "{topic} best practices {current_year}"
Search: "{library} security vulnerabilities {current_year}"
Search: "{topic} vs {alternative} comparison {current_year}"
```
</web_search>
<web_fetch>
For specific documentation pages:
```
Fetch official docs, API references, changelogs with exact URLs
Prefer WebFetch over WebSearch for authoritative sources
```
</web_fetch>
Include tool usage hints in research prompts when specific sources are needed.
</tool_usage>
<pitfalls_reference>
Before completing research, review common pitfalls:
Load: [research-pitfalls.md](research-pitfalls.md)
Key patterns to avoid:
- Configuration scope assumptions - enumerate all scopes
- "Search for X" vagueness - provide exact URLs
- Deprecated vs current confusion - check changelogs
- Tool-specific variations - check each environment
</pitfalls_reference>

View File

@@ -0,0 +1,198 @@
# Research Pitfalls - Known Patterns to Avoid
## Purpose
This document catalogs research mistakes discovered in production use, providing specific patterns to avoid and verification strategies to prevent recurrence.
## Known Pitfalls
### Pitfall 1: Configuration Scope Assumptions
**What**: Assuming global configuration means no project-scoping exists
**Example**: Concluding "MCP servers are configured GLOBALLY only" while missing project-scoped `.mcp.json`
**Why it happens**: Not explicitly checking all known configuration patterns
**Prevention**:
```xml
<verification_checklist>
**CRITICAL**: Verify ALL configuration scopes:
□ User/global scope - System-wide configuration
□ Project scope - Project-level configuration files
□ Local scope - Project-specific user overrides
□ Workspace scope - IDE/tool workspace settings
□ Environment scope - Environment variables
</verification_checklist>
```
### Pitfall 2: "Search for X" Vagueness
**What**: Asking researchers to "search for documentation" without specifying where
**Example**: "Research MCP documentation" → finds outdated community blog instead of official docs
**Why it happens**: Vague research instructions don't specify exact sources
**Prevention**:
```xml
<sources>
Official sources (use WebFetch):
- https://exact-url-to-official-docs
- https://exact-url-to-api-reference
Search queries (use WebSearch):
- "specific search query {current_year}"
- "another specific query {current_year}"
</sources>
```
### Pitfall 3: Deprecated vs Current Features
**What**: Finding archived/old documentation and concluding feature doesn't exist
**Example**: Finding 2022 docs saying "feature not supported" when current version added it
**Why it happens**: Not checking multiple sources or recent updates
**Prevention**:
```xml
<verification_checklist>
□ Check current official documentation
□ Review changelog/release notes for recent updates
□ Verify version numbers and publication dates
□ Cross-reference multiple authoritative sources
</verification_checklist>
```
### Pitfall 4: Tool-Specific Variations
**What**: Conflating capabilities across different tools/environments
**Example**: "Claude Desktop supports X" ≠ "Claude Code supports X"
**Why it happens**: Not explicitly checking each environment separately
**Prevention**:
```xml
<verification_checklist>
□ Claude Desktop capabilities
□ Claude Code capabilities
□ VS Code extension capabilities
□ API/SDK capabilities
Document which environment supports which features
</verification_checklist>
```
### Pitfall 5: Confident Negative Claims Without Citations
**What**: Making definitive "X is not possible" statements without official source verification
**Example**: "Folder-scoped MCP configuration is not supported" (missing `.mcp.json`)
**Why it happens**: Drawing conclusions from absence of evidence rather than evidence of absence
**Prevention**:
```xml
<critical_claims_audit>
For any "X is not possible" or "Y is the only way" statement:
- [ ] Is this verified by official documentation stating it explicitly?
- [ ] Have I checked for recent updates that might change this?
- [ ] Have I verified all possible approaches/mechanisms?
- [ ] Am I confusing "I didn't find it" with "it doesn't exist"?
</critical_claims_audit>
```
### Pitfall 6: Missing Enumeration
**What**: Investigating open-ended scope without enumerating known possibilities first
**Example**: "Research configuration options" instead of listing specific options to verify
**Why it happens**: Not creating explicit checklist of items to investigate
**Prevention**:
```xml
<verification_checklist>
Enumerate ALL known options FIRST:
□ Option 1: [specific item]
□ Option 2: [specific item]
□ Option 3: [specific item]
□ Check for additional unlisted options
For each option above, document:
- Existence (confirmed/not found/unclear)
- Official source URL
- Current status (active/deprecated/beta)
</verification_checklist>
```
### Pitfall 7: Single-Source Verification
**What**: Relying on a single source for critical claims
**Example**: Using only Stack Overflow answer from 2021 for current best practices
**Why it happens**: Not cross-referencing multiple authoritative sources
**Prevention**:
```xml
<source_verification>
For critical claims, require multiple sources:
- [ ] Official documentation (primary)
- [ ] Release notes/changelog (for currency)
- [ ] Additional authoritative source (for verification)
- [ ] Contradiction check (ensure sources agree)
</source_verification>
```
### Pitfall 8: Assumed Completeness
**What**: Assuming search results are complete and authoritative
**Example**: First Google result is outdated but assumed current
**Why it happens**: Not verifying publication dates and source authority
**Prevention**:
```xml
<source_verification>
For each source consulted:
- [ ] Publication/update date verified (prefer recent/current)
- [ ] Source authority confirmed (official docs, not blogs)
- [ ] Version relevance checked (matches current version)
- [ ] Multiple search queries tried (not just one)
</source_verification>
```
## Red Flags in Research Outputs
### 🚩 Red Flag 1: Zero "Not Found" Results
**Warning**: Every investigation succeeds perfectly
**Problem**: Real research encounters dead ends, ambiguity, and unknowns
**Action**: Expect honest reporting of limitations, contradictions, and gaps
### 🚩 Red Flag 2: No Confidence Indicators
**Warning**: All findings presented as equally certain
**Problem**: Can't distinguish verified facts from educated guesses
**Action**: Require confidence levels (High/Medium/Low) for key findings
### 🚩 Red Flag 3: Missing URLs
**Warning**: "According to documentation..." without specific URL
**Problem**: Can't verify claims or check for updates
**Action**: Require actual URLs for all official documentation claims
### 🚩 Red Flag 4: Definitive Statements Without Evidence
**Warning**: "X cannot do Y" or "Z is the only way" without citation
**Problem**: Strong claims require strong evidence
**Action**: Flag for verification against official sources
### 🚩 Red Flag 5: Incomplete Enumeration
**Warning**: Verification checklist lists 4 items, output covers 2
**Problem**: Systematic gaps in coverage
**Action**: Ensure all enumerated items addressed or marked "not found"
## Continuous Improvement
When research gaps occur:
1. **Document the gap**
- What was missed or incorrect?
- What was the actual correct information?
- What was the impact?
2. **Root cause analysis**
- Why wasn't it caught?
- Which verification step would have prevented it?
- What pattern does this reveal?
3. **Update this document**
- Add new pitfall entry
- Update relevant checklists
- Share lesson learned
## Quick Reference Checklist
Before submitting research, verify:
- [ ] All enumerated items investigated (not just some)
- [ ] Negative claims verified with official docs
- [ ] Multiple sources cross-referenced for critical claims
- [ ] URLs provided for all official documentation
- [ ] Publication dates checked (prefer recent/current)
- [ ] Tool/environment-specific variations documented
- [ ] Confidence levels assigned honestly
- [ ] Assumptions distinguished from verified facts
- [ ] "What might I have missed?" review completed
---
**Living Document**: Update after each significant research gap
**Lessons From**: MCP configuration research gap (missed `.mcp.json`)

View File

@@ -0,0 +1,117 @@
<overview>
Standard SUMMARY.md structure for all prompt outputs. Every executed prompt creates this file for human scanning.
</overview>
<template>
```markdown
# {Topic} {Purpose} Summary
**{Substantive one-liner describing outcome}**
## Version
{v1 or "v2 (refined from v1)"}
## Changes from Previous
{Only include if v2+, otherwise omit this section}
## Key Findings
- {Most important finding or action}
- {Second key item}
- {Third key item}
## Files Created
{Only include for Do prompts}
- `path/to/file.ts` - Description
## Decisions Needed
{Specific actionable decisions requiring user input, or "None"}
## Blockers
{External impediments preventing progress, or "None"}
## Next Step
{Concrete forward action}
---
*Confidence: {High|Medium|Low}*
*Iterations: {n}*
*Full output: {filename.md}* (omit for Do prompts)
```
</template>
<field_requirements>
<one_liner>
Must be substantive - describes actual outcome, not status.
**Good**: "JWT with jose library and httpOnly cookies recommended"
**Bad**: "Research completed"
**Good**: "4-phase implementation: types → JWT core → refresh → tests"
**Bad**: "Plan created"
**Good**: "JWT middleware complete with 6 files in src/auth/"
**Bad**: "Implementation finished"
</one_liner>
<key_findings>
Purpose-specific content:
- **Research**: Key recommendations and discoveries
- **Plan**: Phase overview with objectives
- **Do**: What was implemented, patterns used
- **Refine**: What improved from previous version
</key_findings>
<decisions_needed>
Actionable items requiring user judgment:
- Architectural choices
- Tradeoff confirmations
- Assumption validation
- Risk acceptance
Must be specific: "Approve 15-minute token expiry" not "review recommended"
</decisions_needed>
<blockers>
External impediments (rare):
- Access issues
- Missing dependencies
- Environment problems
Most prompts have "None" - only flag genuine problems.
</blockers>
<next_step>
Concrete action:
- "Create auth-plan.md"
- "Execute Phase 1 prompt"
- "Run tests"
Not vague: "proceed to next phase"
</next_step>
</field_requirements>
<purpose_variations>
<research_summary>
Emphasize: Key recommendation, decision readiness
Next step typically: Create plan
</research_summary>
<plan_summary>
Emphasize: Phase breakdown, assumptions needing validation
Next step typically: Execute first phase
</plan_summary>
<do_summary>
Emphasize: Files created, test status
Next step typically: Run tests or execute next phase
</do_summary>
<refine_summary>
Emphasize: What improved, version number
Include: Changes from Previous section
</refine_summary>
</purpose_variations>

View File

@@ -0,0 +1,291 @@
# create-plans
**Hierarchical project planning optimized for solo developer + Claude**
Create executable plans that Claude can run, not enterprise documentation that sits unused.
## Philosophy
**You are the visionary. Claude is the builder.**
No teams. No stakeholders. No ceremonies. No coordination overhead.
Plans are written AS prompts (PLAN.md IS the execution prompt), not documentation that gets transformed into prompts later.
## Quick Start
```
Skill("create-plans")
```
The skill will:
1. Scan for existing planning structure
2. Check for git repo (offers to initialize)
3. Present context-aware options
4. Guide you through the appropriate workflow
## Planning Hierarchy
```
BRIEF.md → Human vision (what and why)
ROADMAP.md → Phase structure (high-level plan)
RESEARCH.md → Research prompt (for unknowns - optional)
FINDINGS.md → Research output (if research done)
PLAN.md → THE PROMPT (Claude executes this)
SUMMARY.md → Outcome (existence = phase complete)
```
## Directory Structure
All planning artifacts go in `.planning/`:
```
.planning/
├── BRIEF.md # Project vision
├── ROADMAP.md # Phase structure + tracking
└── phases/
├── 01-foundation/
│ ├── PLAN.md # THE PROMPT (execute this)
│ ├── SUMMARY.md # Outcome (exists = done)
│ └── .continue-here.md # Handoff (temporary)
└── 02-auth/
├── RESEARCH.md # Research prompt (if needed)
├── FINDINGS.md # Research output
├── PLAN.md # Execute prompt
└── SUMMARY.md
```
## Workflows
### Starting a New Project
1. Invoke skill
2. Choose "Start new project"
3. Answer questions about vision/goals
4. Skill creates BRIEF.md
5. Optionally create ROADMAP.md with phases
6. Plan first phase
### Planning a Phase
1. Skill reads BRIEF + ROADMAP
2. Loads domain expertise if applicable (see Domain Skills below)
3. If phase has unknowns → create RESEARCH.md first
4. Creates PLAN.md (the executable prompt)
5. You review or execute
### Executing a Phase
1. Skill reads PLAN.md
2. Executes each task with verification
3. Creates SUMMARY.md when complete
4. Git commits phase completion
5. Offers to plan next phase
### Pausing Work (Handoff)
1. Choose "Create handoff"
2. Skill creates `.continue-here.md` with full context
3. When resuming, skill loads handoff and continues
## Domain Skills (Optional)
**What are domain skills?**
Full-fledged agent skills that exhaustively document how to build in a specific framework/platform. They make your plans concrete instead of generic.
**Without domain skill:**
```
Task: Create authentication system
Action: Implement user login
```
Generic. Not helpful.
**With domain skill (macOS apps):**
```
Task: Create login window
Files: Sources/Views/LoginView.swift
Action: SwiftUI view with @Bindable for User model. TextField for username/password.
SecureField for password (uses system keychain). Submit button triggers validation
logic. Use @FocusState for tab order. Add Command-L keyboard shortcut.
Verify: xcodebuild test && open App.app (check tab order, keychain storage)
```
Specific. Executable. Framework-appropriate.
**Structure of domain skills:**
```
~/.claude/skills/expertise/[domain]/
├── SKILL.md # Router + essential principles
├── workflows/ # build-new-app, add-feature, debug-app, etc.
└── references/ # Exhaustive domain knowledge (often 10k+ lines)
```
**Domain skills are dual-purpose:**
1. **Standalone skills** - Invoke with `Skill("build-macos-apps")` for guided development
2. **Context for create-plans** - Loaded automatically when planning that domain
**Example domains:**
- `macos-apps` - Swift/SwiftUI macOS (19 references, 10k+ lines)
- `iphone-apps` - Swift/SwiftUI iOS
- `unity-games` - Unity game development
- `swift-midi-apps` - MIDI/audio apps
- `with-agent-sdk` - Claude Agent SDK apps
- `nextjs-ecommerce` - Next.js e-commerce
**How it works:**
1. Skill infers domain from your request ("build a macOS app" → build-macos-apps)
2. Before creating PLAN.md, reads all `~/.claude/skills/build/macos-apps/references/*.md`
3. Uses that exhaustive knowledge to write framework-specific tasks
4. Result: Plans that match your actual tech stack with all the details
**What if you don't have domain skills?**
Skill works fine without them - proceeds with general planning. But tasks will be more generic and require more clarification during execution.
### Creating a Domain Skill
Domain skills are created with [create-agent-skills](../create-agent-skills/) skill.
**Process:**
1. `Skill("create-agent-skills")` → choose "Build a new skill"
2. Name: `build-[your-domain]`
3. Description: "Build [framework/platform] apps. Full lifecycle - build, debug, test, optimize, ship."
4. Ask it to create exhaustive references covering:
- Architecture patterns
- Project scaffolding
- Common features (data, networking, UI)
- Testing and debugging
- Platform-specific conventions
- CLI workflow (how to build/run without IDE)
- Deployment/shipping
**The skill should be comprehensive** - 5k-10k+ lines documenting everything about building in that domain. When create-plans loads it, the resulting PLAN.md tasks will be detailed and executable.
## Quality Controls
Research prompts include systematic verification to prevent gaps:
- **Verification checklists** - Enumerate ALL options before researching
- **Blind spots review** - "What might I have missed?"
- **Critical claims audit** - Verify "X is not possible" with sources
- **Quality reports** - Distinguish verified facts from assumptions
- **Streaming writes** - Write incrementally to prevent token limit failures
See `references/research-pitfalls.md` for known mistakes and prevention.
## Key Principles
### Solo Developer + Claude
Planning for ONE person (you) and ONE implementer (Claude). No team coordination, stakeholder management, or enterprise processes.
### Plans Are Prompts
PLAN.md IS the execution prompt. It contains objective, context (@file references), tasks (Files/Action/Verify/Done), and verification steps.
### Ship Fast, Iterate Fast
Plan → Execute → Ship → Learn → Repeat. No multi-week timelines, approval gates, or sprint ceremonies.
### Context Awareness
Monitors token usage:
- **25% remaining**: Mentions context getting full
- **15% remaining**: Pauses, offers handoff
- **10% remaining**: Auto-creates handoff, stops
Never starts large operations below 15% without confirmation.
### User Gates
Pauses at critical decision points:
- Before writing PLAN.md (confirm breakdown)
- After low-confidence research
- On verification failures
- When previous phase had issues
See `references/user-gates.md` for full gate patterns.
### Git Versioning
All planning artifacts are version controlled. Commits outcomes, not process:
- Initialization commit (BRIEF + ROADMAP)
- Phase completion commits (PLAN + SUMMARY + code)
- Handoff commits (when pausing work)
Git log becomes project history.
## Anti-Patterns
This skill NEVER includes:
- Team structures, roles, RACI matrices
- Stakeholder management, alignment meetings
- Sprint ceremonies, standups, retros
- Multi-week estimates, resource allocation
- Change management, governance processes
- Documentation for documentation's sake
If it sounds like corporate PM theater, it doesn't belong.
## Files Reference
### Structure
- `references/directory-structure.md` - Planning directory layout
- `references/hierarchy-rules.md` - How levels build on each other
### Formats
- `references/plan-format.md` - PLAN.md structure
- `references/handoff-format.md` - Context handoff structure
### Patterns
- `references/context-scanning.md` - How skill understands current state
- `references/context-management.md` - Token usage monitoring
- `references/user-gates.md` - When to pause and ask
- `references/git-integration.md` - Version control patterns
- `references/research-pitfalls.md` - Known research mistakes
### Templates
- `templates/brief.md` - Project vision document
- `templates/roadmap.md` - Phase structure
- `templates/phase-prompt.md` - Executable phase prompt (PLAN.md)
- `templates/research-prompt.md` - Research prompt (RESEARCH.md)
- `templates/summary.md` - Phase outcome (SUMMARY.md)
- `templates/continue-here.md` - Context handoff
### Workflows
- `workflows/create-brief.md` - Create project vision
- `workflows/create-roadmap.md` - Define phases from brief
- `workflows/plan-phase.md` - Create executable phase prompt
- `workflows/execute-phase.md` - Run phase, create summary
- `workflows/research-phase.md` - Create and run research
- `workflows/plan-chunk.md` - Plan immediate next tasks
- `workflows/transition.md` - Mark phase complete, advance
- `workflows/handoff.md` - Create context handoff for pausing
- `workflows/resume.md` - Load handoff, restore context
- `workflows/get-guidance.md` - Help decide planning approach
## Example Domain Skill
See `build/example-nextjs/` for a minimal domain skill showing:
- Framework-specific patterns
- Project structure conventions
- Common commands
- Phase breakdown strategies
- Task specificity guidelines
Use this as a template for creating your own domain skills.
## Success Criteria
Planning skill succeeds when:
- Context scan runs before intake
- Appropriate workflow selected based on state
- PLAN.md IS the executable prompt (not separate doc)
- Hierarchy is maintained (brief → roadmap → phase)
- Handoffs preserve full context for resumption
- Context limits respected (auto-handoff at 10%)
- Quality controls prevent research gaps
- Streaming writes prevent token limit failures

View File

@@ -0,0 +1,488 @@
---
name: create-plans
description: Create hierarchical project plans optimized for solo agentic development. Use when planning projects, phases, or tasks that Claude will execute. Produces Claude-executable plans with verification criteria, not enterprise documentation. Handles briefs, roadmaps, phase plans, and context handoffs.
---
<essential_principles>
<principle name="solo_developer_plus_claude">
You are planning for ONE person (the user) and ONE implementer (Claude).
No teams. No stakeholders. No ceremonies. No coordination overhead.
The user is the visionary/product owner. Claude is the builder.
</principle>
<principle name="plans_are_prompts">
PLAN.md is not a document that gets transformed into a prompt.
PLAN.md IS the prompt. It contains:
- Objective (what and why)
- Context (@file references)
- Tasks (type, files, action, verify, done, checkpoints)
- Verification (overall checks)
- Success criteria (measurable)
- Output (SUMMARY.md specification)
When planning a phase, you are writing the prompt that will execute it.
</principle>
<principle name="scope_control">
Plans must complete within ~50% of context usage to maintain consistent quality.
**The quality degradation curve:**
- 0-30% context: Peak quality (comprehensive, thorough, no anxiety)
- 30-50% context: Good quality (engaged, manageable pressure)
- 50-70% context: Degrading quality (efficiency mode, compression)
- 70%+ context: Poor quality (self-lobotomization, rushed work)
**Critical insight:** Claude doesn't degrade at 80% - it degrades at ~40-50% when it sees context mounting and enters "completion mode." By 80%, quality has already crashed.
**Solution:** Aggressive atomicity - split phases into many small, focused plans.
Examples:
- `01-01-PLAN.md` - Phase 1, Plan 1 (2-3 tasks: database schema only)
- `01-02-PLAN.md` - Phase 1, Plan 2 (2-3 tasks: database client setup)
- `01-03-PLAN.md` - Phase 1, Plan 3 (2-3 tasks: API routes)
- `01-04-PLAN.md` - Phase 1, Plan 4 (2-3 tasks: UI components)
Each plan is independently executable, verifiable, and scoped to **2-3 tasks maximum**.
**Atomic task principle:** Better to have 10 small, high-quality plans than 3 large, degraded plans. Each commit should be surgical, focused, and maintainable.
**Autonomous execution:** Plans without checkpoints execute via subagent with fresh context - impossible to degrade.
See: references/scope-estimation.md
</principle>
<principle name="human_checkpoints">
**Claude automates everything that has a CLI or API.** Checkpoints are for verification and decisions, not manual work.
**Checkpoint types:**
- `checkpoint:human-verify` - Human confirms Claude's automated work (visual checks, UI verification)
- `checkpoint:decision` - Human makes implementation choice (auth provider, architecture)
**Rarely needed:** `checkpoint:human-action` - Only for actions with no CLI/API (email verification links, account approvals requiring web login with 2FA)
**Critical rule:** If Claude CAN do it via CLI/API/tool, Claude MUST do it. Never ask human to:
- Deploy to Vercel/Railway/Fly (use CLI)
- Create Stripe webhooks (use CLI/API)
- Run builds/tests (use Bash)
- Write .env files (use Write tool)
- Create database resources (use provider CLI)
**Protocol:** Claude automates work → reaches checkpoint:human-verify → presents what was done → waits for confirmation → resumes
See: references/checkpoints.md, references/cli-automation.md
</principle>
<principle name="deviation_rules">
Plans are guides, not straitjackets. Real development always involves discoveries.
**During execution, deviations are handled automatically via 5 embedded rules:**
1. **Auto-fix bugs** - Broken behavior → fix immediately, document in Summary
2. **Auto-add missing critical** - Security/correctness gaps → add immediately, document
3. **Auto-fix blockers** - Can't proceed → fix immediately, document
4. **Ask about architectural** - Major structural changes → stop and ask user
5. **Log enhancements** - Nice-to-haves → auto-log to ISSUES.md, continue
**No user intervention needed for Rules 1-3, 5.** Only Rule 4 (architectural) requires user decision.
**All deviations documented in Summary** with: what was found, what rule applied, what was done, commit hash.
**Result:** Flow never breaks. Bugs get fixed. Scope stays controlled. Complete transparency.
See: workflows/execute-phase.md (deviation_rules section)
</principle>
<principle name="ship_fast_iterate_fast">
No enterprise process. No approval gates. No multi-week timelines.
Plan → Execute → Ship → Learn → Repeat.
**Milestone-driven:** Ship v1.0 → mark milestone → plan v1.1 → ship → repeat.
Milestones mark shipped versions and enable continuous iteration.
</principle>
<principle name="milestone_boundaries">
Milestones mark shipped versions (v1.0, v1.1, v2.0).
**Purpose:**
- Historical record in MILESTONES.md (what shipped when)
- Greenfield → Brownfield transition marker
- Git tags for releases
- Clear completion rituals
**Default approach:** Extend existing roadmap with new phases.
- v1.0 ships (phases 1-4) → add phases 5-6 for v1.1
- Continuous phase numbering (01-99)
- Milestone groupings keep roadmap organized
**Archive ONLY for:** Separate codebases or complete rewrites (rare).
See: references/milestone-management.md
</principle>
<principle name="anti_enterprise_patterns">
NEVER include in plans:
- Team structures, roles, RACI matrices
- Stakeholder management, alignment meetings
- Sprint ceremonies, standups, retros
- Multi-week estimates, resource allocation
- Change management, governance processes
- Documentation for documentation's sake
If it sounds like corporate PM theater, delete it.
</principle>
<principle name="context_awareness">
Monitor token usage via system warnings.
**At 25% remaining**: Mention context getting full
**At 15% remaining**: Pause, offer handoff
**At 10% remaining**: Auto-create handoff, stop
Never start large operations below 15% without user confirmation.
</principle>
<principle name="user_gates">
Never charge ahead at critical decision points. Use gates:
- **AskUserQuestion**: Structured choices (2-4 options)
- **Inline questions**: Simple confirmations
- **Decision gate loop**: "Ready, or ask more questions?"
Mandatory gates:
- Before writing PLAN.md (confirm breakdown)
- After low-confidence research
- On verification failures
- After phase completion with issues
- Before starting next phase with previous issues
See: references/user-gates.md
</principle>
<principle name="git_versioning">
All planning artifacts are version controlled. Commit outcomes, not process.
- Check for repo on invocation, offer to initialize
- Commit only at: initialization, phase completion, handoff
- Intermediate artifacts (PLAN.md, RESEARCH.md, FINDINGS.md) NOT committed separately
- Git log becomes project history
See: references/git-integration.md
</principle>
</essential_principles>
<context_scan>
**Run on every invocation** to understand current state:
```bash
# Check git status
git rev-parse --git-dir 2>/dev/null || echo "NO_GIT_REPO"
# Check for planning structure
ls -la .planning/ 2>/dev/null
ls -la .planning/phases/ 2>/dev/null
# Find any continue-here files
find . -name ".continue-here.md" -type f 2>/dev/null
# Check for existing artifacts
[ -f .planning/BRIEF.md ] && echo "BRIEF: exists"
[ -f .planning/ROADMAP.md ] && echo "ROADMAP: exists"
```
**If NO_GIT_REPO detected:**
Inline question: "No git repo found. Initialize one? (Recommended for version control)"
If yes: `git init`
**Present findings before intake question.**
</context_scan>
<domain_expertise>
**Domain expertise lives in `~/.claude/skills/expertise/`**
Before creating roadmap or phase plans, determine if domain expertise should be loaded.
<scan_domains>
```bash
ls ~/.claude/skills/expertise/ 2>/dev/null
```
This reveals available domain expertise (e.g., macos-apps, iphone-apps, unity-games, nextjs-ecommerce).
**If no domain skills found:** Proceed without domain expertise (graceful degradation). The skill works fine without domain-specific context.
</scan_domains>
<inference_rules>
If user's request contains domain keywords, INFER the domain:
| Keywords | Domain Skill |
|----------|--------------|
| "macOS", "Mac app", "menu bar", "AppKit", "SwiftUI desktop" | expertise/macos-apps |
| "iPhone", "iOS", "iPad", "mobile app", "SwiftUI mobile" | expertise/iphone-apps |
| "Unity", "game", "C#", "3D game", "2D game" | expertise/unity-games |
| "MIDI", "MIDI tool", "sequencer", "MIDI controller", "music app", "MIDI 2.0", "MPE", "SysEx" | expertise/midi |
| "Agent SDK", "Claude SDK", "agentic app" | expertise/with-agent-sdk |
| "Python automation", "workflow", "API integration", "webhooks", "Celery", "Airflow", "Prefect" | expertise/python-workflow-automation |
| "UI", "design", "frontend", "interface", "responsive", "visual design", "landing page", "website design", "Tailwind", "CSS", "web design" | expertise/ui-design |
If domain inferred, confirm:
```
Detected: [domain] project → expertise/[skill-name]
Load this expertise for planning? (Y / see other options / none)
```
</inference_rules>
<no_inference>
If no domain obvious from request, present options:
```
What type of project is this?
Available domain expertise:
1. macos-apps - Native macOS with Swift/SwiftUI
2. iphone-apps - Native iOS with Swift/SwiftUI
3. unity-games - Unity game development
4. swift-midi-apps - MIDI/audio apps
5. with-agent-sdk - Claude Agent SDK apps
6. ui-design - Stunning UI/UX design & frontend development
[... any others found in expertise/]
N. None - proceed without domain expertise
C. Create domain skill first
Select:
```
</no_inference>
<load_domain>
When domain selected, use intelligent loading:
**Step 1: Read domain SKILL.md**
```bash
cat ~/.claude/skills/expertise/[domain]/SKILL.md 2>/dev/null
```
This loads core principles and routing guidance (~5k tokens).
**Step 2: Determine what references are needed**
Domain SKILL.md should contain a `<references_index>` section that maps planning contexts to specific references.
Example:
```markdown
<references_index>
**For database/persistence phases:** references/core-data.md, references/swift-concurrency.md
**For UI/layout phases:** references/swiftui-layout.md, references/appleHIG.md
**For system integration:** references/appkit-integration.md
**Always useful:** references/swift-conventions.md
</references_index>
```
**Step 3: Load only relevant references**
Based on the phase being planned (from ROADMAP), load ONLY the references mentioned for that type of work.
```bash
# Example: Planning a database phase
cat ~/.claude/skills/expertise/macos-apps/references/core-data.md
cat ~/.claude/skills/expertise/macos-apps/references/swift-conventions.md
```
**Context efficiency:**
- SKILL.md only: ~5k tokens
- SKILL.md + selective references: ~8-12k tokens
- All references (old approach): ~20-27k tokens
Announce: "Loaded [domain] expertise ([X] references for [phase-type])."
**If domain skill not found:** Inform user and offer to proceed without domain expertise.
**If SKILL.md doesn't have references_index:** Fall back to loading all references with warning about context usage.
</load_domain>
<when_to_load>
Domain expertise should be loaded BEFORE:
- Creating roadmap (phases should be domain-appropriate)
- Planning phases (tasks must be domain-specific)
Domain expertise is NOT needed for:
- Creating brief (vision is domain-agnostic)
- Resuming from handoff (context already established)
- Transition between phases (just updating status)
</when_to_load>
</domain_expertise>
<intake>
Based on scan results, present context-aware options:
**If handoff found:**
```
Found handoff: .planning/phases/XX/.continue-here.md
[Summary of state from handoff]
1. Resume from handoff
2. Discard handoff, start fresh
3. Different action
```
**If planning structure exists:**
```
Project: [from BRIEF or directory]
Brief: [exists/missing]
Roadmap: [X phases defined]
Current: [phase status]
What would you like to do?
1. Plan next phase
2. Execute current phase
3. Create handoff (stopping for now)
4. View/update roadmap
5. Something else
```
**If no planning structure:**
```
No planning structure found.
What would you like to do?
1. Start new project (create brief)
2. Create roadmap from existing brief
3. Jump straight to phase planning
4. Get guidance on approach
```
**Wait for response before proceeding.**
</intake>
<routing>
| Response | Workflow |
|----------|----------|
| "brief", "new project", "start", 1 (no structure) | `workflows/create-brief.md` |
| "roadmap", "phases", 2 (no structure) | `workflows/create-roadmap.md` |
| "phase", "plan phase", "next phase", 1 (has structure) | `workflows/plan-phase.md` |
| "chunk", "next tasks", "what's next" | `workflows/plan-chunk.md` |
| "execute", "run", "do it", "build it", 2 (has structure) | **EXIT SKILL** → Use `/run-plan <path>` slash command |
| "research", "investigate", "unknowns" | `workflows/research-phase.md` |
| "handoff", "pack up", "stopping", 3 (has structure) | `workflows/handoff.md` |
| "resume", "continue", 1 (has handoff) | `workflows/resume.md` |
| "transition", "complete", "done", "next" | `workflows/transition.md` |
| "milestone", "ship", "v1.0", "release" | `workflows/complete-milestone.md` |
| "guidance", "help", 4 | `workflows/get-guidance.md` |
**Critical:** Plan execution should NOT invoke this skill. Use `/run-plan` for context efficiency (skill loads ~20k tokens, /run-plan loads ~5-7k).
**After reading the workflow, follow it exactly.**
</routing>
<hierarchy>
The planning hierarchy (each level builds on previous):
```
BRIEF.md → Human vision (you read this)
ROADMAP.md → Phase structure (overview)
RESEARCH.md → Research prompt (optional, for unknowns)
FINDINGS.md → Research output (if research done)
PLAN.md → THE PROMPT (Claude executes this)
SUMMARY.md → Outcome (existence = phase complete)
```
**Rules:**
- Roadmap requires Brief (or prompts to create one)
- Phase plan requires Roadmap (knows phase scope)
- PLAN.md IS the execution prompt
- SUMMARY.md existence marks phase complete
- Each level can look UP for context
</hierarchy>
<output_structure>
All planning artifacts go in `.planning/`:
```
.planning/
├── BRIEF.md # Human vision
├── ROADMAP.md # Phase structure + tracking
└── phases/
├── 01-foundation/
│ ├── 01-01-PLAN.md # Plan 1: Database setup
│ ├── 01-01-SUMMARY.md # Outcome (exists = done)
│ ├── 01-02-PLAN.md # Plan 2: API routes
│ ├── 01-02-SUMMARY.md
│ ├── 01-03-PLAN.md # Plan 3: UI components
│ └── .continue-here-01-03.md # Handoff (temporary, if needed)
└── 02-auth/
├── 02-01-RESEARCH.md # Research prompt (if needed)
├── 02-01-FINDINGS.md # Research output
├── 02-02-PLAN.md # Implementation prompt
└── 02-02-SUMMARY.md
```
**Naming convention:**
- Plans: `{phase}-{plan}-PLAN.md` (e.g., 01-03-PLAN.md)
- Summaries: `{phase}-{plan}-SUMMARY.md` (e.g., 01-03-SUMMARY.md)
- Phase folders: `{phase}-{name}/` (e.g., 01-foundation/)
Files sort chronologically. Related artifacts (plan + summary) are adjacent.
</output_structure>
<reference_index>
All in `references/`:
**Structure:** directory-structure.md, hierarchy-rules.md
**Formats:** handoff-format.md, plan-format.md
**Patterns:** context-scanning.md, context-management.md
**Planning:** scope-estimation.md, checkpoints.md, milestone-management.md
**Process:** user-gates.md, git-integration.md, research-pitfalls.md
**Domain:** domain-expertise.md (guide for creating context-efficient domain skills)
</reference_index>
<templates_index>
All in `templates/`:
| Template | Purpose |
|----------|---------|
| brief.md | Project vision document with current state |
| roadmap.md | Phase structure with milestone groupings |
| phase-prompt.md | Executable phase prompt (PLAN.md) |
| research-prompt.md | Research prompt (RESEARCH.md) |
| summary.md | Phase outcome (SUMMARY.md) with deviations |
| milestone.md | Milestone entry for MILESTONES.md |
| issues.md | Deferred enhancements log (ISSUES.md) |
| continue-here.md | Context handoff format |
</templates_index>
<workflows_index>
All in `workflows/`:
| Workflow | Purpose |
|----------|---------|
| create-brief.md | Create project vision document |
| create-roadmap.md | Define phases from brief |
| plan-phase.md | Create executable phase prompt |
| execute-phase.md | Run phase prompt, create summary |
| research-phase.md | Create and run research prompt |
| plan-chunk.md | Plan immediate next tasks |
| transition.md | Mark phase complete, advance |
| complete-milestone.md | Mark shipped version, create milestone entry |
| handoff.md | Create context handoff for pausing |
| resume.md | Load handoff, restore context |
| get-guidance.md | Help decide planning approach |
</workflows_index>
<success_criteria>
Planning skill succeeds when:
- Context scan runs before intake
- Appropriate workflow selected based on state
- PLAN.md IS the executable prompt (not separate)
- Hierarchy is maintained (brief → roadmap → phase)
- Handoffs preserve full context for resumption
- Context limits are respected (auto-handoff at 10%)
- Deviations handled automatically per embedded rules
- All work (planned and discovered) fully documented
- Domain expertise loaded intelligently (SKILL.md + selective references, not all files)
- Plan execution uses /run-plan command (not skill invocation)
</success_criteria>

View File

@@ -0,0 +1,584 @@
# Human Checkpoints in Plans
Plans execute autonomously. Checkpoints formalize the interaction points where human verification or decisions are needed.
**Core principle:** Claude automates everything with CLI/API. Checkpoints are for verification and decisions, not manual work.
## Checkpoint Types
### 1. `checkpoint:human-verify` (Most Common)
**When:** Claude completed automated work, human confirms it works correctly.
**Use for:**
- Visual UI checks (layout, styling, responsiveness)
- Interactive flows (click through wizard, test user flows)
- Functional verification (feature works as expected)
- Audio/video playback quality
- Animation smoothness
- Accessibility testing
**Structure:**
```xml
<task type="checkpoint:human-verify" gate="blocking">
<what-built>[What Claude automated and deployed/built]</what-built>
<how-to-verify>
[Exact steps to test - URLs, commands, expected behavior]
</how-to-verify>
<resume-signal>[How to continue - "approved", "yes", or describe issues]</resume-signal>
</task>
```
**Key elements:**
- `<what-built>`: What Claude automated (deployed, built, configured)
- `<how-to-verify>`: Exact steps to confirm it works (numbered, specific)
- `<resume-signal>`: Clear indication of how to continue
**Example: Vercel Deployment**
```xml
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json</files>
<action>Run `vercel --yes` to create project and deploy. Capture deployment URL from output.</action>
<verify>vercel ls shows deployment, curl {url} returns 200</verify>
<done>App deployed, URL captured</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Deployed to Vercel at https://myapp-abc123.vercel.app</what-built>
<how-to-verify>
Visit https://myapp-abc123.vercel.app and confirm:
- Homepage loads without errors
- Login form is visible
- No console errors in browser DevTools
</how-to-verify>
<resume-signal>Type "approved" to continue, or describe issues to fix</resume-signal>
</task>
```
**Example: UI Component**
```xml
<task type="auto">
<name>Build responsive dashboard layout</name>
<files>src/components/Dashboard.tsx, src/app/dashboard/page.tsx</files>
<action>Create dashboard with sidebar, header, and content area. Use Tailwind responsive classes for mobile.</action>
<verify>npm run build succeeds, no TypeScript errors</verify>
<done>Dashboard component builds without errors</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Responsive dashboard layout at /dashboard</what-built>
<how-to-verify>
1. Run: npm run dev
2. Visit: http://localhost:3000/dashboard
3. Desktop (>1024px): Verify sidebar left, content right, header top
4. Tablet (768px): Verify sidebar collapses to hamburger
5. Mobile (375px): Verify single column, bottom nav
6. Check: No layout shift, no horizontal scroll
</how-to-verify>
<resume-signal>Type "approved" or describe layout issues</resume-signal>
</task>
```
**Example: Xcode Build**
```xml
<task type="auto">
<name>Build macOS app with Xcode</name>
<files>App.xcodeproj, Sources/</files>
<action>Run `xcodebuild -project App.xcodeproj -scheme App build`. Check for compilation errors in output.</action>
<verify>Build output contains "BUILD SUCCEEDED", no errors</verify>
<done>App builds successfully</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Built macOS app at DerivedData/Build/Products/Debug/App.app</what-built>
<how-to-verify>
Open App.app and test:
- App launches without crashes
- Menu bar icon appears
- Preferences window opens correctly
- No visual glitches or layout issues
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
```
### 2. `checkpoint:decision`
**When:** Human must make choice that affects implementation direction.
**Use for:**
- Technology selection (which auth provider, which database)
- Architecture decisions (monorepo vs separate repos)
- Design choices (color scheme, layout approach)
- Feature prioritization (which variant to build)
- Data model decisions (schema structure)
**Structure:**
```xml
<task type="checkpoint:decision" gate="blocking">
<decision>[What's being decided]</decision>
<context>[Why this decision matters]</context>
<options>
<option id="option-a">
<name>[Option name]</name>
<pros>[Benefits]</pros>
<cons>[Tradeoffs]</cons>
</option>
<option id="option-b">
<name>[Option name]</name>
<pros>[Benefits]</pros>
<cons>[Tradeoffs]</cons>
</option>
</options>
<resume-signal>[How to indicate choice]</resume-signal>
</task>
```
**Key elements:**
- `<decision>`: What's being decided
- `<context>`: Why this matters
- `<options>`: Each option with balanced pros/cons (not prescriptive)
- `<resume-signal>`: How to indicate choice
**Example: Auth Provider Selection**
```xml
<task type="checkpoint:decision" gate="blocking">
<decision>Select authentication provider</decision>
<context>
Need user authentication for the app. Three solid options with different tradeoffs.
</context>
<options>
<option id="supabase">
<name>Supabase Auth</name>
<pros>Built-in with Supabase DB we're using, generous free tier, row-level security integration</pros>
<cons>Less customizable UI, tied to Supabase ecosystem</cons>
</option>
<option id="clerk">
<name>Clerk</name>
<pros>Beautiful pre-built UI, best developer experience, excellent docs</pros>
<cons>Paid after 10k MAU, vendor lock-in</cons>
</option>
<option id="nextauth">
<name>NextAuth.js</name>
<pros>Free, self-hosted, maximum control, widely adopted</pros>
<cons>More setup work, you manage security updates, UI is DIY</cons>
</option>
</options>
<resume-signal>Select: supabase, clerk, or nextauth</resume-signal>
</task>
```
### 3. `checkpoint:human-action` (Rare)
**When:** Action has NO CLI/API and requires human-only interaction, OR Claude hit an authentication gate during automation.
**Use ONLY for:**
- **Authentication gates** - Claude tried to use CLI/API but needs credentials to continue (this is NOT a failure)
- Email verification links (account creation requires clicking email)
- SMS 2FA codes (phone verification)
- Manual account approvals (platform requires human review before API access)
- Credit card 3D Secure flows (web-based payment authorization)
- OAuth app approvals (some platforms require web-based approval)
**Do NOT use for pre-planned manual work:**
- Manually deploying to Vercel (use `vercel` CLI - auth gate if needed)
- Manually creating Stripe webhooks (use Stripe API - auth gate if needed)
- Manually creating databases (use provider CLI - auth gate if needed)
- Running builds/tests manually (use Bash tool)
- Creating files manually (use Write tool)
**Structure:**
```xml
<task type="checkpoint:human-action" gate="blocking">
<action>[What human must do - Claude already did everything automatable]</action>
<instructions>
[What Claude already automated]
[The ONE thing requiring human action]
</instructions>
<verification>[What Claude can check afterward]</verification>
<resume-signal>[How to continue]</resume-signal>
</task>
```
**Key principle:** Claude automates EVERYTHING possible first, only asks human for the truly unavoidable manual step.
**Example: Email Verification**
```xml
<task type="auto">
<name>Create SendGrid account via API</name>
<action>Use SendGrid API to create subuser account with provided email. Request verification email.</action>
<verify>API returns 201, account created</verify>
<done>Account created, verification email sent</done>
</task>
<task type="checkpoint:human-action" gate="blocking">
<action>Complete email verification for SendGrid account</action>
<instructions>
I created the account and requested verification email.
Check your inbox for SendGrid verification link and click it.
</instructions>
<verification>SendGrid API key works: curl test succeeds</verification>
<resume-signal>Type "done" when email verified</resume-signal>
</task>
```
**Example: Credit Card 3D Secure**
```xml
<task type="auto">
<name>Create Stripe payment intent</name>
<action>Use Stripe API to create payment intent for $99. Generate checkout URL.</action>
<verify>Stripe API returns payment intent ID and URL</verify>
<done>Payment intent created</done>
</task>
<task type="checkpoint:human-action" gate="blocking">
<action>Complete 3D Secure authentication</action>
<instructions>
I created the payment intent: https://checkout.stripe.com/pay/cs_test_abc123
Visit that URL and complete the 3D Secure verification flow with your test card.
</instructions>
<verification>Stripe webhook receives payment_intent.succeeded event</verification>
<resume-signal>Type "done" when payment completes</resume-signal>
</task>
```
**Example: Authentication Gate (Dynamic Checkpoint)**
```xml
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json</files>
<action>Run `vercel --yes` to deploy</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
<!-- If vercel returns "Error: Not authenticated", Claude creates checkpoint on the fly -->
<task type="checkpoint:human-action" gate="blocking">
<action>Authenticate Vercel CLI so I can continue deployment</action>
<instructions>
I tried to deploy but got authentication error.
Run: vercel login
This will open your browser - complete the authentication flow.
</instructions>
<verification>vercel whoami returns your account email</verification>
<resume-signal>Type "done" when authenticated</resume-signal>
</task>
<!-- After authentication, Claude retries the deployment -->
<task type="auto">
<name>Retry Vercel deployment</name>
<action>Run `vercel --yes` (now authenticated)</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
```
**Key distinction:** Authentication gates are created dynamically when Claude encounters auth errors during automation. They're NOT pre-planned - Claude tries to automate first, only asks for credentials when blocked.
See references/cli-automation.md "Authentication Gates" section for more examples and full protocol.
## Execution Protocol
When Claude encounters `type="checkpoint:*"`:
1. **Stop immediately** - do not proceed to next task
2. **Display checkpoint clearly:**
```
════════════════════════════════════════
CHECKPOINT: [Type]
════════════════════════════════════════
Task [X] of [Y]: [Name]
[Display checkpoint-specific content]
[Resume signal instruction]
════════════════════════════════════════
```
3. **Wait for user response** - do not hallucinate completion
4. **Verify if possible** - check files, run tests, whatever is specified
5. **Resume execution** - continue to next task only after confirmation
**For checkpoint:human-verify:**
```
════════════════════════════════════════
CHECKPOINT: Verification Required
════════════════════════════════════════
Task 5 of 8: Responsive dashboard layout
I built: Responsive dashboard at /dashboard
How to verify:
1. Run: npm run dev
2. Visit: http://localhost:3000/dashboard
3. Test: Resize browser window to mobile/tablet/desktop
4. Confirm: No layout shift, proper responsive behavior
Type "approved" to continue, or describe issues.
════════════════════════════════════════
```
**For checkpoint:decision:**
```
════════════════════════════════════════
CHECKPOINT: Decision Required
════════════════════════════════════════
Task 2 of 6: Select authentication provider
Decision: Which auth provider should we use?
Context: Need user authentication. Three options with different tradeoffs.
Options:
1. supabase - Built-in with our DB, free tier
2. clerk - Best DX, paid after 10k users
3. nextauth - Self-hosted, maximum control
Select: supabase, clerk, or nextauth
════════════════════════════════════════
```
## Writing Good Checkpoints
**DO:**
- Automate everything with CLI/API before checkpoint
- Be specific: "Visit https://myapp.vercel.app" not "check deployment"
- Number verification steps: easier to follow
- State expected outcomes: "You should see X"
- Provide context: why this checkpoint exists
- Make verification executable: clear, testable steps
**DON'T:**
- Ask human to do work Claude can automate (deploy, create resources, run builds)
- Assume knowledge: "Configure the usual settings" ❌
- Skip steps: "Set up database" ❌ (too vague)
- Mix multiple verifications in one checkpoint (split them)
- Make verification impossible (Claude can't check visual appearance without user confirmation)
## When to Use Checkpoints
**Use checkpoint:human-verify for:**
- Visual verification (UI, layouts, animations)
- Interactive testing (click flows, user journeys)
- Quality checks (audio/video playback, animation smoothness)
- Confirming deployed apps are accessible
**Use checkpoint:decision for:**
- Technology selection (auth providers, databases, frameworks)
- Architecture choices (monorepo, deployment strategy)
- Design decisions (color schemes, layout approaches)
- Feature prioritization
**Use checkpoint:human-action for:**
- Email verification links (no API)
- SMS 2FA codes (no API)
- Manual approvals with no automation
- 3D Secure payment flows
**Don't use checkpoints for:**
- Things Claude can verify programmatically (tests pass, build succeeds)
- File operations (Claude can read files to verify)
- Code correctness (use tests and static analysis)
- Anything automatable via CLI/API
## Checkpoint Placement
Place checkpoints:
- **After automation completes** - not before Claude does the work
- **After UI buildout** - before declaring phase complete
- **Before dependent work** - decisions before implementation
- **At integration points** - after configuring external services
Bad placement:
- Before Claude automates (asking human to do automatable work) ❌
- Too frequent (every other task is a checkpoint) ❌
- Too late (checkpoint is last task, but earlier tasks needed its result) ❌
## Complete Examples
### Example 1: Deployment Flow (Correct)
```xml
<!-- Claude automates everything -->
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json, package.json</files>
<action>
1. Run `vercel --yes` to create project and deploy
2. Capture deployment URL from output
3. Set environment variables with `vercel env add`
4. Trigger production deployment with `vercel --prod`
</action>
<verify>
- vercel ls shows deployment
- curl {url} returns 200
- Environment variables set correctly
</verify>
<done>App deployed to production, URL captured</done>
</task>
<!-- Human verifies visual/functional correctness -->
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Deployed to https://myapp.vercel.app</what-built>
<how-to-verify>
Visit https://myapp.vercel.app and confirm:
- Homepage loads correctly
- All images/assets load
- Navigation works
- No console errors
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
```
### Example 2: Database Setup (Correct)
```xml
<!-- Claude automates everything -->
<task type="auto">
<name>Create Upstash Redis database</name>
<files>.env</files>
<action>
1. Run `upstash redis create myapp-cache --region us-east-1`
2. Capture connection URL from output
3. Write to .env: UPSTASH_REDIS_URL={url}
4. Verify connection with test command
</action>
<verify>
- upstash redis list shows database
- .env contains UPSTASH_REDIS_URL
- Test connection succeeds
</verify>
<done>Redis database created and configured</done>
</task>
<!-- NO CHECKPOINT NEEDED - Claude automated everything and verified programmatically -->
```
### Example 3: Stripe Webhooks (Correct)
```xml
<!-- Claude automates everything -->
<task type="auto">
<name>Configure Stripe webhooks</name>
<files>.env, src/app/api/webhooks/route.ts</files>
<action>
1. Use Stripe API to create webhook endpoint pointing to /api/webhooks
2. Subscribe to events: payment_intent.succeeded, customer.subscription.updated
3. Save webhook signing secret to .env
4. Implement webhook handler in route.ts
</action>
<verify>
- Stripe API returns webhook endpoint ID
- .env contains STRIPE_WEBHOOK_SECRET
- curl webhook endpoint returns 200
</verify>
<done>Stripe webhooks configured and handler implemented</done>
</task>
<!-- Human verifies in Stripe dashboard -->
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Stripe webhook configured via API</what-built>
<how-to-verify>
Visit Stripe Dashboard > Developers > Webhooks
Confirm: Endpoint shows https://myapp.com/api/webhooks with correct events
</how-to-verify>
<resume-signal>Type "yes" if correct</resume-signal>
</task>
```
## Anti-Patterns
### ❌ BAD: Asking human to automate
```xml
<task type="checkpoint:human-action" gate="blocking">
<action>Deploy to Vercel</action>
<instructions>
1. Visit vercel.com/new
2. Import Git repository
3. Click Deploy
4. Copy deployment URL
</instructions>
<verification>Deployment exists</verification>
<resume-signal>Paste URL</resume-signal>
</task>
```
**Why bad:** Vercel has a CLI. Claude should run `vercel --yes`.
### ✅ GOOD: Claude automates, human verifies
```xml
<task type="auto">
<name>Deploy to Vercel</name>
<action>Run `vercel --yes`. Capture URL.</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
<task type="checkpoint:human-verify">
<what-built>Deployed to {url}</what-built>
<how-to-verify>Visit {url}, check homepage loads</how-to-verify>
<resume-signal>Type "approved"</resume-signal>
</task>
```
### ❌ BAD: Too many checkpoints
```xml
<task type="auto">Create schema</task>
<task type="checkpoint:human-verify">Check schema</task>
<task type="auto">Create API route</task>
<task type="checkpoint:human-verify">Check API</task>
<task type="auto">Create UI form</task>
<task type="checkpoint:human-verify">Check form</task>
```
**Why bad:** Verification fatigue. Combine into one checkpoint at end.
### ✅ GOOD: Single verification checkpoint
```xml
<task type="auto">Create schema</task>
<task type="auto">Create API route</task>
<task type="auto">Create UI form</task>
<task type="checkpoint:human-verify">
<what-built>Complete auth flow (schema + API + UI)</what-built>
<how-to-verify>Test full flow: register, login, access protected page</how-to-verify>
<resume-signal>Type "approved"</resume-signal>
</task>
```
### ❌ BAD: Asking for automatable file operations
```xml
<task type="checkpoint:human-action">
<action>Create .env file</action>
<instructions>
1. Create .env in project root
2. Add: DATABASE_URL=...
3. Add: STRIPE_KEY=...
</instructions>
</task>
```
**Why bad:** Claude has Write tool. This should be `type="auto"`.
## Summary
Checkpoints formalize human-in-the-loop points. Use them when Claude cannot complete a task autonomously OR when human verification is required for correctness.
**The golden rule:** If Claude CAN automate it, Claude MUST automate it.
**Checkpoint priority:**
1. **checkpoint:human-verify** (90% of checkpoints) - Claude automated everything, human confirms visual/functional correctness
2. **checkpoint:decision** (9% of checkpoints) - Human makes architectural/technology choices
3. **checkpoint:human-action** (1% of checkpoints) - Truly unavoidable manual steps with no API/CLI
**See also:** references/cli-automation.md for exhaustive list of what Claude can automate.

View File

@@ -0,0 +1,497 @@
# CLI and API Automation Reference
**Core principle:** If it has a CLI or API, Claude does it. Never ask the human to perform manual steps that Claude can automate.
This reference documents what Claude CAN and SHOULD automate during plan execution.
## Deployment Platforms
### Vercel
**CLI:** `vercel`
**What Claude automates:**
- Create and deploy projects: `vercel --yes`
- Set environment variables: `vercel env add KEY production`
- Link to git repo: `vercel link`
- Trigger deployments: `vercel --prod`
- Get deployment URLs: `vercel ls`
- Manage domains: `vercel domains add example.com`
**Never ask human to:**
- Visit vercel.com/new to create project
- Click through dashboard to add env vars
- Manually link repository
**Checkpoint pattern:**
```xml
<task type="auto">
<name>Deploy to Vercel</name>
<action>Run `vercel --yes` to deploy. Capture deployment URL.</action>
<verify>vercel ls shows deployment, curl {url} returns 200</verify>
</task>
<task type="checkpoint:human-verify">
<what-built>Deployed to {url}</what-built>
<how-to-verify>Visit {url} - check homepage loads</how-to-verify>
<resume-signal>Type "yes" if correct</resume-signal>
</task>
```
### Railway
**CLI:** `railway`
**What Claude automates:**
- Initialize project: `railway init`
- Link to repo: `railway link`
- Deploy: `railway up`
- Set variables: `railway variables set KEY=value`
- Get deployment URL: `railway domain`
### Fly.io
**CLI:** `fly`
**What Claude automates:**
- Launch app: `fly launch --no-deploy`
- Deploy: `fly deploy`
- Set secrets: `fly secrets set KEY=value`
- Scale: `fly scale count 2`
## Payment & Billing
### Stripe
**CLI:** `stripe`
**What Claude automates:**
- Create webhook endpoints: `stripe listen --forward-to localhost:3000/api/webhooks`
- Trigger test events: `stripe trigger payment_intent.succeeded`
- Create products/prices: Stripe API via curl/fetch
- Manage customers: Stripe API via curl/fetch
- Check webhook logs: `stripe webhooks list`
**Never ask human to:**
- Visit dashboard.stripe.com to create webhook
- Click through UI to create products
- Manually copy webhook signing secret
**Checkpoint pattern:**
```xml
<task type="auto">
<name>Configure Stripe webhooks</name>
<action>Use Stripe API to create webhook endpoint at /api/webhooks. Save signing secret to .env.</action>
<verify>stripe webhooks list shows endpoint, .env contains STRIPE_WEBHOOK_SECRET</verify>
</task>
<task type="checkpoint:human-verify">
<what-built>Stripe webhook configured</what-built>
<how-to-verify>Check Stripe dashboard > Developers > Webhooks shows endpoint with correct URL</how-to-verify>
<resume-signal>Type "yes" if correct</resume-signal>
</task>
```
## Databases & Backend
### Supabase
**CLI:** `supabase`
**What Claude automates:**
- Initialize project: `supabase init`
- Link to remote: `supabase link --project-ref {ref}`
- Create migrations: `supabase migration new {name}`
- Push migrations: `supabase db push`
- Generate types: `supabase gen types typescript`
- Deploy functions: `supabase functions deploy {name}`
**Never ask human to:**
- Visit supabase.com to create project manually
- Click through dashboard to run migrations
- Copy/paste connection strings
**Note:** Project creation may require web dashboard initially (no CLI for initial project creation), but all subsequent work (migrations, functions, etc.) is CLI-automated.
### Upstash (Redis/Kafka)
**CLI:** `upstash`
**What Claude automates:**
- Create Redis database: `upstash redis create {name} --region {region}`
- Get connection details: `upstash redis get {id}`
- Create Kafka cluster: `upstash kafka create {name} --region {region}`
**Never ask human to:**
- Visit console.upstash.com
- Click through UI to create database
- Copy/paste connection URLs manually
**Checkpoint pattern:**
```xml
<task type="auto">
<name>Create Upstash Redis database</name>
<action>Run `upstash redis create myapp-cache --region us-east-1`. Save URL to .env.</action>
<verify>.env contains UPSTASH_REDIS_URL, upstash redis list shows database</verify>
</task>
```
### PlanetScale
**CLI:** `pscale`
**What Claude automates:**
- Create database: `pscale database create {name} --region {region}`
- Create branch: `pscale branch create {db} {branch}`
- Deploy request: `pscale deploy-request create {db} {branch}`
- Connection string: `pscale connect {db} {branch}`
## Version Control & CI/CD
### GitHub
**CLI:** `gh`
**What Claude automates:**
- Create repo: `gh repo create {name} --public/--private`
- Create issues: `gh issue create --title "{title}" --body "{body}"`
- Create PR: `gh pr create --title "{title}" --body "{body}"`
- Manage secrets: `gh secret set {KEY}`
- Trigger workflows: `gh workflow run {name}`
- Check status: `gh run list`
**Never ask human to:**
- Visit github.com to create repo
- Click through UI to add secrets
- Manually create issues/PRs
## Build Tools & Testing
### Node/npm/pnpm/bun
**What Claude automates:**
- Install dependencies: `npm install`, `pnpm install`, `bun install`
- Run builds: `npm run build`
- Run tests: `npm test`, `npm run test:e2e`
- Type checking: `tsc --noEmit`
**Never ask human to:** Run these commands manually
### Xcode (macOS/iOS)
**CLI:** `xcodebuild`
**What Claude automates:**
- Build project: `xcodebuild -project App.xcodeproj -scheme App build`
- Run tests: `xcodebuild test -project App.xcodeproj -scheme App`
- Archive: `xcodebuild archive -project App.xcodeproj -scheme App`
- Check compilation: Parse xcodebuild output for errors
**Never ask human to:**
- Open Xcode and click Product > Build
- Click Product > Test manually
- Check for errors by looking at Xcode UI
**Checkpoint pattern:**
```xml
<task type="auto">
<name>Build macOS app</name>
<action>Run `xcodebuild -project App.xcodeproj -scheme App build`. Check output for errors.</action>
<verify>Build succeeds with "BUILD SUCCEEDED" in output</verify>
</task>
<task type="checkpoint:human-verify">
<what-built>Built macOS app at DerivedData/Build/Products/Debug/App.app</what-built>
<how-to-verify>Open App.app and check: login flow works, no visual glitches</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
```
## Environment Configuration
### .env Files
**Tool:** Write tool
**What Claude automates:**
- Create .env files: Use Write tool
- Append variables: Use Edit tool
- Read current values: Use Read tool
**Never ask human to:**
- Manually create .env file
- Copy/paste values into .env
- Edit .env in text editor
**Pattern:**
```xml
<task type="auto">
<name>Configure environment variables</name>
<action>Write .env file with: DATABASE_URL, STRIPE_KEY, JWT_SECRET (generated).</action>
<verify>Read .env confirms all variables present</verify>
</task>
```
## Email & Communication
### Resend
**API:** Resend API via HTTP
**What Claude automates:**
- Create API keys via dashboard API (if available) or instructions for one-time setup
- Send emails: Resend API
- Configure domains: Resend API
### SendGrid
**API:** SendGrid API via HTTP
**What Claude automates:**
- Create API keys via API
- Send emails: SendGrid API
- Configure webhooks: SendGrid API
**Note:** Initial account setup may require email verification (checkpoint:human-action), but all subsequent work is API-automated.
## Authentication Gates
**Critical distinction:** When Claude tries to use a CLI/API and gets an authentication error, this is NOT a failure - it's a gate that requires human input to unblock automation.
**Pattern: Claude encounters auth error → creates checkpoint → you authenticate → Claude continues**
### Example: Vercel CLI Not Authenticated
```xml
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json</files>
<action>Run `vercel --yes` to deploy</action>
<verify>vercel ls shows deployment</verify>
</task>
<!-- If vercel returns "Error: Not authenticated" -->
<task type="checkpoint:human-action" gate="blocking">
<action>Authenticate Vercel CLI so I can continue deployment</action>
<instructions>
I tried to deploy but got authentication error.
Run: vercel login
This will open your browser - complete the authentication flow.
</instructions>
<verification>vercel whoami returns your account email</verification>
<resume-signal>Type "done" when authenticated</resume-signal>
</task>
<!-- After authentication, Claude retries automatically -->
<task type="auto">
<name>Retry Vercel deployment</name>
<action>Run `vercel --yes` (now authenticated)</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
```
### Example: Stripe CLI Needs API Key
```xml
<task type="auto">
<name>Create Stripe webhook endpoint</name>
<action>Use Stripe API to create webhook at /api/webhooks</action>
</task>
<!-- If API returns 401 Unauthorized -->
<task type="checkpoint:human-action" gate="blocking">
<action>Provide Stripe API key so I can continue webhook configuration</action>
<instructions>
I need your Stripe API key to create webhooks.
1. Visit dashboard.stripe.com/apikeys
2. Copy your "Secret key" (starts with sk_test_ or sk_live_)
3. Paste it here or run: export STRIPE_SECRET_KEY=sk_...
</instructions>
<verification>Stripe API key works: curl test succeeds</verification>
<resume-signal>Type "done" or paste the key</resume-signal>
</task>
<!-- After key provided, Claude writes to .env and continues -->
<task type="auto">
<name>Save Stripe key and create webhook</name>
<action>
1. Write STRIPE_SECRET_KEY to .env
2. Create webhook endpoint via Stripe API
3. Save webhook secret to .env
</action>
<verify>.env contains both keys, webhook endpoint exists</verify>
</task>
```
### Example: GitHub CLI Not Logged In
```xml
<task type="auto">
<name>Create GitHub repository</name>
<action>Run `gh repo create myapp --public`</action>
</task>
<!-- If gh returns "Not logged in" -->
<task type="checkpoint:human-action" gate="blocking">
<action>Authenticate GitHub CLI so I can create repository</action>
<instructions>
I need GitHub authentication to create the repo.
Run: gh auth login
Follow the prompts to authenticate (browser or token).
</instructions>
<verification>gh auth status shows "Logged in"</verification>
<resume-signal>Type "done" when authenticated</resume-signal>
</task>
<task type="auto">
<name>Create repository (authenticated)</name>
<action>Run `gh repo create myapp --public`</action>
<verify>gh repo view shows repository exists</verify>
</task>
```
### Example: Upstash CLI Needs API Key
```xml
<task type="auto">
<name>Create Upstash Redis database</name>
<action>Run `upstash redis create myapp-cache --region us-east-1`</action>
</task>
<!-- If upstash returns auth error -->
<task type="checkpoint:human-action" gate="blocking">
<action>Configure Upstash CLI credentials so I can create database</action>
<instructions>
I need Upstash authentication to create Redis database.
1. Visit console.upstash.com/account/api
2. Copy your API key
3. Run: upstash auth login
4. Paste your API key when prompted
</instructions>
<verification>upstash auth status shows authenticated</verification>
<resume-signal>Type "done" when authenticated</resume-signal>
</task>
<task type="auto">
<name>Create Redis database (authenticated)</name>
<action>
1. Run `upstash redis create myapp-cache --region us-east-1`
2. Capture connection URL
3. Write to .env: UPSTASH_REDIS_URL={url}
</action>
<verify>upstash redis list shows database, .env contains URL</verify>
</task>
```
### Authentication Gate Protocol
**When Claude encounters authentication error during execution:**
1. **Recognize it's not a failure** - Missing auth is expected, not a bug
2. **Stop current task** - Don't retry repeatedly
3. **Create checkpoint:human-action on the fly** - Dynamic checkpoint, not pre-planned
4. **Provide exact authentication steps** - CLI commands, where to get keys
5. **Verify authentication** - Test that auth works before continuing
6. **Retry the original task** - Resume automation where it left off
7. **Continue normally** - One auth gate doesn't break the flow
**Key difference from pre-planned checkpoints:**
- Pre-planned: "I need you to do X" (wrong - Claude should automate)
- Auth gate: "I tried to automate X but need credentials to continue" (correct - unblocks automation)
**This preserves agentic flow:**
- Claude tries automation first
- Only asks for help when blocked by credentials
- Continues automating after unblocked
- You never manually deploy/create resources - just provide keys
## When checkpoint:human-action is REQUIRED
**Truly rare cases where no CLI/API exists:**
1. **Email verification links** - Account signup requires clicking verification email
2. **SMS verification codes** - 2FA requiring phone
3. **Manual account approvals** - Platform requires human review before API access
4. **Domain DNS records at registrar** - Some registrars have no API
5. **Credit card input** - Payment methods requiring 3D Secure web flow
6. **OAuth app approval** - Some platforms require web-based app approval flow
**For these rare cases:**
```xml
<task type="checkpoint:human-action" gate="blocking">
<action>Complete email verification for SendGrid account</action>
<instructions>
I created the account and requested verification email.
Check your inbox for verification link and click it.
</instructions>
<verification>SendGrid API key works: curl test succeeds</verification>
<resume-signal>Type "done" when verified</resume-signal>
</task>
```
**Key difference:** Claude does EVERYTHING possible first (account creation, API requests), only asks human for the one thing with no automation path.
## Quick Reference: "Can Claude automate this?"
| Action | CLI/API? | Claude does it? |
|--------|----------|-----------------|
| Deploy to Vercel | ✅ `vercel` | YES |
| Create Stripe webhook | ✅ Stripe API | YES |
| Run xcodebuild | ✅ `xcodebuild` | YES |
| Write .env file | ✅ Write tool | YES |
| Create Upstash DB | ✅ `upstash` CLI | YES |
| Install npm packages | ✅ `npm` | YES |
| Create GitHub repo | ✅ `gh` | YES |
| Run tests | ✅ `npm test` | YES |
| Create Supabase project | ⚠️ Web dashboard | NO (then CLI for everything else) |
| Click email verification link | ❌ No API | NO |
| Enter credit card with 3DS | ❌ No API | NO |
**Default answer: YES.** Unless explicitly in the "NO" category, Claude automates it.
## Decision Tree
```
┌─────────────────────────────────────┐
│ Task requires external resource? │
└──────────────┬──────────────────────┘
┌─────────────────────────────────────┐
│ Does it have CLI/API/tool access? │
└──────────────┬──────────────────────┘
┌─────┴─────┐
│ │
▼ ▼
YES NO
│ │
│ ▼
│ ┌──────────────────────────────┐
│ │ checkpoint:human-action │
│ │ (email links, 2FA, etc.) │
│ └──────────────────────────────┘
┌────────────────────────────────────────┐
│ task type="auto" │
│ Claude automates via CLI/API │
└────────────┬───────────────────────────┘
┌────────────────────────────────────────┐
│ checkpoint:human-verify │
│ Human confirms visual/functional │
└────────────────────────────────────────┘
```
## Summary
**The rule:** If Claude CAN do it, Claude MUST do it.
Checkpoints are for:
- **Verification** - Confirming Claude's automated work looks/behaves correctly
- **Decisions** - Choosing between valid approaches
- **True blockers** - Rare actions with literally no API/CLI (email links, 2FA)
Checkpoints are NOT for:
- Deploying (use CLI)
- Creating resources (use CLI/API)
- Running builds (use Bash)
- Writing files (use Write tool)
- Anything with automation available
**This keeps the agentic coding workflow intact - Claude does the work, you verify results.**

View File

@@ -0,0 +1,138 @@
<overview>
Claude has a finite context window. This reference defines how to monitor usage and handle approaching limits gracefully.
</overview>
<context_awareness>
Claude receives system warnings showing token usage:
```
Token usage: 150000/200000; 50000 remaining
```
This information appears in `<system_warning>` tags during the conversation.
</context_awareness>
<thresholds>
<threshold level="comfortable" remaining="50%+">
**Status**: Plenty of room
**Action**: Work normally
</threshold>
<threshold level="getting_full" remaining="25%">
**Status**: Context accumulating
**Action**: Mention to user: "Context getting full. Consider wrapping up or creating handoff soon."
**No immediate action required.**
</threshold>
<threshold level="low" remaining="15%">
**Status**: Running low
**Action**:
1. Pause at next safe point (complete current atomic operation)
2. Ask user: "Running low on context (~30k tokens remaining). Options:
- Create handoff now and resume in fresh session
- Push through (risky if complex work remains)"
3. Await user decision
**Do not start new large operations.**
</threshold>
<threshold level="critical" remaining="10%">
**Status**: Must stop
**Action**:
1. Complete current atomic task (don't leave broken state)
2. **Automatically create handoff** without asking
3. Tell user: "Context limit reached. Created handoff at [location]. Start fresh session to continue."
4. **Stop working** - do not start any new tasks
This is non-negotiable. Running out of context mid-task is worse than stopping early.
</threshold>
</thresholds>
<what_counts_as_atomic>
An atomic operation is one that shouldn't be interrupted:
**Atomic (finish before stopping)**:
- Writing a single file
- Running a validation command
- Completing a single task from the plan
**Not atomic (can pause between)**:
- Multiple tasks in sequence
- Multi-file changes (can pause between files)
- Research + implementation (can pause between)
When hitting 10% threshold, finish current atomic operation, then stop.
</what_counts_as_atomic>
<handoff_content_at_limit>
When auto-creating handoff at 10%, include:
```yaml
---
phase: [current phase]
task: [current task number]
total_tasks: [total]
status: context_limit_reached
last_updated: [timestamp]
---
```
Body must capture:
1. What was just completed
2. What task was in progress (and how far)
3. What remains
4. Any decisions/context from this session
Be thorough - the next session starts fresh.
</handoff_content_at_limit>
<preventing_context_bloat>
Strategies to extend context life:
**Don't re-read files unnecessarily**
- Read once, remember content
- Don't cat the same file multiple times
**Summarize rather than quote**
- "The schema has 5 models including User and Session"
- Not: [paste entire schema]
**Use targeted reads**
- Read specific functions, not entire files
- Use grep to find relevant sections
**Clear completed work from "memory"**
- Once a task is done, don't keep referencing it
- Move forward, don't re-explain
**Avoid verbose output**
- Concise responses
- Don't repeat user's question back
- Don't over-explain obvious things
</preventing_context_bloat>
<user_signals>
Watch for user signals that suggest context concern:
- "Let's wrap up"
- "Save my place"
- "I need to step away"
- "Pack it up"
- "Create a handoff"
- "Running low on context?"
Any of these → trigger handoff workflow immediately.
</user_signals>
<fresh_session_guidance>
When user returns in fresh session:
1. They invoke skill
2. Context scan finds handoff
3. Resume workflow activates
4. Load handoff, present summary
5. Delete handoff after confirmation
6. Continue from saved state
The fresh session has full context available again.
</fresh_session_guidance>

View File

@@ -0,0 +1,170 @@
# Domain Expertise Structure
Guide for creating domain expertise skills that work efficiently with create-plans.
## Purpose
Domain expertise provides context-specific knowledge (Swift/macOS patterns, Next.js conventions, Unity workflows) that makes plans more accurate and actionable.
**Critical:** Domain skills must be context-efficient. Loading 20k+ tokens of references defeats the purpose.
## File Structure
```
~/.claude/skills/expertise/[domain-name]/
├── SKILL.md # Core principles + references_index (5-7k tokens)
├── references/ # Selective loading based on phase type
│ ├── always-useful.md # Conventions, patterns used in all phases
│ ├── database.md # Database-specific guidance
│ ├── ui-layout.md # UI-specific guidance
│ ├── api-routes.md # API-specific guidance
│ └── ...
└── workflows/ # Optional: domain-specific workflows
└── ...
```
## SKILL.md Template
```markdown
---
name: [domain-name]
description: [What this expertise covers]
---
<principles>
## Core Principles
[Fundamental patterns that apply to ALL work in this domain]
[Should be complete enough to plan without loading references]
Examples:
- File organization patterns
- Naming conventions
- Architecture patterns
- Common gotchas to avoid
- Framework-specific requirements
**Keep this section comprehensive but concise (~3-5k tokens).**
</principles>
<references_index>
## Reference Loading Guide
When planning phases, load references based on phase type:
**For [phase-type-1] phases:**
- references/[file1].md - [What it contains]
- references/[file2].md - [What it contains]
**For [phase-type-2] phases:**
- references/[file3].md - [What it contains]
- references/[file4].md - [What it contains]
**Always useful (load for any phase):**
- references/conventions.md - [What it contains]
- references/common-patterns.md - [What it contains]
**Examples of phase type mapping:**
- Database/persistence phases → database.md, migrations.md
- UI/layout phases → ui-patterns.md, design-system.md
- API/backend phases → api-routes.md, auth.md
- Integration phases → system-apis.md, third-party.md
</references_index>
<workflows>
## Optional Workflows
[If domain has specific workflows, list them here]
[These are NOT auto-loaded - only used when specifically invoked]
</workflows>
```
## Reference File Guidelines
Each reference file should be:
**1. Focused** - Single concern (database patterns, UI layout, API design)
**2. Actionable** - Contains patterns Claude can directly apply
```markdown
# Database Patterns
## Table Naming
- Singular nouns (User, not Users)
- snake_case for SQL, PascalCase for models
## Common Patterns
- Soft deletes: deleted_at timestamp
- Audit columns: created_at, updated_at
- Foreign keys: [table]_id format
```
**3. Sized appropriately** - 500-2000 lines (~1-5k tokens)
- Too small: Not worth separate file
- Too large: Split into more focused files
**4. Self-contained** - Can be understood without reading other references
## Context Efficiency Examples
**Bad (old approach):**
```
Load all references: 10,728 lines = ~27k tokens
Result: 50% context before planning starts
```
**Good (new approach):**
```
Load SKILL.md: ~5k tokens
Planning UI phase → load ui-layout.md + conventions.md: ~7k tokens
Total: ~12k tokens (saves 15k for workspace)
```
## Phase Type Classification
Help create-plans determine which references to load:
**Common phase types:**
- **Foundation/Setup** - Project structure, dependencies, configuration
- **Database/Data** - Schema, models, migrations, queries
- **API/Backend** - Routes, controllers, business logic, auth
- **UI/Frontend** - Components, layouts, styling, interactions
- **Integration** - External APIs, system services, third-party SDKs
- **Features** - Domain-specific functionality
- **Polish** - Performance, accessibility, error handling
**References should map to these types** so create-plans can load the right context.
## Migration Guide
If you have an existing domain skill with many references:
1. **Audit references** - What's actually useful vs. reference dumps?
2. **Consolidate principles** - Move core patterns into SKILL.md principles section
3. **Create references_index** - Map phase types to relevant references
4. **Test loading** - Verify you can plan a phase with <15k token overhead
5. **Iterate** - Adjust groupings based on actual planning needs
## Example: macos-apps
**Before (inefficient):**
- 20 reference files
- Load all: 10,728 lines (~27k tokens)
**After (efficient):**
SKILL.md contains:
- Swift/SwiftUI core principles
- macOS app architecture patterns
- Common patterns (MV VM, data flow)
- references_index mapping:
- UI phases → swiftui-layout.md, appleHIG.md (~4k)
- Data phases → core-data.md, swift-concurrency.md (~5k)
- System phases → appkit-integration.md, menu-bar.md (~3k)
- Always → swift-conventions.md (~2k)
**Result:** 5-12k tokens instead of 27k (saves 15-22k for planning)

View File

@@ -0,0 +1,106 @@
# Git Integration Reference
## Core Principle
**Commit outcomes, not process.**
The git log should read like a changelog of what shipped, not a diary of planning activity.
## Commit Points (Only 3)
| Event | Commit? | Why |
|-------|---------|-----|
| BRIEF + ROADMAP created | YES | Project initialization |
| PLAN.md created | NO | Intermediate - commit with completion |
| RESEARCH.md created | NO | Intermediate |
| FINDINGS.md created | NO | Intermediate |
| **Phase completed** | YES | Actual code shipped |
| Handoff created | YES | WIP state preserved |
## Git Check on Invocation
```bash
git rev-parse --git-dir 2>/dev/null || echo "NO_GIT_REPO"
```
If NO_GIT_REPO:
- Inline: "No git repo found. Initialize one? (Recommended for version control)"
- If yes: `git init`
## Commit Message Formats
### 1. Project Initialization (brief + roadmap together)
```
docs: initialize [project-name] ([N] phases)
[One-liner from BRIEF.md]
Phases:
1. [phase-name]: [goal]
2. [phase-name]: [goal]
3. [phase-name]: [goal]
```
What to commit:
```bash
git add .planning/
git commit
```
### 2. Phase Completion
```
feat([domain]): [one-liner from SUMMARY.md]
- [Key accomplishment 1]
- [Key accomplishment 2]
- [Key accomplishment 3]
[If issues encountered:]
Note: [issue and resolution]
```
Use `fix([domain])` for bug fix phases.
What to commit:
```bash
git add .planning/phases/XX-name/ # PLAN.md + SUMMARY.md
git add src/ # Actual code created
git commit
```
### 3. Handoff (WIP)
```
wip: [phase-name] paused at task [X]/[Y]
Current: [task name]
[If blocked:] Blocked: [reason]
```
What to commit:
```bash
git add .planning/
git commit
```
## Example Clean Git Log
```
a]7f2d1 feat(checkout): Stripe payments with webhook verification
b]3e9c4 feat(products): catalog with search, filters, and pagination
c]8a1b2 feat(auth): JWT with refresh rotation using jose
d]5c3d7 feat(foundation): Next.js 15 + Prisma + Tailwind scaffold
e]2f4a8 docs: initialize ecommerce-app (5 phases)
```
## What NOT To Commit Separately
- PLAN.md creation (wait for phase completion)
- RESEARCH.md (intermediate)
- FINDINGS.md (intermediate)
- Minor planning tweaks
- "Fixed typo in roadmap"
These create noise. Commit outcomes, not process.

View File

@@ -0,0 +1,142 @@
<overview>
The planning hierarchy ensures context flows down and progress flows up.
Each level builds on the previous and enables the next.
</overview>
<hierarchy>
```
BRIEF.md ← Vision (human-focused)
ROADMAP.md ← Structure (phases)
phases/XX/PLAN.md ← Implementation (Claude-executable)
prompts/ ← Execution (via create-meta-prompts)
```
</hierarchy>
<level name="brief">
**Purpose**: Capture vision, goals, constraints
**Audience**: Human (the user)
**Contains**: What we're building, why, success criteria, out of scope
**Creates**: `.planning/BRIEF.md`
**Requires**: Nothing (can start here)
**Enables**: Roadmap creation
This is the ONLY document optimized for human reading.
</level>
<level name="roadmap">
**Purpose**: Define phases and sequence
**Audience**: Both human and Claude
**Contains**: Phase names, goals, dependencies, progress tracking
**Creates**: `.planning/ROADMAP.md`, `.planning/phases/` directories
**Requires**: Brief (or quick context if skipping)
**Enables**: Phase planning
Roadmap looks UP to Brief for scope, looks DOWN to track phase completion.
</level>
<level name="phase_plan">
**Purpose**: Define Claude-executable tasks
**Audience**: Claude (the implementer)
**Contains**: Tasks with Files/Action/Verification/Done-when
**Creates**: `.planning/phases/XX-name/PLAN.md`
**Requires**: Roadmap (to know phase scope)
**Enables**: Prompt generation, direct execution
Phase plan looks UP to Roadmap for scope, produces implementation details.
</level>
<level name="prompts">
**Purpose**: Optimized execution instructions
**Audience**: Claude (via create-meta-prompts)
**Contains**: Research/Plan/Do prompts with metadata
**Creates**: `.planning/phases/XX-name/prompts/`
**Requires**: Phase plan (tasks to execute)
**Enables**: Autonomous execution
Prompts are generated from phase plan via create-meta-prompts skill.
</level>
<navigation_rules>
<looking_up>
When creating a lower-level artifact, ALWAYS read higher levels for context:
- Creating Roadmap → Read Brief
- Planning Phase → Read Roadmap AND Brief
- Generating Prompts → Read Phase Plan AND Roadmap
This ensures alignment with overall vision.
</looking_up>
<looking_down>
When updating a higher-level artifact, check lower levels for status:
- Updating Roadmap progress → Check which phase PLANs exist, completion state
- Reviewing Brief → See how far we've come via Roadmap
This enables progress tracking.
</looking_down>
<missing_prerequisites>
If a prerequisite doesn't exist:
```
Creating phase plan but no roadmap exists.
Options:
1. Create roadmap first (recommended)
2. Create quick roadmap placeholder
3. Proceed anyway (not recommended - loses hierarchy benefits)
```
Always offer to create missing pieces rather than skipping.
</missing_prerequisites>
</navigation_rules>
<file_locations>
All planning artifacts in `.planning/`:
```
.planning/
├── BRIEF.md # One per project
├── ROADMAP.md # One per project
└── phases/
├── 01-phase-name/
│ ├── PLAN.md # One per phase
│ ├── .continue-here.md # Temporary (when paused)
│ └── prompts/ # Generated execution prompts
├── 02-phase-name/
│ ├── PLAN.md
│ └── prompts/
└── ...
```
Phase directories use `XX-kebab-case` for consistent ordering.
</file_locations>
<scope_inheritance>
Each level inherits and narrows scope:
**Brief**: "Build a task management app"
**Roadmap**: "Phase 1: Core task CRUD, Phase 2: Projects, Phase 3: Collaboration"
**Phase 1 Plan**: "Task 1: Database schema, Task 2: API endpoints, Task 3: UI"
Scope flows DOWN and gets more specific.
Progress flows UP and gets aggregated.
</scope_inheritance>
<cross_phase_context>
When planning Phase N, Claude should understand:
- What Phase N-1 delivered (completed work)
- What Phase N should build on (foundations)
- What Phase N+1 will need (don't paint into corner)
Read previous phase's PLAN.md to understand current state.
</cross_phase_context>

View File

@@ -0,0 +1,495 @@
# Milestone Management & Greenfield/Brownfield Planning
Milestones mark shipped versions. They solve the "what happens after v1.0?" problem.
## The Core Problem
**After shipping v1.0:**
- Planning artifacts optimized for greenfield (starting from scratch)
- But now you have: existing code, users, constraints, shipped features
- Need brownfield awareness without losing planning structure
**Solution:** Milestone-bounded extensions with updated BRIEF.
## Three Planning Modes
### 1. Greenfield (v1.0 Initial Development)
**Characteristics:**
- No existing code
- No users
- No constraints from shipped versions
- Pure "build from scratch" mode
**Planning structure:**
```
.planning/
├── BRIEF.md # Original vision
├── ROADMAP.md # Phases 1-4
└── phases/
├── 01-foundation/
├── 02-features/
├── 03-polish/
└── 04-launch/
```
**BRIEF.md looks like:**
```markdown
# Project Brief: AppName
**Vision:** Build a thing that does X
**Purpose:** Solve problem Y
**Scope:**
- Feature A
- Feature B
- Feature C
**Success:** Ships and works
```
**Workflow:** Normal planning → execution → transition flow
---
### 2. Brownfield Extensions (v1.1, v1.2 - Same Codebase)
**Characteristics:**
- v1.0 shipped and in use
- Adding features / fixing issues
- Same codebase, continuous evolution
- Existing code referenced in new plans
**Planning structure:**
```
.planning/
├── BRIEF.md # Updated with "Current State"
├── ROADMAP.md # Phases 1-6 (grouped by milestone)
├── MILESTONES.md # v1.0 entry
└── phases/
├── 01-foundation/ # ✓ v1.0
├── 02-features/ # ✓ v1.0
├── 03-polish/ # ✓ v1.0
├── 04-launch/ # ✓ v1.0
├── 05-security/ # 🚧 v1.1 (in progress)
└── 06-performance/ # 📋 v1.1 (planned)
```
**BRIEF.md updated:**
```markdown
# Project Brief: AppName
## Current State (Updated: 2025-12-01)
**Shipped:** v1.0 MVP (2025-11-25)
**Users:** 500 downloads, 50 daily actives
**Feedback:** Requesting dark mode, occasional crashes on network errors
**Codebase:** 2,450 lines Swift, macOS 13.0+, AppKit
## v1.1 Goals
**Vision:** Harden reliability and add dark mode based on user feedback
**Motivation:**
- 5 crash reports related to network errors
- 15 users requested dark mode
- Want to improve before marketing push
**Scope (v1.1):**
- Comprehensive error handling
- Dark mode support
- Crash reporting integration
---
<details>
<summary>Original Vision (v1.0 - Archived)</summary>
[Original brief content]
</details>
```
**ROADMAP.md updated:**
```markdown
# Roadmap: AppName
## Milestones
-**v1.0 MVP** - Phases 1-4 (shipped 2025-11-25)
- 🚧 **v1.1 Hardening** - Phases 5-6 (in progress)
## Phases
<details>
<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED 2025-11-25</summary>
- [x] Phase 1: Foundation
- [x] Phase 2: Core Features
- [x] Phase 3: Polish
- [x] Phase 4: Launch
</details>
### 🚧 v1.1 Hardening (In Progress)
- [ ] Phase 5: Error Handling & Stability
- [ ] Phase 6: Dark Mode UI
```
**How plans become brownfield-aware:**
When planning Phase 5, the PLAN.md automatically gets context:
```markdown
<context>
@.planning/BRIEF.md # Knows: v1.0 shipped, codebase exists
@.planning/MILESTONES.md # Knows: what v1.0 delivered
@AppName/NetworkManager.swift # Existing code to improve
@AppName/APIClient.swift # Existing code to fix
</context>
<tasks>
<task type="auto">
<name>Add comprehensive error handling to NetworkManager</name>
<files>AppName/NetworkManager.swift</files>
<action>Existing NetworkManager has basic try/catch. Add: retry logic (3 attempts with exponential backoff), specific error types (NetworkError enum), user-friendly error messages. Maintain existing public API - internal improvements only.</action>
<verify>Build succeeds, existing tests pass, new error tests pass</verify>
<done>All network calls have retry logic, error messages are user-friendly</done>
</task>
```
**Key difference from greenfield:**
- PLAN references existing files in `<context>`
- Tasks say "update existing X" not "create X"
- Verify includes "existing tests pass" (regression check)
- Checkpoints may verify existing behavior still works
---
### 3. Major Iterations (v2.0+ - Still Same Codebase)
**Characteristics:**
- Large rewrites within same codebase
- 8-15+ phases planned
- Breaking changes, new architecture
- Still continuous from v1.x
**Planning structure:**
```
.planning/
├── BRIEF.md # Updated for v2.0 vision
├── ROADMAP.md # Phases 1-14 (grouped)
├── MILESTONES.md # v1.0, v1.1 entries
└── phases/
├── 01-foundation/ # ✓ v1.0
├── 02-features/ # ✓ v1.0
├── 03-polish/ # ✓ v1.0
├── 04-launch/ # ✓ v1.0
├── 05-security/ # ✓ v1.1
├── 06-performance/ # ✓ v1.1
├── 07-swiftui-core/ # 🚧 v2.0 (in progress)
├── 08-swiftui-views/ # 📋 v2.0 (planned)
├── 09-new-arch/ # 📋 v2.0
└── ... # Up to 14
```
**ROADMAP.md:**
```markdown
## Milestones
-**v1.0 MVP** - Phases 1-4 (shipped 2025-11-25)
-**v1.1 Hardening** - Phases 5-6 (shipped 2025-12-10)
- 🚧 **v2.0 SwiftUI Redesign** - Phases 7-14 (in progress)
## Phases
<details>
<summary>✅ v1.0 MVP (Phases 1-4)</summary>
[Collapsed]
</details>
<details>
<summary>✅ v1.1 Hardening (Phases 5-6)</summary>
[Collapsed]
</details>
### 🚧 v2.0 SwiftUI Redesign (In Progress)
- [ ] Phase 7: SwiftUI Core Migration
- [ ] Phase 8: SwiftUI Views
- [ ] Phase 9: New Architecture
- [ ] Phase 10: Widget Support
- [ ] Phase 11: iOS Companion
- [ ] Phase 12: Performance
- [ ] Phase 13: Testing
- [ ] Phase 14: Launch
```
**Same rules apply:** Continuous phase numbering, milestone groupings, brownfield-aware plans.
---
## When to Archive and Start Fresh
**Archive ONLY for these scenarios:**
### Scenario 1: Separate Codebase
**Example:**
- Built: WeatherBar (macOS app) ✓ shipped
- Now building: WeatherBar-iOS (separate Xcode project, different repo or workspace)
**Action:**
```
.planning/
├── archive/
│ └── v1-macos/
│ ├── BRIEF.md
│ ├── ROADMAP.md
│ ├── MILESTONES.md
│ └── phases/
├── BRIEF.md # Fresh: iOS app
├── ROADMAP.md # Fresh: starts at phase 01
└── phases/
└── 01-ios-foundation/
```
**Why:** Different codebase = different planning context. Old planning doesn't help with iOS-specific decisions.
### Scenario 2: Complete Rewrite (Different Repo)
**Example:**
- Built: AppName v1 (AppKit, shipped) ✓
- Now building: AppName v2 (complete SwiftUI rewrite, new git repo)
**Action:** Same as Scenario 1 - archive v1, fresh planning for v2
**Why:** New repo, starting from scratch, v1 planning doesn't transfer.
### Scenario 3: Different Product
**Example:**
- Built: WeatherBar (weather app) ✓
- Now building: TaskBar (task management app)
**Action:** New project entirely, new `.planning/` directory
**Why:** Completely different product, no relationship.
---
## Decision Tree
```
Starting new work?
├─ Same codebase/repo?
│ │
│ ├─ YES → Extend existing roadmap
│ │ ├─ Add phases 5-6+ to ROADMAP
│ │ ├─ Update BRIEF "Current State"
│ │ ├─ Plans reference existing code in @context
│ │ └─ Continue normal workflow
│ │
│ └─ NO → Is it a separate platform/codebase for same product?
│ │
│ ├─ YES (e.g., iOS version of Mac app)
│ │ └─ Archive existing planning
│ │ └─ Start fresh with new BRIEF/ROADMAP
│ │ └─ Reference original in "Context" section
│ │
│ └─ NO (completely different product)
│ └─ New project, new planning directory
└─ Is this v1.0 initial delivery?
└─ YES → Greenfield mode
└─ Just follow normal workflow
```
---
## Milestone Workflow Triggers
### When completing v1.0 (first ship):
**User:** "I'm ready to ship v1.0"
**Action:**
1. Verify phases 1-4 complete (all summaries exist)
2. `/milestone:complete "v1.0 MVP"`
3. Creates MILESTONES.md entry
4. Updates BRIEF with "Current State"
5. Reorganizes ROADMAP with milestone grouping
6. Git tag v1.0
7. Commit milestone changes
**Result:** Historical record created, ready for v1.1 work
### When adding v1.1 work:
**User:** "Add dark mode and notifications"
**Action:**
1. Check BRIEF "Current State" - sees v1.0 shipped
2. Ask: "Add phases 5-6 to existing roadmap? (yes / archive and start fresh)"
3. User: "yes"
4. Update BRIEF with v1.1 goals
5. Add Phase 5-6 to ROADMAP under "v1.1" milestone heading
6. Continue normal planning workflow
**Result:** Phases 5-6 added, brownfield-aware through updated BRIEF
### When completing v1.1:
**User:** "Ship v1.1"
**Action:**
1. Verify phases 5-6 complete
2. `/milestone:complete "v1.1 Security"`
3. Add v1.1 entry to MILESTONES.md (prepended, newest first)
4. Update BRIEF current state to v1.1
5. Collapse phases 5-6 in ROADMAP
6. Git tag v1.1
**Result:** v1.0 and v1.1 both in MILESTONES.md, ROADMAP shows history
---
## Brownfield Plan Patterns
**How a brownfield plan differs from greenfield:**
### Greenfield Plan (v1.0):
```markdown
<objective>
Create authentication system from scratch.
</objective>
<context>
@.planning/BRIEF.md
@.planning/ROADMAP.md
</context>
<tasks>
<task type="auto">
<name>Create User model</name>
<files>src/models/User.ts</files>
<action>Create User interface with id, email, passwordHash, createdAt fields. Export from models/index.</action>
<verify>TypeScript compiles, User type exported</verify>
<done>User model exists and is importable</done>
</task>
```
### Brownfield Plan (v1.1):
```markdown
<objective>
Add MFA to existing authentication system.
</objective>
<context>
@.planning/BRIEF.md # Shows v1.0 shipped, auth exists
@.planning/MILESTONES.md # Shows what v1.0 delivered
@src/models/User.ts # Existing User model
@src/auth/AuthService.ts # Existing auth logic
</context>
<tasks>
<task type="auto">
<name>Add MFA fields to User model</name>
<files>src/models/User.ts</files>
<action>Add to existing User interface: mfaEnabled (boolean), mfaSecret (string | null), mfaBackupCodes (string[]). Maintain backward compatibility - all new fields optional or have defaults.</action>
<verify>TypeScript compiles, existing User usages still work</verify>
<done>User model has MFA fields, no breaking changes</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>MFA enrollment flow</what-built>
<how-to-verify>
1. Run: npm run dev
2. Login as existing user (test@example.com)
3. Navigate to Settings → Security
4. Click "Enable MFA" - should show QR code
5. Scan with authenticator app (Google Authenticator)
6. Enter code - should enable successfully
7. Logout, login again - should prompt for MFA code
8. Verify: existing users without MFA can still login (backward compat)
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
```
**Key differences:**
1. **@context** includes existing code files
2. **Actions** say "add to existing" / "update existing" / "maintain backward compat"
3. **Verification** includes regression checks ("existing X still works")
4. **Checkpoints** may verify existing user flows still work
---
## BRIEF Current State Section
The "Current State" section in BRIEF.md is what makes plans brownfield-aware.
**After v1.0 ships:**
```markdown
## Current State (Updated: 2025-11-25)
**Shipped:** v1.0 MVP (2025-11-25)
**Status:** Production
**Users:** 500 downloads, 50 daily actives, growing 10% weekly
**Feedback:**
- "Love the simplicity" (common theme)
- 15 requests for dark mode
- 5 crash reports on network errors
- 3 requests for multiple accounts
**Codebase:**
- 2,450 lines of Swift
- macOS 13.0+ (AppKit)
- OpenWeather API integration
- Auto-refresh every 30 min
- Signed and notarized
**Known Issues:**
- Network errors crash app (no retry logic)
- Memory leak in auto-refresh timer
- No dark mode support
```
When planning Phase 5 (v1.1), Claude reads this and knows:
- Code exists (2,450 lines Swift)
- Users exist (500 downloads)
- Feedback exists (15 want dark mode)
- Issues exist (network crashes, memory leak)
Plans automatically become brownfield-aware because BRIEF says "this is what we have."
---
## Summary
**Greenfield (v1.0):**
- Fresh BRIEF with vision
- Phases 1-4 (or however many)
- Plans create from scratch
- Ship → complete milestone
**Brownfield (v1.1+):**
- Update BRIEF "Current State"
- Add phases 5-6+ to ROADMAP
- Plans reference existing code
- Plans include regression checks
- Ship → complete milestone
**Archive (rare):**
- Only for separate codebases or different products
- Move `.planning/` to `.planning/archive/v1-name/`
- Start fresh with new BRIEF/ROADMAP
- New planning references old in context
**Key insight:** Same roadmap, continuous phase numbering (01-99), milestone groupings keep it organized. BRIEF "Current State" makes everything brownfield-aware automatically.
This scales from "hello world" to 100 shipped versions.

View File

@@ -0,0 +1,377 @@
<overview>
Claude-executable plans have a specific format that enables Claude to implement without interpretation. This reference defines what makes a plan executable vs. vague.
**Key insight:** PLAN.md IS the executable prompt. It contains everything Claude needs to execute the phase, including objective, context references, tasks, verification, success criteria, and output specification.
</overview>
<core_principle>
A plan is Claude-executable when Claude can read the PLAN.md and immediately start implementing without asking clarifying questions.
If Claude has to guess, interpret, or make assumptions - the task is too vague.
</core_principle>
<prompt_structure>
Every PLAN.md follows this XML structure:
```markdown
---
phase: XX-name
type: execute
domain: [optional]
---
<objective>
[What and why]
Purpose: [...]
Output: [...]
</objective>
<context>
@.planning/BRIEF.md
@.planning/ROADMAP.md
@relevant/source/files.ts
</context>
<tasks>
<task type="auto">
<name>Task N: [Name]</name>
<files>[paths]</files>
<action>[what to do, what to avoid and WHY]</action>
<verify>[command/check]</verify>
<done>[criteria]</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>[what Claude automated]</what-built>
<how-to-verify>[numbered verification steps]</how-to-verify>
<resume-signal>[how to continue - "approved" or describe issues]</resume-signal>
</task>
<task type="checkpoint:decision" gate="blocking">
<decision>[what needs deciding]</decision>
<context>[why this matters]</context>
<options>
<option id="option-a"><name>[Name]</name><pros>[pros]</pros><cons>[cons]</cons></option>
<option id="option-b"><name>[Name]</name><pros>[pros]</pros><cons>[cons]</cons></option>
</options>
<resume-signal>[how to indicate choice]</resume-signal>
</task>
</tasks>
<verification>
[Overall phase checks]
</verification>
<success_criteria>
[Measurable completion]
</success_criteria>
<output>
[SUMMARY.md specification]
</output>
```
</prompt_structure>
<task_anatomy>
Every task has four required fields:
<field name="files">
**What it is**: Exact file paths that will be created or modified.
**Good**: `src/app/api/auth/login/route.ts`, `prisma/schema.prisma`
**Bad**: "the auth files", "relevant components"
Be specific. If you don't know the file path, figure it out first.
</field>
<field name="action">
**What it is**: Specific implementation instructions, including what to avoid and WHY.
**Good**: "Create POST endpoint that accepts {email, password}, validates using bcrypt against User table, returns JWT in httpOnly cookie with 15-min expiry. Use jose library (not jsonwebtoken - CommonJS issues with Next.js Edge runtime)."
**Bad**: "Add authentication", "Make login work"
Include: technology choices, data structures, behavior details, pitfalls to avoid.
</field>
<field name="verify">
**What it is**: How to prove the task is complete.
**Good**:
- `npm test` passes
- `curl -X POST /api/auth/login` returns 200 with Set-Cookie header
- Build completes without errors
**Bad**: "It works", "Looks good", "User can log in"
Must be executable - a command, a test, an observable behavior.
</field>
<field name="done">
**What it is**: Acceptance criteria - the measurable state of completion.
**Good**: "Valid credentials return 200 + JWT cookie, invalid credentials return 401"
**Bad**: "Authentication is complete"
Should be testable without subjective judgment.
</field>
</task_anatomy>
<task_types>
Tasks have a `type` attribute that determines how they execute:
<type name="auto">
**Default task type** - Claude executes autonomously.
**Structure:**
```xml
<task type="auto">
<name>Task 3: Create login endpoint with JWT</name>
<files>src/app/api/auth/login/route.ts</files>
<action>POST endpoint accepting {email, password}. Query User by email, compare password with bcrypt. On match, create JWT with jose library, set as httpOnly cookie (15-min expiry). Return 200. On mismatch, return 401.</action>
<verify>curl -X POST localhost:3000/api/auth/login returns 200 with Set-Cookie header</verify>
<done>Valid credentials → 200 + cookie. Invalid → 401.</done>
</task>
```
Use for: Everything Claude can do independently (code, tests, builds, file operations).
</type>
<type name="checkpoint:human-action">
**RARELY USED** - Only for actions with NO CLI/API. Claude automates everything possible first.
**Structure:**
```xml
<task type="checkpoint:human-action" gate="blocking">
<action>[Unavoidable manual step - email link, 2FA code]</action>
<instructions>
[What Claude already automated]
[The ONE thing requiring human action]
</instructions>
<verification>[What Claude can check afterward]</verification>
<resume-signal>[How to continue]</resume-signal>
</task>
```
Use ONLY for: Email verification links, SMS 2FA codes, manual approvals with no API, 3D Secure payment flows.
Do NOT use for: Anything with a CLI (Vercel, Stripe, Upstash, Railway, GitHub), builds, tests, file creation, deployments.
See: references/cli-automation.md for what Claude can automate.
**Execution:** Claude automates everything with CLI/API, stops only for truly unavoidable manual steps.
</type>
<type name="checkpoint:human-verify">
**Human must verify Claude's work** - Visual checks, UX testing.
**Structure:**
```xml
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Responsive dashboard layout</what-built>
<how-to-verify>
1. Run: npm run dev
2. Visit: http://localhost:3000/dashboard
3. Desktop (>1024px): Verify sidebar left, content right
4. Tablet (768px): Verify sidebar collapses to hamburger
5. Mobile (375px): Verify single column, bottom nav
6. Check: No layout shift, no horizontal scroll
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
```
Use for: UI/UX verification, visual design checks, animation smoothness, accessibility testing.
**Execution:** Claude builds the feature, stops, provides testing instructions, waits for approval/feedback.
</type>
<type name="checkpoint:decision">
**Human must make implementation choice** - Direction-setting decisions.
**Structure:**
```xml
<task type="checkpoint:decision" gate="blocking">
<decision>Select authentication provider</decision>
<context>We need user authentication. Three approaches with different tradeoffs:</context>
<options>
<option id="supabase">
<name>Supabase Auth</name>
<pros>Built-in with Supabase, generous free tier</pros>
<cons>Less customizable UI, tied to ecosystem</cons>
</option>
<option id="clerk">
<name>Clerk</name>
<pros>Beautiful pre-built UI, best DX</pros>
<cons>Paid after 10k MAU</cons>
</option>
<option id="nextauth">
<name>NextAuth.js</name>
<pros>Free, self-hosted, maximum control</pros>
<cons>More setup, you manage security</cons>
</option>
</options>
<resume-signal>Select: supabase, clerk, or nextauth</resume-signal>
</task>
```
Use for: Technology selection, architecture decisions, design choices, feature prioritization.
**Execution:** Claude presents options with balanced pros/cons, waits for decision, proceeds with chosen direction.
</type>
**When to use checkpoints:**
- Visual/UX verification (after Claude builds) → `checkpoint:human-verify`
- Implementation direction choice → `checkpoint:decision`
- Truly unavoidable manual actions (email links, 2FA) → `checkpoint:human-action` (rare)
**When NOT to use checkpoints:**
- Anything with CLI/API (Claude automates it) → `type="auto"`
- Deployments (Vercel, Railway, Fly) → `type="auto"` with CLI
- Creating resources (Upstash, Stripe, GitHub) → `type="auto"` with CLI/API
- File operations, tests, builds → `type="auto"`
**Golden rule:** If Claude CAN automate it, Claude MUST automate it. See: references/cli-automation.md
See `references/checkpoints.md` for comprehensive checkpoint guidance.
</task_types>
<context_references>
Use @file references to load context for the prompt:
```markdown
<context>
@.planning/BRIEF.md # Project vision
@.planning/ROADMAP.md # Phase structure
@.planning/phases/02-auth/FINDINGS.md # Research results
@src/lib/db.ts # Existing database setup
@src/types/user.ts # Existing type definitions
</context>
```
Reference files that Claude needs to understand before implementing.
</context_references>
<verification_section>
Overall phase verification (beyond individual task verification):
```markdown
<verification>
Before declaring phase complete:
- [ ] `npm run build` succeeds without errors
- [ ] `npm test` passes all tests
- [ ] No TypeScript errors
- [ ] Feature works end-to-end manually
</verification>
```
</verification_section>
<success_criteria_section>
Measurable criteria for phase completion:
```markdown
<success_criteria>
- All tasks completed
- All verification checks pass
- No errors or warnings introduced
- JWT auth flow works end-to-end
- Protected routes redirect unauthenticated users
</success_criteria>
```
</success_criteria_section>
<output_section>
Specify the SUMMARY.md structure:
```markdown
<output>
After completion, create `.planning/phases/XX-name/SUMMARY.md`:
# Phase X: Name Summary
**[Substantive one-liner]**
## Accomplishments
## Files Created/Modified
## Decisions Made
## Issues Encountered
## Next Phase Readiness
</output>
```
</output_section>
<specificity_levels>
<too_vague>
```xml
<task type="auto">
<name>Task 1: Add authentication</name>
<files>???</files>
<action>Implement auth</action>
<verify>???</verify>
<done>Users can authenticate</done>
</task>
```
Claude: "How? What type? What library? Where?"
</too_vague>
<just_right>
```xml
<task type="auto">
<name>Task 1: Create login endpoint with JWT</name>
<files>src/app/api/auth/login/route.ts</files>
<action>POST endpoint accepting {email, password}. Query User by email, compare password with bcrypt. On match, create JWT with jose library, set as httpOnly cookie (15-min expiry). Return 200. On mismatch, return 401. Use jose instead of jsonwebtoken (CommonJS issues with Edge).</action>
<verify>curl -X POST localhost:3000/api/auth/login -H "Content-Type: application/json" -d '{"email":"test@test.com","password":"test123"}' returns 200 with Set-Cookie header containing JWT</verify>
<done>Valid credentials → 200 + cookie. Invalid → 401. Missing fields → 400.</done>
</task>
```
Claude can implement this immediately.
</just_right>
<too_detailed>
Writing the actual code in the plan. Trust Claude to implement from clear instructions.
</too_detailed>
</specificity_levels>
<anti_patterns>
<vague_actions>
- "Set up the infrastructure"
- "Handle edge cases"
- "Make it production-ready"
- "Add proper error handling"
These require Claude to decide WHAT to do. Specify it.
</vague_actions>
<unverifiable_completion>
- "It works correctly"
- "User experience is good"
- "Code is clean"
- "Tests pass" (which tests? do they exist?)
These require subjective judgment. Make it objective.
</unverifiable_completion>
<missing_context>
- "Use the standard approach"
- "Follow best practices"
- "Like the other endpoints"
Claude doesn't know your standards. Be explicit.
</missing_context>
</anti_patterns>
<sizing_tasks>
Good task size: 15-60 minutes of Claude work.
**Too small**: "Add import statement for bcrypt" (combine with related task)
**Just right**: "Create login endpoint with JWT validation" (focused, specific)
**Too big**: "Implement full authentication system" (split into multiple plans)
If a task takes multiple sessions, break it down.
If a task is trivial, combine with related tasks.
**Note on scope:** If a phase has >7 tasks or spans multiple subsystems, split into multiple plans using the naming convention `{phase}-{plan}-PLAN.md`. See `references/scope-estimation.md` for guidance.
</sizing_tasks>

View File

@@ -0,0 +1,198 @@
# Research Pitfalls - Known Patterns to Avoid
## Purpose
This document catalogs research mistakes discovered in production use, providing specific patterns to avoid and verification strategies to prevent recurrence.
## Known Pitfalls
### Pitfall 1: Configuration Scope Assumptions
**What**: Assuming global configuration means no project-scoping exists
**Example**: Concluding "MCP servers are configured GLOBALLY only" while missing project-scoped `.mcp.json`
**Why it happens**: Not explicitly checking all known configuration patterns
**Prevention**:
```xml
<verification_checklist>
**CRITICAL**: Verify ALL configuration scopes:
□ User/global scope - System-wide configuration
□ Project scope - Project-level configuration files
□ Local scope - Project-specific user overrides
□ Workspace scope - IDE/tool workspace settings
□ Environment scope - Environment variables
</verification_checklist>
```
### Pitfall 2: "Search for X" Vagueness
**What**: Asking researchers to "search for documentation" without specifying where
**Example**: "Research MCP documentation" → finds outdated community blog instead of official docs
**Why it happens**: Vague research instructions don't specify exact sources
**Prevention**:
```xml
<sources>
Official sources (use WebFetch):
- https://exact-url-to-official-docs
- https://exact-url-to-api-reference
Search queries (use WebSearch):
- "specific search query {current_year}"
- "another specific query {current_year}"
</sources>
```
### Pitfall 3: Deprecated vs Current Features
**What**: Finding archived/old documentation and concluding feature doesn't exist
**Example**: Finding 2022 docs saying "feature not supported" when current version added it
**Why it happens**: Not checking multiple sources or recent updates
**Prevention**:
```xml
<verification_checklist>
□ Check current official documentation
□ Review changelog/release notes for recent updates
□ Verify version numbers and publication dates
□ Cross-reference multiple authoritative sources
</verification_checklist>
```
### Pitfall 4: Tool-Specific Variations
**What**: Conflating capabilities across different tools/environments
**Example**: "Claude Desktop supports X" ≠ "Claude Code supports X"
**Why it happens**: Not explicitly checking each environment separately
**Prevention**:
```xml
<verification_checklist>
□ Claude Desktop capabilities
□ Claude Code capabilities
□ VS Code extension capabilities
□ API/SDK capabilities
Document which environment supports which features
</verification_checklist>
```
### Pitfall 5: Confident Negative Claims Without Citations
**What**: Making definitive "X is not possible" statements without official source verification
**Example**: "Folder-scoped MCP configuration is not supported" (missing `.mcp.json`)
**Why it happens**: Drawing conclusions from absence of evidence rather than evidence of absence
**Prevention**:
```xml
<critical_claims_audit>
For any "X is not possible" or "Y is the only way" statement:
- [ ] Is this verified by official documentation stating it explicitly?
- [ ] Have I checked for recent updates that might change this?
- [ ] Have I verified all possible approaches/mechanisms?
- [ ] Am I confusing "I didn't find it" with "it doesn't exist"?
</critical_claims_audit>
```
### Pitfall 6: Missing Enumeration
**What**: Investigating open-ended scope without enumerating known possibilities first
**Example**: "Research configuration options" instead of listing specific options to verify
**Why it happens**: Not creating explicit checklist of items to investigate
**Prevention**:
```xml
<verification_checklist>
Enumerate ALL known options FIRST:
□ Option 1: [specific item]
□ Option 2: [specific item]
□ Option 3: [specific item]
□ Check for additional unlisted options
For each option above, document:
- Existence (confirmed/not found/unclear)
- Official source URL
- Current status (active/deprecated/beta)
</verification_checklist>
```
### Pitfall 7: Single-Source Verification
**What**: Relying on a single source for critical claims
**Example**: Using only Stack Overflow answer from 2021 for current best practices
**Why it happens**: Not cross-referencing multiple authoritative sources
**Prevention**:
```xml
<source_verification>
For critical claims, require multiple sources:
- [ ] Official documentation (primary)
- [ ] Release notes/changelog (for currency)
- [ ] Additional authoritative source (for verification)
- [ ] Contradiction check (ensure sources agree)
</source_verification>
```
### Pitfall 8: Assumed Completeness
**What**: Assuming search results are complete and authoritative
**Example**: First Google result is outdated but assumed current
**Why it happens**: Not verifying publication dates and source authority
**Prevention**:
```xml
<source_verification>
For each source consulted:
- [ ] Publication/update date verified (prefer recent/current)
- [ ] Source authority confirmed (official docs, not blogs)
- [ ] Version relevance checked (matches current version)
- [ ] Multiple search queries tried (not just one)
</source_verification>
```
## Red Flags in Research Outputs
### 🚩 Red Flag 1: Zero "Not Found" Results
**Warning**: Every investigation succeeds perfectly
**Problem**: Real research encounters dead ends, ambiguity, and unknowns
**Action**: Expect honest reporting of limitations, contradictions, and gaps
### 🚩 Red Flag 2: No Confidence Indicators
**Warning**: All findings presented as equally certain
**Problem**: Can't distinguish verified facts from educated guesses
**Action**: Require confidence levels (High/Medium/Low) for key findings
### 🚩 Red Flag 3: Missing URLs
**Warning**: "According to documentation..." without specific URL
**Problem**: Can't verify claims or check for updates
**Action**: Require actual URLs for all official documentation claims
### 🚩 Red Flag 4: Definitive Statements Without Evidence
**Warning**: "X cannot do Y" or "Z is the only way" without citation
**Problem**: Strong claims require strong evidence
**Action**: Flag for verification against official sources
### 🚩 Red Flag 5: Incomplete Enumeration
**Warning**: Verification checklist lists 4 items, output covers 2
**Problem**: Systematic gaps in coverage
**Action**: Ensure all enumerated items addressed or marked "not found"
## Continuous Improvement
When research gaps occur:
1. **Document the gap**
- What was missed or incorrect?
- What was the actual correct information?
- What was the impact?
2. **Root cause analysis**
- Why wasn't it caught?
- Which verification step would have prevented it?
- What pattern does this reveal?
3. **Update this document**
- Add new pitfall entry
- Update relevant checklists
- Share lesson learned
## Quick Reference Checklist
Before submitting research, verify:
- [ ] All enumerated items investigated (not just some)
- [ ] Negative claims verified with official docs
- [ ] Multiple sources cross-referenced for critical claims
- [ ] URLs provided for all official documentation
- [ ] Publication dates checked (prefer recent/current)
- [ ] Tool/environment-specific variations documented
- [ ] Confidence levels assigned honestly
- [ ] Assumptions distinguished from verified facts
- [ ] "What might I have missed?" review completed
---
**Living Document**: Update after each significant research gap
**Lessons From**: MCP configuration research gap (missed `.mcp.json`)

View File

@@ -0,0 +1,415 @@
# Scope Estimation & Quality-Driven Plan Splitting
Plans must maintain consistent quality from first task to last. This requires understanding the **quality degradation curve** and splitting aggressively to stay in the peak quality zone.
## The Quality Degradation Curve
**Critical insight:** Claude doesn't degrade at arbitrary percentages - it degrades when it *perceives* context pressure and enters "completion mode."
```
Context Usage │ Quality Level │ Claude's Mental State
─────────────────────────────────────────────────────────
0-30% │ ████████ PEAK │ "I can be thorough and comprehensive"
│ │ No anxiety, full detail, best work
30-50% │ ██████ GOOD │ "Still have room, maintaining quality"
│ │ Engaged, confident, solid work
50-70% │ ███ DEGRADING │ "Getting tight, need to be efficient"
│ │ Efficiency mode, compression begins
70%+ │ █ POOR │ "Running out, must finish quickly"
│ │ Self-lobotomization, rushed, minimal
```
**The 40-50% inflection point:**
This is where quality breaks. Claude sees context mounting and thinks "I'd better conserve now or I won't finish." Result: The classic mid-execution statement "I'll complete the remaining tasks more concisely" = quality crash.
**The fundamental rule:** Stop BEFORE quality degrades, not at context limit.
## Target: 50% Context Maximum
**Plans should complete within ~50% of context usage.**
Why 50% not 80%?
- Huge safety buffer
- No context anxiety possible
- Quality maintained from start to finish
- Room for unexpected complexity
- Space for iteration and fixes
**If you target 80%, you're planning for failure.** By the time you hit 80%, you've already spent 40% in degradation mode.
## The 2-3 Task Rule
**Each plan should contain 2-3 tasks maximum.**
Why this number?
**Task 1 (0-15% context):**
- Fresh context
- Peak quality
- Comprehensive implementation
- Full testing
- Complete documentation
**Task 2 (15-35% context):**
- Still in peak zone
- Quality maintained
- Buffer feels safe
- No anxiety
**Task 3 (35-50% context):**
- Beginning to feel pressure
- Quality still good but managing it
- Natural stopping point
- Better to commit here
**Task 4+ (50%+ context):**
- DEGRADATION ZONE
- "I'll do this concisely" appears
- Quality crashes
- Should have split before this
**The principle:** Each task is independently committable. 2-3 focused changes per commit creates beautiful, surgical git history.
## Signals to Split Into Multiple Plans
### Always Split If:
**1. More than 3 tasks**
- Even if tasks seem small
- Each additional task increases degradation risk
- Split into logical groups of 2-3
**2. Multiple subsystems**
```
❌ Bad (1 plan):
- Database schema (3 files)
- API routes (5 files)
- UI components (8 files)
Total: 16 files, 1 plan → guaranteed degradation
✅ Good (3 plans):
- 01-01-PLAN.md: Database schema (3 files, 2 tasks)
- 01-02-PLAN.md: API routes (5 files, 3 tasks)
- 01-03-PLAN.md: UI components (8 files, 3 tasks)
Total: 16 files, 3 plans → consistent quality
```
**3. Any task with >5 file modifications**
- Large tasks burn context fast
- Split by file groups or logical units
- Better: 3 plans of 2 files each vs 1 plan of 6 files
**4. Checkpoint + implementation work**
- Checkpoints require user interaction (context preserved)
- Implementation after checkpoint should be separate plan
```
✅ Good split:
- 02-01-PLAN.md: Setup (checkpoint: decision on auth provider)
- 02-02-PLAN.md: Implement chosen auth solution
```
**5. Research + implementation**
- Research produces FINDINGS.md (separate plan)
- Implementation consumes FINDINGS.md (separate plan)
- Clear boundary, clean handoff
### Consider Splitting If:
**1. Estimated >5 files modified total**
- Context from reading existing code
- Context from diffs
- Context from responses
- Adds up faster than expected
**2. Complex domains (auth, payments, data modeling)**
- These require careful thinking
- Burns more context per task than simple CRUD
- Split more aggressively
**3. Any uncertainty about approach**
- "Figure out X" phase separate from "implement X" phase
- Don't mix exploration and implementation
**4. Natural semantic boundaries**
- Setup → Core → Features
- Backend → Frontend
- Configuration → Implementation → Testing
## Splitting Strategies
### By Subsystem
**Phase:** "Authentication System"
**Split:**
```
- 03-01-PLAN.md: Database models (User, Session tables + relations)
- 03-02-PLAN.md: Auth API (register, login, logout endpoints)
- 03-03-PLAN.md: Protected routes (middleware, JWT validation)
- 03-04-PLAN.md: UI components (login form, registration form)
```
Each plan: 2-3 tasks, single subsystem, clean commits.
### By Dependency
**Phase:** "Payment Integration"
**Split:**
```
- 04-01-PLAN.md: Stripe setup (webhook endpoints via API, env vars, test mode)
- 04-02-PLAN.md: Subscription logic (plans, checkout, customer portal)
- 04-03-PLAN.md: Frontend integration (pricing page, payment flow)
```
Later plans depend on earlier completion. Sequential execution, fresh context each time.
### By Complexity
**Phase:** "Dashboard Buildout"
**Split:**
```
- 05-01-PLAN.md: Layout shell (simple: sidebar, header, routing)
- 05-02-PLAN.md: Data fetching (moderate: TanStack Query setup, API integration)
- 05-03-PLAN.md: Data visualization (complex: charts, tables, real-time updates)
```
Complex work gets its own plan with full context budget.
### By Verification Points
**Phase:** "Deployment Pipeline"
**Split:**
```
- 06-01-PLAN.md: Vercel setup (deploy via CLI, configure domains)
→ Ends with checkpoint:human-verify "check xyz.vercel.app loads"
- 06-02-PLAN.md: Environment config (secrets via CLI, env vars)
→ Autonomous (no checkpoints) → subagent execution
- 06-03-PLAN.md: CI/CD (GitHub Actions, preview deploys)
→ Ends with checkpoint:human-verify "check PR preview works"
```
Verification checkpoints create natural boundaries. Autonomous plans between checkpoints execute via subagent with fresh context.
## Autonomous vs Interactive Plans
**Critical optimization:** Plans without checkpoints don't need main context.
### Autonomous Plans (No Checkpoints)
- Contains only `type="auto"` tasks
- No user interaction needed
- **Execute via subagent with fresh 200k context**
- Impossible to degrade (always starts at 0%)
- Creates SUMMARY, commits, reports back
- Can run in parallel (multiple subagents)
### Interactive Plans (Has Checkpoints)
- Contains `checkpoint:human-verify` or `checkpoint:decision` tasks
- Requires user interaction
- Must execute in main context
- Still target 50% context (2-3 tasks)
**Planning guidance:** If splitting a phase, try to:
- Group autonomous work together (→ subagent)
- Separate interactive work (→ main context)
- Maximize autonomous plans (more fresh contexts)
Example:
```
Phase: Feature X
- 07-01-PLAN.md: Backend (autonomous) → subagent
- 07-02-PLAN.md: Frontend (autonomous) → subagent
- 07-03-PLAN.md: Integration test (has checkpoint:human-verify) → main context
```
Two fresh contexts, one interactive verification. Perfect.
## Anti-Patterns
### ❌ The "Comprehensive Plan" Anti-Pattern
```
Plan: "Complete Authentication System"
Tasks:
1. Database models
2. Migration files
3. Auth API endpoints
4. JWT utilities
5. Protected route middleware
6. Password hashing
7. Login form component
8. Registration form component
Result: 8 tasks, 80%+ context, degradation at task 4-5
```
**Why this fails:**
- Task 1-3: Good quality
- Task 4-5: "I'll do these concisely" = degradation begins
- Task 6-8: Rushed, minimal, poor quality
### ✅ The "Atomic Plan" Pattern
```
Split into 4 plans:
Plan 1: "Auth Database Models" (2 tasks)
- Database schema (User, Session)
- Migration files
Plan 2: "Auth API Core" (3 tasks)
- Register endpoint
- Login endpoint
- JWT utilities
Plan 3: "Auth API Protection" (2 tasks)
- Protected route middleware
- Logout endpoint
Plan 4: "Auth UI Components" (2 tasks)
- Login form
- Registration form
```
**Why this succeeds:**
- Each plan: 2-3 tasks, 30-40% context
- All tasks: Peak quality throughout
- Git history: 4 focused commits
- Easy to verify each piece
- Rollback is surgical
### ❌ The "Efficiency Trap" Anti-Pattern
```
Thinking: "These tasks are small, let's do 6 to be efficient"
Result: Task 1-2 are good, task 3-4 begin degrading, task 5-6 are rushed
```
**Why this fails:** You're optimizing for fewer plans, not quality. The "efficiency" is false - poor quality requires more rework.
### ✅ The "Quality First" Pattern
```
Thinking: "These tasks are small, but let's do 2-3 to guarantee quality"
Result: All tasks peak quality, clean commits, no rework needed
```
**Why this succeeds:** You optimize for quality, which is true efficiency. No rework = faster overall.
## Estimating Context Usage
**Rough heuristics for plan size:**
### File Counts
- 0-3 files modified: Small task (~10-15% context)
- 4-6 files modified: Medium task (~20-30% context)
- 7+ files modified: Large task (~40%+ context) - split this
### Complexity
- Simple CRUD: ~15% per task
- Business logic: ~25% per task
- Complex algorithms: ~40% per task
- Domain modeling: ~35% per task
### 2-Task Plan (Safe)
- 2 simple tasks: ~30% total ✅ Plenty of room
- 2 medium tasks: ~50% total ✅ At target
- 2 complex tasks: ~80% total ❌ Too tight, split
### 3-Task Plan (Risky)
- 3 simple tasks: ~45% total ✅ Good
- 3 medium tasks: ~75% total ⚠️ Pushing it
- 3 complex tasks: 120% total ❌ Impossible, split
**Conservative principle:** When in doubt, split. Better to have an extra plan than degraded quality.
## The Atomic Commit Philosophy
**What we're optimizing for:** Beautiful git history where each commit is:
- Focused (2-3 related changes)
- Complete (fully implemented, tested)
- Documented (clear commit message)
- Reviewable (small enough to understand)
- Revertable (surgical rollback possible)
**Bad git history (large plans):**
```
feat(auth): Complete authentication system
- Added 16 files
- Modified 8 files
- 1200 lines changed
- Contains: models, API, UI, middleware, utilities
```
Impossible to review, hard to understand, can't revert without losing everything.
**Good git history (atomic plans):**
```
feat(auth-01): Add User and Session database models
- Added schema files
- Added migration
- 45 lines changed
feat(auth-02): Implement register and login API endpoints
- Added /api/auth/register
- Added /api/auth/login
- Added JWT utilities
- 120 lines changed
feat(auth-03): Add protected route middleware
- Added middleware/auth.ts
- Added tests
- 60 lines changed
feat(auth-04): Build login and registration forms
- Added LoginForm component
- Added RegisterForm component
- 90 lines changed
```
Each commit tells a story. Each is reviewable. Each is revertable. This is craftsmanship.
## Quality Assurance Through Scope Control
**The guarantee:** When you follow the 2-3 task rule with 50% context target:
1. **Consistency:** First task has same quality as last task
2. **Thoroughness:** No "I'll complete X concisely" degradation
3. **Documentation:** Full context budget for comments/tests
4. **Error handling:** Space for proper validation and edge cases
5. **Testing:** Room for comprehensive test coverage
**The cost:** More plans to manage.
**The benefit:** Consistent excellence. No rework. Clean history. Maintainable code.
**The trade-off is worth it.**
## Summary
**Old way (3-6 tasks, 80% target):**
- Tasks 1-2: Good
- Tasks 3-4: Degrading
- Tasks 5-6: Poor
- Git: Large, unreviewable commits
- Quality: Inconsistent
**New way (2-3 tasks, 50% target):**
- All tasks: Peak quality
- Git: Atomic, surgical commits
- Quality: Consistent excellence
- Autonomous plans: Subagent execution (fresh context)
**The principle:** Aggressive atomicity. More plans, smaller scope, consistent quality.
**The rule:** If in doubt, split. Quality over consolidation. Always.

View File

@@ -0,0 +1,72 @@
# User Gates Reference
User gates prevent Claude from charging ahead at critical decision points.
## Question Types
### AskUserQuestion Tool
Use for **structured choices** (2-4 options):
- Selecting from distinct approaches
- Domain/type selection
- When user needs to see options to decide
Examples:
- "What type of project?" (macos-app / iphone-app / web-app / other)
- "Research confidence is low. How to proceed?" (dig deeper / proceed anyway / pause)
- "Multiple valid approaches exist:" (Option A / Option B / Option C)
### Inline Questions
Use for **simple confirmations**:
- Yes/no decisions
- "Does this look right?"
- "Ready to proceed?"
Examples:
- "Here's the task breakdown: [list]. Does this look right?"
- "Proceed with this approach?"
- "I'll initialize a git repo. OK?"
## Decision Gate Loop
After gathering context, ALWAYS offer:
```
Ready to [action], or would you like me to ask more questions?
1. Proceed - I have enough context
2. Ask more questions - There are details to clarify
3. Let me add context - I want to provide additional information
```
Loop continues until user selects "Proceed".
## Mandatory Gate Points
| Location | Gate Type | Trigger |
|----------|-----------|---------|
| plan-phase | Inline | Confirm task breakdown |
| plan-phase | AskUserQuestion | Multiple valid approaches |
| plan-phase | AskUserQuestion | Decision gate before writing |
| research-phase | AskUserQuestion | Low confidence findings |
| research-phase | Inline | Open questions acknowledgment |
| execute-phase | Inline | Verification failure |
| execute-phase | Inline | Issues review before proceeding |
| execute-phase | AskUserQuestion | Previous phase had issues |
| create-brief | AskUserQuestion | Decision gate before writing |
| create-roadmap | Inline | Confirm phase breakdown |
| create-roadmap | AskUserQuestion | Decision gate before writing |
| handoff | Inline | Handoff acknowledgment |
## Good vs Bad Gating
### Good
- Gate before writing artifacts (not after)
- Gate when genuinely ambiguous
- Gate when issues affect next steps
- Quick inline for simple confirmations
### Bad
- Asking obvious choices ("Should I save the file?")
- Multiple gates for same decision
- AskUserQuestion for yes/no
- Gates after the fact

View File

@@ -0,0 +1,157 @@
# Brief Template
## Greenfield Brief (v1.0)
Copy and fill this structure for `.planning/BRIEF.md` when starting a new project:
```markdown
# [Project Name]
**One-liner**: [What this is in one sentence]
## Problem
[What problem does this solve? Why does it need to exist?
2-3 sentences max.]
## Success Criteria
How we know it worked:
- [ ] [Measurable outcome 1]
- [ ] [Measurable outcome 2]
- [ ] [Measurable outcome 3]
## Constraints
[Any hard constraints: tech stack, timeline, budget, dependencies]
- [Constraint 1]
- [Constraint 2]
## Out of Scope
What we're NOT building (prevents scope creep):
- [Not doing X]
- [Not doing Y]
```
<guidelines>
- Keep under 50 lines
- Success criteria must be measurable/verifiable
- Out of scope prevents "while we're at it" creep
- This is the ONLY human-focused document
</guidelines>
## Brownfield Brief (v1.1+)
After shipping v1.0, update BRIEF.md to include current state:
```markdown
# [Project Name]
## Current State (Updated: YYYY-MM-DD)
**Shipped:** v[X.Y] [Name] (YYYY-MM-DD)
**Status:** [Production / Beta / Internal / Live with users]
**Users:** [If known: "~500 downloads, 50 DAU" or "Internal use only" or "N/A"]
**Feedback:** [Key themes from user feedback, or "Initial release, gathering feedback"]
**Codebase:**
- [X,XXX] lines of [primary language]
- [Key tech stack: framework, platform, deployment target]
- [Notable dependencies or architecture]
**Known Issues:**
- [Issue 1 from v1.x that needs addressing]
- [Issue 2]
- [Or "None" if clean slate]
## v[Next] Goals
**Vision:** [What's the goal for this next iteration?]
**Motivation:**
- [Why this work matters now]
- [User feedback driving it]
- [Technical debt or improvements needed]
**Scope (v[X.Y]):**
- [Feature/improvement 1]
- [Feature/improvement 2]
- [Feature/improvement 3]
**Success Criteria:**
- [ ] [Measurable outcome 1]
- [ ] [Measurable outcome 2]
- [ ] [Measurable outcome 3]
**Out of Scope:**
- [Not doing X in this version]
- [Not doing Y in this version]
---
<details>
<summary>Original Vision (v1.0 - Archived for reference)</summary>
**One-liner**: [What this is in one sentence]
## Problem
[What problem does this solve? Why does it need to exist?]
## Success Criteria
How we know it worked:
- [x] [Outcome 1] - Achieved
- [x] [Outcome 2] - Achieved
- [x] [Outcome 3] - Achieved
## Constraints
- [Constraint 1]
- [Constraint 2]
## Out of Scope
- [Not doing X]
- [Not doing Y]
</details>
```
<brownfield_guidelines>
**When to update BRIEF:**
- After completing each milestone (v1.0 → v1.1 → v2.0)
- When starting new phases after a shipped version
- Use `complete-milestone.md` workflow to update systematically
**Current State captures:**
- What shipped (version, date)
- Real-world status (production, beta, etc.)
- User metrics (if applicable)
- User feedback themes
- Codebase stats (LOC, tech stack)
- Known issues needing attention
**Next Goals captures:**
- Vision for next version
- Why now (motivation)
- What's in scope
- What's measurable
- What's explicitly out
**Original Vision:**
- Collapsed in `<details>` tag
- Reference for "where we came from"
- Shows evolution of product thinking
- Checkboxes marked [x] for achieved goals
This structure makes all new plans brownfield-aware automatically because they read BRIEF and see:
- "v1.0 shipped"
- "2,450 lines of existing Swift code"
- "Users reporting X, requesting Y"
- Plans naturally reference existing files in @context
</brownfield_guidelines>

View File

@@ -0,0 +1,78 @@
# Continue-Here Template
Copy and fill this structure for `.planning/phases/XX-name/.continue-here.md`:
```yaml
---
phase: XX-name
task: 3
total_tasks: 7
status: in_progress
last_updated: 2025-01-15T14:30:00Z
---
```
```markdown
<current_state>
[Where exactly are we? What's the immediate context?]
</current_state>
<completed_work>
[What got done this session - be specific]
- Task 1: [name] - Done
- Task 2: [name] - Done
- Task 3: [name] - In progress, [what's done on it]
</completed_work>
<remaining_work>
[What's left in this phase]
- Task 3: [name] - [what's left to do]
- Task 4: [name] - Not started
- Task 5: [name] - Not started
</remaining_work>
<decisions_made>
[Key decisions and why - so next session doesn't re-debate]
- Decided to use [X] because [reason]
- Chose [approach] over [alternative] because [reason]
</decisions_made>
<blockers>
[Anything stuck or waiting on external factors]
- [Blocker 1]: [status/workaround]
</blockers>
<context>
[Mental state, "vibe", anything that helps resume smoothly]
[What were you thinking about? What was the plan?
This is the "pick up exactly where you left off" context.]
</context>
<next_action>
[The very first thing to do when resuming]
Start with: [specific action]
</next_action>
```
<yaml_fields>
Required YAML frontmatter:
- `phase`: Directory name (e.g., `02-authentication`)
- `task`: Current task number
- `total_tasks`: How many tasks in phase
- `status`: `in_progress`, `blocked`, `almost_done`
- `last_updated`: ISO timestamp
</yaml_fields>
<guidelines>
- Be specific enough that a fresh Claude instance understands immediately
- Include WHY decisions were made, not just what
- The `<next_action>` should be actionable without reading anything else
- This file gets DELETED after resume - it's not permanent storage
</guidelines>

View File

@@ -0,0 +1,91 @@
# ISSUES.md Template
This file is auto-created when Rule 5 (Log non-critical enhancements) is first triggered during execution.
Location: `.planning/ISSUES.md`
```markdown
# Project Issues Log
Non-critical enhancements discovered during execution. Address in future phases when appropriate.
## Open Enhancements
### ISS-001: [Brief description]
- **Discovered:** Phase [X] Plan [Y] Task [Z] (YYYY-MM-DD)
- **Type:** [Performance / Refactoring / UX / Testing / Documentation / Accessibility]
- **Description:** [What could be improved and why it would help]
- **Impact:** Low (works correctly, this would enhance)
- **Effort:** [Quick (<1hr) / Medium (1-4hr) / Substantial (>4hr)]
- **Suggested phase:** [Phase number where this makes sense, or "Future"]
### ISS-002: Add connection pooling for Redis
- **Discovered:** Phase 2 Plan 3 Task 6 (2025-11-23)
- **Type:** Performance
- **Description:** Redis client creates new connection per request. Connection pooling would reduce latency and handle connection failures better. Currently works but suboptimal under load.
- **Impact:** Low (works correctly, ~20ms overhead per request)
- **Effort:** Medium (2-3 hours - need to configure ioredis pool, test connection reuse)
- **Suggested phase:** Phase 5 (Performance optimization)
### ISS-003: Refactor UserService into smaller modules
- **Discovered:** Phase 1 Plan 2 Task 3 (2025-11-22)
- **Type:** Refactoring
- **Description:** UserService has grown to 400 lines with mixed concerns (auth, profile, settings). Would be cleaner as separate services (AuthService, ProfileService, SettingsService). Currently works but harder to test and reason about.
- **Impact:** Low (works correctly, just organizational)
- **Effort:** Substantial (4-6 hours - need to split, update imports, ensure no breakage)
- **Suggested phase:** Phase 7 (Code health milestone)
## Closed Enhancements
### ISS-XXX: [Brief description]
- **Status:** Resolved in Phase [X] Plan [Y] (YYYY-MM-DD)
- **Resolution:** [What was done]
- **Benefit:** [How it improved the codebase]
---
**Summary:** [X] open, [Y] closed
**Priority queue:** [List ISS numbers in priority order, or "Address as time permits"]
```
## Usage Guidelines
**When issues are added:**
- Auto-increment ISS numbers (ISS-001, ISS-002, etc.)
- Always include discovery context (Phase/Plan/Task and date)
- Be specific about impact and effort
- Suggested phase helps with roadmap planning
**When issues are resolved:**
- Move to "Closed Enhancements" section
- Document resolution and benefit
- Keeps history for reference
**Prioritization:**
- Quick wins (Quick effort, visible benefit) → Earlier phases
- Substantial refactors (Substantial effort, organizational benefit) → Dedicated "code health" phases
- Nice-to-haves (Low impact, high effort) → "Future" or never
**Integration with roadmap:**
- When planning new phases, scan ISSUES.md for relevant items
- Can create phases specifically for addressing accumulated issues
- Example: "Phase 8: Code Health - Address ISS-003, ISS-007, ISS-012"
## Example: Issues Driving Phase Planning
```markdown
# Roadmap excerpt
### Phase 6: Performance Optimization (Planned)
**Milestone Goal:** Address performance issues discovered during v1.0 usage
**Includes:**
- ISS-002: Redis connection pooling (Medium effort)
- ISS-015: Database query optimization (Quick)
- ISS-021: Image lazy loading (Medium)
**Excludes ISS-003 (refactoring):** Saving for dedicated code health phase
```
This creates traceability: enhancement discovered → logged → planned → addressed → documented.

View File

@@ -0,0 +1,115 @@
# Milestone Entry Template
Add this entry to `.planning/MILESTONES.md` when completing a milestone:
```markdown
## v[X.Y] [Name] (Shipped: YYYY-MM-DD)
**Delivered:** [One sentence describing what shipped]
**Phases completed:** [X-Y] ([Z] plans total)
**Key accomplishments:**
- [Major achievement 1]
- [Major achievement 2]
- [Major achievement 3]
- [Major achievement 4]
**Stats:**
- [X] files created/modified
- [Y] lines of code (primary language)
- [Z] phases, [N] plans, [M] tasks
- [D] days from start to ship (or milestone to milestone)
**Git range:** `feat(XX-XX)``feat(YY-YY)`
**What's next:** [Brief description of next milestone goals, or "Project complete"]
---
```
<structure>
If MILESTONES.md doesn't exist, create it with header:
```markdown
# Project Milestones: [Project Name]
[Entries in reverse chronological order - newest first]
```
</structure>
<guidelines>
**When to create milestones:**
- Initial v1.0 MVP shipped
- Major version releases (v2.0, v3.0)
- Significant feature milestones (v1.1, v1.2)
- Before archiving planning (capture what was shipped)
**Don't create milestones for:**
- Individual phase completions (normal workflow)
- Work in progress (wait until shipped)
- Minor bug fixes that don't constitute a release
**Stats to include:**
- Count modified files: `git diff --stat feat(XX-XX)..feat(YY-YY) | tail -1`
- Count LOC: `find . -name "*.swift" -o -name "*.ts" | xargs wc -l` (or relevant extension)
- Phase/plan/task counts from ROADMAP
- Timeline from first phase commit to last phase commit
**Git range format:**
- First commit of milestone → last commit of milestone
- Example: `feat(01-01)``feat(04-01)` for phases 1-4
</guidelines>
<example>
```markdown
# Project Milestones: WeatherBar
## v1.1 Security & Polish (Shipped: 2025-12-10)
**Delivered:** Security hardening with Keychain integration and comprehensive error handling
**Phases completed:** 5-6 (3 plans total)
**Key accomplishments:**
- Migrated API key storage from plaintext to macOS Keychain
- Implemented comprehensive error handling for network failures
- Added Sentry crash reporting integration
- Fixed memory leak in auto-refresh timer
**Stats:**
- 23 files modified
- 650 lines of Swift added
- 2 phases, 3 plans, 12 tasks
- 8 days from v1.0 to v1.1
**Git range:** `feat(05-01)``feat(06-02)`
**What's next:** v2.0 SwiftUI redesign with widget support
---
## v1.0 MVP (Shipped: 2025-11-25)
**Delivered:** Menu bar weather app with current conditions and 3-day forecast
**Phases completed:** 1-4 (7 plans total)
**Key accomplishments:**
- Menu bar app with popover UI (AppKit)
- OpenWeather API integration with auto-refresh
- Current weather display with conditions icon
- 3-day forecast list with high/low temperatures
- Code signed and notarized for distribution
**Stats:**
- 47 files created
- 2,450 lines of Swift
- 4 phases, 7 plans, 28 tasks
- 12 days from start to ship
**Git range:** `feat(01-01)``feat(04-01)`
**What's next:** Security audit and hardening for v1.1
```
</example>

View File

@@ -0,0 +1,233 @@
# Phase Prompt Template
Copy and fill this structure for `.planning/phases/XX-name/{phase}-{plan}-PLAN.md`:
**Naming:** Use `{phase}-{plan}-PLAN.md` format (e.g., `01-02-PLAN.md` for Phase 1, Plan 2)
```markdown
---
phase: XX-name
type: execute
domain: [optional - if domain skill loaded]
---
<objective>
[What this phase accomplishes - from roadmap phase goal]
Purpose: [Why this matters for the project]
Output: [What artifacts will be created]
</objective>
<execution_context>
@~/.claude/skills/create-plans/workflows/execute-phase.md
@~/.claude/skills/create-plans/templates/summary.md
[If plan contains checkpoint tasks (type="checkpoint:*"), add:]
@~/.claude/skills/create-plans/references/checkpoints.md
</execution_context>
<context>
@.planning/BRIEF.md
@.planning/ROADMAP.md
[If research exists:]
@.planning/phases/XX-name/FINDINGS.md
[Relevant source files:]
@src/path/to/relevant.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: [Action-oriented name]</name>
<files>path/to/file.ext, another/file.ext</files>
<action>[Specific implementation - what to do, how to do it, what to avoid and WHY]</action>
<verify>[Command or check to prove it worked]</verify>
<done>[Measurable acceptance criteria]</done>
</task>
<task type="auto">
<name>Task 2: [Action-oriented name]</name>
<files>path/to/file.ext</files>
<action>[Specific implementation]</action>
<verify>[Command or check]</verify>
<done>[Acceptance criteria]</done>
</task>
<task type="checkpoint:decision" gate="blocking">
<decision>[What needs deciding]</decision>
<context>[Why this decision matters]</context>
<options>
<option id="option-a">
<name>[Option name]</name>
<pros>[Benefits and advantages]</pros>
<cons>[Tradeoffs and limitations]</cons>
</option>
<option id="option-b">
<name>[Option name]</name>
<pros>[Benefits and advantages]</pros>
<cons>[Tradeoffs and limitations]</cons>
</option>
</options>
<resume-signal>[How to indicate choice - "Select: option-a or option-b"]</resume-signal>
</task>
<task type="auto">
<name>Task 3: [Action-oriented name]</name>
<files>path/to/file.ext</files>
<action>[Specific implementation]</action>
<verify>[Command or check]</verify>
<done>[Acceptance criteria]</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>[What Claude just built that needs verification]</what-built>
<how-to-verify>
1. Run: [command to start dev server/app]
2. Visit: [URL to check]
3. Test: [Specific interactions]
4. Confirm: [Expected behaviors]
</how-to-verify>
<resume-signal>Type "approved" to continue, or describe issues to fix</resume-signal>
</task>
[Continue for all tasks - mix of auto and checkpoints as needed...]
</tasks>
<verification>
Before declaring phase complete:
- [ ] [Specific test command]
- [ ] [Build/type check passes]
- [ ] [Behavior verification]
</verification>
<success_criteria>
- All tasks completed
- All verification checks pass
- No errors or warnings introduced
- [Phase-specific criteria]
</success_criteria>
<output>
After completion, create `.planning/phases/XX-name/{phase}-{plan}-SUMMARY.md`:
# Phase [X] Plan [Y]: [Name] Summary
**[Substantive one-liner - what shipped, not "phase complete"]**
## Accomplishments
- [Key outcome 1]
- [Key outcome 2]
## Files Created/Modified
- `path/to/file.ts` - Description
- `path/to/another.ts` - Description
## Decisions Made
[Key decisions and rationale, or "None"]
## Issues Encountered
[Problems and resolutions, or "None"]
## Next Step
[If more plans in this phase: "Ready for {phase}-{next-plan}-PLAN.md"]
[If phase complete: "Phase complete, ready for next phase"]
</output>
```
<key_elements>
From create-meta-prompts patterns:
- XML structure for Claude parsing
- @context references for file loading
- Task types: auto, checkpoint:human-action, checkpoint:human-verify, checkpoint:decision
- Action includes "what to avoid and WHY" (from intelligence-rules)
- Verification is specific and executable
- Success criteria is measurable
- Output specification includes SUMMARY.md structure
**Scope guidance:**
- Aim for 3-6 tasks per plan
- If planning >7 tasks, split into multiple plans (01-01, 01-02, etc.)
- Target ~80% context usage maximum
- See references/scope-estimation.md for splitting guidance
</key_elements>
<good_examples>
```markdown
---
phase: 01-foundation
type: execute
domain: next-js
---
<objective>
Set up Next.js project with authentication foundation.
Purpose: Establish the core structure and auth patterns all features depend on.
Output: Working Next.js app with JWT auth, protected routes, and user model.
</objective>
<execution_context>
@~/.claude/skills/create-plans/workflows/execute-phase.md
@~/.claude/skills/create-plans/templates/summary.md
</execution_context>
<context>
@.planning/BRIEF.md
@.planning/ROADMAP.md
@src/lib/db.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Add User model to database schema</name>
<files>prisma/schema.prisma</files>
<action>Add User model with fields: id (cuid), email (unique), passwordHash, createdAt, updatedAt. Add Session relation. Use @db.VarChar(255) for email to prevent index issues.</action>
<verify>npx prisma validate passes, npx prisma generate succeeds</verify>
<done>Schema valid, types generated, no errors</done>
</task>
<task type="auto">
<name>Task 2: Create login API endpoint</name>
<files>src/app/api/auth/login/route.ts</files>
<action>POST endpoint that accepts {email, password}, validates against User table using bcrypt, returns JWT in httpOnly cookie with 15-min expiry. Use jose library for JWT (not jsonwebtoken - it has CommonJS issues with Next.js).</action>
<verify>curl -X POST /api/auth/login -d '{"email":"test@test.com","password":"test"}' -H "Content-Type: application/json" returns 200 with Set-Cookie header</verify>
<done>Valid credentials return 200 + cookie, invalid return 401, missing fields return 400</done>
</task>
</tasks>
<verification>
Before declaring phase complete:
- [ ] `npm run build` succeeds without errors
- [ ] `npx prisma validate` passes
- [ ] Login endpoint responds correctly to valid/invalid credentials
- [ ] Protected route redirects unauthenticated users
</verification>
<success_criteria>
- All tasks completed
- All verification checks pass
- No TypeScript errors
- JWT auth flow works end-to-end
</success_criteria>
<output>
After completion, create `.planning/phases/01-foundation/01-01-SUMMARY.md`
</output>
```
</good_examples>
<bad_examples>
```markdown
# Phase 1: Foundation
## Tasks
### Task 1: Set up authentication
**Action**: Add auth to the app
**Done when**: Users can log in
```
This is useless. No XML structure, no @context, no verification, no specificity.
</bad_examples>

View File

@@ -0,0 +1,274 @@
# Research Prompt Template
For phases requiring research before planning:
```markdown
---
phase: XX-name
type: research
topic: [research-topic]
---
<session_initialization>
Before beginning research, verify today's date:
!`date +%Y-%m-%d`
Use this date when searching for "current" or "latest" information.
Example: If today is 2025-11-22, search for "2025" not "2024".
</session_initialization>
<research_objective>
Research [topic] to inform [phase name] implementation.
Purpose: [What decision/implementation this enables]
Scope: [Boundaries]
Output: FINDINGS.md with structured recommendations
</research_objective>
<research_scope>
<include>
- [Question to answer]
- [Area to investigate]
- [Specific comparison if needed]
</include>
<exclude>
- [Out of scope for this research]
- [Defer to implementation phase]
</exclude>
<sources>
Official documentation (with exact URLs when known):
- https://example.com/official-docs
- https://example.com/api-reference
Search queries for WebSearch:
- "[topic] best practices {current_year}"
- "[topic] latest version"
Context7 MCP for library docs
Prefer current/recent sources (check date above)
</sources>
</research_scope>
<verification_checklist>
{If researching configuration/architecture with known components:}
□ Enumerate ALL known options/scopes (list them explicitly):
□ Option/Scope 1: [description]
□ Option/Scope 2: [description]
□ Option/Scope 3: [description]
□ Document exact file locations/URLs for each option
□ Verify precedence/hierarchy rules if applicable
□ Check for recent updates or changes to documentation
{For all research:}
□ Verify negative claims ("X is not possible") with official docs
□ Confirm all primary claims have authoritative sources
□ Check both current docs AND recent updates/changelogs
□ Test multiple search queries to avoid missing information
□ Check for environment/tool-specific variations
</verification_checklist>
<research_quality_assurance>
Before completing research, perform these checks:
<completeness_check>
- [ ] All enumerated options/components documented with evidence
- [ ] Official documentation cited for critical claims
- [ ] Contradictory information resolved or flagged
</completeness_check>
<blind_spots_review>
Ask yourself: "What might I have missed?"
- [ ] Are there configuration/implementation options I didn't investigate?
- [ ] Did I check for multiple environments/contexts?
- [ ] Did I verify claims that seem definitive ("cannot", "only", "must")?
- [ ] Did I look for recent changes or updates to documentation?
</blind_spots_review>
<critical_claims_audit>
For any statement like "X is not possible" or "Y is the only way":
- [ ] Is this verified by official documentation?
- [ ] Have I checked for recent updates that might change this?
- [ ] Are there alternative approaches I haven't considered?
</critical_claims_audit>
</research_quality_assurance>
<incremental_output>
**CRITICAL: Write findings incrementally to prevent token limit failures**
Instead of generating full FINDINGS.md at the end:
1. Create FINDINGS.md with structure skeleton
2. Write each finding as you discover it (append immediately)
3. Add code examples as found (append immediately)
4. Finalize summary and metadata at end
This ensures zero lost work if token limits are hit.
<workflow>
Step 1 - Initialize:
```bash
# Create skeleton file
cat > .planning/phases/XX-name/FINDINGS.md <<'EOF'
# [Topic] Research Findings
## Summary
[Will complete at end]
## Recommendations
[Will complete at end]
## Key Findings
[Append findings here as discovered]
## Code Examples
[Append examples here as found]
## Metadata
[Will complete at end]
EOF
```
Step 2 - Append findings as discovered:
After researching each aspect, immediately append to Key Findings section
Step 3 - Finalize at end:
Complete Summary, Recommendations, and Metadata sections
</workflow>
</incremental_output>
<output_structure>
Create `.planning/phases/XX-name/FINDINGS.md`:
# [Topic] Research Findings
## Summary
[2-3 paragraph executive summary]
## Recommendations
### Primary Recommendation
[What to do and why]
### Alternatives Considered
[What else was evaluated]
## Key Findings
### [Category 1]
- Finding with source URL
- Relevance to our case
### [Category 2]
- Finding with source URL
- Relevance
## Code Examples
[Relevant patterns, if applicable]
## Metadata
<metadata>
<confidence level="high|medium|low">
[Why this confidence level]
</confidence>
<dependencies>
[What's needed to proceed]
</dependencies>
<open_questions>
[What couldn't be determined]
</open_questions>
<assumptions>
[What was assumed]
</assumptions>
<quality_report>
<sources_consulted>
[List URLs of official documentation and primary sources]
</sources_consulted>
<claims_verified>
[Key findings verified with official sources]
</claims_verified>
<claims_assumed>
[Findings based on inference or incomplete information]
</claims_assumed>
<confidence_by_finding>
- Finding 1: High (official docs + multiple sources)
- Finding 2: Medium (single source)
- Finding 3: Low (inferred, requires verification)
</confidence_by_finding>
</quality_report>
</metadata>
</output_structure>
<success_criteria>
- All scope questions answered
- All verification checklist items completed
- Sources are current and authoritative
- Clear primary recommendation
- Metadata captures uncertainties
- Quality report distinguishes verified from assumed
- Ready to inform PLAN.md creation
</success_criteria>
```
<when_to_use>
Create RESEARCH.md before PLAN.md when:
- Technology choice unclear
- Best practices needed for unfamiliar domain
- API/library investigation required
- Architecture decision pending
- Multiple valid approaches exist
</when_to_use>
<example>
```markdown
---
phase: 02-auth
type: research
topic: JWT library selection for Next.js App Router
---
<research_objective>
Research JWT libraries to determine best option for Next.js 14 App Router authentication.
Purpose: Select JWT library before implementing auth endpoints
Scope: Compare jose, jsonwebtoken, and @auth/core for our use case
Output: FINDINGS.md with library recommendation
</research_objective>
<research_scope>
<include>
- ESM/CommonJS compatibility with Next.js 14
- Edge runtime support
- Token creation and validation patterns
- Community adoption and maintenance
</include>
<exclude>
- Full auth framework comparison (NextAuth vs custom)
- OAuth provider configuration
- Session storage strategies
</exclude>
<sources>
Official documentation (prioritize):
- https://github.com/panva/jose
- https://github.com/auth0/node-jsonwebtoken
Context7 MCP for library docs
Prefer current/recent sources
</sources>
</research_scope>
<success_criteria>
- Clear recommendation with rationale
- Code examples for selected library
- Known limitations documented
- Verification checklist completed
</success_criteria>
```
</example>

View File

@@ -0,0 +1,200 @@
# Roadmap Template
Copy and fill this structure for `.planning/ROADMAP.md`:
## Initial Roadmap (v1.0 Greenfield)
```markdown
# Roadmap: [Project Name]
## Overview
[One paragraph describing the journey from start to finish]
## Phases
- [ ] **Phase 1: [Name]** - [One-line description]
- [ ] **Phase 2: [Name]** - [One-line description]
- [ ] **Phase 3: [Name]** - [One-line description]
- [ ] **Phase 4: [Name]** - [One-line description]
## Phase Details
### Phase 1: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Nothing (first phase)
**Plans**: [Number of plans, e.g., "3 plans" or "TBD after research"]
Plans:
- [ ] 01-01: [Brief description of first plan]
- [ ] 01-02: [Brief description of second plan]
- [ ] 01-03: [Brief description of third plan]
### Phase 2: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Phase 1
**Plans**: [Number of plans]
Plans:
- [ ] 02-01: [Brief description]
### Phase 3: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Phase 2
**Plans**: [Number of plans]
Plans:
- [ ] 03-01: [Brief description]
- [ ] 03-02: [Brief description]
### Phase 4: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Phase 3
**Plans**: [Number of plans]
Plans:
- [ ] 04-01: [Brief description]
## Progress
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. [Name] | 0/3 | Not started | - |
| 2. [Name] | 0/1 | Not started | - |
| 3. [Name] | 0/2 | Not started | - |
| 4. [Name] | 0/1 | Not started | - |
```
<guidelines>
**Initial planning (v1.0):**
- 3-6 phases total (more = scope creep)
- Each phase delivers something coherent
- Phases can have 1+ plans (split if >7 tasks or multiple subsystems)
- Plans use naming: {phase}-{plan}-PLAN.md (e.g., 01-02-PLAN.md)
- No time estimates (this isn't enterprise PM)
- Progress table updated by transition workflow
- Plan count can be "TBD" initially, refined during planning
**After milestones ship:**
- Reorganize with milestone groupings (see below)
- Collapse completed milestones in `<details>` tags
- Add new milestone sections for upcoming work
- Keep continuous phase numbering (never restart at 01)
</guidelines>
<status_values>
- `Not started` - Haven't begun
- `In progress` - Currently working
- `Complete` - Done (add completion date)
- `Deferred` - Pushed to later (with reason)
</status_values>
## Milestone-Grouped Roadmap (After v1.0 Ships)
After completing first milestone, reorganize roadmap with milestone groupings:
```markdown
# Roadmap: [Project Name]
## Milestones
-**v1.0 MVP** - Phases 1-4 (shipped YYYY-MM-DD)
- 🚧 **v1.1 [Name]** - Phases 5-6 (in progress)
- 📋 **v2.0 [Name]** - Phases 7-10 (planned)
## Phases
<details>
<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED YYYY-MM-DD</summary>
### Phase 1: [Name]
**Goal**: [What this phase delivers]
**Plans**: 3 plans
Plans:
- [x] 01-01: [Brief description]
- [x] 01-02: [Brief description]
- [x] 01-03: [Brief description]
### Phase 2: [Name]
**Goal**: [What this phase delivers]
**Plans**: 2 plans
Plans:
- [x] 02-01: [Brief description]
- [x] 02-02: [Brief description]
### Phase 3: [Name]
**Goal**: [What this phase delivers]
**Plans**: 2 plans
Plans:
- [x] 03-01: [Brief description]
- [x] 03-02: [Brief description]
### Phase 4: [Name]
**Goal**: [What this phase delivers]
**Plans**: 1 plan
Plans:
- [x] 04-01: [Brief description]
</details>
### 🚧 v1.1 [Name] (In Progress)
**Milestone Goal:** [What v1.1 delivers]
#### Phase 5: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Phase 4
**Plans**: 1 plan
Plans:
- [ ] 05-01: [Brief description]
#### Phase 6: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Phase 5
**Plans**: 2 plans
Plans:
- [ ] 06-01: [Brief description]
- [ ] 06-02: [Brief description]
### 📋 v2.0 [Name] (Planned)
**Milestone Goal:** [What v2.0 delivers]
#### Phase 7: [Name]
**Goal**: [What this phase delivers]
**Depends on**: Phase 6
**Plans**: 3 plans
Plans:
- [ ] 07-01: [Brief description]
- [ ] 07-02: [Brief description]
- [ ] 07-03: [Brief description]
[... additional phases for v2.0 ...]
## Progress
| Phase | Milestone | Plans Complete | Status | Completed |
|-------|-----------|----------------|--------|-----------|
| 1. Foundation | v1.0 | 3/3 | Complete | YYYY-MM-DD |
| 2. Features | v1.0 | 2/2 | Complete | YYYY-MM-DD |
| 3. Polish | v1.0 | 2/2 | Complete | YYYY-MM-DD |
| 4. Launch | v1.0 | 1/1 | Complete | YYYY-MM-DD |
| 5. Security | v1.1 | 0/1 | Not started | - |
| 6. Hardening | v1.1 | 0/2 | Not started | - |
| 7. Redesign Core | v2.0 | 0/3 | Not started | - |
```
**Notes:**
- Milestone emoji: ✅ shipped, 🚧 in progress, 📋 planned
- Completed milestones collapsed in `<details>` for readability
- Current/future milestones expanded
- Continuous phase numbering (01-99)
- Progress table includes milestone column

View File

@@ -0,0 +1,148 @@
# Summary Template
Standardize SUMMARY.md format for phase completion:
```markdown
# Phase [X]: [Name] Summary
**[Substantive one-liner describing outcome - NOT "phase complete" or "implementation finished"]**
## Accomplishments
- [Most important outcome]
- [Second key accomplishment]
- [Third if applicable]
## Files Created/Modified
- `path/to/file.ts` - What it does
- `path/to/another.ts` - What it does
## Decisions Made
[Key decisions with brief rationale, or "None - followed plan as specified"]
## Deviations from Plan
[If no deviations: "None - plan executed exactly as written"]
[If deviations occurred:]
### Auto-fixed Issues
**1. [Rule X - Category] Brief description**
- **Found during:** Task [N] ([task name])
- **Issue:** [What was wrong]
- **Fix:** [What was done]
- **Files modified:** [file paths]
- **Verification:** [How it was verified]
- **Commit:** [hash]
[... repeat for each auto-fix ...]
### Deferred Enhancements
Logged to .planning/ISSUES.md for future consideration:
- ISS-XXX: [Brief description] (discovered in Task [N])
- ISS-XXX: [Brief description] (discovered in Task [N])
---
**Total deviations:** [N] auto-fixed ([breakdown by rule]), [N] deferred
**Impact on plan:** [Brief assessment - e.g., "All auto-fixes necessary for correctness/security. No scope creep."]
## Issues Encountered
[Problems and how they were resolved, or "None"]
[Note: "Deviations from Plan" documents unplanned work that was handled automatically via deviation rules. "Issues Encountered" documents problems during planned work that required problem-solving.]
## Next Phase Readiness
[What's ready for next phase]
[Any blockers or concerns]
---
*Phase: XX-name*
*Completed: [date]*
```
<one_liner_rules>
The one-liner MUST be substantive:
**Good:**
- "JWT auth with refresh rotation using jose library"
- "Prisma schema with User, Session, and Product models"
- "Dashboard with real-time metrics via Server-Sent Events"
**Bad:**
- "Phase complete"
- "Authentication implemented"
- "Foundation finished"
- "All tasks done"
The one-liner should tell someone what actually shipped.
</one_liner_rules>
<example>
```markdown
# Phase 1: Foundation Summary
**JWT auth with refresh rotation using jose library, Prisma User model, and protected API middleware**
## Accomplishments
- User model with email/password auth
- Login/logout endpoints with httpOnly JWT cookies
- Protected route middleware checking token validity
- Refresh token rotation on each request
## Files Created/Modified
- `prisma/schema.prisma` - User and Session models
- `src/app/api/auth/login/route.ts` - Login endpoint
- `src/app/api/auth/logout/route.ts` - Logout endpoint
- `src/middleware.ts` - Protected route checks
- `src/lib/auth.ts` - JWT helpers using jose
## Decisions Made
- Used jose instead of jsonwebtoken (ESM-native, Edge-compatible)
- 15-min access tokens with 7-day refresh tokens
- Storing refresh tokens in database for revocation capability
## Deviations from Plan
### Auto-fixed Issues
**1. [Rule 2 - Missing Critical] Added password hashing with bcrypt**
- **Found during:** Task 2 (Login endpoint implementation)
- **Issue:** Plan didn't specify password hashing - storing plaintext would be critical security flaw
- **Fix:** Added bcrypt hashing on registration, comparison on login with salt rounds 10
- **Files modified:** src/app/api/auth/login/route.ts, src/lib/auth.ts
- **Verification:** Password hash test passes, plaintext never stored
- **Commit:** abc123f
**2. [Rule 3 - Blocking] Installed missing jose dependency**
- **Found during:** Task 4 (JWT token generation)
- **Issue:** jose package not in package.json, import failing
- **Fix:** Ran `npm install jose`
- **Files modified:** package.json, package-lock.json
- **Verification:** Import succeeds, build passes
- **Commit:** def456g
### Deferred Enhancements
Logged to .planning/ISSUES.md for future consideration:
- ISS-001: Add rate limiting to login endpoint (discovered in Task 2)
- ISS-002: Improve token refresh UX with auto-retry on 401 (discovered in Task 5)
---
**Total deviations:** 2 auto-fixed (1 missing critical, 1 blocking), 2 deferred
**Impact on plan:** Both auto-fixes essential for security and functionality. No scope creep.
## Issues Encountered
- jsonwebtoken CommonJS import failed in Edge runtime - switched to jose (planned library change, worked as expected)
## Next Phase Readiness
- Auth foundation complete, ready for feature development
- User registration endpoint needed before public launch
---
*Phase: 01-foundation*
*Completed: 2025-01-15*
```
</example>

View File

@@ -0,0 +1,366 @@
# Workflow: Complete Milestone
<required_reading>
**Read these files NOW:**
1. templates/milestone.md
2. `.planning/ROADMAP.md`
3. `.planning/BRIEF.md`
</required_reading>
<purpose>
Mark a shipped version (v1.0, v1.1, v2.0) as complete. This creates a historical record in MILESTONES.md, updates BRIEF.md with current state, reorganizes ROADMAP.md with milestone groupings, and tags the release in git.
This is the ritual that separates "development" from "shipped."
</purpose>
<process>
<step name="verify_readiness">
Check if milestone is truly complete:
```bash
cat .planning/ROADMAP.md
ls .planning/phases/*/SUMMARY.md 2>/dev/null | wc -l
```
**Questions to ask:**
- Which phases belong to this milestone?
- Are all those phases complete (all plans have summaries)?
- Has the work been tested/validated?
- Is this ready to ship/tag?
Present:
```
Milestone: [Name from user, e.g., "v1.0 MVP"]
Appears to include:
- Phase 1: Foundation (2/2 plans complete)
- Phase 2: Authentication (2/2 plans complete)
- Phase 3: Core Features (3/3 plans complete)
- Phase 4: Polish (1/1 plan complete)
Total: 4 phases, 8 plans, all complete
Ready to mark this milestone as shipped?
(yes / wait / adjust scope)
```
Wait for confirmation.
If "adjust scope": Ask which phases should be included.
If "wait": Stop, user will return when ready.
</step>
<step name="gather_stats">
Calculate milestone statistics:
```bash
# Count phases and plans in milestone
# (user specified or detected from roadmap)
# Find git range
git log --oneline --grep="feat(" | head -20
# Count files modified in range
git diff --stat FIRST_COMMIT..LAST_COMMIT | tail -1
# Count LOC (adapt to language)
find . -name "*.swift" -o -name "*.ts" -o -name "*.py" | xargs wc -l 2>/dev/null
# Calculate timeline
git log --format="%ai" FIRST_COMMIT | tail -1 # Start date
git log --format="%ai" LAST_COMMIT | head -1 # End date
```
Present summary:
```
Milestone Stats:
- Phases: [X-Y]
- Plans: [Z] total
- Tasks: [N] total (estimated from phase summaries)
- Files modified: [M]
- Lines of code: [LOC] [language]
- Timeline: [Days] days ([Start] → [End])
- Git range: feat(XX-XX) → feat(YY-YY)
```
Confirm before proceeding.
</step>
<step name="extract_accomplishments">
Read all phase SUMMARY.md files in milestone range:
```bash
cat .planning/phases/01-*/01-*-SUMMARY.md
cat .planning/phases/02-*/02-*-SUMMARY.md
# ... for each phase in milestone
```
From summaries, extract 4-6 key accomplishments.
Present:
```
Key accomplishments for this milestone:
1. [Achievement from phase 1]
2. [Achievement from phase 2]
3. [Achievement from phase 3]
4. [Achievement from phase 4]
5. [Achievement from phase 5]
Does this capture the milestone? (yes / adjust)
```
If "adjust": User can add/remove/edit accomplishments.
</step>
<step name="create_milestone_entry">
Create or update `.planning/MILESTONES.md`.
If file doesn't exist:
```markdown
# Project Milestones: [Project Name from BRIEF]
[New entry]
```
If exists, prepend new entry (reverse chronological order).
Use template from `templates/milestone.md`:
```markdown
## v[Version] [Name] (Shipped: YYYY-MM-DD)
**Delivered:** [One sentence from user]
**Phases completed:** [X-Y] ([Z] plans total)
**Key accomplishments:**
- [List from previous step]
**Stats:**
- [Files] files created/modified
- [LOC] lines of [language]
- [Phases] phases, [Plans] plans, [Tasks] tasks
- [Days] days from [start milestone or start project] to ship
**Git range:** `feat(XX-XX)``feat(YY-YY)`
**What's next:** [Ask user: what's the next goal?]
---
```
Confirm entry looks correct.
</step>
<step name="update_brief">
Update `.planning/BRIEF.md` to reflect current state.
Add/update "Current State" section at top (after YAML if present):
```markdown
# Project Brief: [Name]
## Current State (Updated: YYYY-MM-DD)
**Shipped:** v[X.Y] [Name] (YYYY-MM-DD)
**Status:** [Production / Beta / Internal]
**Users:** [If known, e.g., "~500 downloads, 50 DAU" or "Internal use only"]
**Feedback:** [Key themes from users, or "Initial release, gathering feedback"]
**Codebase:** [LOC] [language], [key tech stack], [platform/deployment target]
## [Next Milestone] Goals
**Vision:** [What's the goal for next version?]
**Motivation:**
- [Why this next work matters]
- [User feedback driving it]
- [Technical debt or improvements needed]
**Scope (v[X.Y]):**
- [Feature/improvement 1]
- [Feature/improvement 2]
- [Feature/improvement 3]
---
<details>
<summary>Original Vision (v1.0 - Archived for reference)</summary>
[Move original brief content here]
</details>
```
**If this is v1.0 (first milestone):**
Just add "Current State" section, no need to archive original vision yet.
**If this is v1.1+:**
Collapse previous version's content into `<details>` section.
Show diff, confirm changes.
</step>
<step name="reorganize_roadmap">
Update `.planning/ROADMAP.md` to group completed milestone phases.
Add milestone headers and collapse completed work:
```markdown
# Roadmap: [Project Name]
## Milestones
-**v1.0 MVP** - Phases 1-4 (shipped YYYY-MM-DD)
- 🚧 **v1.1 Security** - Phases 5-6 (in progress)
- 📋 **v2.0 Redesign** - Phases 7-10 (planned)
## Phases
<details>
<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED YYYY-MM-DD</summary>
- [x] Phase 1: Foundation (2/2 plans) - completed YYYY-MM-DD
- [x] Phase 2: Authentication (2/2 plans) - completed YYYY-MM-DD
- [x] Phase 3: Core Features (3/3 plans) - completed YYYY-MM-DD
- [x] Phase 4: Polish (1/1 plan) - completed YYYY-MM-DD
</details>
### 🚧 v[Next] [Name] (In Progress / Planned)
- [ ] Phase 5: [Name] ([N] plans)
- [ ] Phase 6: [Name] ([N] plans)
## Progress
| Phase | Milestone | Plans Complete | Status | Completed |
|-------|-----------|----------------|--------|-----------|
| 1. Foundation | v1.0 | 2/2 | Complete | YYYY-MM-DD |
| 2. Authentication | v1.0 | 2/2 | Complete | YYYY-MM-DD |
| 3. Core Features | v1.0 | 3/3 | Complete | YYYY-MM-DD |
| 4. Polish | v1.0 | 1/1 | Complete | YYYY-MM-DD |
| 5. Security Audit | v1.1 | 0/1 | Not started | - |
| 6. Hardening | v1.1 | 0/2 | Not started | - |
```
Show diff, confirm changes.
</step>
<step name="git_tag">
Create git tag for milestone:
```bash
git tag -a v[X.Y] -m "$(cat <<'EOF'
v[X.Y] [Name]
Delivered: [One sentence]
Key accomplishments:
- [Item 1]
- [Item 2]
- [Item 3]
See .planning/MILESTONES.md for full details.
EOF
)"
```
Confirm: "Tagged: v[X.Y]"
Ask: "Push tag to remote? (y/n)"
If yes:
```bash
git push origin v[X.Y]
```
</step>
<step name="git_commit_milestone">
Commit milestone completion (MILESTONES.md + BRIEF.md + ROADMAP.md updates):
```bash
git add .planning/MILESTONES.md
git add .planning/BRIEF.md
git add .planning/ROADMAP.md
git commit -m "$(cat <<'EOF'
chore: milestone v[X.Y] [Name] shipped
- Added MILESTONES.md entry
- Updated BRIEF.md current state
- Reorganized ROADMAP.md with milestone grouping
- Tagged v[X.Y]
EOF
)"
```
Confirm: "Committed: chore: milestone v[X.Y] shipped"
</step>
<step name="offer_next">
```
✅ Milestone v[X.Y] [Name] complete
Shipped:
- [N] phases ([M] plans, [P] tasks)
- [One sentence of what shipped]
Summary: .planning/MILESTONES.md
Tag: v[X.Y]
Next steps:
1. Plan next milestone work (add phases to roadmap)
2. Archive and start fresh (for major rewrite/new codebase)
3. Take a break (done for now)
```
Wait for user decision.
If "1": Route to workflows/plan-phase.md (but ask about milestone scope first)
If "2": Route to workflows/archive-planning.md (to be created)
</step>
</process>
<milestone_naming>
**Version conventions:**
- **v1.0** - Initial MVP
- **v1.1, v1.2, v1.3** - Minor updates, new features, fixes
- **v2.0, v3.0** - Major rewrites, breaking changes, significant new direction
**Name conventions:**
- v1.0 MVP
- v1.1 Security
- v1.2 Performance
- v2.0 Redesign
- v2.0 iOS Launch
Keep names short (1-2 words describing the focus).
</milestone_naming>
<what_qualifies>
**Create milestones for:**
- Initial release (v1.0)
- Public releases
- Major feature sets shipped
- Before archiving planning
**Don't create milestones for:**
- Every phase completion (too granular)
- Work in progress (wait until shipped)
- Internal dev iterations (unless truly shipped internally)
If uncertain, ask: "Is this deployed/usable/shipped in some form?"
If yes → milestone. If no → keep working.
</what_qualifies>
<success_criteria>
Milestone completion is successful when:
- [ ] MILESTONES.md entry created with stats and accomplishments
- [ ] BRIEF.md updated with current state
- [ ] ROADMAP.md reorganized with milestone grouping
- [ ] Git tag created (v[X.Y])
- [ ] Milestone commit made
- [ ] User knows next steps
</success_criteria>

Some files were not shown because too many files have changed in this diff Show More