Initial commit
This commit is contained in:
529
skills/brainstorming/SKILL.md
Normal file
529
skills/brainstorming/SKILL.md
Normal file
@@ -0,0 +1,529 @@
|
||||
---
|
||||
name: brainstorming
|
||||
description: Use when creating or developing anything, before writing code - refines rough ideas into bd epics with immutable requirements
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Turn rough ideas into validated designs stored as bd epics with immutable requirements; tasks created iteratively as you learn, not upfront.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
HIGH FREEDOM - Adapt Socratic questioning to context, but always create immutable epic before code and only create first task (not full tree).
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Step | Action | Deliverable |
|
||||
|------|--------|-------------|
|
||||
| 1 | Ask questions (one at a time) | Understanding of requirements |
|
||||
| 2 | Research (agents for codebase/internet) | Existing patterns and approaches |
|
||||
| 3 | Propose 2-3 approaches with trade-offs | Recommended option |
|
||||
| 4 | Present design in sections (200-300 words) | Validated architecture |
|
||||
| 5 | Create bd epic with IMMUTABLE requirements | Epic with anti-patterns |
|
||||
| 6 | Create ONLY first task | Ready for executing-plans |
|
||||
| 7 | Hand off to executing-plans | Iterative implementation begins |
|
||||
|
||||
**Key:** Epic = contract (immutable), Tasks = adaptive (created as you learn)
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
- User describes new feature to implement
|
||||
- User has rough idea that needs refinement
|
||||
- About to write code without clear requirements
|
||||
- Need to explore approaches before committing
|
||||
- Requirements exist but architecture unclear
|
||||
|
||||
**Don't use for:**
|
||||
- Executing existing plans (use hyperpowers:executing-plans)
|
||||
- Fixing bugs (use hyperpowers:fixing-bugs)
|
||||
- Refactoring (use hyperpowers:refactoring-safely)
|
||||
- Requirements already crystal clear and epic exists
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## 1. Understanding the Idea
|
||||
|
||||
**Announce:** "I'm using the brainstorming skill to refine your idea into a design."
|
||||
|
||||
**Check current state:**
|
||||
- Recent commits, existing docs, codebase structure
|
||||
- Dispatch `hyperpowers:codebase-investigator` for existing patterns
|
||||
- Dispatch `hyperpowers:internet-researcher` for external APIs/libraries
|
||||
|
||||
**REQUIRED: Use AskUserQuestion tool for all questions**
|
||||
- One question at a time (don't batch multiple questions)
|
||||
- Prefer multiple choice options (easier to answer)
|
||||
- Wait for response before asking next question
|
||||
- Focus on: purpose, constraints, success criteria
|
||||
- Gather enough context to propose approaches
|
||||
|
||||
**Do NOT just print questions and wait for "yes"** - use the AskUserQuestion tool.
|
||||
|
||||
**Example questions:**
|
||||
- "What problem does this solve for users?"
|
||||
- "Are there existing implementations we should follow?"
|
||||
- "What's the most important success criterion?"
|
||||
- "Token storage: cookies, localStorage, or sessionStorage?"
|
||||
|
||||
---
|
||||
|
||||
## 2. Exploring Approaches
|
||||
|
||||
**Research first:**
|
||||
- Similar feature exists → dispatch codebase-investigator
|
||||
- New integration → dispatch internet-researcher
|
||||
- Review findings before proposing
|
||||
|
||||
**Propose 2-3 approaches with trade-offs:**
|
||||
|
||||
```
|
||||
Based on [research findings], I recommend:
|
||||
|
||||
1. **[Approach A]** (recommended)
|
||||
- Pros: [benefits, especially "matches existing pattern"]
|
||||
- Cons: [drawbacks]
|
||||
|
||||
2. **[Approach B]**
|
||||
- Pros: [benefits]
|
||||
- Cons: [drawbacks]
|
||||
|
||||
3. **[Approach C]**
|
||||
- Pros: [benefits]
|
||||
- Cons: [drawbacks]
|
||||
|
||||
I recommend option 1 because [specific reason, especially codebase consistency].
|
||||
```
|
||||
|
||||
**Lead with recommended option and explain why.**
|
||||
|
||||
---
|
||||
|
||||
## 3. Presenting the Design
|
||||
|
||||
**Once approach is chosen, present design in sections:**
|
||||
- Break into 200-300 word chunks
|
||||
- Ask after each: "Does this look right so far?"
|
||||
- Cover: architecture, components, data flow, error handling, testing
|
||||
- Be ready to go back and clarify
|
||||
|
||||
**Show research findings:**
|
||||
- "Based on codebase investigation: auth/ uses passport.js..."
|
||||
- "API docs show OAuth flow requires..."
|
||||
- Demonstrate how design builds on existing code
|
||||
|
||||
---
|
||||
|
||||
## 4. Creating the bd Epic
|
||||
|
||||
**After design validated, create epic as immutable contract:**
|
||||
|
||||
```bash
|
||||
bd create "Feature: [Feature Name]" \
|
||||
--type epic \
|
||||
--priority [0-4] \
|
||||
--design "## Requirements (IMMUTABLE)
|
||||
[What MUST be true when complete - specific, testable]
|
||||
- Requirement 1: [concrete requirement]
|
||||
- Requirement 2: [concrete requirement]
|
||||
- Requirement 3: [concrete requirement]
|
||||
|
||||
## Success Criteria (MUST ALL BE TRUE)
|
||||
- [ ] Criterion 1 (objective, testable - e.g., 'Integration tests pass')
|
||||
- [ ] Criterion 2 (objective, testable - e.g., 'Works with existing User model')
|
||||
- [ ] All tests passing
|
||||
- [ ] Pre-commit hooks passing
|
||||
|
||||
## Anti-Patterns (FORBIDDEN)
|
||||
- ❌ [Specific shortcut that violates requirements]
|
||||
- ❌ [Rationalization to prevent - e.g., 'NO mocking core behavior']
|
||||
- ❌ [Pattern to avoid - e.g., 'NO localStorage for tokens']
|
||||
|
||||
## Approach
|
||||
[2-3 paragraph summary of chosen approach]
|
||||
|
||||
## Architecture
|
||||
[Key components, data flow, integration points]
|
||||
|
||||
## Context
|
||||
[Links to similar implementations: file.ts:123]
|
||||
[External docs consulted]
|
||||
[Agent research findings]"
|
||||
```
|
||||
|
||||
**Critical:** Anti-patterns section prevents watering down requirements when blockers occur.
|
||||
|
||||
**Example anti-patterns:**
|
||||
- ❌ NO localStorage tokens (violates httpOnly security requirement)
|
||||
- ❌ NO new user model (must integrate with existing db/models/user.ts)
|
||||
- ❌ NO mocking OAuth in integration tests (defeats validation)
|
||||
- ❌ NO TODO stubs for core authentication flow
|
||||
|
||||
---
|
||||
|
||||
## 5. Creating ONLY First Task
|
||||
|
||||
**Create one task, not full tree:**
|
||||
|
||||
```bash
|
||||
bd create "Task 1: [Specific Deliverable]" \
|
||||
--type feature \
|
||||
--priority [match-epic] \
|
||||
--design "## Goal
|
||||
[What this task delivers - one clear outcome]
|
||||
|
||||
## Implementation
|
||||
[Detailed step-by-step for this task]
|
||||
|
||||
1. Study existing code
|
||||
[Point to 2-3 similar implementations: file.ts:line]
|
||||
|
||||
2. Write tests first (TDD)
|
||||
[Specific test cases for this task]
|
||||
|
||||
3. Implementation checklist
|
||||
- [ ] file.ts:line - function_name() - [exactly what it does]
|
||||
- [ ] test.ts:line - test_name() - [what scenario it tests]
|
||||
|
||||
## Success Criteria
|
||||
- [ ] [Specific, measurable outcome]
|
||||
- [ ] Tests passing
|
||||
- [ ] Pre-commit hooks passing"
|
||||
|
||||
bd dep add bd-2 bd-1 --type parent-child # Link to epic
|
||||
```
|
||||
|
||||
**Why only one task?**
|
||||
- Subsequent tasks created iteratively by executing-plans
|
||||
- Each task reflects learnings from previous
|
||||
- Avoids brittle task trees that break when assumptions change
|
||||
|
||||
---
|
||||
|
||||
## 6. SRE Refinement and Handoff
|
||||
|
||||
After epic and first task created:
|
||||
|
||||
**REQUIRED: Run SRE refinement before handoff**
|
||||
|
||||
```
|
||||
Use Skill tool: hyperpowers:sre-task-refinement
|
||||
```
|
||||
|
||||
SRE refinement will:
|
||||
- Apply 7-category corner-case analysis (Opus 4.1)
|
||||
- Strengthen success criteria
|
||||
- Identify edge cases and failure modes
|
||||
- Ensure task is ready for implementation
|
||||
|
||||
**Do NOT skip SRE refinement.** The first task sets the pattern for the entire epic.
|
||||
|
||||
**After refinement approved, present handoff:**
|
||||
|
||||
```
|
||||
"Epic bd-1 is ready with immutable requirements and success criteria.
|
||||
First task bd-2 has been refined and is ready to execute.
|
||||
|
||||
Ready to start implementation? I'll use executing-plans to work through this iteratively.
|
||||
|
||||
The executing-plans skill will:
|
||||
1. Execute the current task
|
||||
2. Review what was learned against epic requirements
|
||||
3. Create next task based on current reality
|
||||
4. Run SRE refinement on new tasks
|
||||
5. Repeat until all epic success criteria met
|
||||
|
||||
This approach avoids brittle upfront planning - each task adapts to what we learn."
|
||||
```
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer skips research, proposes approach without checking codebase</scenario>
|
||||
|
||||
<code>
|
||||
User: "Add OAuth authentication"
|
||||
|
||||
Claude (without brainstorming):
|
||||
"I'll implement OAuth with Auth0..."
|
||||
[Proposes approach without checking if auth exists]
|
||||
[Doesn't research existing patterns]
|
||||
[Misses that passport.js already set up]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Proposes Auth0 when passport.js already exists in codebase
|
||||
- Creates inconsistent architecture (two auth systems)
|
||||
- Wastes time implementing when partial solution exists
|
||||
- Doesn't leverage existing code
|
||||
- User has to redirect to existing pattern
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
1. **Research first:**
|
||||
- Dispatch codebase-investigator: "Find existing auth implementation"
|
||||
- Findings: passport.js at auth/passport-config.ts
|
||||
- Dispatch internet-researcher: "Passport OAuth2 strategies"
|
||||
|
||||
2. **Propose approaches building on findings:**
|
||||
```
|
||||
Based on codebase showing passport.js at auth/passport-config.ts:
|
||||
|
||||
1. Extend existing passport setup (recommended)
|
||||
- Add google-oauth20 strategy
|
||||
- Matches codebase pattern
|
||||
- Pros: Consistent, tested library
|
||||
- Cons: Requires OAuth provider setup
|
||||
|
||||
2. Custom JWT implementation
|
||||
- Pros: Full control
|
||||
- Cons: Security complexity, breaks pattern
|
||||
|
||||
I recommend option 1 because it builds on existing auth/ setup.
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Leverages existing code (faster)
|
||||
- Consistent architecture (maintainable)
|
||||
- Research informs design (correct)
|
||||
- User sees you understand codebase (trust)
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer creates full task tree upfront</scenario>
|
||||
|
||||
<code>
|
||||
bd create "Epic: Add OAuth"
|
||||
bd create "Task 1: Configure OAuth provider"
|
||||
bd create "Task 2: Implement token exchange"
|
||||
bd create "Task 3: Add refresh token logic"
|
||||
bd create "Task 4: Create middleware"
|
||||
bd create "Task 5: Add UI components"
|
||||
bd create "Task 6: Write integration tests"
|
||||
|
||||
# Starts implementing Task 1
|
||||
# Discovers OAuth library handles refresh automatically
|
||||
# Now Task 3 is wrong, needs deletion
|
||||
# Discovers middleware already exists
|
||||
# Now Task 4 is wrong
|
||||
# Task tree brittle to reality
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Assumptions about implementation prove wrong
|
||||
- Task tree becomes incorrect as you learn
|
||||
- Wastes time updating/deleting wrong tasks
|
||||
- Rigid plan fights with reality
|
||||
- Context switching between fixing plan and implementing
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach (iterative):**
|
||||
|
||||
```bash
|
||||
bd create "Epic: Add OAuth" [with immutable requirements]
|
||||
bd create "Task 1: Configure OAuth provider"
|
||||
|
||||
# Execute Task 1
|
||||
# Learn: OAuth library handles refresh, middleware exists
|
||||
|
||||
bd create "Task 2: Integrate with existing middleware"
|
||||
# [Created AFTER learning from Task 1]
|
||||
|
||||
# Execute Task 2
|
||||
# Learn: UI needs OAuth button component
|
||||
|
||||
bd create "Task 3: Add OAuth button to login UI"
|
||||
# [Created AFTER learning from Task 2]
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Tasks reflect current reality (accurate)
|
||||
- No wasted time fixing wrong plans (efficient)
|
||||
- Each task informed by previous learnings (adaptive)
|
||||
- Plan evolves with understanding (flexible)
|
||||
- Epic requirements stay immutable (contract preserved)
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Epic created without anti-patterns section</scenario>
|
||||
|
||||
<code>
|
||||
bd create "Epic: OAuth Authentication" --design "
|
||||
## Requirements
|
||||
- Users authenticate via Google OAuth2
|
||||
- Tokens stored securely
|
||||
- Session management
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Login flow works
|
||||
- [ ] Tokens secured
|
||||
- [ ] All tests pass
|
||||
"
|
||||
|
||||
# During implementation, hits blocker:
|
||||
# "Integration tests for OAuth are complex, I'll mock it..."
|
||||
# [No anti-pattern preventing this]
|
||||
# Ships with mocked OAuth (defeats validation)
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- No explicit forbidden patterns
|
||||
- Agent rationalizes shortcuts when blocked
|
||||
- "Tokens stored securely" too vague (localStorage? cookies?)
|
||||
- Requirements can be "met" without meeting intent
|
||||
- Mocking defeats the purpose of integration tests
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach with anti-patterns:**
|
||||
|
||||
```bash
|
||||
bd create "Epic: OAuth Authentication" --design "
|
||||
## Requirements (IMMUTABLE)
|
||||
- Users authenticate via Google OAuth2
|
||||
- Tokens stored in httpOnly cookies (NOT localStorage)
|
||||
- Session expires after 24h inactivity
|
||||
- Integrates with existing User model at db/models/user.ts
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Login redirects to Google and back
|
||||
- [ ] Tokens in httpOnly cookies
|
||||
- [ ] Token refresh works automatically
|
||||
- [ ] Integration tests pass WITHOUT mocking OAuth
|
||||
- [ ] All tests passing
|
||||
|
||||
## Anti-Patterns (FORBIDDEN)
|
||||
- ❌ NO localStorage tokens (violates httpOnly requirement)
|
||||
- ❌ NO new user model (must use existing)
|
||||
- ❌ NO mocking OAuth in integration tests (defeats validation)
|
||||
- ❌ NO skipping token refresh (explicit requirement)
|
||||
"
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Requirements concrete and specific (testable)
|
||||
- Forbidden patterns explicit (prevents shortcuts)
|
||||
- Agent can't rationalize away requirements (contract enforced)
|
||||
- Success criteria unambiguous (clear done state)
|
||||
- Anti-patterns prevent "letter not spirit" compliance
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<key_principles>
|
||||
- **One question at a time** - Don't overwhelm
|
||||
- **Multiple choice preferred** - Easier to answer when possible
|
||||
- **Delegate research** - Use codebase-investigator and internet-researcher agents
|
||||
- **YAGNI ruthlessly** - Remove unnecessary features from all designs
|
||||
- **Explore alternatives** - Propose 2-3 approaches before settling
|
||||
- **Incremental validation** - Present design in sections, validate each
|
||||
- **Epic is contract** - Requirements immutable, tasks adapt
|
||||
- **Anti-patterns prevent shortcuts** - Explicit forbidden patterns stop rationalization
|
||||
- **One task only** - Subsequent tasks created iteratively (not upfront)
|
||||
</key_principles>
|
||||
|
||||
<research_agents>
|
||||
## Use codebase-investigator when:
|
||||
- Understanding how existing features work
|
||||
- Finding where specific functionality lives
|
||||
- Identifying patterns to follow
|
||||
- Verifying assumptions about structure
|
||||
- Checking if feature already exists
|
||||
|
||||
## Use internet-researcher when:
|
||||
- Finding current API documentation
|
||||
- Researching library capabilities
|
||||
- Comparing technology options
|
||||
- Understanding community recommendations
|
||||
- Finding official code examples
|
||||
|
||||
## Research protocol:
|
||||
1. Codebase pattern exists → Use it (unless clearly unwise)
|
||||
2. No codebase pattern → Research external patterns
|
||||
3. Research yields nothing → Ask user for direction
|
||||
</research_agents>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Use AskUserQuestion tool** → Don't just print questions and wait
|
||||
2. **Research BEFORE proposing** → Use agents to understand context
|
||||
3. **Propose 2-3 approaches** → Don't jump to single solution
|
||||
4. **Epic requirements IMMUTABLE** → Tasks adapt, requirements don't
|
||||
5. **Include anti-patterns section** → Prevents watering down requirements
|
||||
6. **Create ONLY first task** → Subsequent tasks created iteratively
|
||||
7. **Run SRE refinement** → Before handoff to executing-plans
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Follow the process.**
|
||||
|
||||
- "Requirements obvious, don't need questions" (Questions reveal hidden complexity)
|
||||
- "I know this pattern, don't need research" (Research might show better way)
|
||||
- "Can plan all tasks upfront" (Plans become brittle, tasks adapt as you learn)
|
||||
- "Anti-patterns section overkill" (Prevents rationalization under pressure)
|
||||
- "Epic can evolve" (Requirements contract, tasks evolve)
|
||||
- "Can just print questions" (Use AskUserQuestion tool - it's more interactive)
|
||||
- "SRE refinement overkill for first task" (First task sets pattern for entire epic)
|
||||
- "User said yes, design is done" (Still need SRE refinement before execution)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before handing off to executing-plans:
|
||||
|
||||
- [ ] Used AskUserQuestion tool for clarifying questions (one at a time)
|
||||
- [ ] Researched codebase patterns (if applicable)
|
||||
- [ ] Researched external docs/libraries (if applicable)
|
||||
- [ ] Proposed 2-3 approaches with trade-offs
|
||||
- [ ] Presented design in sections, validated each
|
||||
- [ ] Created bd epic with all sections (requirements, success criteria, anti-patterns, approach, architecture)
|
||||
- [ ] Requirements are IMMUTABLE and specific
|
||||
- [ ] Anti-patterns section prevents shortcuts
|
||||
- [ ] Created ONLY first task (not full tree)
|
||||
- [ ] First task has detailed implementation checklist
|
||||
- [ ] Ran SRE refinement on first task (hyperpowers:sre-task-refinement)
|
||||
- [ ] Announced handoff to executing-plans after refinement approved
|
||||
|
||||
**Can't check all boxes?** Return to process and complete missing steps.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill calls:**
|
||||
- hyperpowers:codebase-investigator (for finding existing patterns)
|
||||
- hyperpowers:internet-researcher (for external documentation)
|
||||
- hyperpowers:sre-task-refinement (REQUIRED before handoff to executing-plans)
|
||||
- hyperpowers:executing-plans (handoff after refinement approved)
|
||||
|
||||
**Call chain:**
|
||||
```
|
||||
brainstorming → sre-task-refinement → executing-plans
|
||||
```
|
||||
|
||||
**This skill is called by:**
|
||||
- hyperpowers:using-hyper (mandatory before writing code)
|
||||
- User requests for new features
|
||||
- Beginning of greenfield development
|
||||
|
||||
**Agents used:**
|
||||
- codebase-investigator (understand existing code)
|
||||
- internet-researcher (find external documentation)
|
||||
|
||||
**Tools required:**
|
||||
- AskUserQuestion (for all clarifying questions)
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [bd epic template examples](resources/epic-templates.md)
|
||||
- [Socratic questioning patterns](resources/questioning-patterns.md)
|
||||
- [Anti-pattern examples by domain](resources/anti-patterns.md)
|
||||
|
||||
**When stuck:**
|
||||
- User gives vague answer → Ask follow-up multiple choice question
|
||||
- Research yields nothing → Ask user for direction explicitly
|
||||
- Too many approaches → Narrow to top 2-3, explain why others eliminated
|
||||
- User changes requirements mid-design → Acknowledge, return to understanding phase
|
||||
</resources>
|
||||
609
skills/building-hooks/SKILL.md
Normal file
609
skills/building-hooks/SKILL.md
Normal file
@@ -0,0 +1,609 @@
|
||||
---
|
||||
name: building-hooks
|
||||
description: Use when creating Claude Code hooks - covers hook patterns, composition, testing, progressive enhancement from simple to advanced
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Hooks encode business rules at application level; start with observation, add automation, enforce only when patterns clear.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
MEDIUM FREEDOM - Follow progressive enhancement (observe → automate → enforce) strictly. Hook patterns are adaptable, but always start non-blocking and test thoroughly.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Phase | Approach | Example |
|
||||
|-------|----------|---------|
|
||||
| 1. Observe | Non-blocking, report only | Log edits, display reminders |
|
||||
| 2. Automate | Background tasks, non-blocking | Auto-format, run builds |
|
||||
| 3. Enforce | Blocking only when necessary | Block dangerous ops, require fixes |
|
||||
|
||||
**Most used events:** UserPromptSubmit (before processing), Stop (after completion)
|
||||
|
||||
**Critical:** Start Phase 1, observe for a week, then Phase 2. Only add Phase 3 if absolutely necessary.
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
Use hooks for:
|
||||
- Automatic quality checks (build, lint, format)
|
||||
- Workflow automation (skill activation, context injection)
|
||||
- Error prevention (catching issues early)
|
||||
- Consistent behavior (formatting, conventions)
|
||||
|
||||
**Never use hooks for:**
|
||||
- Complex business logic (use tools/scripts)
|
||||
- Slow operations that block workflow (use background jobs)
|
||||
- Anything requiring LLM reasoning (hooks are deterministic)
|
||||
</when_to_use>
|
||||
|
||||
<hook_lifecycle_events>
|
||||
| Event | When Fires | Use Cases |
|
||||
|-------|------------|-----------|
|
||||
| UserPromptSubmit | Before Claude processes prompt | Validation, context injection, skill activation |
|
||||
| Stop | After Claude finishes | Build checks, formatting, quality reminders |
|
||||
| PostToolUse | After each tool execution | Logging, tracking, validation |
|
||||
| PreToolUse | Before tool execution | Permission checks, validation |
|
||||
| ToolError | When tool fails | Error handling, fallbacks |
|
||||
| SessionStart | New session begins | Environment setup, context loading |
|
||||
| SessionEnd | Session closes | Cleanup, logging |
|
||||
| Error | Unhandled error | Error recovery, notifications |
|
||||
</hook_lifecycle_events>
|
||||
|
||||
<progressive_enhancement>
|
||||
## Phase 1: Observation (Non-Blocking)
|
||||
|
||||
**Goal:** Understand patterns before acting
|
||||
|
||||
**Examples:**
|
||||
- Log file edits (PostToolUse)
|
||||
- Display reminders (Stop, non-blocking)
|
||||
- Track metrics
|
||||
|
||||
**Duration:** Observe for 1 week minimum
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Automation (Background)
|
||||
|
||||
**Goal:** Automate tedious tasks
|
||||
|
||||
**Examples:**
|
||||
- Auto-format edited files (Stop)
|
||||
- Run builds after changes (Stop)
|
||||
- Inject helpful context (UserPromptSubmit)
|
||||
|
||||
**Requirement:** Fast (<2 seconds), non-blocking
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Enforcement (Blocking)
|
||||
|
||||
**Goal:** Prevent errors, enforce standards
|
||||
|
||||
**Examples:**
|
||||
- Block dangerous operations (PreToolUse)
|
||||
- Require fixes before continuing (Stop, blocking)
|
||||
- Validate inputs (UserPromptSubmit, blocking)
|
||||
|
||||
**Requirement:** Only add when patterns clear from Phase 1-2
|
||||
</progressive_enhancement>
|
||||
|
||||
<common_hook_patterns>
|
||||
## Pattern 1: Build Checker (Stop Hook)
|
||||
|
||||
**Problem:** TypeScript errors left behind
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Stop hook - runs after Claude finishes
|
||||
|
||||
# Check modified repos
|
||||
modified_repos=$(grep -h "edited" ~/.claude/edit-log.txt | cut -d: -f1 | sort -u)
|
||||
|
||||
for repo in $modified_repos; do
|
||||
echo "Building $repo..."
|
||||
cd "$repo" && npm run build 2>&1 | tee /tmp/build-output.txt
|
||||
|
||||
error_count=$(grep -c "error TS" /tmp/build-output.txt || echo "0")
|
||||
|
||||
if [ "$error_count" -gt 0 ]; then
|
||||
if [ "$error_count" -ge 5 ]; then
|
||||
echo "⚠️ Found $error_count errors - consider error-resolver agent"
|
||||
else
|
||||
echo "🔴 Found $error_count TypeScript errors:"
|
||||
grep "error TS" /tmp/build-output.txt
|
||||
fi
|
||||
else
|
||||
echo "✅ Build passed"
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
**Configuration:**
|
||||
```json
|
||||
{
|
||||
"event": "Stop",
|
||||
"command": "~/.claude/hooks/build-checker.sh",
|
||||
"description": "Run builds on modified repos",
|
||||
"blocking": false
|
||||
}
|
||||
```
|
||||
|
||||
**Result:** Zero errors left behind
|
||||
|
||||
---
|
||||
|
||||
## Pattern 2: Auto-Formatter (Stop Hook)
|
||||
|
||||
**Problem:** Inconsistent formatting
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Stop hook - format all edited files
|
||||
|
||||
edited_files=$(tail -20 ~/.claude/edit-log.txt | grep "^/" | sort -u)
|
||||
|
||||
for file in $edited_files; do
|
||||
repo_dir=$(dirname "$file")
|
||||
while [ "$repo_dir" != "/" ]; do
|
||||
if [ -f "$repo_dir/.prettierrc" ]; then
|
||||
echo "Formatting $file..."
|
||||
cd "$repo_dir" && npx prettier --write "$file"
|
||||
break
|
||||
fi
|
||||
repo_dir=$(dirname "$repo_dir")
|
||||
done
|
||||
done
|
||||
|
||||
echo "✅ Formatting complete"
|
||||
```
|
||||
|
||||
**Result:** All code consistently formatted
|
||||
|
||||
---
|
||||
|
||||
## Pattern 3: Error Handling Reminder (Stop Hook)
|
||||
|
||||
**Problem:** Claude forgets error handling
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Stop hook - gentle reminder
|
||||
|
||||
edited_files=$(tail -20 ~/.claude/edit-log.txt | grep "^/")
|
||||
|
||||
risky_patterns=0
|
||||
for file in $edited_files; do
|
||||
if grep -q "try\|catch\|async\|await\|prisma\|router\." "$file"; then
|
||||
((risky_patterns++))
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "$risky_patterns" -gt 0 ]; then
|
||||
cat <<EOF
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
📋 ERROR HANDLING SELF-CHECK
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
⚠️ Risky Patterns Detected
|
||||
$risky_patterns file(s) with async/try-catch/database operations
|
||||
|
||||
❓ Did you add proper error handling?
|
||||
❓ Are errors logged appropriately?
|
||||
|
||||
💡 Consider: Sentry.captureException(), proper logging
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
EOF
|
||||
fi
|
||||
```
|
||||
|
||||
**Result:** Claude self-checks without blocking
|
||||
|
||||
---
|
||||
|
||||
## Pattern 4: Skills Auto-Activation
|
||||
|
||||
**See:** hyperpowers:skills-auto-activation for complete implementation
|
||||
|
||||
**Summary:** Analyzes prompt keywords, injects skill activation reminder before Claude processes.
|
||||
</common_hook_patterns>
|
||||
|
||||
<hook_composition>
|
||||
## Naming for Order Control
|
||||
|
||||
Multiple hooks for same event run in **alphabetical order** by filename.
|
||||
|
||||
**Use numeric prefixes:**
|
||||
|
||||
```
|
||||
hooks/
|
||||
├── 00-log-prompt.sh # First (logging)
|
||||
├── 10-inject-context.sh # Second (context)
|
||||
├── 20-activate-skills.sh # Third (skills)
|
||||
└── 99-notify.sh # Last (notifications)
|
||||
```
|
||||
|
||||
## Hook Dependencies
|
||||
|
||||
If Hook B depends on Hook A's output:
|
||||
|
||||
1. **Option 1:** Numeric prefixes (A before B)
|
||||
2. **Option 2:** Combine into single hook
|
||||
3. **Option 3:** File-based communication
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# 10-track-edits.sh writes to edit-log.txt
|
||||
# 20-check-builds.sh reads from edit-log.txt
|
||||
```
|
||||
</hook_composition>
|
||||
|
||||
<testing_hooks>
|
||||
## Test in Isolation
|
||||
|
||||
```bash
|
||||
# Manually trigger
|
||||
bash ~/.claude/hooks/build-checker.sh
|
||||
|
||||
# Check exit code
|
||||
echo $? # 0 = success
|
||||
```
|
||||
|
||||
## Test with Mock Data
|
||||
|
||||
```bash
|
||||
# Create mock log
|
||||
echo "/path/to/test/file.ts" > /tmp/test-edit-log.txt
|
||||
|
||||
# Run with test data
|
||||
EDIT_LOG=/tmp/test-edit-log.txt bash ~/.claude/hooks/build-checker.sh
|
||||
```
|
||||
|
||||
## Test Non-Blocking Behavior
|
||||
|
||||
- Hook exits quickly (<2 seconds)
|
||||
- Doesn't block Claude
|
||||
- Provides clear output
|
||||
|
||||
## Test Blocking Behavior
|
||||
|
||||
- Blocking decision correct
|
||||
- Reason message helpful
|
||||
- Escape hatch exists
|
||||
|
||||
## Debugging
|
||||
|
||||
**Enable logging:**
|
||||
```bash
|
||||
set -x # Debug output
|
||||
exec 2>~/.claude/hooks/debug.log
|
||||
```
|
||||
|
||||
**Check execution:**
|
||||
```bash
|
||||
tail -f ~/.claude/logs/hooks.log
|
||||
```
|
||||
|
||||
**Common issues:**
|
||||
- Timeout (>10 second default)
|
||||
- Wrong working directory
|
||||
- Missing environment variables
|
||||
- File permissions
|
||||
</testing_hooks>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer adds blocking hook immediately without observation</scenario>
|
||||
|
||||
<code>
|
||||
# Developer frustrated by TypeScript errors
|
||||
# Creates blocking Stop hook immediately:
|
||||
|
||||
#!/bin/bash
|
||||
npm run build
|
||||
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "BUILD FAILED - BLOCKING"
|
||||
exit 1 # Blocks Claude
|
||||
fi
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- No observation period to understand patterns
|
||||
- Blocks even for minor errors
|
||||
- No escape hatch if hook misbehaves
|
||||
- Might block during experimentation
|
||||
- Frustrates workflow when building is slow
|
||||
- Haven't identified when blocking is actually needed
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Phase 1: Observe (1 week)**
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Non-blocking observation
|
||||
npm run build 2>&1 | tee /tmp/build.log
|
||||
|
||||
if grep -q "error TS" /tmp/build.log; then
|
||||
echo "🔴 Build errors found (not blocking)"
|
||||
fi
|
||||
```
|
||||
|
||||
**After 1 week, review:**
|
||||
- How often do errors appear?
|
||||
- Are they usually fixed quickly?
|
||||
- Do they cause real problems or just noise?
|
||||
|
||||
**Phase 2: If errors are frequent, automate**
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Still non-blocking, but more helpful
|
||||
npm run build 2>&1 | tee /tmp/build.log
|
||||
|
||||
error_count=$(grep -c "error TS" /tmp/build.log || echo "0")
|
||||
|
||||
if [ "$error_count" -ge 5 ]; then
|
||||
echo "⚠️ $error_count errors - consider using error-resolver agent"
|
||||
elif [ "$error_count" -gt 0 ]; then
|
||||
echo "🔴 $error_count errors (not blocking):"
|
||||
grep "error TS" /tmp/build.log | head -5
|
||||
fi
|
||||
```
|
||||
|
||||
**Phase 3: Only if observation shows blocking is necessary**
|
||||
|
||||
Never reached - non-blocking works fine!
|
||||
|
||||
**What you gain:**
|
||||
- Understood patterns before acting
|
||||
- Non-blocking keeps workflow smooth
|
||||
- Helpful messages without friction
|
||||
- Can experiment without frustration
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Hook is slow, blocks workflow</scenario>
|
||||
|
||||
<code>
|
||||
#!/bin/bash
|
||||
# Stop hook that's too slow
|
||||
|
||||
# Run full test suite (takes 45 seconds!)
|
||||
npm test
|
||||
|
||||
# Run linter (takes 10 seconds)
|
||||
npm run lint
|
||||
|
||||
# Run build (takes 30 seconds)
|
||||
npm run build
|
||||
|
||||
# Total: 85 seconds of blocking!
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Hook takes 85 seconds to complete
|
||||
- Blocks Claude for entire duration
|
||||
- User can't continue working
|
||||
- Frustrating, likely to be disabled
|
||||
- Defeats purpose of automation
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Make hook fast (<2 seconds):**
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Stop hook - fast checks only
|
||||
|
||||
# Quick syntax check (< 1 second)
|
||||
npm run check-syntax
|
||||
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "🔴 Syntax errors found"
|
||||
echo "💡 Run 'npm test' manually for full test suite"
|
||||
fi
|
||||
|
||||
echo "✅ Quick checks passed (run 'npm test' for full suite)"
|
||||
```
|
||||
|
||||
**Or run slow checks in background:**
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Stop hook - trigger background job
|
||||
|
||||
# Start tests in background
|
||||
(
|
||||
npm test > /tmp/test-results.txt 2>&1
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "🔴 Tests failed (see /tmp/test-results.txt)"
|
||||
fi
|
||||
) &
|
||||
|
||||
echo "⏳ Tests running in background (check /tmp/test-results.txt)"
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Hook completes instantly
|
||||
- Workflow not blocked
|
||||
- Still get quality checks
|
||||
- User can continue working
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Hook has no error handling, fails silently</scenario>
|
||||
|
||||
<code>
|
||||
#!/bin/bash
|
||||
# Hook with no error handling
|
||||
|
||||
file=$(tail -1 ~/.claude/edit-log.txt)
|
||||
prettier --write "$file"
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- If edit-log.txt missing → hook fails silently
|
||||
- If file path invalid → prettier errors not caught
|
||||
- If prettier not installed → silent failure
|
||||
- No logging, can't debug
|
||||
- User has no idea hook ran or failed
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Add error handling:**
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail # Exit on error, undefined vars
|
||||
|
||||
# Log execution
|
||||
echo "[$(date)] Hook started" >> ~/.claude/hooks/formatter.log
|
||||
|
||||
# Validate input
|
||||
if [ ! -f ~/.claude/edit-log.txt ]; then
|
||||
echo "[$(date)] ERROR: edit-log.txt not found" >> ~/.claude/hooks/formatter.log
|
||||
exit 1
|
||||
fi
|
||||
|
||||
file=$(tail -1 ~/.claude/edit-log.txt | grep "^/.*\.ts$")
|
||||
|
||||
if [ -z "$file" ]; then
|
||||
echo "[$(date)] No TypeScript file to format" >> ~/.claude/hooks/formatter.log
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [ ! -f "$file" ]; then
|
||||
echo "[$(date)] ERROR: File not found: $file" >> ~/.claude/hooks/formatter.log
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check prettier exists
|
||||
if ! command -v prettier &> /dev/null; then
|
||||
echo "[$(date)] ERROR: prettier not installed" >> ~/.claude/hooks/formatter.log
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Format
|
||||
echo "[$(date)] Formatting: $file" >> ~/.claude/hooks/formatter.log
|
||||
if prettier --write "$file" 2>&1 | tee -a ~/.claude/hooks/formatter.log; then
|
||||
echo "✅ Formatted $file"
|
||||
else
|
||||
echo "🔴 Formatting failed (see ~/.claude/hooks/formatter.log)"
|
||||
fi
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Errors logged and visible
|
||||
- Graceful handling of missing files
|
||||
- Can debug when issues occur
|
||||
- Clear feedback to user
|
||||
- Hook doesn't fail silently
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<security>
|
||||
**Hooks run with your credentials and have full system access.**
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Review code carefully** - Hooks execute any command
|
||||
2. **Use absolute paths** - Don't rely on PATH
|
||||
3. **Validate inputs** - Don't trust file paths blindly
|
||||
4. **Limit scope** - Only access what's needed
|
||||
5. **Log actions** - Track what hooks do
|
||||
6. **Test thoroughly** - Especially blocking hooks
|
||||
|
||||
## Dangerous Patterns
|
||||
|
||||
❌ **Don't:**
|
||||
```bash
|
||||
# DANGEROUS - executes arbitrary code
|
||||
cmd=$(tail -1 ~/.claude/edit-log.txt)
|
||||
eval "$cmd"
|
||||
```
|
||||
|
||||
✅ **Do:**
|
||||
```bash
|
||||
# SAFE - validates and sanitizes
|
||||
file=$(tail -1 ~/.claude/edit-log.txt | grep "^/.*\.ts$")
|
||||
if [ -f "$file" ]; then
|
||||
prettier --write "$file"
|
||||
fi
|
||||
```
|
||||
</security>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Start with Phase 1 (observe)** → Understand patterns before acting
|
||||
2. **Keep hooks fast (<2 seconds)** → Don't block workflow
|
||||
3. **Test thoroughly** → Hooks have full system access
|
||||
4. **Add error handling and logging** → Silent failures are debugging nightmares
|
||||
5. **Use progressive enhancement** → Observe → Automate → Enforce (only if needed)
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Follow progressive enhancement.**
|
||||
|
||||
- "Hook is simple, don't need testing" (Untested hooks fail in production)
|
||||
- "Blocking is fine, need to enforce" (Start non-blocking, observe first)
|
||||
- "I'll add error handling later" (Hook errors silent, add now)
|
||||
- "Hook is slow but thorough" (Slow hooks block workflow, optimize)
|
||||
- "Need access to everything" (Minimal permissions only)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before deploying hook:
|
||||
|
||||
- [ ] Tested in isolation (manual execution)
|
||||
- [ ] Tested with mock data
|
||||
- [ ] Completes quickly (<2 seconds for non-blocking)
|
||||
- [ ] Has error handling (set -euo pipefail)
|
||||
- [ ] Has logging (can debug failures)
|
||||
- [ ] Validates inputs (doesn't trust blindly)
|
||||
- [ ] Uses absolute paths
|
||||
- [ ] Started with Phase 1 (observation)
|
||||
- [ ] If blocking: has escape hatch
|
||||
|
||||
**Can't check all boxes?** Return to development and fix.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill covers:** Hook creation and patterns
|
||||
|
||||
**Related skills:**
|
||||
- hyperpowers:skills-auto-activation (complete skill activation hook)
|
||||
- hyperpowers:verification-before-completion (quality hooks automate this)
|
||||
- hyperpowers:testing-anti-patterns (avoid in hooks)
|
||||
|
||||
**Hook patterns support:**
|
||||
- Automatic skill activation
|
||||
- Build verification
|
||||
- Code formatting
|
||||
- Error prevention
|
||||
- Workflow automation
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Complete hook examples](resources/hook-examples.md)
|
||||
- [Hook pattern library](resources/hook-patterns.md)
|
||||
- [Testing strategies](resources/testing-hooks.md)
|
||||
|
||||
**Official documentation:**
|
||||
- [Anthropic Hooks Guide](https://docs.claude.com/en/docs/claude-code/hooks-guide)
|
||||
|
||||
**When stuck:**
|
||||
- Hook failing silently → Add logging, check ~/.claude/hooks/debug.log
|
||||
- Hook too slow → Profile execution, move slow parts to background
|
||||
- Hook blocking incorrectly → Return to Phase 1, observe patterns
|
||||
- Testing unclear → Start with manual execution, then mock data
|
||||
</resources>
|
||||
577
skills/building-hooks/resources/hook-examples.md
Normal file
577
skills/building-hooks/resources/hook-examples.md
Normal file
@@ -0,0 +1,577 @@
|
||||
# Complete Hook Examples
|
||||
|
||||
This guide provides complete, production-ready hook implementations you can use and adapt.
|
||||
|
||||
## Example 1: File Edit Tracker (PostToolUse)
|
||||
|
||||
**Purpose:** Track which files were edited and in which repos for later analysis.
|
||||
|
||||
**File:** `~/.claude/hooks/post-tool-use/01-track-edits.sh`
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Configuration
|
||||
LOG_FILE="$HOME/.claude/edit-log.txt"
|
||||
MAX_LOG_LINES=1000
|
||||
|
||||
# Create log if doesn't exist
|
||||
touch "$LOG_FILE"
|
||||
|
||||
# Function to log edit
|
||||
log_edit() {
|
||||
local file_path="$1"
|
||||
local timestamp=$(date +"%Y-%m-%d %H:%M:%S")
|
||||
local repo=$(find_repo "$file_path")
|
||||
|
||||
echo "$timestamp | $repo | $file_path" >> "$LOG_FILE"
|
||||
}
|
||||
|
||||
# Function to find repo root
|
||||
find_repo() {
|
||||
local dir=$(dirname "$1")
|
||||
while [ "$dir" != "/" ]; do
|
||||
if [ -d "$dir/.git" ]; then
|
||||
basename "$dir"
|
||||
return
|
||||
fi
|
||||
dir=$(dirname "$dir")
|
||||
done
|
||||
echo "unknown"
|
||||
}
|
||||
|
||||
# Read tool use event from stdin
|
||||
read -r tool_use_json
|
||||
|
||||
# Extract file path from tool use
|
||||
tool_name=$(echo "$tool_use_json" | jq -r '.tool.name')
|
||||
file_path=""
|
||||
|
||||
case "$tool_name" in
|
||||
"Edit"|"Write")
|
||||
file_path=$(echo "$tool_use_json" | jq -r '.tool.input.file_path')
|
||||
;;
|
||||
"MultiEdit")
|
||||
# MultiEdit has multiple files - log each
|
||||
echo "$tool_use_json" | jq -r '.tool.input.edits[].file_path' | while read -r path; do
|
||||
log_edit "$path"
|
||||
done
|
||||
exit 0
|
||||
;;
|
||||
esac
|
||||
|
||||
# Log single edit
|
||||
if [ -n "$file_path" ] && [ "$file_path" != "null" ]; then
|
||||
log_edit "$file_path"
|
||||
fi
|
||||
|
||||
# Rotate log if too large
|
||||
line_count=$(wc -l < "$LOG_FILE")
|
||||
if [ "$line_count" -gt "$MAX_LOG_LINES" ]; then
|
||||
tail -n "$MAX_LOG_LINES" "$LOG_FILE" > "$LOG_FILE.tmp"
|
||||
mv "$LOG_FILE.tmp" "$LOG_FILE"
|
||||
fi
|
||||
|
||||
# Return success (non-blocking)
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
**Configuration (`hooks.json`):**
|
||||
```json
|
||||
{
|
||||
"hooks": [
|
||||
{
|
||||
"event": "PostToolUse",
|
||||
"command": "~/.claude/hooks/post-tool-use/01-track-edits.sh",
|
||||
"description": "Track file edits for build checking",
|
||||
"blocking": false,
|
||||
"timeout": 1000
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Example 2: Multi-Repo Build Checker (Stop)
|
||||
|
||||
**Purpose:** Run builds on all repos that were modified, report errors.
|
||||
|
||||
**File:** `~/.claude/hooks/stop/20-build-checker.sh`
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Configuration
|
||||
LOG_FILE="$HOME/.claude/edit-log.txt"
|
||||
PROJECT_ROOT="$HOME/git/myproject"
|
||||
ERROR_THRESHOLD=5
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Get repos modified since last check
|
||||
get_modified_repos() {
|
||||
# Get unique repos from recent edits
|
||||
tail -50 "$LOG_FILE" 2>/dev/null | \
|
||||
cut -d'|' -f2 | \
|
||||
tr -d ' ' | \
|
||||
sort -u | \
|
||||
grep -v "unknown"
|
||||
}
|
||||
|
||||
# Run build in repo
|
||||
build_repo() {
|
||||
local repo_name="$1"
|
||||
local repo_path="$PROJECT_ROOT/$repo_name"
|
||||
|
||||
if [ ! -d "$repo_path" ]; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
# Determine build command
|
||||
local build_cmd=""
|
||||
if [ -f "$repo_path/package.json" ]; then
|
||||
build_cmd="npm run build"
|
||||
elif [ -f "$repo_path/Cargo.toml" ]; then
|
||||
build_cmd="cargo build"
|
||||
elif [ -f "$repo_path/go.mod" ]; then
|
||||
build_cmd="go build ./..."
|
||||
else
|
||||
return 0 # No build system found
|
||||
fi
|
||||
|
||||
echo "Building $repo_name..."
|
||||
|
||||
# Run build and capture output
|
||||
cd "$repo_path"
|
||||
local output=$(eval "$build_cmd" 2>&1)
|
||||
local exit_code=$?
|
||||
|
||||
if [ $exit_code -ne 0 ]; then
|
||||
# Count errors
|
||||
local error_count=$(echo "$output" | grep -c "error" || echo "0")
|
||||
|
||||
if [ "$error_count" -ge "$ERROR_THRESHOLD" ]; then
|
||||
echo -e "${YELLOW}⚠️ $repo_name: $error_count errors found${NC}"
|
||||
echo " Consider launching auto-error-resolver agent"
|
||||
else
|
||||
echo -e "${RED}🔴 $repo_name: $error_count errors${NC}"
|
||||
echo "$output" | grep "error" | head -10
|
||||
fi
|
||||
|
||||
return 1
|
||||
else
|
||||
echo -e "${GREEN}✅ $repo_name: Build passed${NC}"
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
# Main execution
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🔨 BUILD VERIFICATION"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo ""
|
||||
|
||||
modified_repos=$(get_modified_repos)
|
||||
|
||||
if [ -z "$modified_repos" ]; then
|
||||
echo "No repos modified since last check"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
build_failures=0
|
||||
|
||||
for repo in $modified_repos; do
|
||||
if ! build_repo "$repo"; then
|
||||
((build_failures++))
|
||||
fi
|
||||
echo ""
|
||||
done
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
if [ "$build_failures" -gt 0 ]; then
|
||||
echo -e "${RED}$build_failures repo(s) failed to build${NC}"
|
||||
else
|
||||
echo -e "${GREEN}All builds passed${NC}"
|
||||
fi
|
||||
|
||||
# Non-blocking - always return success
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Example 3: TypeScript Prettier Formatter (Stop)
|
||||
|
||||
**Purpose:** Auto-format all edited TypeScript/JavaScript files.
|
||||
|
||||
**File:** `~/.claude/hooks/stop/30-format-code.sh`
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Configuration
|
||||
LOG_FILE="$HOME/.claude/edit-log.txt"
|
||||
PROJECT_ROOT="$HOME/git/myproject"
|
||||
|
||||
# Get recently edited files
|
||||
get_edited_files() {
|
||||
tail -50 "$LOG_FILE" 2>/dev/null | \
|
||||
cut -d'|' -f3 | \
|
||||
tr -d ' ' | \
|
||||
grep -E '\.(ts|tsx|js|jsx)$' | \
|
||||
sort -u
|
||||
}
|
||||
|
||||
# Format file with prettier
|
||||
format_file() {
|
||||
local file="$1"
|
||||
|
||||
if [ ! -f "$file" ]; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
# Find prettier config
|
||||
local dir=$(dirname "$file")
|
||||
local prettier_config=""
|
||||
|
||||
while [ "$dir" != "/" ]; do
|
||||
if [ -f "$dir/.prettierrc" ] || [ -f "$dir/.prettierrc.json" ]; then
|
||||
prettier_config="$dir"
|
||||
break
|
||||
fi
|
||||
dir=$(dirname "$dir")
|
||||
done
|
||||
|
||||
if [ -z "$prettier_config" ]; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
# Format the file
|
||||
cd "$prettier_config"
|
||||
npx prettier --write "$file" 2>/dev/null
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✓ Formatted: $(basename $file)"
|
||||
fi
|
||||
}
|
||||
|
||||
# Main execution
|
||||
echo "🎨 Formatting edited files..."
|
||||
|
||||
edited_files=$(get_edited_files)
|
||||
|
||||
if [ -z "$edited_files" ]; then
|
||||
echo "No files to format"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
formatted_count=0
|
||||
|
||||
for file in $edited_files; do
|
||||
if format_file "$file"; then
|
||||
((formatted_count++))
|
||||
fi
|
||||
done
|
||||
|
||||
echo "✅ Formatted $formatted_count file(s)"
|
||||
|
||||
# Non-blocking
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Example 4: Skill Activation Injector (UserPromptSubmit)
|
||||
|
||||
**Purpose:** Analyze user prompt and inject skill activation reminders.
|
||||
|
||||
**File:** `~/.claude/hooks/user-prompt-submit/skill-activator.js`
|
||||
|
||||
```javascript
|
||||
#!/usr/bin/env node
|
||||
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
// Load skill rules
|
||||
const rulesPath = process.env.SKILL_RULES || path.join(process.env.HOME, '.claude/skill-rules.json');
|
||||
const rules = JSON.parse(fs.readFileSync(rulesPath, 'utf8'));
|
||||
|
||||
// Read prompt from stdin
|
||||
let promptData = '';
|
||||
process.stdin.on('data', chunk => {
|
||||
promptData += chunk;
|
||||
});
|
||||
|
||||
process.stdin.on('end', () => {
|
||||
const prompt = JSON.parse(promptData);
|
||||
const activatedSkills = analyzePrompt(prompt.text);
|
||||
|
||||
if (activatedSkills.length > 0) {
|
||||
const context = generateContext(activatedSkills);
|
||||
console.log(JSON.stringify({
|
||||
decision: 'approve',
|
||||
additionalContext: context
|
||||
}));
|
||||
} else {
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
}
|
||||
});
|
||||
|
||||
function analyzePrompt(text) {
|
||||
const lowerText = text.toLowerCase();
|
||||
const activated = [];
|
||||
|
||||
for (const [skillName, config] of Object.entries(rules)) {
|
||||
// Check keywords
|
||||
if (config.promptTriggers?.keywords) {
|
||||
for (const keyword of config.promptTriggers.keywords) {
|
||||
if (lowerText.includes(keyword.toLowerCase())) {
|
||||
activated.push({ skill: skillName, priority: config.priority || 'medium' });
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check intent patterns
|
||||
if (config.promptTriggers?.intentPatterns) {
|
||||
for (const pattern of config.promptTriggers.intentPatterns) {
|
||||
if (new RegExp(pattern, 'i').test(text)) {
|
||||
activated.push({ skill: skillName, priority: config.priority || 'medium' });
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Sort by priority
|
||||
return activated.sort((a, b) => {
|
||||
const priorityOrder = { high: 0, medium: 1, low: 2 };
|
||||
return priorityOrder[a.priority] - priorityOrder[b.priority];
|
||||
});
|
||||
}
|
||||
|
||||
function generateContext(skills) {
|
||||
const skillList = skills.map(s => s.skill).join(', ');
|
||||
|
||||
return `
|
||||
🎯 SKILL ACTIVATION CHECK
|
||||
|
||||
The following skills may be relevant to this prompt:
|
||||
${skills.map(s => `- **${s.skill}** (${s.priority} priority)`).join('\n')}
|
||||
|
||||
Before responding, check if any of these skills should be used.
|
||||
`;
|
||||
}
|
||||
```
|
||||
|
||||
**Configuration (`skill-rules.json`):**
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["backend", "controller", "service", "API", "endpoint"],
|
||||
"intentPatterns": [
|
||||
"(create|add).*?(route|endpoint|controller)",
|
||||
"(how to|best practice).*?(backend|API)"
|
||||
]
|
||||
}
|
||||
},
|
||||
"frontend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["frontend", "component", "react", "UI", "layout"],
|
||||
"intentPatterns": [
|
||||
"(create|build).*?(component|page|view)",
|
||||
"(how to|pattern).*?(react|frontend)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Example 5: Error Handling Reminder (Stop)
|
||||
|
||||
**Purpose:** Gentle reminder to check error handling in risky code.
|
||||
|
||||
**File:** `~/.claude/hooks/stop/40-error-reminder.sh`
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
LOG_FILE="$HOME/.claude/edit-log.txt"
|
||||
|
||||
# Get recently edited files
|
||||
get_edited_files() {
|
||||
tail -20 "$LOG_FILE" 2>/dev/null | \
|
||||
cut -d'|' -f3 | \
|
||||
tr -d ' ' | \
|
||||
sort -u
|
||||
}
|
||||
|
||||
# Check for risky patterns
|
||||
check_file_risk() {
|
||||
local file="$1"
|
||||
|
||||
if [ ! -f "$file" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Look for risky patterns
|
||||
if grep -q -E "try|catch|async|await|prisma|\.execute\(|fetch\(|axios\." "$file"; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
# Main execution
|
||||
risky_count=0
|
||||
backend_files=0
|
||||
|
||||
for file in $(get_edited_files); do
|
||||
if check_file_risk "$file"; then
|
||||
((risky_count++))
|
||||
|
||||
if echo "$file" | grep -q "backend\|server\|api"; then
|
||||
((backend_files++))
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "$risky_count" -gt 0 ]; then
|
||||
cat <<EOF
|
||||
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
📋 ERROR HANDLING SELF-CHECK
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
⚠️ Risky Patterns Detected
|
||||
$risky_count file(s) with async/try-catch/database operations
|
||||
|
||||
❓ Did you add proper error handling?
|
||||
❓ Are errors logged/captured appropriately?
|
||||
❓ Are promises handled correctly?
|
||||
|
||||
EOF
|
||||
|
||||
if [ "$backend_files" -gt 0 ]; then
|
||||
cat <<EOF
|
||||
💡 Backend Best Practice:
|
||||
- All errors should be captured (Sentry, logging)
|
||||
- Database operations need try-catch
|
||||
- API routes should use error middleware
|
||||
|
||||
EOF
|
||||
fi
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
fi
|
||||
|
||||
# Non-blocking
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Example 6: Dangerous Operation Blocker (PreToolUse)
|
||||
|
||||
**Purpose:** Block dangerous file operations (deletion, overwrite) in production paths.
|
||||
|
||||
**File:** `~/.claude/hooks/pre-tool-use/dangerous-ops.sh`
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Read tool use event
|
||||
read -r tool_use_json
|
||||
|
||||
tool_name=$(echo "$tool_use_json" | jq -r '.tool.name')
|
||||
file_path=$(echo "$tool_use_json" | jq -r '.tool.input.file_path // empty')
|
||||
|
||||
# Dangerous paths (customize for your project)
|
||||
PROTECTED_PATHS=(
|
||||
"/production/"
|
||||
"/prod/"
|
||||
"/.env.production"
|
||||
"/config/production"
|
||||
)
|
||||
|
||||
# Check if operation is dangerous
|
||||
is_dangerous() {
|
||||
local path="$1"
|
||||
|
||||
for protected in "${PROTECTED_PATHS[@]}"; do
|
||||
if [[ "$path" == *"$protected"* ]]; then
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
# Check dangerous operations
|
||||
if [ "$tool_name" == "Write" ] || [ "$tool_name" == "Edit" ]; then
|
||||
if is_dangerous "$file_path"; then
|
||||
cat <<EOF | jq -c '.'
|
||||
{
|
||||
"decision": "block",
|
||||
"reason": "⛔ BLOCKED: Attempting to modify protected path\\n\\nFile: $file_path\\n\\nThis path is protected from automatic modification.\\nIf you need to make changes:\\n1. Review changes carefully\\n2. Use manual file editing\\n3. Confirm with teammate\\n\\nTo override, edit ~/.claude/hooks/pre-tool-use/dangerous-ops.sh"
|
||||
}
|
||||
EOF
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
|
||||
# Allow operation (NOTE: PreToolUse hooks should use hookSpecificOutput format with permissionDecision)
|
||||
echo '{"decision": "allow"}'
|
||||
```
|
||||
|
||||
## Testing These Examples
|
||||
|
||||
### Test Edit Tracker
|
||||
```bash
|
||||
# Create test log entry
|
||||
echo "2025-01-15 10:30:00 | frontend | /path/to/file.ts" > ~/.claude/edit-log.txt
|
||||
|
||||
# Test formatting script
|
||||
bash ~/.claude/hooks/stop/30-format-code.sh
|
||||
```
|
||||
|
||||
### Test Build Checker
|
||||
```bash
|
||||
# Add some edits to log
|
||||
echo "2025-01-15 10:30:00 | backend | /path/to/backend/file.ts" >> ~/.claude/edit-log.txt
|
||||
|
||||
# Run build checker
|
||||
bash ~/.claude/hooks/stop/20-build-checker.sh
|
||||
```
|
||||
|
||||
### Test Skill Activator
|
||||
```bash
|
||||
# Test with mock prompt
|
||||
echo '{"text": "How do I create a new API endpoint?"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
## Debugging Tips
|
||||
|
||||
**Enable debug mode:**
|
||||
```bash
|
||||
# Add to top of any bash script
|
||||
set -x
|
||||
exec 2>>~/.claude/hooks/debug.log
|
||||
```
|
||||
|
||||
**Check hook execution:**
|
||||
```bash
|
||||
# Watch hooks run in real-time
|
||||
tail -f ~/.claude/logs/hooks.log
|
||||
```
|
||||
|
||||
**Test hook output:**
|
||||
```bash
|
||||
# Capture output
|
||||
bash ~/.claude/hooks/stop/20-build-checker.sh > /tmp/hook-test.log 2>&1
|
||||
cat /tmp/hook-test.log
|
||||
```
|
||||
610
skills/building-hooks/resources/hook-patterns.md
Normal file
610
skills/building-hooks/resources/hook-patterns.md
Normal file
@@ -0,0 +1,610 @@
|
||||
# Hook Patterns Library
|
||||
|
||||
Reusable patterns for common hook use cases.
|
||||
|
||||
## Pattern: File Path Validation
|
||||
|
||||
Safely validate and sanitize file paths in hooks.
|
||||
|
||||
```bash
|
||||
validate_file_path() {
|
||||
local path="$1"
|
||||
|
||||
# Remove null/empty
|
||||
if [ -z "$path" ] || [ "$path" == "null" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Must be absolute path
|
||||
if [[ ! "$path" =~ ^/ ]]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Must exist
|
||||
if [ ! -f "$path" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Check file extension whitelist
|
||||
if [[ ! "$path" =~ \.(ts|tsx|js|jsx|py|rs|go|java)$ ]]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# Usage
|
||||
if validate_file_path "$file_path"; then
|
||||
# Safe to operate on file
|
||||
process_file "$file_path"
|
||||
fi
|
||||
```
|
||||
|
||||
## Pattern: Finding Project Root
|
||||
|
||||
Locate the project root directory from any file path.
|
||||
|
||||
```bash
|
||||
find_project_root() {
|
||||
local dir="$1"
|
||||
|
||||
# Start from file's directory
|
||||
if [ -f "$dir" ]; then
|
||||
dir=$(dirname "$dir")
|
||||
fi
|
||||
|
||||
# Walk up until finding markers
|
||||
while [ "$dir" != "/" ]; do
|
||||
# Check for project markers
|
||||
if [ -f "$dir/package.json" ] || \
|
||||
[ -f "$dir/Cargo.toml" ] || \
|
||||
[ -f "$dir/go.mod" ] || \
|
||||
[ -d "$dir/.git" ]; then
|
||||
echo "$dir"
|
||||
return 0
|
||||
fi
|
||||
dir=$(dirname "$dir")
|
||||
done
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
# Usage
|
||||
project_root=$(find_project_root "$file_path")
|
||||
if [ -n "$project_root" ]; then
|
||||
cd "$project_root"
|
||||
npm run build
|
||||
fi
|
||||
```
|
||||
|
||||
## Pattern: Conditional Hook Execution
|
||||
|
||||
Run hook only when certain conditions are met.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Configuration
|
||||
MIN_CHANGES=3
|
||||
TARGET_REPO="backend"
|
||||
|
||||
# Check if should run
|
||||
should_run() {
|
||||
# Count recent edits
|
||||
local edit_count=$(tail -20 ~/.claude/edit-log.txt | wc -l)
|
||||
|
||||
if [ "$edit_count" -lt "$MIN_CHANGES" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Check if target repo was modified
|
||||
if ! tail -20 ~/.claude/edit-log.txt | grep -q "$TARGET_REPO"; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# Main execution
|
||||
if ! should_run; then
|
||||
echo '{}'
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Run actual hook logic
|
||||
perform_build_check
|
||||
```
|
||||
|
||||
## Pattern: Rate Limiting
|
||||
|
||||
Prevent hooks from running too frequently.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
RATE_LIMIT_FILE="/tmp/hook-last-run"
|
||||
MIN_INTERVAL=30 # seconds
|
||||
|
||||
# Check if enough time has passed
|
||||
should_run() {
|
||||
if [ ! -f "$RATE_LIMIT_FILE" ]; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
local last_run=$(cat "$RATE_LIMIT_FILE")
|
||||
local now=$(date +%s)
|
||||
local elapsed=$((now - last_run))
|
||||
|
||||
if [ "$elapsed" -lt "$MIN_INTERVAL" ]; then
|
||||
echo "Skipping (ran ${elapsed}s ago, min interval ${MIN_INTERVAL}s)"
|
||||
return 1
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# Update last run time
|
||||
mark_run() {
|
||||
date +%s > "$RATE_LIMIT_FILE"
|
||||
}
|
||||
|
||||
# Usage
|
||||
if should_run; then
|
||||
perform_expensive_operation
|
||||
mark_run
|
||||
fi
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Pattern: Multi-Project Detection
|
||||
|
||||
Detect which project/repo a file belongs to.
|
||||
|
||||
```bash
|
||||
detect_project() {
|
||||
local file="$1"
|
||||
local project_root="/Users/myuser/projects"
|
||||
|
||||
# Extract project name from path
|
||||
if [[ "$file" =~ $project_root/([^/]+) ]]; then
|
||||
echo "${BASH_REMATCH[1]}"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "unknown"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Usage
|
||||
project=$(detect_project "$file_path")
|
||||
|
||||
case "$project" in
|
||||
"frontend")
|
||||
npm --prefix ~/projects/frontend run build
|
||||
;;
|
||||
"backend")
|
||||
cargo build --manifest-path ~/projects/backend/Cargo.toml
|
||||
;;
|
||||
*)
|
||||
echo "Unknown project: $project"
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
## Pattern: Graceful Degradation
|
||||
|
||||
Handle failures gracefully without blocking workflow.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Try operation with fallback
|
||||
try_with_fallback() {
|
||||
local primary_cmd="$1"
|
||||
local fallback_cmd="$2"
|
||||
local description="$3"
|
||||
|
||||
echo "Attempting: $description"
|
||||
|
||||
# Try primary command
|
||||
if eval "$primary_cmd" 2>/dev/null; then
|
||||
echo "✅ Success"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "⚠️ Primary failed, trying fallback..."
|
||||
|
||||
# Try fallback
|
||||
if eval "$fallback_cmd" 2>/dev/null; then
|
||||
echo "✅ Fallback succeeded"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "❌ Both failed, continuing anyway"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Usage
|
||||
try_with_fallback \
|
||||
"npm run build" \
|
||||
"npm run build:dev" \
|
||||
"Building project"
|
||||
|
||||
# Always return empty response (non-blocking)
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Pattern: Parallel Execution
|
||||
|
||||
Run multiple checks in parallel for speed.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Run checks in parallel
|
||||
run_parallel_checks() {
|
||||
local pids=()
|
||||
|
||||
# Start each check in background
|
||||
check_typescript &
|
||||
pids+=($!)
|
||||
|
||||
check_eslint &
|
||||
pids+=($!)
|
||||
|
||||
check_tests &
|
||||
pids+=($!)
|
||||
|
||||
# Wait for all to complete
|
||||
local exit_code=0
|
||||
for pid in "${pids[@]}"; do
|
||||
wait "$pid" || exit_code=1
|
||||
done
|
||||
|
||||
return $exit_code
|
||||
}
|
||||
|
||||
check_typescript() {
|
||||
npx tsc --noEmit > /tmp/tsc-output.txt 2>&1
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "TypeScript errors found"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
check_eslint() {
|
||||
npx eslint . > /tmp/eslint-output.txt 2>&1
|
||||
}
|
||||
|
||||
check_tests() {
|
||||
npm test > /tmp/test-output.txt 2>&1
|
||||
}
|
||||
|
||||
# Usage
|
||||
if run_parallel_checks; then
|
||||
echo "✅ All checks passed"
|
||||
else
|
||||
echo "⚠️ Some checks failed"
|
||||
cat /tmp/tsc-output.txt
|
||||
cat /tmp/eslint-output.txt
|
||||
fi
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Pattern: Smart Caching
|
||||
|
||||
Cache results to avoid redundant work.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
CACHE_DIR="$HOME/.claude/hook-cache"
|
||||
mkdir -p "$CACHE_DIR"
|
||||
|
||||
# Generate cache key
|
||||
cache_key() {
|
||||
local file="$1"
|
||||
echo -n "$file:$(stat -f %m "$file" 2>/dev/null || stat -c %Y "$file")" | md5sum | cut -d' ' -f1
|
||||
}
|
||||
|
||||
# Check cache
|
||||
check_cache() {
|
||||
local file="$1"
|
||||
local key=$(cache_key "$file")
|
||||
local cache_file="$CACHE_DIR/$key"
|
||||
|
||||
if [ -f "$cache_file" ]; then
|
||||
# Cache hit
|
||||
cat "$cache_file"
|
||||
return 0
|
||||
fi
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
# Update cache
|
||||
update_cache() {
|
||||
local file="$1"
|
||||
local result="$2"
|
||||
local key=$(cache_key "$file")
|
||||
local cache_file="$CACHE_DIR/$key"
|
||||
|
||||
echo "$result" > "$cache_file"
|
||||
|
||||
# Clean old cache entries (older than 1 day)
|
||||
find "$CACHE_DIR" -type f -mtime +1 -delete 2>/dev/null
|
||||
}
|
||||
|
||||
# Usage
|
||||
if cached=$(check_cache "$file_path"); then
|
||||
echo "Cache hit: $cached"
|
||||
else
|
||||
result=$(expensive_operation "$file_path")
|
||||
update_cache "$file_path" "$result"
|
||||
echo "Computed: $result"
|
||||
fi
|
||||
```
|
||||
|
||||
## Pattern: Progressive Output
|
||||
|
||||
Show progress for long-running hooks.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Progress indicator
|
||||
show_progress() {
|
||||
local message="$1"
|
||||
echo -n "$message..."
|
||||
}
|
||||
|
||||
complete_progress() {
|
||||
local status="$1"
|
||||
if [ "$status" == "success" ]; then
|
||||
echo " ✅"
|
||||
else
|
||||
echo " ❌"
|
||||
fi
|
||||
}
|
||||
|
||||
# Usage
|
||||
show_progress "Running TypeScript compiler"
|
||||
if npx tsc --noEmit 2>/dev/null; then
|
||||
complete_progress "success"
|
||||
else
|
||||
complete_progress "failure"
|
||||
fi
|
||||
|
||||
show_progress "Running linter"
|
||||
if npx eslint . 2>/dev/null; then
|
||||
complete_progress "success"
|
||||
else
|
||||
complete_progress "failure"
|
||||
fi
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Pattern: Context Injection
|
||||
|
||||
Inject helpful context into Claude's prompt.
|
||||
|
||||
```javascript
|
||||
// UserPromptSubmit hook
|
||||
function injectContext(prompt) {
|
||||
const context = [];
|
||||
|
||||
// Add relevant documentation
|
||||
if (prompt.includes('API')) {
|
||||
context.push('📖 API Documentation: https://docs.example.com/api');
|
||||
}
|
||||
|
||||
// Add recent changes
|
||||
const recentFiles = getRecentlyEditedFiles();
|
||||
if (recentFiles.length > 0) {
|
||||
context.push(`📝 Recently edited: ${recentFiles.join(', ')}`);
|
||||
}
|
||||
|
||||
// Add project status
|
||||
const buildStatus = getLastBuildStatus();
|
||||
if (!buildStatus.passed) {
|
||||
context.push(`⚠️ Current build has ${buildStatus.errorCount} errors`);
|
||||
}
|
||||
|
||||
if (context.length === 0) {
|
||||
return { decision: 'approve' };
|
||||
}
|
||||
|
||||
return {
|
||||
decision: 'approve',
|
||||
additionalContext: `\n\n---\n${context.join('\n')}\n---\n`
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
## Pattern: Error Accumulation
|
||||
|
||||
Collect multiple errors before reporting.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
ERRORS=()
|
||||
|
||||
# Add error to collection
|
||||
add_error() {
|
||||
ERRORS+=("$1")
|
||||
}
|
||||
|
||||
# Report all errors
|
||||
report_errors() {
|
||||
if [ ${#ERRORS[@]} -eq 0 ]; then
|
||||
echo "✅ No errors found"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "⚠️ Found ${#ERRORS[@]} issue(s):"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
local i=1
|
||||
for error in "${ERRORS[@]}"; do
|
||||
echo "$i. $error"
|
||||
((i++))
|
||||
done
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Usage
|
||||
if ! run_typescript_check; then
|
||||
add_error "TypeScript compilation failed"
|
||||
fi
|
||||
|
||||
if ! run_lint_check; then
|
||||
add_error "Linting issues found"
|
||||
fi
|
||||
|
||||
if ! run_test_check; then
|
||||
add_error "Tests failing"
|
||||
fi
|
||||
|
||||
report_errors
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Pattern: Conditional Blocking
|
||||
|
||||
Block only on critical errors, warn on others.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
ERROR_LEVEL="none" # none, warning, critical
|
||||
|
||||
# Check for issues
|
||||
check_critical_issues() {
|
||||
if grep -q "FIXME\|XXX\|TODO: CRITICAL" "$file_path"; then
|
||||
ERROR_LEVEL="critical"
|
||||
return 1
|
||||
fi
|
||||
return 0
|
||||
}
|
||||
|
||||
check_warnings() {
|
||||
if grep -q "console.log\|debugger" "$file_path"; then
|
||||
ERROR_LEVEL="warning"
|
||||
return 1
|
||||
fi
|
||||
return 0
|
||||
}
|
||||
|
||||
# Run checks
|
||||
check_critical_issues
|
||||
check_warnings
|
||||
|
||||
# Return appropriate decision
|
||||
case "$ERROR_LEVEL" in
|
||||
"critical")
|
||||
echo '{
|
||||
"decision": "block",
|
||||
"reason": "🚫 CRITICAL: Found critical TODOs or FIXMEs that must be addressed"
|
||||
}' | jq -c '.'
|
||||
;;
|
||||
"warning")
|
||||
echo "⚠️ Warning: Found debug statements (console.log, debugger)"
|
||||
echo '{}'
|
||||
;;
|
||||
*)
|
||||
echo '{}'
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
## Pattern: Hook Coordination
|
||||
|
||||
Coordinate between multiple hooks using shared state.
|
||||
|
||||
```bash
|
||||
# Hook 1: Track state
|
||||
#!/bin/bash
|
||||
STATE_FILE="/tmp/hook-state.json"
|
||||
|
||||
# Update state
|
||||
jq -n \
|
||||
--arg timestamp "$(date +%s)" \
|
||||
--arg files "$files_edited" \
|
||||
'{lastRun: $timestamp, filesEdited: ($files | split(","))}' \
|
||||
> "$STATE_FILE"
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
```bash
|
||||
# Hook 2: Read state
|
||||
#!/bin/bash
|
||||
STATE_FILE="/tmp/hook-state.json"
|
||||
|
||||
if [ -f "$STATE_FILE" ]; then
|
||||
last_run=$(jq -r '.lastRun' "$STATE_FILE")
|
||||
files=$(jq -r '.filesEdited[]' "$STATE_FILE")
|
||||
|
||||
# Use state from previous hook
|
||||
for file in $files; do
|
||||
process_file "$file"
|
||||
done
|
||||
fi
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Pattern: User Notification
|
||||
|
||||
Notify user of important events without blocking.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Send desktop notification (macOS)
|
||||
notify_macos() {
|
||||
osascript -e "display notification \"$1\" with title \"Claude Code Hook\""
|
||||
}
|
||||
|
||||
# Send desktop notification (Linux)
|
||||
notify_linux() {
|
||||
notify-send "Claude Code Hook" "$1"
|
||||
}
|
||||
|
||||
# Notify based on OS
|
||||
notify() {
|
||||
local message="$1"
|
||||
|
||||
case "$OSTYPE" in
|
||||
darwin*)
|
||||
notify_macos "$message"
|
||||
;;
|
||||
linux*)
|
||||
notify_linux "$message"
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
# Usage
|
||||
if [ "$error_count" -gt 10 ]; then
|
||||
notify "⚠️ Build has $error_count errors"
|
||||
fi
|
||||
|
||||
echo '{}'
|
||||
```
|
||||
|
||||
## Remember
|
||||
|
||||
- **Keep it simple** - Start with basic patterns, add complexity only when needed
|
||||
- **Test thoroughly** - Test each pattern in isolation before combining
|
||||
- **Fail gracefully** - Non-blocking hooks should never crash workflow
|
||||
- **Log everything** - You'll need it for debugging
|
||||
- **Document patterns** - Future you will thank present you
|
||||
657
skills/building-hooks/resources/testing-hooks.md
Normal file
657
skills/building-hooks/resources/testing-hooks.md
Normal file
@@ -0,0 +1,657 @@
|
||||
# Testing Hooks
|
||||
|
||||
Comprehensive testing strategies for Claude Code hooks.
|
||||
|
||||
## Testing Philosophy
|
||||
|
||||
**Hooks run with full system access. Test them thoroughly before deploying.**
|
||||
|
||||
### Testing Levels
|
||||
|
||||
1. **Unit testing** - Test functions in isolation
|
||||
2. **Integration testing** - Test with mock Claude Code events
|
||||
3. **Manual testing** - Test in real Claude Code sessions
|
||||
4. **Regression testing** - Verify hooks don't break existing workflows
|
||||
|
||||
## Unit Testing Hook Functions
|
||||
|
||||
### Bash Functions
|
||||
|
||||
**Example: Testing file validation**
|
||||
|
||||
```bash
|
||||
# hook-functions.sh - extractable functions
|
||||
validate_file_path() {
|
||||
local path="$1"
|
||||
|
||||
if [ -z "$path" ] || [ "$path" == "null" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
if [[ ! "$path" =~ ^/ ]]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
if [ ! -f "$path" ]; then
|
||||
return 1
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# Test script
|
||||
#!/bin/bash
|
||||
source ./hook-functions.sh
|
||||
|
||||
test_validate_file_path() {
|
||||
# Test valid path
|
||||
touch /tmp/test-file.txt
|
||||
if validate_file_path "/tmp/test-file.txt"; then
|
||||
echo "✅ Valid path test passed"
|
||||
else
|
||||
echo "❌ Valid path test failed"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Test invalid path
|
||||
if ! validate_file_path ""; then
|
||||
echo "✅ Empty path test passed"
|
||||
else
|
||||
echo "❌ Empty path test failed"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Test null path
|
||||
if ! validate_file_path "null"; then
|
||||
echo "✅ Null path test passed"
|
||||
else
|
||||
echo "❌ Null path test failed"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Test relative path
|
||||
if ! validate_file_path "relative/path.txt"; then
|
||||
echo "✅ Relative path test passed"
|
||||
else
|
||||
echo "❌ Relative path test failed"
|
||||
return 1
|
||||
fi
|
||||
|
||||
rm /tmp/test-file.txt
|
||||
return 0
|
||||
}
|
||||
|
||||
# Run test
|
||||
test_validate_file_path
|
||||
```
|
||||
|
||||
### JavaScript Functions
|
||||
|
||||
**Example: Testing prompt analysis**
|
||||
|
||||
```javascript
|
||||
// skill-activator.js
|
||||
function analyzePrompt(text, rules) {
|
||||
const lowerText = text.toLowerCase();
|
||||
const activated = [];
|
||||
|
||||
for (const [skillName, config] of Object.entries(rules)) {
|
||||
if (config.promptTriggers?.keywords) {
|
||||
for (const keyword of config.promptTriggers.keywords) {
|
||||
if (lowerText.includes(keyword.toLowerCase())) {
|
||||
activated.push({ skill: skillName, priority: config.priority || 'medium' });
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return activated;
|
||||
}
|
||||
|
||||
// test.js
|
||||
const assert = require('assert');
|
||||
|
||||
const testRules = {
|
||||
'backend-dev': {
|
||||
priority: 'high',
|
||||
promptTriggers: {
|
||||
keywords: ['backend', 'API', 'endpoint']
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
// Test keyword matching
|
||||
function testKeywordMatching() {
|
||||
const result = analyzePrompt('How do I create a backend endpoint?', testRules);
|
||||
assert.equal(result.length, 1, 'Should find one skill');
|
||||
assert.equal(result[0].skill, 'backend-dev', 'Should match backend-dev');
|
||||
assert.equal(result[0].priority, 'high', 'Should have high priority');
|
||||
console.log('✅ Keyword matching test passed');
|
||||
}
|
||||
|
||||
// Test no match
|
||||
function testNoMatch() {
|
||||
const result = analyzePrompt('How do I write Python?', testRules);
|
||||
assert.equal(result.length, 0, 'Should find no skills');
|
||||
console.log('✅ No match test passed');
|
||||
}
|
||||
|
||||
// Test case insensitivity
|
||||
function testCaseInsensitive() {
|
||||
const result = analyzePrompt('BACKEND endpoint', testRules);
|
||||
assert.equal(result.length, 1, 'Should match regardless of case');
|
||||
console.log('✅ Case insensitive test passed');
|
||||
}
|
||||
|
||||
// Run tests
|
||||
testKeywordMatching();
|
||||
testNoMatch();
|
||||
testCaseInsensitive();
|
||||
```
|
||||
|
||||
## Integration Testing with Mock Events
|
||||
|
||||
### Creating Mock Events
|
||||
|
||||
**PostToolUse event:**
|
||||
```json
|
||||
{
|
||||
"event": "PostToolUse",
|
||||
"tool": {
|
||||
"name": "Edit",
|
||||
"input": {
|
||||
"file_path": "/Users/test/project/src/file.ts",
|
||||
"old_string": "const x = 1;",
|
||||
"new_string": "const x = 2;"
|
||||
}
|
||||
},
|
||||
"result": {
|
||||
"success": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**UserPromptSubmit event:**
|
||||
```json
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"text": "How do I create a new API endpoint?",
|
||||
"timestamp": "2025-01-15T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Stop event:**
|
||||
```json
|
||||
{
|
||||
"event": "Stop",
|
||||
"sessionId": "abc123",
|
||||
"messageCount": 10
|
||||
}
|
||||
```
|
||||
|
||||
### Testing Hook with Mock Events
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# test-hook.sh
|
||||
|
||||
# Create mock event
|
||||
create_mock_edit_event() {
|
||||
cat <<EOF
|
||||
{
|
||||
"event": "PostToolUse",
|
||||
"tool": {
|
||||
"name": "Edit",
|
||||
"input": {
|
||||
"file_path": "/tmp/test-file.ts"
|
||||
}
|
||||
}
|
||||
}
|
||||
EOF
|
||||
}
|
||||
|
||||
# Test hook
|
||||
test_edit_tracker() {
|
||||
# Setup
|
||||
export LOG_FILE="/tmp/test-edit-log.txt"
|
||||
rm -f "$LOG_FILE"
|
||||
|
||||
# Run hook with mock event
|
||||
create_mock_edit_event | bash hooks/post-tool-use/01-track-edits.sh
|
||||
|
||||
# Verify
|
||||
if [ -f "$LOG_FILE" ]; then
|
||||
if grep -q "test-file.ts" "$LOG_FILE"; then
|
||||
echo "✅ Edit tracker test passed"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "❌ Edit tracker test failed"
|
||||
return 1
|
||||
}
|
||||
|
||||
test_edit_tracker
|
||||
```
|
||||
|
||||
### Testing JavaScript Hooks
|
||||
|
||||
```javascript
|
||||
// test-skill-activator.js
|
||||
const { execSync } = require('child_process');
|
||||
|
||||
function testSkillActivator(prompt) {
|
||||
const mockEvent = JSON.stringify({
|
||||
text: prompt
|
||||
});
|
||||
|
||||
const result = execSync(
|
||||
'node hooks/user-prompt-submit/skill-activator.js',
|
||||
{
|
||||
input: mockEvent,
|
||||
encoding: 'utf8',
|
||||
env: {
|
||||
...process.env,
|
||||
SKILL_RULES: './test-skill-rules.json'
|
||||
}
|
||||
}
|
||||
);
|
||||
|
||||
return JSON.parse(result);
|
||||
}
|
||||
|
||||
// Test activation
|
||||
function testBackendActivation() {
|
||||
const result = testSkillActivator('How do I create a backend endpoint?');
|
||||
|
||||
if (result.additionalContext && result.additionalContext.includes('backend')) {
|
||||
console.log('✅ Backend activation test passed');
|
||||
} else {
|
||||
console.log('❌ Backend activation test failed');
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
testBackendActivation();
|
||||
```
|
||||
|
||||
## Manual Testing in Claude Code
|
||||
|
||||
### Testing Checklist
|
||||
|
||||
**Before deployment:**
|
||||
- [ ] Hook executes without errors
|
||||
- [ ] Hook completes within timeout (default 10s)
|
||||
- [ ] Output is helpful and not overwhelming
|
||||
- [ ] Non-blocking hooks don't prevent work
|
||||
- [ ] Blocking hooks have clear error messages
|
||||
- [ ] Hook handles missing files gracefully
|
||||
- [ ] Hook handles malformed input gracefully
|
||||
|
||||
### Manual Test Procedure
|
||||
|
||||
**1. Enable debug mode:**
|
||||
```bash
|
||||
# Add to top of hook
|
||||
set -x
|
||||
exec 2>>~/.claude/hooks/debug-$(date +%Y%m%d).log
|
||||
```
|
||||
|
||||
**2. Test with minimal prompt:**
|
||||
```
|
||||
Create a simple test file
|
||||
```
|
||||
|
||||
**3. Observe hook execution:**
|
||||
```bash
|
||||
# Watch debug log
|
||||
tail -f ~/.claude/hooks/debug-*.log
|
||||
```
|
||||
|
||||
**4. Verify output:**
|
||||
- Check that hook completes
|
||||
- Verify no errors in debug log
|
||||
- Confirm expected behavior
|
||||
|
||||
**5. Test edge cases:**
|
||||
- Empty file paths
|
||||
- Non-existent files
|
||||
- Files outside project
|
||||
- Malformed input
|
||||
- Missing dependencies
|
||||
|
||||
**6. Test performance:**
|
||||
```bash
|
||||
# Time hook execution
|
||||
time bash hooks/stop/build-checker.sh
|
||||
```
|
||||
|
||||
## Regression Testing
|
||||
|
||||
### Creating Test Suite
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# regression-test.sh
|
||||
|
||||
TEST_DIR="/tmp/hook-tests"
|
||||
mkdir -p "$TEST_DIR"
|
||||
|
||||
# Setup test environment
|
||||
setup() {
|
||||
export LOG_FILE="$TEST_DIR/edit-log.txt"
|
||||
export PROJECT_ROOT="$TEST_DIR/projects"
|
||||
mkdir -p "$PROJECT_ROOT"
|
||||
}
|
||||
|
||||
# Cleanup after tests
|
||||
teardown() {
|
||||
rm -rf "$TEST_DIR"
|
||||
}
|
||||
|
||||
# Test 1: Edit tracker logs edits
|
||||
test_edit_tracker_logs() {
|
||||
echo '{"tool": {"name": "Edit", "input": {"file_path": "/test/file.ts"}}}' | \
|
||||
bash hooks/post-tool-use/01-track-edits.sh
|
||||
|
||||
if grep -q "file.ts" "$LOG_FILE"; then
|
||||
echo "✅ Test 1 passed"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "❌ Test 1 failed"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Test 2: Build checker finds errors
|
||||
test_build_checker_finds_errors() {
|
||||
# Create mock project with errors
|
||||
mkdir -p "$PROJECT_ROOT/test-project"
|
||||
echo 'const x: string = 123;' > "$PROJECT_ROOT/test-project/error.ts"
|
||||
|
||||
# Add to log
|
||||
echo "2025-01-15 10:00:00 | test-project | error.ts" > "$LOG_FILE"
|
||||
|
||||
# Run build checker (should find errors)
|
||||
output=$(bash hooks/stop/20-build-checker.sh)
|
||||
|
||||
if echo "$output" | grep -q "error"; then
|
||||
echo "✅ Test 2 passed"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "❌ Test 2 failed"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Test 3: Formatter handles missing prettier
|
||||
test_formatter_missing_prettier() {
|
||||
# Create file without prettier config
|
||||
mkdir -p "$PROJECT_ROOT/no-prettier"
|
||||
echo 'const x=1' > "$PROJECT_ROOT/no-prettier/file.js"
|
||||
echo "2025-01-15 10:00:00 | no-prettier | file.js" > "$LOG_FILE"
|
||||
|
||||
# Should complete without error
|
||||
if bash hooks/stop/30-format-code.sh 2>&1; then
|
||||
echo "✅ Test 3 passed"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "❌ Test 3 failed"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Run all tests
|
||||
run_all_tests() {
|
||||
setup
|
||||
|
||||
local failed=0
|
||||
|
||||
test_edit_tracker_logs || ((failed++))
|
||||
test_build_checker_finds_errors || ((failed++))
|
||||
test_formatter_missing_prettier || ((failed++))
|
||||
|
||||
teardown
|
||||
|
||||
if [ $failed -eq 0 ]; then
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "✅ All tests passed!"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
return 0
|
||||
else
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "❌ $failed test(s) failed"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
run_all_tests
|
||||
```
|
||||
|
||||
### Running Regression Suite
|
||||
|
||||
```bash
|
||||
# Run before deploying changes
|
||||
bash test/regression-test.sh
|
||||
|
||||
# Run on schedule (cron)
|
||||
0 0 * * * cd ~/hooks && bash test/regression-test.sh
|
||||
```
|
||||
|
||||
## Performance Testing
|
||||
|
||||
### Measuring Hook Performance
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# benchmark-hook.sh
|
||||
|
||||
ITERATIONS=10
|
||||
HOOK_PATH="hooks/stop/build-checker.sh"
|
||||
|
||||
total_time=0
|
||||
|
||||
for i in $(seq 1 $ITERATIONS); do
|
||||
start=$(date +%s%N)
|
||||
bash "$HOOK_PATH" > /dev/null 2>&1
|
||||
end=$(date +%s%N)
|
||||
|
||||
elapsed=$(( (end - start) / 1000000 )) # Convert to ms
|
||||
total_time=$(( total_time + elapsed ))
|
||||
|
||||
echo "Iteration $i: ${elapsed}ms"
|
||||
done
|
||||
|
||||
average=$(( total_time / ITERATIONS ))
|
||||
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Average: ${average}ms"
|
||||
echo "Total: ${total_time}ms"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
if [ $average -gt 2000 ]; then
|
||||
echo "⚠️ Hook is slow (>2s)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✅ Performance acceptable"
|
||||
```
|
||||
|
||||
### Performance Targets
|
||||
|
||||
- **Non-blocking hooks:** <2 seconds
|
||||
- **Blocking hooks:** <5 seconds
|
||||
- **UserPromptSubmit:** <1 second (critical path)
|
||||
- **PostToolUse:** <500ms (runs frequently)
|
||||
|
||||
## Continuous Testing
|
||||
|
||||
### Pre-commit Hook for Hook Testing
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# .git/hooks/pre-commit
|
||||
|
||||
echo "Testing Claude Code hooks..."
|
||||
|
||||
# Run test suite
|
||||
if bash test/regression-test.sh; then
|
||||
echo "✅ Hook tests passed"
|
||||
exit 0
|
||||
else
|
||||
echo "❌ Hook tests failed"
|
||||
echo "Fix tests before committing"
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
```yaml
|
||||
# .github/workflows/test-hooks.yml
|
||||
name: Test Claude Code Hooks
|
||||
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v2
|
||||
with:
|
||||
node-version: '18'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm install
|
||||
|
||||
- name: Run hook tests
|
||||
run: bash test/regression-test.sh
|
||||
|
||||
- name: Run performance tests
|
||||
run: bash test/benchmark-hook.sh
|
||||
```
|
||||
|
||||
## Common Testing Mistakes
|
||||
|
||||
### Mistake 1: Not Testing Error Paths
|
||||
|
||||
❌ **Wrong:**
|
||||
```bash
|
||||
# Only test success path
|
||||
npx tsc --noEmit
|
||||
echo "✅ Build passed"
|
||||
```
|
||||
|
||||
✅ **Right:**
|
||||
```bash
|
||||
# Test both success and failure
|
||||
if npx tsc --noEmit 2>&1; then
|
||||
echo "✅ Build passed"
|
||||
else
|
||||
echo "❌ Build failed"
|
||||
# Test that error handling works
|
||||
fi
|
||||
```
|
||||
|
||||
### Mistake 2: Hardcoding Paths
|
||||
|
||||
❌ **Wrong:**
|
||||
```bash
|
||||
# Hardcoded path
|
||||
cd /Users/myname/projects/myproject
|
||||
npm run build
|
||||
```
|
||||
|
||||
✅ **Right:**
|
||||
```bash
|
||||
# Dynamic path
|
||||
project_root=$(find_project_root "$file_path")
|
||||
if [ -n "$project_root" ]; then
|
||||
cd "$project_root"
|
||||
npm run build
|
||||
fi
|
||||
```
|
||||
|
||||
### Mistake 3: Not Cleaning Up
|
||||
|
||||
❌ **Wrong:**
|
||||
```bash
|
||||
# Leaves test files behind
|
||||
echo "test" > /tmp/test-file.txt
|
||||
run_test
|
||||
# Never cleans up
|
||||
```
|
||||
|
||||
✅ **Right:**
|
||||
```bash
|
||||
# Always cleanup
|
||||
trap 'rm -f /tmp/test-file.txt' EXIT
|
||||
echo "test" > /tmp/test-file.txt
|
||||
run_test
|
||||
```
|
||||
|
||||
### Mistake 4: Silent Failures
|
||||
|
||||
❌ **Wrong:**
|
||||
```bash
|
||||
# Errors disappear
|
||||
npx tsc --noEmit 2>/dev/null
|
||||
```
|
||||
|
||||
✅ **Right:**
|
||||
```bash
|
||||
# Capture errors
|
||||
output=$(npx tsc --noEmit 2>&1)
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "❌ TypeScript errors:"
|
||||
echo "$output"
|
||||
fi
|
||||
```
|
||||
|
||||
## Debugging Failed Tests
|
||||
|
||||
### Enable Verbose Output
|
||||
|
||||
```bash
|
||||
# Add debug flags
|
||||
set -x # Print commands
|
||||
set -e # Exit on error
|
||||
set -u # Error on undefined variables
|
||||
set -o pipefail # Catch pipe failures
|
||||
```
|
||||
|
||||
### Capture Test Output
|
||||
|
||||
```bash
|
||||
# Run test with full output
|
||||
bash -x test/regression-test.sh 2>&1 | tee test-output.log
|
||||
|
||||
# Review output
|
||||
less test-output.log
|
||||
```
|
||||
|
||||
### Isolate Failing Test
|
||||
|
||||
```bash
|
||||
# Run single test
|
||||
source test/regression-test.sh
|
||||
setup
|
||||
test_build_checker_finds_errors
|
||||
teardown
|
||||
```
|
||||
|
||||
## Remember
|
||||
|
||||
- **Test before deploying** - Hooks have full system access
|
||||
- **Test all paths** - Success, failure, edge cases
|
||||
- **Test performance** - Hooks shouldn't slow workflow
|
||||
- **Automate testing** - Run tests on every change
|
||||
- **Clean up** - Don't leave test artifacts
|
||||
- **Document tests** - Future you will thank present you
|
||||
|
||||
**Golden rule:** If you wouldn't run it on production, don't deploy it as a hook.
|
||||
1
skills/commands/brainstorm.md
Normal file
1
skills/commands/brainstorm.md
Normal file
@@ -0,0 +1 @@
|
||||
Use your hyperpowers:brainstorming skill.
|
||||
1
skills/commands/execute-plan.md
Normal file
1
skills/commands/execute-plan.md
Normal file
@@ -0,0 +1 @@
|
||||
Use your Executing-Plans skill.
|
||||
1
skills/commands/write-plan.md
Normal file
1
skills/commands/write-plan.md
Normal file
@@ -0,0 +1 @@
|
||||
Use your hyperpowers:writing-plans skill.
|
||||
141
skills/common-patterns/bd-commands.md
Normal file
141
skills/common-patterns/bd-commands.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# bd Command Reference
|
||||
|
||||
Common bd commands used across multiple skills. Reference this instead of duplicating.
|
||||
|
||||
## Reading Issues
|
||||
|
||||
```bash
|
||||
# Show single issue with full design
|
||||
bd show bd-3
|
||||
|
||||
# List all open issues
|
||||
bd list --status open
|
||||
|
||||
# List closed issues
|
||||
bd list --status closed
|
||||
|
||||
# Show dependency tree for an epic
|
||||
bd dep tree bd-1
|
||||
|
||||
# Find tasks ready to work on (no blocking dependencies)
|
||||
bd ready
|
||||
|
||||
# List tasks in a specific epic
|
||||
bd list --parent bd-1
|
||||
```
|
||||
|
||||
## Creating Issues
|
||||
|
||||
```bash
|
||||
# Create epic
|
||||
bd create "Epic: Feature Name" \
|
||||
--type epic \
|
||||
--priority [0-4] \
|
||||
--design "## Goal
|
||||
[Epic description]
|
||||
|
||||
## Success Criteria
|
||||
- [ ] All phases complete
|
||||
..."
|
||||
|
||||
# Create feature/phase
|
||||
bd create "Phase 1: Phase Name" \
|
||||
--type feature \
|
||||
--priority [0-4] \
|
||||
--design "[Phase design]"
|
||||
|
||||
# Create task
|
||||
bd create "Task Name" \
|
||||
--type task \
|
||||
--priority [0-4] \
|
||||
--design "[Task design]"
|
||||
```
|
||||
|
||||
## Updating Issues
|
||||
|
||||
```bash
|
||||
# Update issue design (detailed description)
|
||||
bd update bd-3 --design "$(cat <<'EOF'
|
||||
[Complete updated design]
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
**IMPORTANT**: Use `--design` for the full detailed description, NOT `--description` (which is title only).
|
||||
|
||||
## Managing Status
|
||||
|
||||
```bash
|
||||
# Start working on task
|
||||
bd update bd-3 --status in_progress
|
||||
|
||||
# Complete task
|
||||
bd close bd-3
|
||||
|
||||
# Reopen task
|
||||
bd update bd-3 --status open
|
||||
```
|
||||
|
||||
**Common Mistakes:**
|
||||
```bash
|
||||
# ❌ WRONG - bd status shows database overview, doesn't change status
|
||||
bd status bd-3 --status in_progress
|
||||
|
||||
# ✅ CORRECT - use bd update to change status
|
||||
bd update bd-3 --status in_progress
|
||||
|
||||
# ❌ WRONG - using hyphens in status values
|
||||
bd update bd-3 --status in-progress
|
||||
|
||||
# ✅ CORRECT - use underscores in status values
|
||||
bd update bd-3 --status in_progress
|
||||
|
||||
# ❌ WRONG - 'done' is not a valid status
|
||||
bd update bd-3 --status done
|
||||
|
||||
# ✅ CORRECT - use bd close to complete
|
||||
bd close bd-3
|
||||
```
|
||||
|
||||
**Valid status values:** `open`, `in_progress`, `blocked`, `closed`
|
||||
|
||||
## Managing Dependencies
|
||||
|
||||
```bash
|
||||
# Add blocking dependency (LATER depends on EARLIER)
|
||||
# Syntax: bd dep add <dependent> <dependency>
|
||||
bd dep add bd-3 bd-2 # bd-3 depends on bd-2 (do bd-2 first)
|
||||
|
||||
# Add parent-child relationship
|
||||
# Syntax: bd dep add <child> <parent> --type parent-child
|
||||
bd dep add bd-3 bd-1 --type parent-child # bd-3 is child of bd-1
|
||||
|
||||
# View dependency tree
|
||||
bd dep tree bd-1
|
||||
```
|
||||
|
||||
## Commit Message Format
|
||||
|
||||
Reference bd task IDs in commits (use hyperpowers:test-runner agent):
|
||||
|
||||
```bash
|
||||
# Use test-runner agent to avoid pre-commit hook pollution
|
||||
Dispatch hyperpowers:test-runner agent: "Run: git add <files> && git commit -m 'feat(bd-3): implement feature
|
||||
|
||||
Implements step 1 of bd-3: Task Name
|
||||
'"
|
||||
```
|
||||
|
||||
## Common Queries
|
||||
|
||||
```bash
|
||||
# Check if all tasks in epic are closed
|
||||
bd list --status open --parent bd-1
|
||||
# Output: [empty] = all closed
|
||||
|
||||
# See what's blocking current work
|
||||
bd ready # Shows only unblocked tasks
|
||||
|
||||
# Find all in-progress work
|
||||
bd list --status in_progress
|
||||
```
|
||||
123
skills/common-patterns/common-anti-patterns.md
Normal file
123
skills/common-patterns/common-anti-patterns.md
Normal file
@@ -0,0 +1,123 @@
|
||||
# Common Anti-Patterns
|
||||
|
||||
Anti-patterns that apply across multiple skills. Reference this to avoid duplication.
|
||||
|
||||
## Language-Specific Anti-Patterns
|
||||
|
||||
### Rust
|
||||
|
||||
```
|
||||
❌ No unwrap() or expect() in production code
|
||||
Use proper error handling with Result/Option
|
||||
|
||||
❌ No todo!(), unimplemented!(), or panic!() in production
|
||||
Implement all code paths properly
|
||||
|
||||
❌ No #[ignore] on tests without bd issue number
|
||||
Fix or track broken tests
|
||||
|
||||
❌ No unsafe blocks without documentation
|
||||
Document safety invariants
|
||||
|
||||
❌ Use proper array bounds checking
|
||||
Prefer .get() over direct indexing in production
|
||||
```
|
||||
|
||||
### Swift
|
||||
|
||||
```
|
||||
❌ No force unwrap (!) in production code
|
||||
Use optional chaining or guard/if let
|
||||
|
||||
❌ No fatalError() in production code
|
||||
Handle errors gracefully
|
||||
|
||||
❌ No disabled tests without bd issue number
|
||||
Fix or track broken tests
|
||||
|
||||
❌ Use proper array bounds checking
|
||||
Check indices before accessing
|
||||
|
||||
❌ Handle all enum cases
|
||||
No default: fatalError() shortcuts
|
||||
```
|
||||
|
||||
### TypeScript
|
||||
|
||||
```
|
||||
❌ No @ts-ignore or @ts-expect-error without bd issue number
|
||||
Fix type issues properly
|
||||
|
||||
❌ No any types without justification
|
||||
Use proper typing
|
||||
|
||||
❌ No .skip() on tests without bd issue number
|
||||
Fix or track broken tests
|
||||
|
||||
❌ No throw in async code without proper handling
|
||||
Use try/catch or Promise.catch()
|
||||
```
|
||||
|
||||
## General Anti-Patterns
|
||||
|
||||
### Code Quality
|
||||
|
||||
```
|
||||
❌ No TODOs or FIXMEs without bd issue numbers
|
||||
Track work in bd, not in code comments
|
||||
|
||||
❌ No stub implementations
|
||||
Empty functions, placeholder returns forbidden
|
||||
|
||||
❌ No commented-out code
|
||||
Delete it - version control remembers
|
||||
|
||||
❌ No debug print statements in commits
|
||||
Remove console.log, println!, print() before committing
|
||||
|
||||
❌ No "we'll do this later"
|
||||
Either do it now or create bd issue and reference it
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```
|
||||
❌ Don't test mock behavior
|
||||
Test real behavior or unmock it
|
||||
|
||||
❌ Don't add test-only methods to production code
|
||||
Put in test utilities instead
|
||||
|
||||
❌ Don't mock without understanding dependencies
|
||||
Understand what you're testing first
|
||||
|
||||
❌ Don't skip verifications
|
||||
Run the test, see the output, then claim it passes
|
||||
```
|
||||
|
||||
### Process
|
||||
|
||||
```
|
||||
❌ Don't commit without running tests
|
||||
Verify tests pass before committing
|
||||
|
||||
❌ Don't create PR without running full test suite
|
||||
All tests must pass before PR creation
|
||||
|
||||
❌ Don't skip pre-commit hooks
|
||||
Never use --no-verify
|
||||
|
||||
❌ Don't force push without explicit request
|
||||
Respect shared branch history
|
||||
|
||||
❌ Don't assume backwards compatibility is desired
|
||||
Ask if breaking changes are acceptable
|
||||
```
|
||||
|
||||
## Project-Specific Additions
|
||||
|
||||
Each project may have additional anti-patterns. Check CLAUDE.md for:
|
||||
- Project-specific code patterns to avoid
|
||||
- Custom linting rules
|
||||
- Framework-specific anti-patterns
|
||||
- Team conventions
|
||||
99
skills/common-patterns/common-rationalizations.md
Normal file
99
skills/common-patterns/common-rationalizations.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# Common Rationalizations - STOP
|
||||
|
||||
These rationalizations appear across multiple contexts. When you catch yourself thinking any of these, STOP - you're about to violate a skill.
|
||||
|
||||
## Process Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "This is simple, can skip the process" | Simple tasks done wrong become complex problems. |
|
||||
| "Just this once" | No exceptions. Process exists because exceptions fail. |
|
||||
| "I'm confident this will work" | Confidence ≠ evidence. Run the verification. |
|
||||
| "I'm tired" | Exhaustion ≠ excuse for shortcuts. |
|
||||
| "No time for proper approach" | Shortcuts cost more time in rework. |
|
||||
| "Partner won't notice" | They will. Trust is earned through consistency. |
|
||||
| "Different words so rule doesn't apply" | Spirit over letter. Intent matters. |
|
||||
|
||||
## Verification Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Should work now" | RUN the verification command. |
|
||||
| "Looks correct" | Run it and see the output. |
|
||||
| "Tests probably pass" | Probably ≠ verified. Run them. |
|
||||
| "Linter passed, must be fine" | Linter ≠ compiler ≠ tests. Run everything. |
|
||||
| "Partial check is enough" | Partial proves nothing about the whole. |
|
||||
| "Agent said success" | Agents lie/hallucinate. Verify independently. |
|
||||
|
||||
## Documentation Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "File probably exists" | Use tools to verify. Don't assume. |
|
||||
| "Design mentioned it, must be there" | Codebase changes. Verify current state. |
|
||||
| "I can verify quickly myself" | Use investigator agents. Prevents hallucination. |
|
||||
| "User can figure it out during execution" | Your job is exact instructions. No ambiguity. |
|
||||
|
||||
## Planning Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Can skip exploring alternatives" | Comparison reveals issues. Always propose 2-3. |
|
||||
| "Partner knows what they want" | Questions reveal hidden constraints. Always ask. |
|
||||
| "Whole design at once for efficiency" | Incremental validation catches problems early. |
|
||||
| "Checklist is just suggestion" | Create TodoWrite todos. Track properly. |
|
||||
| "Subtask can reference parent for details" | NO. Subtasks must be complete. NO placeholders, NO "see parent". |
|
||||
| "I'll use placeholder and fill in later" | NO. Write actual content NOW. No meta-references like "[detailed above]". |
|
||||
| "Design field is too long, use placeholder" | Length doesn't matter. Write full content. Placeholder defeats the purpose. |
|
||||
| "Should I continue to the next task?" | YES. You have a TodoWrite plan. Execute it. Don't interrupt your own workflow. |
|
||||
| "Let me ask user's preference for remaining tasks" | NO. The user gave you the work. Do it. Only ask at natural completion points. |
|
||||
| "Should I stop here or keep going?" | Your TodoWrite list tells you. If tasks remain, continue. |
|
||||
|
||||
## Execution Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Task is tracked in TodoWrite, don't need step tracking" | Tasks have 4-8 implementation steps. Without substep tracking, steps 4-8 get skipped. |
|
||||
| "Made progress on the task, can move on" | Progress ≠ complete. All substeps must finish. 2/6 steps = 33%, not done. |
|
||||
| "Other tasks are waiting, should continue" | Current task incomplete = blocked. Finish all substeps first. |
|
||||
| "Can finish remaining steps later" | Later never comes. Complete all substeps now before closing task. |
|
||||
|
||||
## Quality Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Small gaps don't matter" | Spec is contract. All criteria must be met. |
|
||||
| "Will fix in next PR" | This PR should complete this work. Fix now. |
|
||||
| "Partner will review anyway" | You review first. Don't delegate your quality check. |
|
||||
| "Good enough for now" | "Now" becomes "forever". Do it right. |
|
||||
|
||||
## TDD Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Test is obvious, can skip RED phase" | If you don't watch it fail, you don't know it works. |
|
||||
| "Will adapt this code while writing test" | Delete it. Start fresh from the test. |
|
||||
| "Can keep it as reference" | No. Delete means delete. |
|
||||
| "Test is simple, don't need to run it" | Simple tests fail for subtle reasons. Run it. |
|
||||
|
||||
## Research Shortcuts
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "I can research quickly myself" | Use agents. You'll hallucinate or waste context. |
|
||||
| "Agent didn't find it first try, must not exist" | Be persistent. Refine query and try again. |
|
||||
| "I know this codebase" | You don't know current state. Always verify. |
|
||||
| "Obvious solution, skip research" | Codebase may have established pattern. Check first. |
|
||||
|
||||
**All of these mean: STOP. Follow the requirements exactly.**
|
||||
|
||||
## Why This Matters
|
||||
|
||||
Rationalizations are how good processes fail:
|
||||
1. Developer thinks "just this once"
|
||||
2. Shortcut causes subtle bug
|
||||
3. Bug found in production/PR
|
||||
4. More time spent fixing than process would have cost
|
||||
5. Trust damaged
|
||||
|
||||
**No shortcuts. Follow the process. Every time.**
|
||||
493
skills/debugging-with-tools/SKILL.md
Normal file
493
skills/debugging-with-tools/SKILL.md
Normal file
@@ -0,0 +1,493 @@
|
||||
---
|
||||
name: debugging-with-tools
|
||||
description: Use when encountering bugs or test failures - systematic debugging using debuggers, internet research, and agents to find root cause before fixing
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Random fixes waste time and create new bugs. Always use tools to understand root cause BEFORE attempting fixes. Symptom fixes are failure.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
MEDIUM FREEDOM - Must complete investigation phases (tools → hypothesis → test) before fixing.
|
||||
|
||||
Can adapt tool choice to language/context. Never skip investigation or guess at fixes.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
|
||||
| Phase | Tools to Use | Output |
|
||||
|-------|--------------|--------|
|
||||
| **1. Investigate** | Error messages, internet-researcher agent, debugger, codebase-investigator | Root cause understanding |
|
||||
| **2. Hypothesize** | Form theory based on evidence (not guesses) | Testable hypothesis |
|
||||
| **3. Test** | Validate hypothesis with minimal change | Confirms or rejects theory |
|
||||
| **4. Fix** | Implement proper fix for root cause | Problem solved permanently |
|
||||
|
||||
**FORBIDDEN:** Skip investigation → guess at fix → hope it works
|
||||
**REQUIRED:** Tools → evidence → hypothesis → test → fix
|
||||
|
||||
**Key agents:**
|
||||
- `internet-researcher` - Search error messages, known bugs, solutions
|
||||
- `codebase-investigator` - Understand code structure, find related code
|
||||
- `test-runner` - Run tests without output pollution
|
||||
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**Use for ANY technical issue:**
|
||||
- Test failures
|
||||
- Bugs in production or development
|
||||
- Unexpected behavior
|
||||
- Build failures
|
||||
- Integration issues
|
||||
- Performance problems
|
||||
|
||||
**ESPECIALLY when:**
|
||||
- "Just one quick fix" seems obvious
|
||||
- Under time pressure (emergencies make guessing tempting)
|
||||
- Error message is unclear
|
||||
- Previous fix didn't work
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
|
||||
## Phase 1: Tool-Assisted Investigation
|
||||
|
||||
**BEFORE attempting ANY fix, gather evidence with tools:**
|
||||
|
||||
### 1. Read Complete Error Messages
|
||||
|
||||
- Entire error message (not just first line)
|
||||
- Complete stack trace (all frames)
|
||||
- Line numbers, file paths, error codes
|
||||
- Stack traces show exact execution path
|
||||
|
||||
### 2. Search Internet FIRST (Use internet-researcher Agent)
|
||||
|
||||
**Dispatch internet-researcher with:**
|
||||
```
|
||||
"Search for error: [exact error message]
|
||||
- Check Stack Overflow solutions
|
||||
- Look for GitHub issues in [library] version [X]
|
||||
- Find official documentation explaining this error
|
||||
- Check if this is a known bug"
|
||||
```
|
||||
|
||||
**What agent should find:**
|
||||
- Exact matches to your error
|
||||
- Similar symptoms and solutions
|
||||
- Known bugs in your dependency versions
|
||||
- Workarounds that worked for others
|
||||
|
||||
### 3. Use Debugger to Inspect State
|
||||
|
||||
**Claude cannot run debuggers directly. Instead:**
|
||||
|
||||
**Option A - Recommend debugger to user:**
|
||||
```
|
||||
"Let's use lldb/gdb/DevTools to inspect state at error location.
|
||||
Please run: [specific commands]
|
||||
When breakpoint hits: [what to inspect]
|
||||
Share output with me."
|
||||
```
|
||||
|
||||
**Option B - Add instrumentation Claude can add:**
|
||||
```rust
|
||||
// Add logging
|
||||
println!("DEBUG: var = {:?}, state = {:?}", var, state);
|
||||
|
||||
// Add assertions
|
||||
assert!(condition, "Expected X but got {:?}", actual);
|
||||
```
|
||||
|
||||
### 4. Investigate Codebase (Use codebase-investigator Agent)
|
||||
|
||||
**Dispatch codebase-investigator with:**
|
||||
```
|
||||
"Error occurs in function X at line Y.
|
||||
Find:
|
||||
- How is X called? What are the callers?
|
||||
- What does variable Z contain at this point?
|
||||
- Are there similar functions that work correctly?
|
||||
- What changed recently in this area?"
|
||||
```
|
||||
|
||||
## Phase 2: Form Hypothesis
|
||||
|
||||
**Based on evidence (not guesses):**
|
||||
|
||||
1. **State what you know** (from investigation)
|
||||
2. **Propose theory** explaining the evidence
|
||||
3. **Make prediction** that tests the theory
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Known: Error "null pointer" at auth.rs:45 when email is empty
|
||||
Theory: Empty email bypasses validation, passes null to login()
|
||||
Prediction: Adding validation before login() will prevent error
|
||||
Test: Add validation, verify error doesn't occur with empty email
|
||||
```
|
||||
|
||||
**NEVER:**
|
||||
- Guess without evidence
|
||||
- Propose fix without hypothesis
|
||||
- Skip to "try this and see"
|
||||
|
||||
## Phase 3: Test Hypothesis
|
||||
|
||||
**Minimal change to validate theory:**
|
||||
|
||||
1. Make smallest change that tests hypothesis
|
||||
2. Run test/reproduction case
|
||||
3. Observe result
|
||||
|
||||
**If confirmed:** Proceed to Phase 4
|
||||
**If rejected:** Return to Phase 1 with new information
|
||||
|
||||
## Phase 4: Implement Fix
|
||||
|
||||
**After understanding root cause:**
|
||||
|
||||
1. Write test reproducing bug (RED phase - use test-driven-development skill)
|
||||
2. Implement proper fix addressing root cause
|
||||
3. Verify test passes (GREEN phase)
|
||||
4. Run full test suite (regression check)
|
||||
5. Commit fix
|
||||
|
||||
**The fix should:**
|
||||
- Address root cause (not symptom)
|
||||
- Be minimal and focused
|
||||
- Include test preventing regression
|
||||
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
|
||||
<example>
|
||||
<scenario>Developer encounters test failure, immediately tries "obvious" fix without investigation</scenario>
|
||||
|
||||
<code>
|
||||
Test error:
|
||||
```
|
||||
FAIL: test_login_expired_token
|
||||
AssertionError: Expected Err(TokenExpired), got Ok(User)
|
||||
```
|
||||
|
||||
Developer thinks: "Obviously the token expiration check is wrong"
|
||||
|
||||
Makes change without investigation:
|
||||
```rust
|
||||
// "Fix" - just check if token is expired
|
||||
if token.expires_at < now() {
|
||||
return Err(AuthError::TokenExpired);
|
||||
}
|
||||
```
|
||||
|
||||
Commits without testing other cases.
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**No investigation:**
|
||||
- Didn't read error completely
|
||||
- Didn't check what `expires_at` contains
|
||||
- Didn't debug to see token state
|
||||
- Didn't search for similar issues
|
||||
|
||||
**What actually happened:** Token `expires_at` was being parsed incorrectly, always showing future date. The "fix" adds dead code that never runs.
|
||||
|
||||
**Result:** Bug not fixed, new dead code added, time wasted.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Phase 1 - Investigate with tools:**
|
||||
|
||||
```bash
|
||||
# 1. Read complete error
|
||||
FAIL: test_login_expired_token at line 45
|
||||
Expected: Err(TokenExpired)
|
||||
Got: Ok(User { id: 123 })
|
||||
Token: { expires_at: "2099-01-01", ... }
|
||||
```
|
||||
|
||||
**Dispatch internet-researcher:**
|
||||
```
|
||||
"Search for: token expiration always showing future date
|
||||
- Check date parsing bugs
|
||||
- Look for timezone issues
|
||||
- Find JWT expiration handling"
|
||||
```
|
||||
|
||||
**Add instrumentation:**
|
||||
```rust
|
||||
println!("DEBUG: expires_at = {:?}, now = {:?}, expired = {:?}",
|
||||
token.expires_at, now(), token.expires_at < now());
|
||||
```
|
||||
|
||||
**Run test again:**
|
||||
```
|
||||
DEBUG: expires_at = 2099-01-01T00:00:00Z, now = 2024-01-15T10:30:00Z, expired = false
|
||||
```
|
||||
|
||||
**Phase 2 - Hypothesis:**
|
||||
"Token `expires_at` is being set to 2099, not actual expiration. Problem is in token creation, not validation."
|
||||
|
||||
**Phase 3 - Test:**
|
||||
Check token creation code:
|
||||
```rust
|
||||
// Found the bug!
|
||||
fn create_token() -> Token {
|
||||
Token {
|
||||
expires_at: "2099-01-01".parse()?, // HARDCODED!
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Phase 4 - Fix root cause:**
|
||||
```rust
|
||||
fn create_token(duration: Duration) -> Token {
|
||||
Token {
|
||||
expires_at: now() + duration, // Correct
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Result:** Root cause fixed, test passes, no dead code.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer skips internet search, reinvents solution to known problem</scenario>
|
||||
|
||||
<code>
|
||||
Error:
|
||||
```
|
||||
error: linking with `cc` failed: exit status: 1
|
||||
ld: symbol(s) not found for architecture arm64
|
||||
```
|
||||
|
||||
Developer thinks: "Must be a linking issue, I'll add flags"
|
||||
|
||||
Spends 2 hours trying different linker flags:
|
||||
```toml
|
||||
[target.aarch64-apple-darwin]
|
||||
rustflags = ["-C", "link-arg=-undefined dynamic_lookup"]
|
||||
# Doesn't work, tries more flags...
|
||||
```
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Skipped internet search:**
|
||||
- This is a common error with known solutions
|
||||
- Stack Overflow has exact fix
|
||||
- Official docs explain the issue
|
||||
- Wasted 2 hours reinventing solution
|
||||
|
||||
**Why it happens:** Impatience, thinking "I can figure this out faster"
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Dispatch internet-researcher FIRST:**
|
||||
|
||||
```
|
||||
"Search for: 'symbol not found for architecture arm64' Rust linking
|
||||
- Check Stack Overflow solutions
|
||||
- Look for Xcode/macOS specific fixes
|
||||
- Find Cargo configuration for Apple Silicon"
|
||||
```
|
||||
|
||||
**Agent returns (30 seconds):**
|
||||
```
|
||||
Found on Stack Overflow (2.4k upvotes):
|
||||
This occurs when Xcode Command Line Tools aren't installed or outdated.
|
||||
|
||||
Solution:
|
||||
xcode-select --install
|
||||
|
||||
Or update existing:
|
||||
softwareupdate --all --install --force
|
||||
```
|
||||
|
||||
**Apply solution:**
|
||||
```bash
|
||||
xcode-select --install
|
||||
# Wait for installation
|
||||
cargo build
|
||||
# Success!
|
||||
```
|
||||
|
||||
**Result:** Fixed in 5 minutes, not 2 hours.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer fixes symptom without understanding root cause, bug returns different way</scenario>
|
||||
|
||||
<code>
|
||||
Bug: Users can delete other users' posts
|
||||
|
||||
Developer notices: DELETE /posts/:id doesn't check ownership
|
||||
|
||||
"Fix":
|
||||
```python
|
||||
@app.delete("/posts/{post_id}")
|
||||
def delete_post(post_id: int, user: User):
|
||||
post = get_post(post_id)
|
||||
if post.author_id != user.id:
|
||||
raise Forbidden()
|
||||
delete(post)
|
||||
```
|
||||
|
||||
Bug "fixed"... but bug report comes back: "Users can edit other users' posts"
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Symptom fix:**
|
||||
- Fixed DELETE endpoint
|
||||
- Didn't investigate root cause
|
||||
- Didn't check other endpoints
|
||||
|
||||
**Root cause:** No authorization middleware checking ownership across ALL endpoints.
|
||||
|
||||
**Result:** Whack-a-mole fixing symptoms, not the underlying issue.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Phase 1 - Investigate with codebase-investigator:**
|
||||
|
||||
```
|
||||
"User can delete other users' posts. Investigate:
|
||||
- What authorization exists across all post endpoints?
|
||||
- Is there a pattern for ownership checks?
|
||||
- Which other endpoints access posts?
|
||||
- Is there middleware handling this?"
|
||||
```
|
||||
|
||||
**Agent reports:**
|
||||
```
|
||||
Found:
|
||||
- DELETE /posts/:id - NO ownership check
|
||||
- PUT /posts/:id - NO ownership check
|
||||
- GET /posts/:id - NO ownership check (allows viewing private posts!)
|
||||
- No authorization middleware found
|
||||
- Similar pattern in comments, likes endpoints
|
||||
```
|
||||
|
||||
**Phase 2 - Hypothesis:**
|
||||
"Missing authorization layer. Need middleware checking resource ownership across ALL endpoints."
|
||||
|
||||
**Phase 4 - Fix root cause:**
|
||||
```python
|
||||
# Add authorization middleware
|
||||
class OwnershipMiddleware:
|
||||
def check_ownership(self, resource, user):
|
||||
if resource.author_id != user.id:
|
||||
raise Forbidden()
|
||||
|
||||
# Apply to all endpoints
|
||||
@app.delete("/posts/{post_id}")
|
||||
@require_ownership(Post)
|
||||
def delete_post(...):
|
||||
...
|
||||
|
||||
@app.put("/posts/{post_id}")
|
||||
@require_ownership(Post)
|
||||
def update_post(...):
|
||||
...
|
||||
```
|
||||
|
||||
**Result:** Root cause fixed, ALL endpoints secured, not just one symptom.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Tools before fixes** → Never guess without investigation
|
||||
- Use internet-researcher for errors
|
||||
- Use debugger or instrumentation for state
|
||||
- Use codebase-investigator for context
|
||||
|
||||
2. **Evidence-based hypotheses** → Not guesses or hunches
|
||||
- State what tools revealed
|
||||
- Propose theory explaining evidence
|
||||
- Make testable prediction
|
||||
|
||||
3. **Test hypothesis before fixing** → Minimal change to validate
|
||||
- Smallest change that tests theory
|
||||
- Observe result
|
||||
- If wrong, return to investigation
|
||||
|
||||
4. **Fix root cause, not symptom** → One fix, many symptoms prevented
|
||||
- Understand why problem occurred
|
||||
- Fix the underlying issue
|
||||
- Don't play whack-a-mole
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: Stop, use tools to investigate:
|
||||
- "The fix is obvious"
|
||||
- "I know what this is"
|
||||
- "Just a quick try"
|
||||
- "No time for debugging"
|
||||
- "Error message is clear enough"
|
||||
- "Internet search will take too long"
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
|
||||
Before proposing any fix:
|
||||
- [ ] Read complete error message (not just first line)
|
||||
- [ ] Dispatched internet-researcher for unclear errors
|
||||
- [ ] Used debugger or added instrumentation to inspect state
|
||||
- [ ] Dispatched codebase-investigator to understand context
|
||||
- [ ] Formed hypothesis based on evidence (not guesses)
|
||||
- [ ] Tested hypothesis with minimal change
|
||||
- [ ] Verified hypothesis confirmed before fixing
|
||||
|
||||
Before committing fix:
|
||||
- [ ] Written test reproducing bug (RED phase)
|
||||
- [ ] Verified test fails before fix
|
||||
- [ ] Implemented fix addressing root cause
|
||||
- [ ] Verified test passes after fix (GREEN phase)
|
||||
- [ ] Ran full test suite (regression check)
|
||||
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
|
||||
**This skill calls:**
|
||||
- internet-researcher (search errors, known bugs, solutions)
|
||||
- codebase-investigator (understand code structure, find related code)
|
||||
- test-driven-development (write test for bug, implement fix)
|
||||
- test-runner (run tests without output pollution)
|
||||
|
||||
**This skill is called by:**
|
||||
- fixing-bugs (complete bug fix workflow)
|
||||
- root-cause-tracing (deep debugging for complex issues)
|
||||
- Any skill when encountering unexpected behavior
|
||||
|
||||
**Agents used:**
|
||||
- hyperpowers:internet-researcher (search for error solutions)
|
||||
- hyperpowers:codebase-investigator (understand codebase context)
|
||||
- hyperpowers:test-runner (run tests, return summary only)
|
||||
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
|
||||
**Detailed guides:**
|
||||
- [Debugger reference](resources/debugger-reference.md) - LLDB, GDB, DevTools commands
|
||||
- [Debugging session example](resources/debugging-session-example.md) - Complete walkthrough
|
||||
|
||||
**When stuck:**
|
||||
- Error unclear → Dispatch internet-researcher with exact error text
|
||||
- Don't understand code flow → Dispatch codebase-investigator
|
||||
- Need to inspect runtime state → Recommend debugger to user or add instrumentation
|
||||
- Tempted to guess → Stop, use tools to gather evidence first
|
||||
|
||||
</resources>
|
||||
113
skills/debugging-with-tools/resources/debugger-reference.md
Normal file
113
skills/debugging-with-tools/resources/debugger-reference.md
Normal file
@@ -0,0 +1,113 @@
|
||||
## Debugger Quick Reference
|
||||
|
||||
### Automated Debugging (Claude CAN run these)
|
||||
|
||||
#### lldb Batch Mode
|
||||
|
||||
```bash
|
||||
# One-shot command to inspect variable at breakpoint
|
||||
lldb -o "breakpoint set --file main.rs --line 42" \
|
||||
-o "run" \
|
||||
-o "frame variable my_var" \
|
||||
-o "quit" \
|
||||
-- target/debug/myapp 2>&1
|
||||
|
||||
# With script file for complex debugging
|
||||
cat > debug.lldb <<'EOF'
|
||||
breakpoint set --file main.rs --line 42
|
||||
run
|
||||
frame variable
|
||||
bt
|
||||
up
|
||||
frame variable
|
||||
quit
|
||||
EOF
|
||||
|
||||
lldb -s debug.lldb target/debug/myapp 2>&1
|
||||
```
|
||||
|
||||
#### strace (Linux - system call tracing)
|
||||
|
||||
```bash
|
||||
# See which files program opens
|
||||
strace -e trace=open,openat cargo run 2>&1 | grep -v "ENOENT"
|
||||
|
||||
# Find network activity
|
||||
strace -e trace=network cargo run 2>&1
|
||||
|
||||
# All syscalls with time
|
||||
strace -tt cargo test some_test 2>&1
|
||||
```
|
||||
|
||||
#### dtrace (macOS - dynamic tracing)
|
||||
|
||||
```bash
|
||||
# Trace function calls
|
||||
sudo dtrace -n 'pid$target:myapp::entry { printf("%s", probefunc); }' -p <PID>
|
||||
```
|
||||
|
||||
### Interactive Debugging (USER runs these, Claude guides)
|
||||
|
||||
**These require interactive terminal - Claude provides commands, user runs them**
|
||||
|
||||
### lldb (Rust, Swift, C++)
|
||||
|
||||
```bash
|
||||
# Start debugging
|
||||
lldb target/debug/myapp
|
||||
|
||||
# Set breakpoints
|
||||
(lldb) breakpoint set --file main.rs --line 42
|
||||
(lldb) breakpoint set --name my_function
|
||||
|
||||
# Run
|
||||
(lldb) run
|
||||
(lldb) run arg1 arg2
|
||||
|
||||
# When paused:
|
||||
(lldb) frame variable # Show all locals
|
||||
(lldb) print my_var # Print specific variable
|
||||
(lldb) bt # Backtrace (stack)
|
||||
(lldb) up / down # Navigate stack
|
||||
(lldb) continue # Resume
|
||||
(lldb) step / next # Step into / over
|
||||
(lldb) finish # Run until return
|
||||
```
|
||||
|
||||
### Browser DevTools (JavaScript)
|
||||
|
||||
```javascript
|
||||
// In code:
|
||||
debugger; // Execution pauses here
|
||||
|
||||
// In DevTools:
|
||||
// - Sources tab → Add breakpoint by clicking line number
|
||||
// - When paused:
|
||||
// - Scope panel: See all variables
|
||||
// - Watch: Add expressions to watch
|
||||
// - Call stack: Navigate callers
|
||||
// - Step over (F10), Step into (F11)
|
||||
```
|
||||
|
||||
### gdb (C, C++, Go)
|
||||
|
||||
```bash
|
||||
# Start debugging
|
||||
gdb ./myapp
|
||||
|
||||
# Set breakpoints
|
||||
(gdb) break main.c:42
|
||||
(gdb) break myfunction
|
||||
|
||||
# Run
|
||||
(gdb) run
|
||||
|
||||
# When paused:
|
||||
(gdb) print myvar
|
||||
(gdb) info locals
|
||||
(gdb) backtrace
|
||||
(gdb) up / down
|
||||
(gdb) continue
|
||||
(gdb) step / next
|
||||
```
|
||||
|
||||
@@ -0,0 +1,73 @@
|
||||
## Example: Complete Debugging Session
|
||||
|
||||
**Problem:** Test fails with "Symbol not found: _OBJC_CLASS_$_WKWebView"
|
||||
|
||||
**Phase 1: Investigation**
|
||||
|
||||
1. **Read error**: Symbol not found, linking issue
|
||||
2. **Internet research**:
|
||||
```
|
||||
Dispatch hyperpowers:internet-researcher:
|
||||
"Search for 'dyld Symbol not found _OBJC_CLASS_$_WKWebView'
|
||||
Focus on: Xcode linking, framework configuration, iOS deployment"
|
||||
|
||||
Results: Need to link WebKit framework in Xcode project
|
||||
```
|
||||
|
||||
3. **Debugger**: Not needed, linking happens before runtime
|
||||
|
||||
4. **Codebase investigation**:
|
||||
```
|
||||
Dispatch hyperpowers:codebase-investigator:
|
||||
"Find other code using WKWebView - how is WebKit linked?"
|
||||
|
||||
Results: Main app target has WebKit in frameworks, test target doesn't
|
||||
```
|
||||
|
||||
**Phase 2: Analysis**
|
||||
|
||||
Root cause: Test target doesn't link WebKit framework
|
||||
Evidence: Main target works, test target fails, Stack Overflow confirms
|
||||
|
||||
**Phase 3: Testing**
|
||||
|
||||
Hypothesis: Adding WebKit to test target will fix it
|
||||
|
||||
Minimal test:
|
||||
1. Add WebKit.framework to test target
|
||||
2. Clean build
|
||||
3. Run tests
|
||||
|
||||
```
|
||||
Dispatch hyperpowers:test-runner: "Run: swift test"
|
||||
Result: ✓ All tests pass
|
||||
```
|
||||
|
||||
**Phase 4: Implementation**
|
||||
|
||||
1. Test already exists (the failing test)
|
||||
2. Fix: Framework linked
|
||||
3. Verification: Tests pass
|
||||
4. Update bd:
|
||||
```bash
|
||||
bd close bd-123
|
||||
```
|
||||
|
||||
**Time:** 15 minutes systematic vs. 2+ hours guessing
|
||||
|
||||
## Remember
|
||||
|
||||
- **Tools make debugging faster**, not slower
|
||||
- **hyperpowers:internet-researcher** can find solutions in seconds
|
||||
- **Automated debugging works** - lldb batch mode, strace, instrumentation
|
||||
- **hyperpowers:codebase-investigator** finds patterns you'd miss
|
||||
- **hyperpowers:test-runner agent** keeps context clean
|
||||
- **Evidence before fixes**, always
|
||||
|
||||
**Prefer automated tools:**
|
||||
1. lldb batch mode - non-interactive variable inspection
|
||||
2. strace/dtrace - system call tracing
|
||||
3. Instrumentation - logging Claude can add
|
||||
4. Interactive debugger - only when automated tools insufficient
|
||||
|
||||
95% faster to investigate systematically than to guess-and-check.
|
||||
662
skills/dispatching-parallel-agents/SKILL.md
Normal file
662
skills/dispatching-parallel-agents/SKILL.md
Normal file
@@ -0,0 +1,662 @@
|
||||
---
|
||||
name: dispatching-parallel-agents
|
||||
description: Use when facing 3+ independent failures that can be investigated without shared state or dependencies - dispatches multiple Claude agents to investigate and fix independent problems concurrently
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
When facing 3+ independent failures, dispatch one agent per problem domain to investigate concurrently; verify independence first, dispatch all in single message, wait for all agents, check conflicts, verify integration.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
MEDIUM FREEDOM - Follow the 6-step process (identify, create tasks, dispatch, monitor, review, verify) strictly. Independence verification mandatory. Parallel dispatch in single message required. Adapt agent prompt content to problem domain.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Step | Action | Critical Rule |
|
||||
|------|--------|---------------|
|
||||
| 1. Identify Domains | Test independence (fix A doesn't affect B) | 3+ independent domains required |
|
||||
| 2. Create Agent Tasks | Write focused prompts (scope, goal, constraints, output) | One prompt per domain |
|
||||
| 3. Dispatch Agents | Launch all agents in SINGLE message | Multiple Task() calls in parallel |
|
||||
| 4. Monitor Progress | Track completions, don't integrate until ALL done | Wait for all agents |
|
||||
| 5. Review Results | Read summaries, check conflicts | Manual conflict resolution |
|
||||
| 6. Verify Integration | Run full test suite | Use verification-before-completion |
|
||||
|
||||
**Why 3+?** With only 2 failures, coordination overhead often exceeds sequential time.
|
||||
|
||||
**Critical:** Dispatch all agents in single message with multiple Task() calls, or they run sequentially.
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
Use when:
|
||||
- 3+ test files failing with different root causes
|
||||
- Multiple subsystems broken independently
|
||||
- Each problem can be understood without context from others
|
||||
- No shared state between investigations
|
||||
- You've verified failures are truly independent
|
||||
- Each domain has clear boundaries (different files, modules, features)
|
||||
|
||||
Don't use when:
|
||||
- Failures are related (fix one might fix others)
|
||||
- Need to understand full system state first
|
||||
- Agents would interfere (editing same files)
|
||||
- Haven't verified independence yet (exploratory phase)
|
||||
- Failures share root cause (one bug, multiple symptoms)
|
||||
- Need to preserve investigation order (cascading failures)
|
||||
- Only 2 failures (overhead exceeds benefit)
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## Step 1: Identify Independent Domains
|
||||
|
||||
**Announce:** "I'm using hyperpowers:dispatching-parallel-agents to investigate these independent failures concurrently."
|
||||
|
||||
**Create TodoWrite tracker:**
|
||||
```
|
||||
- Identify independent domains (3+ domains identified)
|
||||
- Create agent tasks (one prompt per domain drafted)
|
||||
- Dispatch agents in parallel (all agents launched in single message)
|
||||
- Monitor agent progress (track completions)
|
||||
- Review results (summaries read, conflicts checked)
|
||||
- Verify integration (full test suite green)
|
||||
```
|
||||
|
||||
**Test for independence:**
|
||||
|
||||
1. **Ask:** "If I fix failure A, does it affect failure B?"
|
||||
- If NO → Independent
|
||||
- If YES → Related, investigate together
|
||||
|
||||
2. **Check:** "Do failures touch same code/files?"
|
||||
- If NO → Likely independent
|
||||
- If YES → Check if different functions/areas
|
||||
|
||||
3. **Verify:** "Do failures share error patterns?"
|
||||
- If NO → Independent
|
||||
- If YES → Might be same root cause
|
||||
|
||||
**Example independence check:**
|
||||
```
|
||||
Failure 1: Authentication tests failing (auth.test.ts)
|
||||
Failure 2: Database query tests failing (db.test.ts)
|
||||
Failure 3: API endpoint tests failing (api.test.ts)
|
||||
|
||||
Check: Does fixing auth affect db queries? NO
|
||||
Check: Does fixing db affect API? YES - API uses db
|
||||
|
||||
Result: 2 independent domains:
|
||||
Domain 1: Authentication (auth.test.ts)
|
||||
Domain 2: Database + API (db.test.ts + api.test.ts together)
|
||||
```
|
||||
|
||||
**Group failures by what's broken:**
|
||||
- File A tests: Tool approval flow
|
||||
- File B tests: Batch completion behavior
|
||||
- File C tests: Abort functionality
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Create Focused Agent Tasks
|
||||
|
||||
Each agent prompt must have:
|
||||
|
||||
1. **Specific scope:** One test file or subsystem
|
||||
2. **Clear goal:** Make these tests pass
|
||||
3. **Constraints:** Don't change other code
|
||||
4. **Expected output:** Summary of what you found and fixed
|
||||
|
||||
**Good agent prompt example:**
|
||||
```markdown
|
||||
Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
|
||||
|
||||
1. "should abort tool with partial output capture" - expects 'interrupted at' in message
|
||||
2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
|
||||
3. "should properly track pendingToolCount" - expects 3 results but gets 0
|
||||
|
||||
These are timing/race condition issues. Your task:
|
||||
|
||||
1. Read the test file and understand what each test verifies
|
||||
2. Identify root cause - timing issues or actual bugs?
|
||||
3. Fix by:
|
||||
- Replacing arbitrary timeouts with event-based waiting
|
||||
- Fixing bugs in abort implementation if found
|
||||
- Adjusting test expectations if testing changed behavior
|
||||
|
||||
Never just increase timeouts - find the real issue.
|
||||
|
||||
Return: Summary of what you found and what you fixed.
|
||||
```
|
||||
|
||||
**What makes this good:**
|
||||
- Specific test failures listed
|
||||
- Context provided (timing/race conditions)
|
||||
- Clear methodology (read, identify, fix)
|
||||
- Constraints (don't just increase timeouts)
|
||||
- Output format (summary)
|
||||
|
||||
**Common mistakes:**
|
||||
|
||||
❌ **Too broad:** "Fix all the tests" - agent gets lost
|
||||
✅ **Specific:** "Fix agent-tool-abort.test.ts" - focused scope
|
||||
|
||||
❌ **No context:** "Fix the race condition" - agent doesn't know where
|
||||
✅ **Context:** Paste the error messages and test names
|
||||
|
||||
❌ **No constraints:** Agent might refactor everything
|
||||
✅ **Constraints:** "Do NOT change production code" or "Fix tests only"
|
||||
|
||||
❌ **Vague output:** "Fix it" - you don't know what changed
|
||||
✅ **Specific:** "Return summary of root cause and changes"
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Dispatch All Agents in Parallel
|
||||
|
||||
**CRITICAL:** You must dispatch all agents in a SINGLE message with multiple Task() calls.
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT - Single message with multiple parallel tasks
|
||||
Task("Fix agent-tool-abort.test.ts failures", prompt1)
|
||||
Task("Fix batch-completion-behavior.test.ts failures", prompt2)
|
||||
Task("Fix tool-approval-race-conditions.test.ts failures", prompt3)
|
||||
// All three run concurrently
|
||||
|
||||
// ❌ WRONG - Sequential messages
|
||||
Task("Fix agent-tool-abort.test.ts failures", prompt1)
|
||||
// Wait for response
|
||||
Task("Fix batch-completion-behavior.test.ts failures", prompt2)
|
||||
// This is sequential, not parallel!
|
||||
```
|
||||
|
||||
**After dispatch:**
|
||||
- Mark "Dispatch agents in parallel" as completed in TodoWrite
|
||||
- Mark "Monitor agent progress" as in_progress
|
||||
- Wait for all agents to complete before integration
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Monitor Progress
|
||||
|
||||
As agents work:
|
||||
- Note which agents have completed
|
||||
- Note which are still running
|
||||
- Don't start integration until ALL agents done
|
||||
|
||||
**If an agent gets stuck (>5 minutes):**
|
||||
|
||||
1. Check AgentOutput to see what it's doing
|
||||
2. If stuck on wrong path: Cancel and retry with clearer prompt
|
||||
3. If needs context from other domain: Wait for other agent, then restart with context
|
||||
4. If hit real blocker: Investigate blocker yourself, then retry
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Review Results and Check Conflicts
|
||||
|
||||
**When all agents return:**
|
||||
|
||||
1. **Read each summary carefully**
|
||||
- What was the root cause?
|
||||
- What did the agent change?
|
||||
- Were there any uncertainties?
|
||||
|
||||
2. **Check for conflicts**
|
||||
- Did multiple agents edit same files?
|
||||
- Did agents make contradictory assumptions?
|
||||
- Are there integration points between domains?
|
||||
|
||||
3. **Integration strategy:**
|
||||
- If no conflicts: Apply all changes
|
||||
- If conflicts: Resolve manually before applying
|
||||
- If assumptions conflict: Verify with user
|
||||
|
||||
4. **Document what happened**
|
||||
- Which agents fixed what
|
||||
- Any conflicts found
|
||||
- Integration decisions made
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Verify Integration
|
||||
|
||||
**Run full test suite:**
|
||||
- Not just the fixed tests
|
||||
- Verify no regressions in other areas
|
||||
- Use hyperpowers:verification-before-completion skill
|
||||
|
||||
**Before completing:**
|
||||
```bash
|
||||
# Run all tests
|
||||
npm test # or cargo test, pytest, etc.
|
||||
|
||||
# Verify output
|
||||
# If all pass → Mark "Verify integration" complete
|
||||
# If failures → Identify which agent's change caused regression
|
||||
```
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer dispatches agents sequentially instead of in parallel</scenario>
|
||||
|
||||
<code>
|
||||
# Developer sees 3 independent failures
|
||||
# Creates 3 agent prompts
|
||||
|
||||
# Dispatches first agent
|
||||
Task("Fix agent-tool-abort.test.ts failures", prompt1)
|
||||
# Waits for response from agent 1
|
||||
|
||||
# Then dispatches second agent
|
||||
Task("Fix batch-completion-behavior.test.ts failures", prompt2)
|
||||
# Waits for response from agent 2
|
||||
|
||||
# Then dispatches third agent
|
||||
Task("Fix tool-approval-race-conditions.test.ts failures", prompt3)
|
||||
|
||||
# Total time: Sum of all three agents (sequential)
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Agents run sequentially, not in parallel
|
||||
- No time savings from parallelization
|
||||
- Each agent waits for previous to complete
|
||||
- Defeats entire purpose of parallel dispatch
|
||||
- Same result as sequential investigation
|
||||
- Wasted overhead of creating separate agents
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Dispatch all agents in SINGLE message:**
|
||||
|
||||
```typescript
|
||||
// Single message with multiple Task() calls
|
||||
Task("Fix agent-tool-abort.test.ts failures", `
|
||||
Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
|
||||
[prompt 1 content]
|
||||
`)
|
||||
|
||||
Task("Fix batch-completion-behavior.test.ts failures", `
|
||||
Fix the 2 failing tests in src/agents/batch-completion-behavior.test.ts:
|
||||
[prompt 2 content]
|
||||
`)
|
||||
|
||||
Task("Fix tool-approval-race-conditions.test.ts failures", `
|
||||
Fix the 1 failing test in src/agents/tool-approval-race-conditions.test.ts:
|
||||
[prompt 3 content]
|
||||
`)
|
||||
|
||||
// All three run concurrently - THIS IS THE KEY
|
||||
```
|
||||
|
||||
**What happens:**
|
||||
- All three agents start simultaneously
|
||||
- Each investigates independently
|
||||
- All complete in parallel
|
||||
- Total time: Max(agent1, agent2, agent3) instead of Sum
|
||||
|
||||
**What you gain:**
|
||||
- True parallelization - 3 problems solved concurrently
|
||||
- Time saved: 3 investigations in time of 1
|
||||
- Each agent focused on narrow scope
|
||||
- No waiting for sequential completion
|
||||
- Proper use of parallel dispatch pattern
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer assumes failures are independent without verification</scenario>
|
||||
|
||||
<code>
|
||||
# Developer sees 3 test failures:
|
||||
# - API endpoint tests failing
|
||||
# - Database query tests failing
|
||||
# - Cache invalidation tests failing
|
||||
|
||||
# Thinks: "Different subsystems, must be independent"
|
||||
|
||||
# Dispatches 3 agents immediately without checking independence
|
||||
|
||||
# Agent 1 finds: API failing because database schema changed
|
||||
# Agent 2 finds: Database queries need migration
|
||||
# Agent 3 finds: Cache keys based on old schema
|
||||
|
||||
# All three failures caused by same root cause: schema change
|
||||
# Agents make conflicting fixes based on different assumptions
|
||||
# Integration fails because fixes contradict each other
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped independence verification (Step 1)
|
||||
- Assumed independence based on surface appearance
|
||||
- All failures actually shared root cause (schema change)
|
||||
- Agents worked in isolation without seeing connection
|
||||
- Each agent made different assumptions about correct schema
|
||||
- Conflicting fixes can't be integrated
|
||||
- Wasted time on parallel work that should have been unified
|
||||
- Have to throw away agent work and start over
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Run independence check FIRST:**
|
||||
|
||||
```
|
||||
Check: Does fixing API affect database queries?
|
||||
- API uses database
|
||||
- If database schema changes, API breaks
|
||||
- YES - these are related
|
||||
|
||||
Check: Does fixing database affect cache?
|
||||
- Cache stores database results
|
||||
- If database schema changes, cache keys break
|
||||
- YES - these are related
|
||||
|
||||
Check: Do failures share error patterns?
|
||||
- All mention "column not found: user_email"
|
||||
- All started after schema migration
|
||||
- YES - shared root cause
|
||||
|
||||
Result: NOT INDEPENDENT
|
||||
These are one problem (schema change) manifesting in 3 places
|
||||
```
|
||||
|
||||
**Correct approach:**
|
||||
|
||||
```
|
||||
Single agent investigates: "Schema migration broke 3 subsystems"
|
||||
|
||||
Agent prompt:
|
||||
"We have 3 test failures all related to schema change:
|
||||
1. API endpoints: column not found
|
||||
2. Database queries: column not found
|
||||
3. Cache invalidation: old keys
|
||||
|
||||
Investigate the schema migration that caused this.
|
||||
Fix by updating all 3 subsystems consistently.
|
||||
Return: What changed in schema, how you fixed each subsystem."
|
||||
|
||||
# One agent sees full picture
|
||||
# Makes consistent fix across all 3 areas
|
||||
# No conflicts, proper integration
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Caught shared root cause before wasting time
|
||||
- One agent sees full context
|
||||
- Consistent fix across all affected areas
|
||||
- No conflicting assumptions
|
||||
- No integration conflicts
|
||||
- Faster than 3 agents working at cross-purposes
|
||||
- Proper problem diagnosis before parallel dispatch
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer integrates agent results without checking conflicts</scenario>
|
||||
|
||||
<code>
|
||||
# 3 agents complete successfully
|
||||
# Developer quickly reads summaries:
|
||||
|
||||
Agent 1: "Fixed timeout issue by increasing wait time to 5000ms"
|
||||
Agent 2: "Fixed race condition by adding mutex lock"
|
||||
Agent 3: "Fixed timing issue by reducing wait time to 1000ms"
|
||||
|
||||
# Developer thinks: "All agents succeeded, ship it"
|
||||
|
||||
# Applies all changes without checking conflicts
|
||||
|
||||
# Result:
|
||||
# - Agent 1 and Agent 3 edited same file
|
||||
# - Agent 1 increased timeout, Agent 3 decreased it
|
||||
# - Final code has inconsistent timeouts
|
||||
# - Agent 2's mutex interacts badly with Agent 3's reduced timeout
|
||||
# - Tests still fail after integration
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped conflict checking (Step 5)
|
||||
- Didn't carefully read what each agent changed
|
||||
- Agents made contradictory decisions
|
||||
- Agent 1 and Agent 3 had different assumptions about timing
|
||||
- Agent 2's locking interacts with timing changes
|
||||
- Blindly applying all fixes creates inconsistent state
|
||||
- Tests fail after "successful" integration
|
||||
- Have to manually untangle conflicting changes
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Review results carefully before integration:**
|
||||
|
||||
```markdown
|
||||
## Agent Summaries Review
|
||||
|
||||
Agent 1: Fixed timeout issue by increasing wait time to 5000ms
|
||||
- File: src/agents/tool-executor.ts
|
||||
- Change: DEFAULT_TIMEOUT = 5000
|
||||
|
||||
Agent 2: Fixed race condition by adding mutex lock
|
||||
- File: src/agents/tool-executor.ts
|
||||
- Change: Added mutex around tool execution
|
||||
|
||||
Agent 3: Fixed timing issue by reducing wait time to 1000ms
|
||||
- File: src/agents/tool-executor.ts
|
||||
- Change: DEFAULT_TIMEOUT = 1000
|
||||
|
||||
## Conflict Analysis
|
||||
|
||||
**CONFLICT DETECTED:**
|
||||
- Agents 1 and 3 edited same file (tool-executor.ts)
|
||||
- Agents 1 and 3 changed same constant (DEFAULT_TIMEOUT)
|
||||
- Agent 1: increase to 5000ms
|
||||
- Agent 3: decrease to 1000ms
|
||||
- Contradictory assumptions about correct timing
|
||||
|
||||
**Why conflict occurred:**
|
||||
- Domains weren't actually independent (same timeout constant)
|
||||
- Both agents tested locally, didn't see interaction
|
||||
- Different problem spaces led to different timing needs
|
||||
|
||||
## Resolution
|
||||
|
||||
**Option 1:** Different timeouts for different operations
|
||||
```typescript
|
||||
const TOOL_EXECUTION_TIMEOUT = 5000 // Agent 1's need
|
||||
const TOOL_APPROVAL_TIMEOUT = 1000 // Agent 3's need
|
||||
```
|
||||
|
||||
**Option 2:** Investigate why timing varies
|
||||
- Maybe Agent 1's tests are actually slow (fix slowness)
|
||||
- Maybe Agent 3's tests are correct (use 1000ms everywhere)
|
||||
|
||||
**Choose Option 2 after investigation:**
|
||||
- Agent 1's tests were slow due to unrelated issue
|
||||
- Fix the slowness, use 1000ms timeout everywhere
|
||||
- Agent 2's mutex is compatible with 1000ms
|
||||
|
||||
**Integration steps:**
|
||||
1. Apply Agent 2's mutex (no conflict)
|
||||
2. Apply Agent 3's 1000ms timeout
|
||||
3. Fix Agent 1's slow tests (root cause)
|
||||
4. Don't apply Agent 1's timeout increase (symptom fix)
|
||||
```
|
||||
|
||||
**Run full test suite:**
|
||||
```bash
|
||||
npm test
|
||||
# All tests pass ✅
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Caught contradiction before breaking integration
|
||||
- Understood why agents made different decisions
|
||||
- Resolved conflict thoughtfully, not arbitrarily
|
||||
- Fixed root cause (slow tests) not symptom (long timeout)
|
||||
- Verified integration works correctly
|
||||
- Avoided shipping inconsistent code
|
||||
- Professional conflict resolution process
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<failure_modes>
|
||||
## Agent Gets Stuck
|
||||
|
||||
**Symptoms:** No progress after 5+ minutes
|
||||
|
||||
**Causes:**
|
||||
- Prompt too vague, agent exploring aimlessly
|
||||
- Domain not actually independent, needs context from other agents
|
||||
- Agent hit a blocker (missing file, unclear error)
|
||||
|
||||
**Recovery:**
|
||||
1. Use AgentOutput tool to check what it's doing
|
||||
2. If stuck on wrong path: Cancel and retry with clearer prompt
|
||||
3. If needs context from other domain: Wait for other agent, then restart with context
|
||||
4. If hit real blocker: Investigate blocker yourself, then retry
|
||||
|
||||
---
|
||||
|
||||
## Agents Return Conflicting Fixes
|
||||
|
||||
**Symptoms:** Agents edited same code differently, or made contradictory assumptions
|
||||
|
||||
**Causes:**
|
||||
- Domains weren't actually independent
|
||||
- Shared code between domains
|
||||
- Agents made different assumptions about correct behavior
|
||||
|
||||
**Recovery:**
|
||||
1. Don't apply either fix automatically
|
||||
2. Read both fixes carefully
|
||||
3. Identify the conflict point
|
||||
4. Resolve manually based on which assumption is correct
|
||||
5. Consider if domains should be merged
|
||||
|
||||
---
|
||||
|
||||
## Integration Breaks Other Tests
|
||||
|
||||
**Symptoms:** Fixed tests pass, but other tests now fail
|
||||
|
||||
**Causes:**
|
||||
- Agent changed shared code
|
||||
- Agent's fix was too broad
|
||||
- Agent misunderstood requirements
|
||||
|
||||
**Recovery:**
|
||||
1. Identify which agent's change caused the regression
|
||||
2. Read the agent's summary - did they mention this change?
|
||||
3. Evaluate if change is correct but tests need updating
|
||||
4. Or if change broke something, need to refine the fix
|
||||
5. Use hyperpowers:verification-before-completion skill for final check
|
||||
|
||||
---
|
||||
|
||||
## False Independence
|
||||
|
||||
**Symptoms:** Fixing one domain revealed it affected another
|
||||
|
||||
**Recovery:**
|
||||
1. Merge the domains
|
||||
2. Have one agent investigate both together
|
||||
3. Learn: Better independence test needed upfront
|
||||
</failure_modes>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Verify independence first** → Test with questions before dispatching
|
||||
2. **3+ domains required** → 2 failures: overhead exceeds benefit, do sequentially
|
||||
3. **Single message dispatch** → All agents in one message with multiple Task() calls
|
||||
4. **Wait for ALL agents** → Don't integrate until all complete
|
||||
5. **Check conflicts manually** → Read summaries, verify no contradictions
|
||||
6. **Verify integration** → Run full suite yourself, don't trust agents
|
||||
7. **TodoWrite tracking** → Track agent progress explicitly
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Follow the process.**
|
||||
|
||||
- "Just 2 failures, can still parallelize" (Overhead exceeds benefit, do sequentially)
|
||||
- "Probably independent, will dispatch and see" (Verify independence FIRST)
|
||||
- "Can dispatch sequentially to save syntax" (WRONG - must dispatch in single message)
|
||||
- "Agent failed, but others succeeded - ship it" (All agents must succeed or re-investigate)
|
||||
- "Conflicts are minor, can ignore" (Resolve all conflicts explicitly)
|
||||
- "Don't need TodoWrite for just tracking agents" (Use TodoWrite, track properly)
|
||||
- "Can skip verification, agents ran tests" (Agents can make mistakes, YOU verify)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before completing parallel agent work:
|
||||
|
||||
- [ ] Verified independence with 3 questions (fix A affects B? same code? same error pattern?)
|
||||
- [ ] 3+ independent domains identified (not 2 or fewer)
|
||||
- [ ] Created focused agent prompts (scope, goal, constraints, output)
|
||||
- [ ] Dispatched all agents in single message (multiple Task() calls)
|
||||
- [ ] Waited for ALL agents to complete (didn't integrate early)
|
||||
- [ ] Read all agent summaries carefully
|
||||
- [ ] Checked for conflicts (same files, contradictory assumptions)
|
||||
- [ ] Resolved any conflicts manually before integration
|
||||
- [ ] Ran full test suite (not just fixed tests)
|
||||
- [ ] Used verification-before-completion skill
|
||||
- [ ] Documented which agents fixed what
|
||||
|
||||
**Can't check all boxes?** Return to the process and complete missing steps.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill covers:** Parallel investigation of independent failures
|
||||
|
||||
**Related skills:**
|
||||
- hyperpowers:debugging-with-tools (how to investigate individual failures)
|
||||
- hyperpowers:fixing-bugs (complete bug workflow)
|
||||
- hyperpowers:verification-before-completion (verify integration)
|
||||
- hyperpowers:test-runner (run tests without context pollution)
|
||||
|
||||
**This skill uses:**
|
||||
- Task tool (dispatch parallel agents)
|
||||
- AgentOutput tool (monitor stuck agents)
|
||||
- TodoWrite (track agent progress)
|
||||
|
||||
**Workflow integration:**
|
||||
```
|
||||
Multiple independent failures
|
||||
↓
|
||||
Verify independence (Step 1)
|
||||
↓
|
||||
Create agent tasks (Step 2)
|
||||
↓
|
||||
Dispatch in parallel (Step 3)
|
||||
↓
|
||||
Monitor progress (Step 4)
|
||||
↓
|
||||
Review + check conflicts (Step 5)
|
||||
↓
|
||||
Verify integration (Step 6)
|
||||
↓
|
||||
hyperpowers:verification-before-completion
|
||||
```
|
||||
|
||||
**Real example from session (2025-10-03):**
|
||||
- 6 failures across 3 files
|
||||
- 3 agents dispatched in parallel
|
||||
- All investigations completed concurrently
|
||||
- All fixes integrated successfully
|
||||
- Zero conflicts between agent changes
|
||||
- Time saved: 3 problems solved in parallel vs sequentially
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Key principles:**
|
||||
- Parallelization only wins with 3+ independent problems
|
||||
- Independence verification prevents wasted parallel work
|
||||
- Single message dispatch is critical for true parallelism
|
||||
- Conflict checking prevents integration disasters
|
||||
- Full verification catches agent mistakes
|
||||
|
||||
**When stuck:**
|
||||
- Agent not making progress → Check AgentOutput, retry with clearer prompt
|
||||
- Conflicts after dispatch → Domains weren't independent, merge and retry
|
||||
- Integration fails tests → Identify which agent caused regression
|
||||
- Unclear if independent → Test with 3 questions (affects? same code? same error?)
|
||||
</resources>
|
||||
571
skills/executing-plans/SKILL.md
Normal file
571
skills/executing-plans/SKILL.md
Normal file
@@ -0,0 +1,571 @@
|
||||
---
|
||||
name: executing-plans
|
||||
description: Use to execute bd tasks iteratively - executes one task, reviews learnings, creates/refines next task, then STOPS for user review before continuing
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Execute bd tasks one at a time with mandatory checkpoints: Load epic → Execute task → Review learnings → Create next task → Run SRE refinement → STOP. User clears context, reviews implementation, then runs command again to continue. Epic requirements are immutable, tasks adapt to reality.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow exact process: load epic, execute ONE task, review, create next task with SRE refinement, STOP.
|
||||
|
||||
Epic requirements are immutable. Tasks adapt to discoveries. Do not skip checkpoints, SRE refinement, or verification. STOP after each task for user review.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
|
||||
| Step | Command | Purpose |
|
||||
|------|---------|---------|
|
||||
| **Load Epic** | `bd show bd-1` | Read immutable requirements once at start |
|
||||
| **Find Task** | `bd ready` | Get next ready task to execute |
|
||||
| **Start Task** | `bd update bd-2 --status in_progress` | Mark task active |
|
||||
| **Track Substeps** | TodoWrite for each implementation step | Prevent incomplete execution |
|
||||
| **Close Task** | `bd close bd-2` | Mark task complete after verification |
|
||||
| **Review** | Re-read epic, check learnings | Adapt next task to reality |
|
||||
| **Create Next** | `bd create "Task N"` | Based on learnings, not assumptions |
|
||||
| **Refine** | Use `sre-task-refinement` skill | Corner-case analysis with Opus 4.1 |
|
||||
| **STOP** | Present summary to user | User reviews, clears context, runs command again |
|
||||
| **Final Check** | Use `review-implementation` skill | Verify all success criteria before closing epic |
|
||||
|
||||
**Critical:** Epic = contract (immutable). Tasks = discovery (adapt to reality). STOP after each task for user review.
|
||||
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**Use after hyperpowers:writing-plans creates epic and first task.**
|
||||
|
||||
Symptoms you need this:
|
||||
- bd epic exists with tasks ready to execute
|
||||
- Need to implement features iteratively
|
||||
- Requirements clear, but implementation path will adapt
|
||||
- Want continuous learning between tasks
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
|
||||
## 0. Resumption Check (Every Invocation)
|
||||
|
||||
This skill supports explicit resumption. When invoked:
|
||||
|
||||
```bash
|
||||
bd list --type epic --status open # Find active epic
|
||||
bd ready # Check for ready tasks
|
||||
bd list --status in_progress # Check for in-progress tasks
|
||||
```
|
||||
|
||||
**Fresh start:** No in-progress tasks, proceed to Step 1.
|
||||
|
||||
**Resuming:** Found ready or in-progress tasks:
|
||||
- In-progress task exists → Resume at Step 2 (continue executing)
|
||||
- Ready task exists → Resume at Step 2 (start executing)
|
||||
- All tasks closed but epic open → Resume at Step 4 (check criteria)
|
||||
|
||||
**Why resumption matters:**
|
||||
- User cleared context between tasks (intended workflow)
|
||||
- Context limit reached mid-task
|
||||
- Previous session ended unexpectedly
|
||||
|
||||
**Do not ask "where did we leave off?"** - bd state tells you exactly where to resume.
|
||||
|
||||
## 1. Load Epic Context (Once at Start)
|
||||
|
||||
Before executing ANY task, load the epic into context:
|
||||
|
||||
```bash
|
||||
bd list --type epic --status open # Find epic
|
||||
bd show bd-1 # Load epic details
|
||||
```
|
||||
|
||||
**Extract and keep in mind:**
|
||||
- Requirements (IMMUTABLE)
|
||||
- Success criteria (validation checklist)
|
||||
- Anti-patterns (FORBIDDEN shortcuts)
|
||||
- Approach (high-level strategy)
|
||||
|
||||
**Why:** Requirements prevent watering down when blocked.
|
||||
|
||||
## 2. Execute Current Ready Task
|
||||
|
||||
```bash
|
||||
bd ready # Find next task
|
||||
bd update bd-2 --status in_progress # Start it
|
||||
bd show bd-2 # Read details
|
||||
```
|
||||
|
||||
**CRITICAL - Create TodoWrite for ALL substeps:**
|
||||
|
||||
Tasks contain 4-8 implementation steps. Create TodoWrite todos for each to prevent incomplete execution:
|
||||
|
||||
```
|
||||
- bd-2 Step 1: Write test (pending)
|
||||
- bd-2 Step 2: Run test RED (pending)
|
||||
- bd-2 Step 3: Implement function (pending)
|
||||
- bd-2 Step 4: Run test GREEN (pending)
|
||||
- bd-2 Step 5: Refactor (pending)
|
||||
- bd-2 Step 6: Commit (pending)
|
||||
```
|
||||
|
||||
**Execute steps:**
|
||||
- Use `test-driven-development` when implementing features
|
||||
- Mark each substep completed immediately after finishing
|
||||
- Use `test-runner` agent for verifications
|
||||
|
||||
**Pre-close verification:**
|
||||
- Check TodoWrite: All substeps completed?
|
||||
- If incomplete: Continue with remaining substeps
|
||||
- If complete: Close task and commit
|
||||
|
||||
```bash
|
||||
bd close bd-2 # After ALL substeps done
|
||||
```
|
||||
|
||||
## 3. Review Against Epic and Create Next Task
|
||||
|
||||
**CRITICAL:** After each task, adapt plan based on reality.
|
||||
|
||||
**Review questions:**
|
||||
1. What did we learn?
|
||||
2. Discovered any blockers, existing functionality, limitations?
|
||||
3. Does this move us toward epic success criteria?
|
||||
4. What's next logical step?
|
||||
5. Any epic anti-patterns to avoid?
|
||||
|
||||
**Re-read epic:**
|
||||
```bash
|
||||
bd show bd-1 # Keep requirements fresh
|
||||
```
|
||||
|
||||
**Three cases:**
|
||||
|
||||
**A) Next task still valid** → Proceed to Step 2
|
||||
|
||||
**B) Next task now redundant** (plan invalidation allowed):
|
||||
```bash
|
||||
bd delete bd-4 # Remove wasteful task
|
||||
# Or update: bd update bd-4 --title "New work" --design "..."
|
||||
```
|
||||
|
||||
**C) Need new task** based on learnings:
|
||||
```bash
|
||||
bd create "Task N: [Next Step Based on Reality]" \
|
||||
--type feature \
|
||||
--design "## Goal
|
||||
[Deliverable based on what we learned]
|
||||
|
||||
## Context
|
||||
Completed bd-2: [discoveries]
|
||||
|
||||
## Implementation
|
||||
[Steps reflecting current state, not assumptions]
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Specific outcomes
|
||||
- [ ] Tests passing"
|
||||
|
||||
bd dep add bd-N bd-1 --type parent-child
|
||||
bd dep add bd-N bd-2 --type blocks
|
||||
```
|
||||
|
||||
**REQUIRED - Run SRE refinement on new task:**
|
||||
```
|
||||
Use Skill tool: hyperpowers:sre-task-refinement
|
||||
```
|
||||
|
||||
SRE refinement will:
|
||||
- Apply 7-category corner-case analysis (Opus 4.1)
|
||||
- Identify edge cases and failure modes
|
||||
- Strengthen success criteria
|
||||
- Ensure task is ready for implementation
|
||||
|
||||
**Do NOT skip SRE refinement.** New tasks need the same rigor as initial planning.
|
||||
|
||||
## 4. Check Epic Success Criteria and STOP
|
||||
|
||||
```bash
|
||||
bd show bd-1 # Check success criteria
|
||||
```
|
||||
|
||||
- ALL criteria met? → Step 5 (final validation)
|
||||
- Some missing? → **STOP for user review**
|
||||
|
||||
## 4a. STOP Checkpoint (Mandatory)
|
||||
|
||||
**Present summary to user:**
|
||||
|
||||
```markdown
|
||||
## Task bd-N Complete - Checkpoint
|
||||
|
||||
### What Was Done
|
||||
- [Summary of implementation]
|
||||
- [Key learnings/discoveries]
|
||||
|
||||
### Next Task Ready
|
||||
- bd-M: [Title]
|
||||
- [Brief description of what's next]
|
||||
|
||||
### Epic Progress
|
||||
- [X/Y success criteria met]
|
||||
- [Remaining criteria]
|
||||
|
||||
### To Continue
|
||||
Run `/hyperpowers:execute-plan` to execute the next task.
|
||||
```
|
||||
|
||||
**Why STOP is mandatory:**
|
||||
- User can clear context (prevents context exhaustion)
|
||||
- User can review implementation before next task
|
||||
- User can adjust next task if needed
|
||||
- Prevents runaway execution without oversight
|
||||
|
||||
**Do NOT rationalize skipping the stop:**
|
||||
- "Good context loaded" → Context reloads are cheap, wrong decisions aren't
|
||||
- "Momentum" → Checkpoints ensure quality over speed
|
||||
- "User didn't ask to stop" → Stopping is the default, continuing requires explicit command
|
||||
|
||||
## 5. Final Validation and Closure
|
||||
|
||||
When all success criteria appear met:
|
||||
|
||||
1. **Run full verification** (tests, hooks, manual checks)
|
||||
|
||||
2. **REQUIRED - Use review-implementation skill:**
|
||||
```
|
||||
Use Skill tool: hyperpowers:review-implementation
|
||||
```
|
||||
|
||||
Review-implementation will:
|
||||
- Check each requirement met
|
||||
- Verify each success criterion satisfied
|
||||
- Confirm no anti-patterns used
|
||||
- If approved: Calls `finishing-a-development-branch`
|
||||
- If gaps: Create tasks, return to Step 2
|
||||
|
||||
3. **Only close epic after review approves**
|
||||
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
|
||||
<example>
|
||||
<scenario>Developer closes task without completing all substeps, claims "mostly done"</scenario>
|
||||
|
||||
<code>
|
||||
bd-2 has 6 implementation steps.
|
||||
|
||||
TodoWrite shows:
|
||||
- ✅ bd-2 Step 1: Write test
|
||||
- ✅ bd-2 Step 2: Run test RED
|
||||
- ✅ bd-2 Step 3: Implement function
|
||||
- ⏸️ bd-2 Step 4: Run test GREEN (pending)
|
||||
- ⏸️ bd-2 Step 5: Refactor (pending)
|
||||
- ⏸️ bd-2 Step 6: Commit (pending)
|
||||
|
||||
Developer thinks: "Function works, I'll close bd-2 and move on"
|
||||
Runs: bd close bd-2
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
Steps 4-6 skipped:
|
||||
- Tests not verified GREEN (might have broken other tests)
|
||||
- Code not refactored (leaves technical debt)
|
||||
- Changes not committed (work could be lost)
|
||||
|
||||
"Mostly done" = incomplete task = will cause issues later.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Pre-close verification checkpoint:**
|
||||
|
||||
Before closing ANY task:
|
||||
1. Check TodoWrite: All substeps completed?
|
||||
2. If incomplete: Continue with remaining substeps
|
||||
3. Only when ALL ✅: bd close bd-2
|
||||
|
||||
**Result:** Task actually complete, tests passing, code committed.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer discovers planned task is redundant, executes it anyway "because it's in the plan"</scenario>
|
||||
|
||||
<code>
|
||||
bd-4 says: "Implement token refresh middleware"
|
||||
|
||||
While executing bd-2, developer discovers:
|
||||
- Token refresh middleware already exists in auth/middleware/refresh.ts
|
||||
- Works correctly, has tests
|
||||
- bd-4 would duplicate existing code
|
||||
|
||||
Developer thinks: "bd-4 is in the plan, I should do it anyway"
|
||||
Proceeds to implement duplicate middleware
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Wasteful execution:**
|
||||
- Duplicates existing functionality
|
||||
- Creates maintenance burden (two implementations to keep in sync)
|
||||
- Violates DRY principle
|
||||
- Wastes time on redundant work
|
||||
|
||||
**Why it happens:** Treating tasks as immutable instead of epic.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Plan invalidation is allowed:**
|
||||
|
||||
1. Verify the discovery:
|
||||
```bash
|
||||
# Check existing code
|
||||
cat auth/middleware/refresh.ts
|
||||
# Confirm it works
|
||||
npm test -- refresh.spec.ts
|
||||
```
|
||||
|
||||
2. Delete redundant task:
|
||||
```bash
|
||||
bd delete bd-4
|
||||
```
|
||||
|
||||
3. Document why:
|
||||
```
|
||||
bd update bd-2 --design "...
|
||||
|
||||
Discovery: Token refresh middleware already exists (auth/middleware/refresh.ts).
|
||||
Verified working with tests. bd-4 deleted as redundant."
|
||||
```
|
||||
|
||||
4. Create new task if needed (maybe "Integrate existing refresh middleware" instead)
|
||||
|
||||
**Result:** Plan adapts to reality. No wasted work.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer hits blocker, waters down epic requirement to "make it easier"</scenario>
|
||||
|
||||
<code>
|
||||
Epic bd-1 anti-patterns say:
|
||||
"FORBIDDEN: Using mocks for database integration tests. Must use real test database."
|
||||
|
||||
Developer encounters:
|
||||
- Real database setup is complex
|
||||
- Mocking would make tests pass quickly
|
||||
|
||||
Developer thinks: "This is too hard, I'll use mocks just for now and refactor later"
|
||||
|
||||
Adds TODO: // TODO: Replace mocks with real DB later
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Violates epic anti-pattern:**
|
||||
- Epic explicitly forbids mocks for integration tests
|
||||
- "Later" never happens (TODO remains forever)
|
||||
- Tests don't verify actual integration
|
||||
- Defeats purpose of integration testing
|
||||
|
||||
**Why it happens:** Rationalizing around blockers instead of solving them.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**When blocked, re-read epic:**
|
||||
|
||||
1. Re-read epic requirements and anti-patterns:
|
||||
```bash
|
||||
bd show bd-1
|
||||
```
|
||||
|
||||
2. Check if solution violates anti-pattern:
|
||||
- Using mocks? YES, explicitly forbidden
|
||||
|
||||
3. Don't rationalize. Instead:
|
||||
|
||||
**Option A - Research:**
|
||||
```bash
|
||||
bd create "Research: Real DB test setup for [project]" \
|
||||
--design "Find how this project sets up test databases.
|
||||
Check existing test files for patterns.
|
||||
Document setup process that meets anti-pattern requirements."
|
||||
```
|
||||
|
||||
**Option B - Ask user:**
|
||||
"Blocker: Test DB setup complex. Epic forbids mocks for integration.
|
||||
Is there existing test DB infrastructure I should use?"
|
||||
|
||||
**Result:** Epic requirements maintained. Blocker solved properly.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer skips STOP checkpoint to "maintain momentum"</scenario>
|
||||
|
||||
<code>
|
||||
Just completed bd-2 (authentication middleware).
|
||||
Created bd-3 (rate limiting endpoint).
|
||||
Ran SRE refinement on bd-3.
|
||||
|
||||
Developer thinks: "Good context loaded, I'll just do bd-3 quickly then stop.
|
||||
User approved the epic, they trust me to execute it.
|
||||
Stopping now is inefficient."
|
||||
|
||||
Continues directly to execute bd-3 without STOP checkpoint.
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Multiple failures:**
|
||||
- User can't review bd-2 implementation before bd-3 starts
|
||||
- User can't clear context (may hit context limit mid-task)
|
||||
- User can't adjust bd-3 based on bd-2 learnings
|
||||
- No checkpoint = no oversight
|
||||
|
||||
**The rationalization trap:**
|
||||
- "Good context" sounds efficient but prevents review
|
||||
- "User trust" misinterprets approval (one command ≠ blanket permission)
|
||||
- "Quick task" becomes long task when issues arise
|
||||
|
||||
**What actually happens:**
|
||||
- bd-3 hits unexpected issue
|
||||
- Context exhausted trying to debug
|
||||
- User returns to find 2 half-finished tasks instead of 1 complete task
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Follow the STOP checkpoint:**
|
||||
|
||||
1. After completing bd-2 and refining bd-3:
|
||||
```markdown
|
||||
## Task bd-2 Complete - Checkpoint
|
||||
|
||||
### What Was Done
|
||||
- Implemented JWT middleware with validation
|
||||
- Added token refresh handling
|
||||
|
||||
### Next Task Ready
|
||||
- bd-3: Implement rate limiting
|
||||
- Adds rate limiting to auth endpoints
|
||||
|
||||
### Epic Progress
|
||||
- 2/4 success criteria met
|
||||
- Remaining: password reset, rate limiting
|
||||
|
||||
### To Continue
|
||||
Run `/hyperpowers:execute-plan` to execute the next task.
|
||||
```
|
||||
|
||||
2. **STOP and wait for user**
|
||||
|
||||
**Result:** User can review, clear context, adjust next task. Each task completes with full oversight.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **STOP after each task** → Present summary, wait for user to run command again
|
||||
- User needs checkpoint to review implementation
|
||||
- User may need to clear context
|
||||
- Continuous execution = no oversight
|
||||
|
||||
2. **SRE refinement for new tasks** → Never skip corner-case analysis
|
||||
- New tasks created during execution need same rigor as initial planning
|
||||
- Use Opus 4.1 for thorough analysis
|
||||
- Tasks without refinement will miss edge cases
|
||||
|
||||
3. **Epic requirements are immutable** → Never water down when blocked
|
||||
- If blocked: Research solution or ask user
|
||||
- Never violate anti-patterns to "make it easier"
|
||||
|
||||
4. **All substeps must be completed** → Never close task with pending substeps
|
||||
- Check TodoWrite before closing
|
||||
- "Mostly done" = incomplete = will cause issues
|
||||
|
||||
5. **Plan invalidation is allowed** → Delete redundant tasks
|
||||
- If discovered existing functionality: Delete duplicate task
|
||||
- If discovered blocker: Update or delete invalid task
|
||||
- Document what you found and why
|
||||
|
||||
6. **Review before closing epic** → Use review-implementation skill
|
||||
- Tasks done ≠ success criteria met
|
||||
- All criteria must be verified before closing
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: Re-read epic, STOP as required, ask for help:
|
||||
- "Good context loaded, don't want to lose it" → STOP anyway, context reloads
|
||||
- "Just one more quick task" → STOP anyway, user needs checkpoint
|
||||
- "User didn't ask me to stop" → Stopping is default, continuing requires explicit command
|
||||
- "SRE refinement is overkill for this task" → Every task needs refinement, no exceptions
|
||||
- "This requirement is too hard" → Research or ask, don't water down
|
||||
- "I'll come back to this later" → Complete now or document why blocked
|
||||
- "Let me fake this to make tests pass" → Never, defeats purpose
|
||||
- "Existing task is wasteful, but it's planned" → Delete it, plan adapts to reality
|
||||
- "All tasks done, epic must be complete" → Verify with review-implementation
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
|
||||
Before closing each task:
|
||||
- [ ] ALL TodoWrite substeps completed (no pending)
|
||||
- [ ] Tests passing (use test-runner agent)
|
||||
- [ ] Changes committed
|
||||
- [ ] Task actually done (not "mostly")
|
||||
|
||||
After closing each task:
|
||||
- [ ] Reviewed learnings against epic
|
||||
- [ ] Created/updated next task based on reality
|
||||
- [ ] Ran SRE refinement on any new tasks
|
||||
- [ ] Presented STOP checkpoint summary to user
|
||||
- [ ] STOPPED execution (do not continue to next task)
|
||||
|
||||
Before closing epic:
|
||||
- [ ] ALL success criteria met (check epic)
|
||||
- [ ] review-implementation skill used and approved
|
||||
- [ ] No anti-patterns violated
|
||||
- [ ] All tasks closed
|
||||
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
|
||||
**This skill calls:**
|
||||
- writing-plans (creates epic and first task before this runs)
|
||||
- sre-task-refinement (REQUIRED for new tasks created during execution)
|
||||
- test-driven-development (when implementing features)
|
||||
- test-runner (for running tests without output pollution)
|
||||
- review-implementation (final validation before closing epic)
|
||||
- finishing-a-development-branch (after review approves)
|
||||
|
||||
**This skill is called by:**
|
||||
- User (via /hyperpowers:execute-plan command)
|
||||
- After writing-plans creates epic
|
||||
- Explicitly to resume after checkpoint (user runs command again)
|
||||
|
||||
**Agents used:**
|
||||
- hyperpowers:test-runner (run tests, return summary only)
|
||||
|
||||
**Workflow pattern:**
|
||||
```
|
||||
/hyperpowers:execute-plan → Execute task → STOP
|
||||
[User clears context, reviews]
|
||||
/hyperpowers:execute-plan → Execute next task → STOP
|
||||
[Repeat until epic complete]
|
||||
```
|
||||
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
|
||||
**bd command reference:**
|
||||
- See [bd commands](../common-patterns/bd-commands.md) for complete command list
|
||||
|
||||
**When stuck:**
|
||||
- Hit blocker → Re-read epic, check anti-patterns, research or ask
|
||||
- Don't understand instruction → Stop and ask (never guess)
|
||||
- Verification fails repeatedly → Check epic anti-patterns, ask for help
|
||||
- Tempted to skip steps → Check TodoWrite, complete all substeps
|
||||
|
||||
</resources>
|
||||
482
skills/finishing-a-development-branch/SKILL.md
Normal file
482
skills/finishing-a-development-branch/SKILL.md
Normal file
@@ -0,0 +1,482 @@
|
||||
---
|
||||
name: finishing-a-development-branch
|
||||
description: Use when implementation complete and tests pass - closes bd epic, presents integration options (merge/PR/keep/discard), executes choice
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Close bd epic, verify tests pass, present 4 integration options, execute choice, cleanup worktree appropriately.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow the 6-step process exactly. Present exactly 4 options. Never skip test verification. Must confirm before discarding.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Step | Action | If Blocked |
|
||||
|------|--------|------------|
|
||||
| 1 | Close bd epic | Tasks still open → STOP |
|
||||
| 2 | Verify tests pass (test-runner agent) | Tests fail → STOP |
|
||||
| 3 | Determine base branch | Ask if needed |
|
||||
| 4 | Present exactly 4 options | Wait for choice |
|
||||
| 5 | Execute choice | Follow option workflow |
|
||||
| 6 | Cleanup worktree (options 1,2,4 only) | Option 3 keeps worktree |
|
||||
|
||||
**Options:** 1=Merge locally, 2=PR, 3=Keep as-is, 4=Discard (confirm)
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
- Implementation complete and reviewed
|
||||
- All bd tasks for epic are done
|
||||
- Ready to integrate work back to main branch
|
||||
- Called by hyperpowers:review-implementation (final step)
|
||||
|
||||
**Don't use for:**
|
||||
- Work still in progress
|
||||
- Tests failing
|
||||
- Epic has open tasks
|
||||
- Mid-implementation (use hyperpowers:executing-plans)
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## Step 1: Close bd Epic
|
||||
|
||||
**Announce:** "I'm using hyperpowers:finishing-a-development-branch to complete this work."
|
||||
|
||||
**Verify all tasks closed:**
|
||||
|
||||
```bash
|
||||
bd dep tree bd-1 # Show task tree
|
||||
bd list --status open --parent bd-1 # Check for open tasks
|
||||
```
|
||||
|
||||
**If any tasks still open:**
|
||||
```
|
||||
Cannot close epic bd-1: N tasks still open:
|
||||
- bd-3: Task Name (status: in_progress)
|
||||
- bd-5: Task Name (status: open)
|
||||
|
||||
Complete all tasks before finishing.
|
||||
```
|
||||
|
||||
**STOP. Do not proceed.**
|
||||
|
||||
**If all tasks closed:**
|
||||
|
||||
```bash
|
||||
bd close bd-1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Verify Tests
|
||||
|
||||
**IMPORTANT:** Use hyperpowers:test-runner agent to avoid context pollution.
|
||||
|
||||
Dispatch hyperpowers:test-runner agent:
|
||||
```
|
||||
Run: cargo test
|
||||
(or: npm test / pytest / go test ./...)
|
||||
```
|
||||
|
||||
Agent returns summary + failures only.
|
||||
|
||||
**If tests fail:**
|
||||
```
|
||||
Tests failing (N failures). Must fix before completing:
|
||||
|
||||
[Show failures]
|
||||
|
||||
Cannot proceed until tests pass.
|
||||
```
|
||||
|
||||
**STOP. Do not proceed.**
|
||||
|
||||
**If tests pass:** Continue to Step 3.
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Determine Base Branch
|
||||
|
||||
```bash
|
||||
git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null
|
||||
```
|
||||
|
||||
Or ask: "This branch split from main - is that correct?"
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Present Options
|
||||
|
||||
Present exactly these 4 options:
|
||||
|
||||
```
|
||||
Implementation complete. What would you like to do?
|
||||
|
||||
1. Merge back to <base-branch> locally
|
||||
2. Push and create a Pull Request
|
||||
3. Keep the branch as-is (I'll handle it later)
|
||||
4. Discard this work
|
||||
|
||||
Which option?
|
||||
```
|
||||
|
||||
**Don't add explanation.** Keep concise.
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Execute Choice
|
||||
|
||||
### Option 1: Merge Locally
|
||||
|
||||
```bash
|
||||
git checkout <base-branch>
|
||||
git pull
|
||||
git merge <feature-branch>
|
||||
|
||||
# Verify tests on merged result
|
||||
Dispatch hyperpowers:test-runner: "Run: <test command>"
|
||||
|
||||
# If tests pass
|
||||
git branch -d <feature-branch>
|
||||
```
|
||||
|
||||
Then: Step 6 (cleanup worktree)
|
||||
|
||||
---
|
||||
|
||||
### Option 2: Push and Create PR
|
||||
|
||||
**Get epic info:**
|
||||
|
||||
```bash
|
||||
bd show bd-1
|
||||
bd dep tree bd-1
|
||||
```
|
||||
|
||||
**Create PR:**
|
||||
|
||||
```bash
|
||||
git push -u origin <feature-branch>
|
||||
|
||||
gh pr create --title "feat: <epic-name>" --body "$(cat <<'EOF'
|
||||
## Epic
|
||||
|
||||
Closes bd-<N>: <Epic Title>
|
||||
|
||||
## Summary
|
||||
<2-3 bullets from epic implementation>
|
||||
|
||||
## Tasks Completed
|
||||
- bd-2: <Task Name>
|
||||
- bd-3: <Task Name>
|
||||
|
||||
## Test Plan
|
||||
- [ ] All tests passing
|
||||
- [ ] <verification steps from epic>
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
Then: Step 6 (cleanup worktree)
|
||||
|
||||
---
|
||||
|
||||
### Option 3: Keep As-Is
|
||||
|
||||
Report: "Keeping branch <name>. Worktree preserved at <path>."
|
||||
|
||||
**Don't cleanup worktree.**
|
||||
|
||||
---
|
||||
|
||||
### Option 4: Discard
|
||||
|
||||
**Confirm first:**
|
||||
|
||||
```
|
||||
This will permanently delete:
|
||||
- Branch <name>
|
||||
- All commits: <commit-list>
|
||||
- Worktree at <path>
|
||||
|
||||
Type 'discard' to confirm.
|
||||
```
|
||||
|
||||
Wait for exact "discard" confirmation.
|
||||
|
||||
**If confirmed:**
|
||||
|
||||
```bash
|
||||
git checkout <base-branch>
|
||||
git branch -D <feature-branch>
|
||||
```
|
||||
|
||||
Then: Step 6 (cleanup worktree)
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Cleanup Worktree
|
||||
|
||||
**For Options 1, 2, 4 only:**
|
||||
|
||||
```bash
|
||||
# Check if in worktree
|
||||
git worktree list | grep $(git branch --show-current)
|
||||
|
||||
# If yes
|
||||
git worktree remove <worktree-path>
|
||||
```
|
||||
|
||||
**For Option 3:** Keep worktree (don't cleanup).
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer skips test verification before presenting options</scenario>
|
||||
|
||||
<code>
|
||||
# Step 1: Epic closed ✓
|
||||
bd close bd-1
|
||||
|
||||
# Step 2: SKIPPED test verification
|
||||
# Jump directly to presenting options
|
||||
|
||||
"Implementation complete. What would you like to do?
|
||||
1. Merge back to main locally
|
||||
2. Push and create PR
|
||||
..."
|
||||
|
||||
User selects Option 1
|
||||
|
||||
git checkout main
|
||||
git merge feature-branch
|
||||
# Tests fail! Broken code now on main
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped mandatory test verification
|
||||
- Merged broken code to main branch
|
||||
- Other developers pull broken main
|
||||
- CI/CD fails, blocks deployment
|
||||
- Must revert, fix, merge again (wasted time)
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Follow Step 2 strictly:**
|
||||
|
||||
```bash
|
||||
# After closing epic
|
||||
bd close bd-1 ✓
|
||||
|
||||
# MANDATORY: Verify tests BEFORE presenting options
|
||||
Dispatch hyperpowers:test-runner agent: "Run: cargo test"
|
||||
|
||||
# Agent reports
|
||||
"Test suite passed (127 tests, 0 failures, 2.3s)"
|
||||
|
||||
# NOW present options
|
||||
"Implementation complete. What would you like to do?
|
||||
1. Merge back to main locally
|
||||
..."
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Confidence tests pass before integration
|
||||
- No broken code merged to main
|
||||
- CI/CD stays green
|
||||
- Other developers unblocked
|
||||
- Professional workflow
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer auto-cleans worktree for PR option</scenario>
|
||||
|
||||
<code>
|
||||
# User selects Option 2: Create PR
|
||||
git push -u origin feature-auth
|
||||
gh pr create --title "feat: Add OAuth" --body "..."
|
||||
|
||||
# Developer immediately cleans up worktree
|
||||
git worktree remove ../feature-auth-worktree
|
||||
|
||||
# PR gets feedback: "Please add rate limiting"
|
||||
# User: "Can you address the PR feedback?"
|
||||
# Worktree is gone! Have to recreate it
|
||||
git worktree add ../feature-auth-worktree feature-auth
|
||||
# Lost local state, uncommitted experiments, etc.
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Cleaned worktree when PR still active
|
||||
- User likely needs worktree for PR feedback
|
||||
- Have to recreate worktree for changes
|
||||
- Lost any local uncommitted work
|
||||
- Inefficient workflow
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Option 2 workflow (correct):**
|
||||
|
||||
```bash
|
||||
git push -u origin feature-auth
|
||||
gh pr create --title "feat: Add OAuth" --body "..."
|
||||
|
||||
# Report PR created
|
||||
"Pull request created: https://github.com/user/repo/pull/42
|
||||
|
||||
Keeping worktree at ../feature-auth-worktree for PR updates."
|
||||
|
||||
# NO worktree cleanup
|
||||
# User can address PR feedback in same worktree
|
||||
```
|
||||
|
||||
**Cleanup happens later when:**
|
||||
- PR is merged
|
||||
- User explicitly requests cleanup
|
||||
- User uses finishing-a-development-branch again after PR merges
|
||||
|
||||
**What you gain:**
|
||||
- Worktree available for PR feedback
|
||||
- No need to recreate worktree
|
||||
- Preserve local state and experiments
|
||||
- Efficient PR iteration workflow
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer discards work without confirmation</scenario>
|
||||
|
||||
<code>
|
||||
# User says: "Actually, discard this work"
|
||||
|
||||
# Developer immediately executes
|
||||
git checkout main
|
||||
git branch -D feature-experimental
|
||||
git worktree remove ../feature-experimental-worktree
|
||||
|
||||
# 50 commits deleted
|
||||
# User: "Wait, I meant discard the LAST commit, not the whole branch!"
|
||||
# Too late - work is gone
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- No confirmation before destructive action
|
||||
- Misunderstood user intent
|
||||
- Permanent data loss
|
||||
- No way to recover work
|
||||
- Catastrophic for user
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Option 4 workflow (correct):**
|
||||
|
||||
```
|
||||
User: "Discard this work"
|
||||
|
||||
"This will permanently delete:
|
||||
- Branch feature-experimental
|
||||
- All commits:
|
||||
* a1b2c3d Add OAuth integration
|
||||
* d4e5f6g Add rate limiting
|
||||
* g7h8i9j Update tests
|
||||
... (47 more commits)
|
||||
- Worktree at ../feature-experimental-worktree
|
||||
|
||||
Type 'discard' to confirm."
|
||||
|
||||
# WAIT for exact confirmation
|
||||
User types: "discard"
|
||||
|
||||
# NOW execute
|
||||
git checkout main
|
||||
git branch -D feature-experimental
|
||||
git worktree remove ../feature-experimental-worktree
|
||||
|
||||
"Branch feature-experimental deleted."
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- User sees exactly what will be deleted
|
||||
- Explicit confirmation required
|
||||
- Prevents accidental data loss
|
||||
- Time to reconsider or clarify
|
||||
- Safe destructive operations
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<option_matrix>
|
||||
| Option | Merge | Push | Keep Worktree | Cleanup Branch | Cleanup Worktree |
|
||||
|--------|-------|------|---------------|----------------|------------------|
|
||||
| 1. Merge locally | ✓ | - | - | ✓ | ✓ |
|
||||
| 2. Create PR | - | ✓ | ✓ | - | - |
|
||||
| 3. Keep as-is | - | - | ✓ | - | - |
|
||||
| 4. Discard | - | - | - | ✓ (force) | ✓ |
|
||||
</option_matrix>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Never skip test verification** → Tests must pass before presenting options
|
||||
2. **Present exactly 4 options** → No open-ended questions
|
||||
3. **Require confirmation for Option 4** → Type "discard" exactly
|
||||
4. **Keep worktree for Options 2 & 3** → PR and keep-as-is need worktree
|
||||
5. **Verify tests after merge (Option 1)** → Merged result might break
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Follow the process.**
|
||||
|
||||
- "Tests passed earlier, don't need to verify" (Might have changed, verify now)
|
||||
- "User knows what they want" (Present options, let them choose)
|
||||
- "Obvious they want to discard" (Require explicit confirmation)
|
||||
- "PR done, cleanup worktree" (PR likely needs updates, keep worktree)
|
||||
- "Too many options" (Exactly 4, no more, no less)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before completing:
|
||||
|
||||
- [ ] bd epic closed (all child tasks closed)
|
||||
- [ ] Tests verified passing (via test-runner agent)
|
||||
- [ ] Presented exactly 4 options (no open-ended questions)
|
||||
- [ ] Waited for user choice (didn't assume)
|
||||
- [ ] If Option 4: Got typed "discard" confirmation
|
||||
- [ ] Worktree cleaned for Options 1, 4 only (not 2, 3)
|
||||
- [ ] If Option 1: Verified tests on merged result
|
||||
|
||||
**Can't check all boxes?** Return to process and complete missing steps.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill is called by:**
|
||||
- hyperpowers:review-implementation (final step after approval)
|
||||
|
||||
**Call chain:**
|
||||
```
|
||||
hyperpowers:executing-plans → hyperpowers:review-implementation → hyperpowers:finishing-a-development-branch
|
||||
↓
|
||||
(if gaps found: STOP)
|
||||
```
|
||||
|
||||
**This skill calls:**
|
||||
- hyperpowers:test-runner agent (for test verification)
|
||||
- bd commands (epic management)
|
||||
- gh commands (PR creation)
|
||||
|
||||
**CRITICAL:** Never read `.beads/issues.jsonl` directly. Always use bd CLI commands.
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Git worktree management](resources/worktree-guide.md)
|
||||
- [PR description templates](resources/pr-templates.md)
|
||||
- [bd epic reference in PRs](resources/bd-pr-integration.md)
|
||||
|
||||
**When stuck:**
|
||||
- Tasks won't close → Check bd status, verify all child tasks done
|
||||
- Tests fail → Fix before presenting options (can't proceed)
|
||||
- User unsure → Explain options, but don't make choice for them
|
||||
- Worktree won't remove → Might have uncommitted changes, ask user
|
||||
</resources>
|
||||
528
skills/fixing-bugs/SKILL.md
Normal file
528
skills/fixing-bugs/SKILL.md
Normal file
@@ -0,0 +1,528 @@
|
||||
---
|
||||
name: fixing-bugs
|
||||
description: Use when encountering a bug - complete workflow from discovery through debugging, bd issue, test-driven fix, verification, and closure
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Bug fixing is a complete workflow: reproduce, track in bd, debug systematically, write test, fix, verify, close. Every bug gets a bd issue and regression test.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow exact workflow: create bd issue → debug with tools → write failing test → fix → verify → close.
|
||||
|
||||
Never skip tracking or regression test. Use debugging-with-tools for investigation, test-driven-development for fix.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
|
||||
| Step | Action | Command/Skill |
|
||||
|------|--------|---------------|
|
||||
| **1. Track** | Create bd bug issue | `bd create "Bug: [description]" --type bug` |
|
||||
| **2. Debug** | Systematic investigation | Use `debugging-with-tools` skill |
|
||||
| **3. Test (RED)** | Write failing test reproducing bug | Use `test-driven-development` skill |
|
||||
| **4. Fix (GREEN)** | Implement fix | Minimal code to pass test |
|
||||
| **5. Verify** | Run full test suite | Use `verification-before-completion` skill |
|
||||
| **6. Close** | Update and close bd issue | `bd close bd-123` |
|
||||
|
||||
**FORBIDDEN:** Fix without bd issue, fix without regression test
|
||||
**REQUIRED:** Every bug gets tracked, tested, verified before closing
|
||||
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**Use when you discover a bug:**
|
||||
- Test failure you need to fix
|
||||
- Bug reported by user
|
||||
- Unexpected behavior in development
|
||||
- Regression from recent change
|
||||
- Production issue (non-emergency)
|
||||
|
||||
**Production emergencies:** Abbreviated workflow OK (hotfix first), but still create bd issue and add regression tests afterward.
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
|
||||
## 1. Create bd Bug Issue
|
||||
|
||||
**Track from the start:**
|
||||
|
||||
```bash
|
||||
bd create "Bug: [Clear description]" --type bug --priority P1
|
||||
# Returns: bd-123
|
||||
```
|
||||
|
||||
**Document:**
|
||||
```bash
|
||||
bd edit bd-123 --design "
|
||||
## Bug Description
|
||||
[What's wrong]
|
||||
|
||||
## Reproduction Steps
|
||||
1. Step one
|
||||
2. Step two
|
||||
|
||||
## Expected Behavior
|
||||
[What should happen]
|
||||
|
||||
## Actual Behavior
|
||||
[What actually happens]
|
||||
|
||||
## Environment
|
||||
[Version, OS, etc.]"
|
||||
```
|
||||
|
||||
## 2. Debug Systematically
|
||||
|
||||
**REQUIRED: Use debugging-with-tools skill**
|
||||
|
||||
```
|
||||
Use Skill tool: hyperpowers:debugging-with-tools
|
||||
```
|
||||
|
||||
**debugging-with-tools will:**
|
||||
- Use internet-researcher to search for error
|
||||
- Recommend debugger or instrumentation
|
||||
- Use codebase-investigator to understand context
|
||||
- Guide to root cause (not symptom)
|
||||
|
||||
**Update bd issue with findings:**
|
||||
```bash
|
||||
bd edit bd-123 --design "[previous content]
|
||||
|
||||
## Investigation
|
||||
[Root cause found via debugging]
|
||||
[Tools used: debugger, internet search, etc.]"
|
||||
```
|
||||
|
||||
## 3. Write Failing Test (RED Phase)
|
||||
|
||||
**REQUIRED: Use test-driven-development skill**
|
||||
|
||||
Write test that reproduces the bug:
|
||||
|
||||
```python
|
||||
def test_rejects_empty_email():
|
||||
"""Regression test for bd-123: Empty email accepted"""
|
||||
with pytest.raises(ValidationError):
|
||||
create_user(email="") # Should fail, currently passes
|
||||
```
|
||||
|
||||
**Run test, verify it FAILS:**
|
||||
```bash
|
||||
pytest tests/test_user.py::test_rejects_empty_email
|
||||
# Expected: PASS (bug exists)
|
||||
# Should fail AFTER fix
|
||||
```
|
||||
|
||||
**Why critical:** If test passes before fix, it doesn't test the bug.
|
||||
|
||||
## 4. Implement Fix (GREEN Phase)
|
||||
|
||||
**Fix the root cause (not symptom):**
|
||||
|
||||
```python
|
||||
def create_user(email: str):
|
||||
if not email or not email.strip(): # Fix
|
||||
raise ValidationError("Email required")
|
||||
# ... rest
|
||||
```
|
||||
|
||||
**Run test, verify it now FAILS (test was written backwards by mistake earlier - fix this):**
|
||||
|
||||
Actually write the test to FAIL first:
|
||||
```python
|
||||
def test_rejects_empty_email():
|
||||
with pytest.raises(ValidationError):
|
||||
create_user(email="")
|
||||
```
|
||||
|
||||
Run:
|
||||
```bash
|
||||
pytest tests/test_user.py::test_rejects_empty_email
|
||||
# Should FAIL before fix (no validation)
|
||||
# Should PASS after fix (validation added)
|
||||
```
|
||||
|
||||
## 5. Verify Complete Fix
|
||||
|
||||
**REQUIRED: Use verification-before-completion skill**
|
||||
|
||||
```bash
|
||||
# Run full test suite (via test-runner agent)
|
||||
"Run: pytest"
|
||||
|
||||
# Agent returns: All tests pass (including regression test)
|
||||
```
|
||||
|
||||
**Verify:**
|
||||
- Regression test passes
|
||||
- All other tests still pass
|
||||
- No new warnings or errors
|
||||
- Pre-commit hooks pass
|
||||
|
||||
## 6. Close bd Issue
|
||||
|
||||
**Update with fix details:**
|
||||
```bash
|
||||
bd edit bd-123 --design "[previous content]
|
||||
|
||||
## Fix Implemented
|
||||
[Description of fix]
|
||||
[File changed: src/auth/user.py:23]
|
||||
|
||||
## Regression Test
|
||||
[Test added: tests/test_user.py::test_rejects_empty_email]
|
||||
|
||||
## Verification
|
||||
[All tests pass: 145/145]"
|
||||
|
||||
bd close bd-123
|
||||
```
|
||||
|
||||
**Commit with bd reference:**
|
||||
```bash
|
||||
git commit -m "fix(bd-123): Reject empty email in user creation
|
||||
|
||||
Adds validation to prevent empty strings.
|
||||
Regression test: test_rejects_empty_email
|
||||
|
||||
Closes bd-123"
|
||||
```
|
||||
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
|
||||
<example>
|
||||
<scenario>Developer fixes bug without creating bd issue or regression test</scenario>
|
||||
|
||||
<code>
|
||||
Developer notices: Empty email accepted in user creation
|
||||
|
||||
"Fixes" immediately:
|
||||
```python
|
||||
def create_user(email: str):
|
||||
if not email: # Quick fix
|
||||
raise ValidationError("Email required")
|
||||
```
|
||||
|
||||
Commits: "fix: validate email"
|
||||
|
||||
[No bd issue, no regression test]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**No tracking:**
|
||||
- Work not tracked in bd (can't see what was fixed)
|
||||
- No link between commit and bug
|
||||
- Can't verify fix meets requirements
|
||||
|
||||
**No regression test:**
|
||||
- Bug could come back in future
|
||||
- Can't prove fix works
|
||||
- No protection against breaking this again
|
||||
|
||||
**Incomplete fix:**
|
||||
- Doesn't handle `email=" "` (whitespace)
|
||||
- Didn't debug to understand full issue
|
||||
|
||||
**Result:** Bug returns when someone changes validation logic.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Complete workflow:**
|
||||
|
||||
```bash
|
||||
# 1. Track
|
||||
bd create "Bug: Empty email accepted" --type bug
|
||||
# Returns: bd-123
|
||||
|
||||
# 2. Debug (use debugging-with-tools)
|
||||
# Investigation reveals: Email validation missing entirely
|
||||
# Also: Whitespace emails like " " also accepted
|
||||
|
||||
# 3. Write failing test (RED)
|
||||
def test_rejects_empty_email():
|
||||
with pytest.raises(ValidationError):
|
||||
create_user(email="")
|
||||
|
||||
def test_rejects_whitespace_email():
|
||||
with pytest.raises(ValidationError):
|
||||
create_user(email=" ")
|
||||
|
||||
# Run: Both PASS (bug exists) - WAIT, test should FAIL before fix!
|
||||
```
|
||||
|
||||
Actually:
|
||||
```python
|
||||
# Test currently PASSES (bug exists - no validation)
|
||||
# We expect test to FAIL after we add validation
|
||||
|
||||
# 4. Fix
|
||||
def create_user(email: str):
|
||||
if not email or not email.strip():
|
||||
raise ValidationError("Email required")
|
||||
|
||||
# 5. Verify
|
||||
pytest # All tests pass now, including regression tests
|
||||
|
||||
# 6. Close
|
||||
bd close bd-123
|
||||
git commit -m "fix(bd-123): Reject empty/whitespace email"
|
||||
```
|
||||
|
||||
**Result:** Bug fixed, tracked, tested, won't regress.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer writes test after fix, test passes immediately, doesn't catch regression</scenario>
|
||||
|
||||
<code>
|
||||
Developer fixes validation bug, then writes test:
|
||||
|
||||
```python
|
||||
# Fix first
|
||||
def validate_email(email):
|
||||
return "@" in email and len(email) > 0
|
||||
|
||||
# Then test
|
||||
def test_validate_email():
|
||||
assert validate_email("user@example.com") == True
|
||||
```
|
||||
|
||||
Test runs: PASS
|
||||
|
||||
Commits both together.
|
||||
|
||||
Later, someone changes validation:
|
||||
```python
|
||||
def validate_email(email):
|
||||
return True # Breaks validation!
|
||||
```
|
||||
|
||||
Test still PASSES (only checks happy path).
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Test written after fix:**
|
||||
- Never saw test fail
|
||||
- Only tests happy path remembered
|
||||
- Doesn't test the bug that was fixed
|
||||
- Missed edge case: `validate_email("@@")` returns True (bug!)
|
||||
|
||||
**Why it happens:** Skipping TDD RED phase.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**TDD approach (RED-GREEN):**
|
||||
|
||||
```python
|
||||
# 1. Write test FIRST that reproduces the bug
|
||||
def test_validate_email():
|
||||
# Happy path
|
||||
assert validate_email("user@example.com") == True
|
||||
# Bug case (empty email was accepted)
|
||||
assert validate_email("") == False
|
||||
# Edge case discovered during debugging
|
||||
assert validate_email("@@") == False
|
||||
|
||||
# 2. Run test - should FAIL (bug exists)
|
||||
pytest test_validate_email
|
||||
# FAIL: validate_email("") returned True, expected False
|
||||
|
||||
# 3. Implement fix
|
||||
def validate_email(email):
|
||||
if not email or len(email) == 0:
|
||||
return False
|
||||
return "@" in email and email.count("@") == 1
|
||||
|
||||
# 4. Run test - should PASS
|
||||
pytest test_validate_email
|
||||
# PASS: All assertions pass
|
||||
```
|
||||
|
||||
**Later regression:**
|
||||
```python
|
||||
def validate_email(email):
|
||||
return True # Someone breaks it
|
||||
```
|
||||
|
||||
**Test catches it:**
|
||||
```
|
||||
FAIL: assert validate_email("") == False
|
||||
Expected False, got True
|
||||
```
|
||||
|
||||
**Result:** Regression test actually prevents bug from returning.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer fixes symptom without using debugging-with-tools to find root cause</scenario>
|
||||
|
||||
<code>
|
||||
Bug report: "Application crashes when processing user data"
|
||||
|
||||
Error:
|
||||
```
|
||||
NullPointerException at UserService.java:45
|
||||
```
|
||||
|
||||
Developer sees line 45:
|
||||
```java
|
||||
String email = user.getEmail().toLowerCase(); // Line 45
|
||||
```
|
||||
|
||||
"Obvious fix":
|
||||
```java
|
||||
String email = user.getEmail() != null ? user.getEmail().toLowerCase() : "";
|
||||
```
|
||||
|
||||
Bug "fixed"... but crashes continue with different data.
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Symptom fix:**
|
||||
- Fixed null check at crash point
|
||||
- Didn't investigate WHY email is null
|
||||
- Didn't use debugging-with-tools to find root cause
|
||||
|
||||
**Actual root cause:** User object created without email in registration flow. Email is null for all users created via broken endpoint.
|
||||
|
||||
**Result:** Null-check applied everywhere, root cause (broken registration) unfixed.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Use debugging-with-tools skill:**
|
||||
|
||||
```
|
||||
# Dispatch internet-researcher
|
||||
"Search for: NullPointerException UserService getEmail
|
||||
- Common causes of null email in user objects
|
||||
- User registration validation patterns"
|
||||
|
||||
# Dispatch codebase-investigator
|
||||
"Investigate:
|
||||
- How is User object created?
|
||||
- Where is email set?
|
||||
- Are there paths where email can be null?
|
||||
- Which endpoints create users?"
|
||||
|
||||
# Agent reports:
|
||||
"Found: POST /register endpoint creates User without validating email field.
|
||||
Email is optional in UserDTO but required in User domain object."
|
||||
```
|
||||
|
||||
**Root cause found:** Registration doesn't validate email.
|
||||
|
||||
**Proper fix:**
|
||||
```java
|
||||
// In registration endpoint
|
||||
@PostMapping("/register")
|
||||
public User register(@RequestBody UserDTO dto) {
|
||||
if (dto.getEmail() == null || dto.getEmail().isEmpty()) {
|
||||
throw new ValidationException("Email required");
|
||||
}
|
||||
return userService.create(dto);
|
||||
}
|
||||
```
|
||||
|
||||
**Regression test:**
|
||||
```java
|
||||
@Test
|
||||
void registrationRequiresEmail() {
|
||||
assertThrows(ValidationException.class, () ->
|
||||
register(new UserDTO(null, "password")));
|
||||
}
|
||||
```
|
||||
|
||||
**Result:** Root cause fixed, no more null emails created.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Every bug gets a bd issue** → Track from discovery to closure
|
||||
- Create bd issue before fixing
|
||||
- Document reproduction steps
|
||||
- Update with investigation findings
|
||||
- Close only after verified
|
||||
|
||||
2. **Use debugging-with-tools skill** → Systematic investigation required
|
||||
- Never guess at fixes
|
||||
- Use internet-researcher for errors
|
||||
- Use debugger/instrumentation for state
|
||||
- Find root cause, not symptom
|
||||
|
||||
3. **Write failing test first (RED)** → Regression prevention
|
||||
- Test must fail before fix
|
||||
- Test must reproduce the bug
|
||||
- Test must pass after fix
|
||||
- If test passes immediately, it doesn't test the bug
|
||||
|
||||
4. **Verify complete fix** → Use verification-before-completion
|
||||
- Regression test passes
|
||||
- Full test suite passes
|
||||
- No new warnings
|
||||
- Pre-commit hooks pass
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: Stop, follow complete workflow:
|
||||
- "Quick fix, no need for bd issue"
|
||||
- "Obvious bug, no need to debug"
|
||||
- "I'll add test later"
|
||||
- "Test passes, must be fixed"
|
||||
- "Just one line change"
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
|
||||
Before claiming bug fixed:
|
||||
- [ ] bd issue created with reproduction steps
|
||||
- [ ] Used debugging-with-tools to find root cause
|
||||
- [ ] Wrote test that reproduces bug (RED phase)
|
||||
- [ ] Verified test FAILS before fix
|
||||
- [ ] Implemented fix addressing root cause
|
||||
- [ ] Verified test PASSES after fix
|
||||
- [ ] Ran full test suite (all pass)
|
||||
- [ ] Updated bd issue with fix details
|
||||
- [ ] Closed bd issue
|
||||
- [ ] Committed with bd reference
|
||||
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
|
||||
**This skill calls:**
|
||||
- debugging-with-tools (systematic investigation)
|
||||
- test-driven-development (RED-GREEN-REFACTOR cycle)
|
||||
- verification-before-completion (verify complete fix)
|
||||
|
||||
**This skill is called by:**
|
||||
- When bugs discovered during development
|
||||
- When test failures need fixing
|
||||
- When user reports bugs
|
||||
|
||||
**Agents used:**
|
||||
- hyperpowers:internet-researcher (via debugging-with-tools)
|
||||
- hyperpowers:codebase-investigator (via debugging-with-tools)
|
||||
- hyperpowers:test-runner (run tests without output pollution)
|
||||
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
|
||||
**When stuck:**
|
||||
- Don't understand bug → Use debugging-with-tools skill
|
||||
- Tempted to skip tracking → Create bd issue first, always
|
||||
- Test passes immediately → Not testing the bug, rewrite test
|
||||
- Fix doesn't work → Return to debugging-with-tools, find actual root cause
|
||||
|
||||
</resources>
|
||||
707
skills/managing-bd-tasks/SKILL.md
Normal file
707
skills/managing-bd-tasks/SKILL.md
Normal file
@@ -0,0 +1,707 @@
|
||||
---
|
||||
name: managing-bd-tasks
|
||||
description: Use for advanced bd operations - splitting tasks mid-flight, merging duplicates, changing dependencies, archiving epics, querying metrics, cross-epic dependencies
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Advanced bd operations for managing complex task structures; bd is single source of truth, keep it accurate.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
HIGH FREEDOM - These are operational patterns, not rigid workflows. Adapt operations to your specific situation while following the core principles (keep bd accurate, merge don't delete, document changes).
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Operation | When | Key Command |
|
||||
|-----------|------|-------------|
|
||||
| Split task | Task too large mid-flight | Create subtasks, add deps, close parent |
|
||||
| Merge duplicates | Found duplicate tasks | Combine designs, move deps, close with reference |
|
||||
| Change dependencies | Dependencies wrong/changed | `bd dep remove` then `bd dep add` |
|
||||
| Archive epic | Epic complete, hide from views | `bd close bd-X --reason "Archived"` |
|
||||
| Query metrics | Need status/velocity data | `bd list` + filters + `wc -l` |
|
||||
| Cross-epic deps | Task depends on other epic | `bd dep add` works across epics |
|
||||
| Bulk updates | Multiple tasks need same change | Loop with careful review first |
|
||||
| Recover mistakes | Accidentally closed/wrong dep | `bd update --status` or `bd dep remove` |
|
||||
|
||||
**Core principle:** Track all work in bd, update as you go, never batch updates.
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
Use this skill for **advanced** bd operations:
|
||||
- Split task that's too large (discovered mid-implementation)
|
||||
- Merge duplicate tasks
|
||||
- Reorganize dependencies after work started
|
||||
- Archive completed epics (hide from views, keep history)
|
||||
- Query bd for metrics (velocity, progress, bottlenecks)
|
||||
- Manage cross-epic dependencies
|
||||
- Bulk status updates
|
||||
- Recover from bd mistakes
|
||||
|
||||
**For basic operations:** See skills/common-patterns/bd-commands.md (create, show, close, update)
|
||||
</when_to_use>
|
||||
|
||||
<operations>
|
||||
## Operation 1: Splitting Tasks Mid-Flight
|
||||
|
||||
**When:** Task in-progress but turns out too large.
|
||||
|
||||
**Example:** Started "Implement authentication" - realize it's 8+ hours of work across multiple areas.
|
||||
|
||||
**Process:**
|
||||
|
||||
### Step 1: Create subtasks for remaining work
|
||||
|
||||
```bash
|
||||
# Original task bd-5 is in-progress
|
||||
# Already completed: Login form
|
||||
# Remaining work gets split:
|
||||
|
||||
bd create "Auth API endpoints" --type task --priority P1 --design "
|
||||
POST /api/login and POST /api/logout endpoints.
|
||||
## Success Criteria
|
||||
- [ ] POST /api/login validates credentials, returns JWT
|
||||
- [ ] POST /api/logout invalidates token
|
||||
- [ ] Tests pass
|
||||
"
|
||||
# Returns bd-12
|
||||
|
||||
bd create "Session management" --type task --priority P1 --design "
|
||||
JWT token tracking and validation.
|
||||
## Success Criteria
|
||||
- [ ] JWT generated on login
|
||||
- [ ] Tokens validated on protected routes
|
||||
- [ ] Token expiration handled
|
||||
- [ ] Tests pass
|
||||
"
|
||||
# Returns bd-13
|
||||
|
||||
bd create "Password hashing" --type task --priority P1 --design "
|
||||
Secure password hashing with bcrypt.
|
||||
## Success Criteria
|
||||
- [ ] Passwords hashed before storage
|
||||
- [ ] Hash verification on login
|
||||
- [ ] Tests pass
|
||||
"
|
||||
# Returns bd-14
|
||||
```
|
||||
|
||||
### Step 2: Set up dependencies
|
||||
|
||||
```bash
|
||||
# Password hashing must be done first
|
||||
# API endpoints depend on password hashing
|
||||
bd dep add bd-12 bd-14 # bd-12 depends on bd-14
|
||||
|
||||
# Session management depends on API endpoints
|
||||
bd dep add bd-13 bd-12 # bd-13 depends on bd-12
|
||||
|
||||
# View tree
|
||||
bd dep tree bd-5
|
||||
```
|
||||
|
||||
### Step 3: Update original task and close
|
||||
|
||||
```bash
|
||||
bd edit bd-5 --design "
|
||||
Implement user authentication.
|
||||
|
||||
## Status
|
||||
✓ Login form completed (frontend)
|
||||
✗ Remaining work split into subtasks:
|
||||
- bd-14: Password hashing (do first)
|
||||
- bd-12: Auth API endpoints (depends on bd-14)
|
||||
- bd-13: Session management (depends on bd-12)
|
||||
|
||||
## Success Criteria
|
||||
- [x] Login form renders
|
||||
- [ ] See subtasks for remaining criteria
|
||||
"
|
||||
|
||||
bd close bd-5 --reason "Split into bd-12, bd-13, bd-14"
|
||||
```
|
||||
|
||||
### Step 4: Work on subtasks in order
|
||||
|
||||
```bash
|
||||
bd ready # Shows bd-14 (no dependencies)
|
||||
bd update bd-14 --status in_progress
|
||||
# Complete bd-14...
|
||||
bd close bd-14
|
||||
|
||||
# Now bd-12 is unblocked
|
||||
bd ready # Shows bd-12
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Operation 2: Merging Duplicate Tasks
|
||||
|
||||
**When:** Discovered two tasks are same thing.
|
||||
|
||||
**Example:**
|
||||
```
|
||||
bd-7: "Add email validation"
|
||||
bd-9: "Validate user email addresses"
|
||||
^ Duplicates
|
||||
```
|
||||
|
||||
### Step 1: Choose which to keep
|
||||
|
||||
Based on:
|
||||
- Which has more complete design?
|
||||
- Which has more work done?
|
||||
- Which has more dependencies?
|
||||
|
||||
**Example:** Keep bd-7 (more complete)
|
||||
|
||||
### Step 2: Merge designs
|
||||
|
||||
```bash
|
||||
bd show bd-7
|
||||
bd show bd-9
|
||||
|
||||
# Combine into bd-7
|
||||
bd edit bd-7 --design "
|
||||
Add email validation to user creation and update.
|
||||
|
||||
## Background
|
||||
Originally tracked as bd-7 and bd-9 (now merged).
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Email validated on creation
|
||||
- [ ] Email validated on update
|
||||
- [ ] Rejects invalid formats
|
||||
- [ ] Rejects empty strings
|
||||
- [ ] Tests cover all cases
|
||||
|
||||
## Notes from bd-9
|
||||
Need validation on update, not just creation.
|
||||
"
|
||||
```
|
||||
|
||||
### Step 3: Move dependencies
|
||||
|
||||
```bash
|
||||
# Check bd-9 dependencies
|
||||
bd show bd-9
|
||||
|
||||
# If bd-10 depended on bd-9, update to bd-7
|
||||
bd dep remove bd-10 bd-9
|
||||
bd dep add bd-10 bd-7
|
||||
```
|
||||
|
||||
### Step 4: Close duplicate with reference
|
||||
|
||||
```bash
|
||||
bd edit bd-9 --design "DUPLICATE: Merged into bd-7
|
||||
|
||||
This task was duplicate of bd-7. All work tracked there."
|
||||
|
||||
bd close bd-9
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Operation 3: Changing Dependencies
|
||||
|
||||
**When:** Dependencies were wrong or requirements changed.
|
||||
|
||||
**Example:** bd-10 depends on bd-8 and bd-9, but bd-9 got merged and bd-10 now also needs bd-11.
|
||||
|
||||
```bash
|
||||
# Remove obsolete dependency
|
||||
bd dep remove bd-10 bd-9
|
||||
|
||||
# Add new dependency
|
||||
bd dep add bd-10 bd-11
|
||||
|
||||
# Verify
|
||||
bd dep tree bd-1 # If bd-10 in epic bd-1
|
||||
bd show bd-10 | grep "Blocking"
|
||||
```
|
||||
|
||||
**Common scenarios:**
|
||||
- Discovered hidden dependency during implementation
|
||||
- Requirements changed mid-flight
|
||||
- Tasks reordered for better flow
|
||||
|
||||
---
|
||||
|
||||
## Operation 4: Archiving Completed Epics
|
||||
|
||||
**When:** Epic complete, want to hide from default views but keep history.
|
||||
|
||||
```bash
|
||||
# Verify all tasks closed
|
||||
bd list --parent bd-1 --status open
|
||||
# Output: [empty] = all closed
|
||||
|
||||
# Archive epic
|
||||
bd close bd-1 --reason "Archived - completed Oct 2025"
|
||||
|
||||
# Won't show in open listings
|
||||
bd list --status open # bd-1 won't appear
|
||||
|
||||
# Still accessible
|
||||
bd show bd-1 # Still shows full epic
|
||||
```
|
||||
|
||||
**Use archived for:** Completed epics, shipped features, historical reference
|
||||
**Use open/in-progress for:** Active work
|
||||
**Use closed with note for:** Cancelled work (explain why)
|
||||
|
||||
---
|
||||
|
||||
## Operation 5: Querying for Metrics
|
||||
|
||||
### Velocity
|
||||
|
||||
```bash
|
||||
# Tasks closed this week
|
||||
bd list --status closed | grep "closed_at" | grep "2025-10-" | wc -l
|
||||
|
||||
# Tasks closed by epic
|
||||
bd list --parent bd-1 --status closed | wc -l
|
||||
```
|
||||
|
||||
### Blocked vs Ready
|
||||
|
||||
```bash
|
||||
# Ready to work on
|
||||
bd ready
|
||||
bd ready | grep "^bd-" | wc -l
|
||||
|
||||
# All open tasks
|
||||
bd list --status open | wc -l
|
||||
|
||||
# Blocked = open - ready
|
||||
```
|
||||
|
||||
### Epic Progress
|
||||
|
||||
```bash
|
||||
# Show tree
|
||||
bd dep tree bd-1
|
||||
|
||||
# Total tasks in epic
|
||||
bd list --parent bd-1 | grep "^bd-" | wc -l
|
||||
|
||||
# Completed tasks
|
||||
bd list --parent bd-1 --status closed | grep "^bd-" | wc -l
|
||||
|
||||
# Percentage = (completed / total) * 100
|
||||
```
|
||||
|
||||
**For detailed metrics guidance:** See [resources/metrics-guide.md](resources/metrics-guide.md)
|
||||
|
||||
---
|
||||
|
||||
## Operation 6: Cross-Epic Dependencies
|
||||
|
||||
**When:** Task in one epic depends on task in different epic.
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Epic bd-1: User Management
|
||||
- bd-10: User CRUD API
|
||||
|
||||
Epic bd-2: Order Management
|
||||
- bd-20: Order creation (needs user API)
|
||||
```
|
||||
|
||||
```bash
|
||||
# Add cross-epic dependency
|
||||
bd dep add bd-20 bd-10
|
||||
# bd-20 (in bd-2) depends on bd-10 (in bd-1)
|
||||
|
||||
# Check dependencies
|
||||
bd show bd-20 | grep "Blocking"
|
||||
|
||||
# Check ready tasks
|
||||
bd ready
|
||||
# Won't show bd-20 until bd-10 closed
|
||||
```
|
||||
|
||||
**Best practices:**
|
||||
- Document cross-epic dependencies clearly
|
||||
- Consider if epics should be merged
|
||||
- Coordinate if different people own epics
|
||||
|
||||
---
|
||||
|
||||
## Operation 7: Bulk Status Updates
|
||||
|
||||
**When:** Need to update multiple tasks.
|
||||
|
||||
**Example:** Mark all test tasks closed after suite complete.
|
||||
|
||||
```bash
|
||||
# Get tasks
|
||||
bd list --parent bd-1 --status open | grep "test:" > test-tasks.txt
|
||||
|
||||
# Review list
|
||||
cat test-tasks.txt
|
||||
|
||||
# Update each
|
||||
while read task_id; do
|
||||
bd close "$task_id"
|
||||
done < test-tasks.txt
|
||||
|
||||
# Verify
|
||||
bd list --parent bd-1 --status open | grep "test:"
|
||||
```
|
||||
|
||||
**Use bulk for:**
|
||||
- Marking completed work closed
|
||||
- Reopening related tasks
|
||||
- Updating priorities
|
||||
|
||||
**Never bulk:**
|
||||
- Thoughtless changes
|
||||
- Hiding problems (closing unfinished tasks)
|
||||
|
||||
---
|
||||
|
||||
## Operation 8: Recovering from Mistakes
|
||||
|
||||
### Accidentally closed task
|
||||
|
||||
```bash
|
||||
bd update bd-15 --status open
|
||||
# Or if was in progress
|
||||
bd update bd-15 --status in_progress
|
||||
```
|
||||
|
||||
### Wrong dependency
|
||||
|
||||
```bash
|
||||
bd dep remove bd-10 bd-8 # Remove wrong
|
||||
bd dep add bd-10 bd-9 # Add correct
|
||||
```
|
||||
|
||||
### Undo design changes
|
||||
|
||||
```bash
|
||||
# bd has no undo, restore from git
|
||||
git log -p -- .beads/issues.jsonl | grep -A 50 "bd-10"
|
||||
# Find previous version, copy
|
||||
|
||||
bd edit bd-10 --design "[paste previous]"
|
||||
```
|
||||
|
||||
### Epic structure wrong
|
||||
|
||||
1. Create new tasks with correct structure
|
||||
2. Move work to new tasks
|
||||
3. Close old tasks with reference
|
||||
4. Don't delete (keep audit trail)
|
||||
</operations>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer closes duplicate without merging information</scenario>
|
||||
|
||||
<code>
|
||||
# Found duplicates
|
||||
bd-7: "Add email validation"
|
||||
bd-9: "Validate user email addresses"
|
||||
|
||||
# Developer just closes bd-9
|
||||
bd close bd-9
|
||||
|
||||
# Loses information from bd-9's design
|
||||
# bd-9 mentioned validation on update (bd-7 didn't)
|
||||
# Now that requirement is lost
|
||||
# Work on bd-7 completes, but misses update validation
|
||||
# Bug ships to production
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Closed duplicate without reading its design
|
||||
- Lost requirement mentioned only in duplicate
|
||||
- Information not preserved
|
||||
- Incomplete implementation ships
|
||||
- bd not accurate source of truth
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct process:**
|
||||
|
||||
```bash
|
||||
# Read BOTH tasks
|
||||
bd show bd-7 # Only mentions validation on creation
|
||||
bd show bd-9 # Mentions validation on update too
|
||||
|
||||
# Merge information
|
||||
bd edit bd-7 --design "
|
||||
Email validation for user creation and update.
|
||||
|
||||
## Background
|
||||
Merged from bd-9.
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Validate on creation (from bd-7)
|
||||
- [ ] Validate on update (from bd-9) ← Preserved!
|
||||
- [ ] Tests for both cases
|
||||
"
|
||||
|
||||
# Then close duplicate with reference
|
||||
bd edit bd-9 --design "DUPLICATE: Merged into bd-7"
|
||||
bd close bd-9
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- All requirements preserved
|
||||
- bd remains accurate
|
||||
- No information lost
|
||||
- Complete implementation
|
||||
- Audit trail clear
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer doesn't split large task, struggles through</scenario>
|
||||
|
||||
<code>
|
||||
bd-15: "Implement payment processing" (started)
|
||||
|
||||
# 3 hours in, developer realizes:
|
||||
# - Need Stripe API integration (4 hours)
|
||||
# - Need payment validation (2 hours)
|
||||
# - Need retry logic (3 hours)
|
||||
# - Need receipt generation (2 hours)
|
||||
# Total: 11 more hours!
|
||||
|
||||
# Developer thinks: "Too late to split, I'll power through"
|
||||
# Works 14 hours straight
|
||||
# Gets exhausted, makes mistakes
|
||||
# Ships buggy code
|
||||
# Has to fix in production
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Didn't split when discovered size
|
||||
- "Sunk cost" rationalization (already started)
|
||||
- No clear stopping points
|
||||
- Exhaustion leads to bugs
|
||||
- Can't track progress granularly
|
||||
- If interrupted, hard to resume
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach (split mid-flight):**
|
||||
|
||||
```bash
|
||||
# 3 hours in, stop and split
|
||||
|
||||
bd edit bd-15 --design "
|
||||
Implement payment processing.
|
||||
|
||||
## Status
|
||||
✓ Completed: Payment form UI (3 hours)
|
||||
✗ Split remaining work into subtasks:
|
||||
- bd-20: Stripe API integration
|
||||
- bd-21: Payment validation
|
||||
- bd-22: Retry logic
|
||||
- bd-23: Receipt generation
|
||||
"
|
||||
|
||||
bd close bd-15 --reason "Split into bd-20, bd-21, bd-22, bd-23"
|
||||
|
||||
# Create subtasks with dependencies
|
||||
bd create "Stripe API integration" ... # bd-20
|
||||
bd create "Payment validation" ... # bd-21
|
||||
bd create "Retry logic" ... # bd-22
|
||||
bd create "Receipt generation" ... # bd-23
|
||||
|
||||
bd dep add bd-21 bd-20 # Validation needs API
|
||||
bd dep add bd-22 bd-20 # Retry needs API
|
||||
bd dep add bd-23 bd-22 # Receipts after retry works
|
||||
|
||||
# Work on one at a time
|
||||
bd update bd-20 --status in_progress
|
||||
# Complete bd-20 (4 hours)
|
||||
bd close bd-20
|
||||
|
||||
# Take break
|
||||
# Next day: bd-21
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Clear stopping points (can pause between tasks)
|
||||
- Track progress granularly
|
||||
- No exhaustion (spread over days)
|
||||
- Better quality (not rushed)
|
||||
- If interrupted, easy to resume
|
||||
- Each subtask gets proper focus
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer adds dependency but doesn't update dependent task</scenario>
|
||||
|
||||
<code>
|
||||
# Initial state
|
||||
bd-10: "Add user dashboard" (in progress)
|
||||
bd-15: "Add analytics to dashboard" (blocked on bd-10)
|
||||
|
||||
# During bd-10 implementation, discover need for new API
|
||||
bd create "Analytics API endpoints" ... # Creates bd-20
|
||||
|
||||
# Add dependency
|
||||
bd dep add bd-15 bd-20 # bd-15 now depends on bd-20 too
|
||||
|
||||
# But bd-10 completes, closes
|
||||
bd close bd-10
|
||||
|
||||
# bd-15 shows as ready (bd-10 closed)
|
||||
bd ready # Shows bd-15
|
||||
|
||||
# Developer starts bd-15
|
||||
bd update bd-15 --status in_progress
|
||||
|
||||
# Immediately blocked - needs bd-20!
|
||||
# bd-20 not done yet
|
||||
# Have to stop work on bd-15
|
||||
# Time wasted
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Added dependency but didn't document in bd-15
|
||||
- bd-15's design doesn't mention bd-20 requirement
|
||||
- Appears ready when not actually ready
|
||||
- Wastes time starting work that's blocked
|
||||
- Dependencies not obvious from task design
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
```bash
|
||||
# Create new API task
|
||||
bd create "Analytics API endpoints" ... # bd-20
|
||||
|
||||
# Add dependency
|
||||
bd dep add bd-15 bd-20
|
||||
|
||||
# UPDATE bd-15 to document new requirement
|
||||
bd edit bd-15 --design "
|
||||
Add analytics to dashboard.
|
||||
|
||||
## Dependencies
|
||||
- bd-10: User dashboard (completed)
|
||||
- bd-20: Analytics API endpoints (NEW - discovered during bd-10)
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Integrate with analytics API (bd-20)
|
||||
- [ ] Display charts on dashboard
|
||||
- [ ] Tests pass
|
||||
"
|
||||
|
||||
# Close bd-10
|
||||
bd close bd-10
|
||||
|
||||
# Check ready
|
||||
bd ready # Does NOT show bd-15 (blocked on bd-20)
|
||||
|
||||
# Work on bd-20 first
|
||||
bd update bd-20 --status in_progress
|
||||
# Complete bd-20
|
||||
bd close bd-20
|
||||
|
||||
# NOW bd-15 is truly ready
|
||||
bd ready # Shows bd-15
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Dependencies documented in task design
|
||||
- Clear why task is blocked
|
||||
- No false "ready" signals
|
||||
- Work proceeds in correct order
|
||||
- No wasted time starting blocked work
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Keep bd accurate** → Single source of truth for all work
|
||||
2. **Merge duplicates, don't just close** → Preserve information from both
|
||||
3. **Split large tasks when discovered** → Not after struggling through
|
||||
4. **Document dependency changes** → Update task designs when deps change
|
||||
5. **Update as you go** → Never batch updates "for later"
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Follow the operation properly.**
|
||||
|
||||
- "Task too complex to split" (Every task can be broken down)
|
||||
- "Just close duplicate" (Merge first, preserve information)
|
||||
- "Won't track this in bd" (All work tracked, no exceptions)
|
||||
- "bd is out of date, update later" (Later never comes, update now)
|
||||
- "This dependency doesn't matter" (Dependencies prevent blocking, they matter)
|
||||
- "Too much overhead to split" (More overhead to fail huge task)
|
||||
</critical_rules>
|
||||
|
||||
<bd_best_practices>
|
||||
**For detailed guidance on:**
|
||||
- Task naming conventions
|
||||
- Priority guidelines (P0-P4)
|
||||
- Task granularity
|
||||
- Success criteria
|
||||
- Dependency management
|
||||
|
||||
**See:** [resources/task-naming-guide.md](resources/task-naming-guide.md)
|
||||
</bd_best_practices>
|
||||
|
||||
<red_flags>
|
||||
Watch for these patterns:
|
||||
|
||||
- **Multiple in-progress tasks** → Focus on one
|
||||
- **Tasks stuck in-progress for days** → Blocked? Split it?
|
||||
- **Many open tasks, no dependencies** → Prioritize!
|
||||
- **Epics with 20+ tasks** → Too large, split epic
|
||||
- **Closed tasks, incomplete criteria** → Not done, reopen
|
||||
</red_flags>
|
||||
|
||||
<verification_checklist>
|
||||
After advanced bd operations:
|
||||
|
||||
- [ ] bd still accurate (reflects reality)
|
||||
- [ ] Dependencies correct (nothing blocked incorrectly)
|
||||
- [ ] Duplicate information merged (not lost)
|
||||
- [ ] Changes documented in task designs
|
||||
- [ ] Ready tasks are actually unblocked
|
||||
- [ ] Metrics queries return sensible numbers
|
||||
- [ ] No orphaned tasks (all part of epics)
|
||||
|
||||
**Can't check all boxes?** Review operation and fix issues.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill covers:** Advanced bd operations
|
||||
|
||||
**For basic operations:**
|
||||
- skills/common-patterns/bd-commands.md
|
||||
|
||||
**Related skills:**
|
||||
- hyperpowers:writing-plans (creating epics and tasks)
|
||||
- hyperpowers:executing-plans (working through tasks)
|
||||
- hyperpowers:verification-before-completion (closing tasks properly)
|
||||
|
||||
**CRITICAL:** Use bd CLI commands, never read `.beads/issues.jsonl` directly.
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Metrics guide (cycle time, WIP limits)](resources/metrics-guide.md)
|
||||
- [Task naming conventions](resources/task-naming-guide.md)
|
||||
- [Dependency patterns](resources/dependency-patterns.md)
|
||||
|
||||
**When stuck:**
|
||||
- Task seems unsplittable → Ask user how to break it down
|
||||
- Duplicates complex → Merge designs carefully, don't rush
|
||||
- Dependencies tangled → Draw diagram, untangle systematically
|
||||
- bd out of sync → Stop everything, update bd first
|
||||
</resources>
|
||||
166
skills/managing-bd-tasks/resources/metrics-guide.md
Normal file
166
skills/managing-bd-tasks/resources/metrics-guide.md
Normal file
@@ -0,0 +1,166 @@
|
||||
# bd Metrics Guide
|
||||
|
||||
This guide covers the key metrics for tracking work in bd.
|
||||
|
||||
## Cycle Time vs. Lead Time
|
||||
|
||||
**Two distinct time measurements:**
|
||||
|
||||
### Cycle Time
|
||||
|
||||
- **Definition**: Time from "work started" to "work completed"
|
||||
- **Start**: When task moves to "in-progress" status
|
||||
- **End**: When task moves to "closed" status
|
||||
- **Measures**: How efficiently work flows through active development
|
||||
- **Use**: Identify process inefficiencies, improve development speed
|
||||
|
||||
```bash
|
||||
# Calculate cycle time for completed task
|
||||
bd show bd-5 | grep "status.*in-progress" # Get start time
|
||||
bd show bd-5 | grep "status.*closed" # Get end time
|
||||
# Difference = cycle time
|
||||
```
|
||||
|
||||
### Lead Time
|
||||
|
||||
- **Definition**: Time from "request created" to "delivered to customer"
|
||||
- **Start**: When task is created (enters backlog)
|
||||
- **End**: When task is deployed/delivered
|
||||
- **Measures**: Overall responsiveness to requests
|
||||
- **Use**: Set realistic expectations, measure total process duration
|
||||
|
||||
```bash
|
||||
# Calculate lead time for completed task
|
||||
bd show bd-5 | grep "created_at" # Get creation time
|
||||
bd show bd-5 | grep "deployed_at" # Get deployment time (if tracked)
|
||||
# Difference = lead time
|
||||
```
|
||||
|
||||
### Key Differences
|
||||
|
||||
| Metric | Starts | Ends | Includes Waiting? | Measures |
|
||||
|--------|--------|------|-------------------|----------|
|
||||
| **Cycle Time** | In-progress | Closed | No | Development efficiency |
|
||||
| **Lead Time** | Created | Deployed | Yes | Total responsiveness |
|
||||
|
||||
### Example
|
||||
|
||||
```
|
||||
Task created: Monday 9am (enters backlog)
|
||||
↓ [waits 2 days]
|
||||
Task started: Wednesday 9am (moved to in-progress)
|
||||
↓ [active work]
|
||||
Task completed: Wednesday 5pm (moved to closed)
|
||||
↓ [waits for deployment]
|
||||
Task deployed: Thursday 2pm (delivered)
|
||||
|
||||
Cycle Time: 8 hours (Wednesday 9am → 5pm)
|
||||
Lead Time: 3 days, 5 hours (Monday 9am → Thursday 2pm)
|
||||
```
|
||||
|
||||
### Why Both Matter
|
||||
|
||||
- **Short cycle time, long lead time**: Work is efficient once started, but tasks wait too long in backlog
|
||||
- Fix: Reduce WIP, start fewer tasks, finish faster
|
||||
|
||||
- **Long cycle time, short lead time**: Work starts immediately but takes forever to complete
|
||||
- Fix: Split tasks smaller, remove blockers, improve focus
|
||||
|
||||
- **Both long**: Overall process is slow
|
||||
- Fix: Address both backlog management AND development efficiency
|
||||
|
||||
### Tracking Over Time
|
||||
|
||||
```bash
|
||||
# Average cycle time (manual calculation)
|
||||
# For each closed task: (closed_at - started_at)
|
||||
# Sum and divide by task count
|
||||
|
||||
# Trend analysis
|
||||
# Week 1: Avg cycle time = 3 days
|
||||
# Week 2: Avg cycle time = 2 days ✅ Improving
|
||||
# Week 3: Avg cycle time = 4 days ❌ Getting worse
|
||||
```
|
||||
|
||||
### Improvement Targets
|
||||
|
||||
- **Cycle time**: Reduce by splitting tasks, removing blockers, improving focus
|
||||
- **Lead time**: Reduce by prioritizing backlog, reducing WIP, faster deployment
|
||||
|
||||
## Work in Progress (WIP)
|
||||
|
||||
```bash
|
||||
# All in-progress tasks
|
||||
bd list --status in-progress
|
||||
|
||||
# Count
|
||||
bd list --status in-progress | grep "^bd-" | wc -l
|
||||
```
|
||||
|
||||
### WIP Limits
|
||||
|
||||
Work in Progress limits prevent overcommitment and identify bottlenecks.
|
||||
|
||||
**Setting WIP limits:**
|
||||
- **Personal WIP limit**: 1-2 tasks in-progress at a time
|
||||
- **Team WIP limit**: Depends on team size and workflow stages
|
||||
- **Rule of thumb**: WIP limit = (Team size ÷ 2) + 1
|
||||
|
||||
**Example for individual developer:**
|
||||
```
|
||||
✅ Good: 1 task in-progress, 0-1 in code review
|
||||
❌ Bad: 5 tasks in-progress simultaneously
|
||||
```
|
||||
|
||||
**Example for team of 6:**
|
||||
```
|
||||
Workflow stages and limits:
|
||||
- Backlog: Unlimited
|
||||
- Ready: 8 items max
|
||||
- In Progress: 4 items max (team size ÷ 2 + 1)
|
||||
- Code Review: 3 items max
|
||||
- Testing: 2 items max
|
||||
- Done: Unlimited
|
||||
```
|
||||
|
||||
### Why WIP Limits Matter
|
||||
|
||||
1. **Focus:** Fewer tasks means deeper focus, faster completion
|
||||
2. **Flow:** Prevents bottlenecks from accumulating
|
||||
3. **Quality:** Less context switching, fewer mistakes
|
||||
4. **Visibility:** High WIP indicates blocked work or overcommitment
|
||||
|
||||
### Monitoring WIP
|
||||
|
||||
```bash
|
||||
# Check personal WIP
|
||||
bd list --status in-progress | grep "assignee:me" | wc -l
|
||||
|
||||
# If > 2: Focus on finishing before starting new work
|
||||
```
|
||||
|
||||
### Red Flags
|
||||
|
||||
- WIP consistently at or above limit (need more capacity or smaller tasks)
|
||||
- WIP growing week-over-week (work piling up, not finishing)
|
||||
- WIP high but velocity low (tasks blocked or too large)
|
||||
|
||||
### Response to High WIP
|
||||
|
||||
1. Finish existing tasks before starting new ones
|
||||
2. Identify and remove blockers
|
||||
3. Split large tasks
|
||||
4. Add capacity (if chronically high)
|
||||
|
||||
## Bottleneck Identification
|
||||
|
||||
```bash
|
||||
# Find tasks that are blocking others
|
||||
# (Tasks that many other tasks depend on)
|
||||
for task in $(bd list --status open | grep "^bd-" | cut -d: -f1); do
|
||||
echo -n "$task: "
|
||||
bd list --status open | xargs -I {} sh -c "bd show {} | grep -q \"depends on $task\" && echo {}" | wc -l
|
||||
done | sort -t: -k2 -n -r
|
||||
|
||||
# Shows tasks with most dependencies (top bottlenecks)
|
||||
```
|
||||
276
skills/managing-bd-tasks/resources/task-naming-guide.md
Normal file
276
skills/managing-bd-tasks/resources/task-naming-guide.md
Normal file
@@ -0,0 +1,276 @@
|
||||
# bd Task Naming and Quality Guidelines
|
||||
|
||||
This guide covers best practices for naming tasks, setting priorities, sizing work, and defining success criteria.
|
||||
|
||||
## Task Naming Conventions
|
||||
|
||||
### Principles
|
||||
|
||||
- **Actionable**: Start with action verbs (add, fix, update, remove, refactor, implement)
|
||||
- **Specific**: Include enough context to understand without opening
|
||||
- **Consistent**: Follow project-wide templates
|
||||
|
||||
### Templates by Task Type
|
||||
|
||||
#### User Stories
|
||||
|
||||
**Template:**
|
||||
```
|
||||
As a [persona], I want [something] so that [reason]
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```
|
||||
As a customer, I want one-click checkout so that I can purchase quickly
|
||||
As an admin, I want bulk user import so that I can onboard teams efficiently
|
||||
As a developer, I want API rate limiting so that I can prevent abuse
|
||||
```
|
||||
|
||||
**When to use:** Features from user perspective
|
||||
|
||||
#### Bug Reports
|
||||
|
||||
**Template 1 (Capability-focused):**
|
||||
```
|
||||
[User type] can't [action they should be able to do]
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```
|
||||
New users can't view home screen after signup
|
||||
Admin users can't export user data to CSV
|
||||
Guest users can't add items to cart
|
||||
```
|
||||
|
||||
**Template 2 (Event-focused):**
|
||||
```
|
||||
When [action/event], [system feature] doesn't work
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```
|
||||
When clicking Submit, payment form doesn't validate
|
||||
When uploading large files, progress bar freezes
|
||||
When session expires, user isn't redirected to login
|
||||
```
|
||||
|
||||
**When to use:** Describing broken functionality
|
||||
|
||||
#### Tasks (Implementation Work)
|
||||
|
||||
**Template:**
|
||||
```
|
||||
[Verb] [object] [context]
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```
|
||||
feat(auth): Implement JWT token generation
|
||||
fix(api): Handle empty email validation in user endpoint
|
||||
test: Add integration tests for payment flow
|
||||
refactor: Extract validation logic from UserService
|
||||
docs: Update API documentation for v2 endpoints
|
||||
```
|
||||
|
||||
**When to use:** Technical implementation tasks
|
||||
|
||||
#### Features (High-Level Capabilities)
|
||||
|
||||
**Template:**
|
||||
```
|
||||
[Verb] [capability] for [user/system]
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```
|
||||
Add dark mode toggle for Settings page
|
||||
Implement rate limiting for API endpoints
|
||||
Enable two-factor authentication for admin users
|
||||
Build export functionality for report data
|
||||
```
|
||||
|
||||
**When to use:** Feature-level work (may become epic with multiple tasks)
|
||||
|
||||
### Context Guidelines
|
||||
|
||||
- **Which component**: "in login flow", "for user API", "in Settings page"
|
||||
- **Which user type**: "for admins", "for guests", "for authenticated users"
|
||||
- **Avoid jargon** in user stories (user perspective, not technical)
|
||||
- **Be specific** in technical tasks (exact API, file, function)
|
||||
|
||||
### Good vs Bad Names
|
||||
|
||||
**Good names:**
|
||||
- `feat(auth): Implement JWT token generation`
|
||||
- `fix(api): Handle empty email validation in user endpoint`
|
||||
- `As a customer, I want CSV export so that I can analyze my data`
|
||||
- `test: Add integration tests for payment flow`
|
||||
- `refactor: Extract validation logic from UserService`
|
||||
|
||||
**Bad names:**
|
||||
- `fix stuff` (vague - what stuff?)
|
||||
- `implement feature` (vague - which feature?)
|
||||
- `work on backend` (vague - what work?)
|
||||
- `Report` (noun, not action - should be "Generate Q4 Sales Report")
|
||||
- `API endpoint` (incomplete - "Add GET /users endpoint" better)
|
||||
|
||||
## Priority Guidelines
|
||||
|
||||
Use bd's priority system consistently:
|
||||
|
||||
- **P0:** Critical production bug (drop everything)
|
||||
- **P1:** Blocking other work (do next)
|
||||
- **P2:** Important feature work (normal priority)
|
||||
- **P3:** Nice to have (do when time permits)
|
||||
- **P4:** Someday/maybe (backlog)
|
||||
|
||||
## Granularity Guidelines
|
||||
|
||||
**Good task size:**
|
||||
- 2-4 hours of focused work
|
||||
- Can complete in one sitting
|
||||
- Clear deliverable
|
||||
|
||||
**Too large:**
|
||||
- Takes multiple days
|
||||
- Multiple independent pieces
|
||||
- Should be split
|
||||
|
||||
**Too small:**
|
||||
- Takes 15 minutes
|
||||
- Too granular to track
|
||||
- Combine with related tasks
|
||||
|
||||
## Success Criteria: Acceptance Criteria vs. Definition of Done
|
||||
|
||||
**Two distinct types of completion criteria:**
|
||||
|
||||
### Acceptance Criteria (Per-Task, Functional)
|
||||
|
||||
**Definition:** Specific, measurable requirements unique to each task that define functional completeness from user/business perspective.
|
||||
|
||||
**Scope:** Unique to each backlog item (bug, task, story)
|
||||
|
||||
**Purpose:** "Does this feature work correctly?"
|
||||
|
||||
**Owner:** Product owner/stakeholder defines, team validates
|
||||
|
||||
**Format:** Checklist or scenarios
|
||||
|
||||
```markdown
|
||||
## Acceptance Criteria
|
||||
- [ ] User can upload CSV files up to 10MB
|
||||
- [ ] System validates CSV format before processing
|
||||
- [ ] User sees progress bar during upload
|
||||
- [ ] User receives success message with row count
|
||||
- [ ] Invalid files show specific error messages
|
||||
```
|
||||
|
||||
**Scenario format (Given/When/Then):**
|
||||
```markdown
|
||||
## Acceptance Criteria
|
||||
|
||||
Scenario 1: Valid file upload
|
||||
Given a user is on the upload page
|
||||
When they select a valid CSV file
|
||||
Then the file uploads successfully
|
||||
And they see confirmation with row count
|
||||
|
||||
Scenario 2: Invalid file format
|
||||
Given a user selects a non-CSV file
|
||||
When they try to upload
|
||||
Then they see error: "Only CSV files supported"
|
||||
```
|
||||
|
||||
### Definition of Done (Universal, Quality)
|
||||
|
||||
**Definition:** Universal checklist that applies to ALL work items to ensure consistent quality and release-readiness.
|
||||
|
||||
**Scope:** Applies to every single task (bugs, features, stories)
|
||||
|
||||
**Purpose:** "Is this work complete to our quality standards?"
|
||||
|
||||
**Owner:** Team defines and maintains (reviewed in retrospectives)
|
||||
|
||||
**Example DoD:**
|
||||
```markdown
|
||||
## Definition of Done (applies to all tasks)
|
||||
- [ ] Code written and peer-reviewed
|
||||
- [ ] Unit tests written and passing (>80% coverage)
|
||||
- [ ] Integration tests passing
|
||||
- [ ] No linter warnings
|
||||
- [ ] Documentation updated (if public API)
|
||||
- [ ] Manual testing completed (if UI)
|
||||
- [ ] Deployed to staging environment
|
||||
- [ ] Product owner accepted
|
||||
- [ ] Commit references bd task ID
|
||||
```
|
||||
|
||||
### Key Differences
|
||||
|
||||
| Aspect | Acceptance Criteria | Definition of Done |
|
||||
|--------|--------------------|--------------------|
|
||||
| **Scope** | Per-task (unique) | All tasks (universal) |
|
||||
| **Focus** | Functional requirements | Quality standards |
|
||||
| **Question** | "Does it work?" | "Is it done?" |
|
||||
| **Owner** | Product owner | Team |
|
||||
| **Changes** | Per task | Rarely (retrospectives) |
|
||||
| **Examples** | "User can export data" | "Tests pass, code reviewed" |
|
||||
|
||||
### How to Use Both
|
||||
|
||||
**When creating a task:**
|
||||
|
||||
1. **Define Acceptance Criteria** (task-specific functional requirements)
|
||||
2. **Reference Definition of Done** (don't duplicate it in task)
|
||||
|
||||
```markdown
|
||||
bd create "Implement CSV file upload" --design "
|
||||
## Acceptance Criteria
|
||||
- [ ] User can upload CSV files up to 10MB
|
||||
- [ ] System validates CSV format
|
||||
- [ ] Progress bar shows during upload
|
||||
- [ ] Success message displays row count
|
||||
|
||||
## Notes
|
||||
Must also meet team's Definition of Done (see project wiki)
|
||||
"
|
||||
```
|
||||
|
||||
**Before closing a task:**
|
||||
|
||||
1. ✅ Verify all Acceptance Criteria met (functional)
|
||||
2. ✅ Verify Definition of Done met (quality)
|
||||
3. Only then close task
|
||||
|
||||
**Bad practice:**
|
||||
```markdown
|
||||
## Success Criteria
|
||||
- [ ] CSV upload works
|
||||
- [ ] Tests pass ← This is DoD, not acceptance criteria
|
||||
- [ ] Code reviewed ← This is DoD, not acceptance criteria
|
||||
- [ ] No linter warnings ← This is DoD, not acceptance criteria
|
||||
```
|
||||
|
||||
**Good practice:**
|
||||
```markdown
|
||||
## Acceptance Criteria (functional, task-specific)
|
||||
- [ ] CSV upload handles files up to 10MB
|
||||
- [ ] Validation rejects non-CSV formats
|
||||
- [ ] Progress bar updates during upload
|
||||
|
||||
## Definition of Done (quality, universal - referenced, not duplicated)
|
||||
See team DoD checklist (applies to all tasks)
|
||||
```
|
||||
|
||||
## Dependency Management
|
||||
|
||||
**Good dependency usage:**
|
||||
- Technical dependency (feature B needs feature A's code)
|
||||
- Clear ordering (must do A before B)
|
||||
- Unblocks work (completing A unblocks B)
|
||||
|
||||
**Bad dependency usage:**
|
||||
- "Feels like should be done first" (vague)
|
||||
- No technical relationship (just preference)
|
||||
- Circular dependencies (A depends on B depends on A)
|
||||
541
skills/refactoring-safely/SKILL.md
Normal file
541
skills/refactoring-safely/SKILL.md
Normal file
@@ -0,0 +1,541 @@
|
||||
---
|
||||
name: refactoring-safely
|
||||
description: Use when refactoring code - test-preserving transformations in small steps, running tests between each change
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Refactoring changes code structure without changing behavior; tests must stay green throughout or you're rewriting, not refactoring.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
MEDIUM FREEDOM - Follow the change→test→commit cycle strictly, but adapt the specific refactoring patterns to your language and codebase.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Step | Action | Verify |
|
||||
|------|--------|--------|
|
||||
| 1 | Run full test suite | ALL pass |
|
||||
| 2 | Create bd refactoring task | Track work |
|
||||
| 3 | Make ONE small change | Compiles |
|
||||
| 4 | Run tests immediately | ALL still pass |
|
||||
| 5 | Commit with descriptive message | History clear |
|
||||
| 6 | Repeat 3-5 until complete | Each step safe |
|
||||
| 7 | Final verification & close bd | Done |
|
||||
|
||||
**Core cycle:** Change → Test → Commit (repeat until complete)
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
- Improving code structure without changing functionality
|
||||
- Extracting duplicated code into shared utilities
|
||||
- Renaming for clarity
|
||||
- Reorganizing file/module structure
|
||||
- Simplifying complex code while preserving behavior
|
||||
|
||||
**Don't use for:**
|
||||
- Changing functionality (use feature development)
|
||||
- Fixing bugs (use hyperpowers:fixing-bugs)
|
||||
- Adding features while restructuring (do separately)
|
||||
- Code without tests (write tests first using hyperpowers:test-driven-development)
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## 1. Verify Tests Pass
|
||||
|
||||
**BEFORE any refactoring:**
|
||||
|
||||
```bash
|
||||
# Use test-runner agent to keep context clean
|
||||
Dispatch hyperpowers:test-runner agent: "Run: cargo test"
|
||||
```
|
||||
|
||||
**Verify:** ALL tests pass. If any fail, fix them FIRST, then refactor.
|
||||
|
||||
**Why:** Failing tests mean you can't detect if refactoring breaks things.
|
||||
|
||||
---
|
||||
|
||||
## 2. Create bd Task for Refactoring
|
||||
|
||||
Track the refactoring work:
|
||||
|
||||
```bash
|
||||
bd create "Refactor: Extract user validation logic" \
|
||||
--type task \
|
||||
--priority P2
|
||||
|
||||
bd edit bd-456 --design "
|
||||
## Goal
|
||||
Extract user validation logic from UserService into separate Validator class.
|
||||
|
||||
## Why
|
||||
- Validation duplicated across 3 services
|
||||
- Makes testing individual validations difficult
|
||||
- Violates single responsibility
|
||||
|
||||
## Approach
|
||||
1. Create UserValidator class
|
||||
2. Extract email validation
|
||||
3. Extract name validation
|
||||
4. Extract age validation
|
||||
5. Update UserService to use validator
|
||||
6. Remove duplication from other services
|
||||
|
||||
## Success Criteria
|
||||
- All existing tests still pass
|
||||
- No behavior changes
|
||||
- Validator has 100% test coverage
|
||||
"
|
||||
|
||||
bd update bd-456 --status in_progress
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Make ONE Small Change
|
||||
|
||||
The smallest transformation that compiles.
|
||||
|
||||
**Examples of "small":**
|
||||
- Extract one method
|
||||
- Rename one variable
|
||||
- Move one function to different file
|
||||
- Inline one constant
|
||||
- Extract one interface
|
||||
|
||||
**NOT small:**
|
||||
- Extracting multiple methods at once
|
||||
- Renaming + moving + restructuring
|
||||
- "While I'm here" improvements
|
||||
|
||||
**Example:**
|
||||
|
||||
```rust
|
||||
// Before
|
||||
fn create_user(name: &str, email: &str) -> Result<User> {
|
||||
if email.is_empty() {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
if !email.contains('@') {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
|
||||
let user = User { name, email };
|
||||
Ok(user)
|
||||
}
|
||||
|
||||
// After - ONE small change (extract email validation)
|
||||
fn create_user(name: &str, email: &str) -> Result<User> {
|
||||
validate_email(email)?;
|
||||
|
||||
let user = User { name, email };
|
||||
Ok(user)
|
||||
}
|
||||
|
||||
fn validate_email(email: &str) -> Result<()> {
|
||||
if email.is_empty() {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
if !email.contains('@') {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Run Tests Immediately
|
||||
|
||||
After EVERY small change:
|
||||
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner agent: "Run: cargo test"
|
||||
```
|
||||
|
||||
**Verify:** ALL tests still pass.
|
||||
|
||||
**If tests fail:**
|
||||
1. STOP
|
||||
2. Undo the change: `git restore src/file.rs`
|
||||
3. Understand why it broke
|
||||
4. Make smaller change
|
||||
5. Try again
|
||||
|
||||
**Never proceed with failing tests.**
|
||||
|
||||
---
|
||||
|
||||
## 5. Commit the Small Change
|
||||
|
||||
Commit each safe transformation:
|
||||
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner agent: "Run: git add src/user_service.rs && git commit -m 'refactor(bd-456): extract email validation to function
|
||||
|
||||
No behavior change. All tests pass.
|
||||
|
||||
Part of bd-456'"
|
||||
```
|
||||
|
||||
**Why commit so often:**
|
||||
- Easy to undo if next step breaks
|
||||
- Clear history of transformations
|
||||
- Can review each step independently
|
||||
- Proves tests passed at each point
|
||||
|
||||
---
|
||||
|
||||
## 6. Repeat Until Complete
|
||||
|
||||
Repeat steps 3-5 for each small transformation:
|
||||
|
||||
```
|
||||
1. Extract validate_email() ✓ (committed)
|
||||
2. Extract validate_name() ✓ (committed)
|
||||
3. Extract validate_age() ✓ (committed)
|
||||
4. Create UserValidator struct ✓ (committed)
|
||||
5. Move validations into UserValidator ✓ (committed)
|
||||
6. Update UserService to use validator ✓ (committed)
|
||||
7. Remove validation from OrderService ✓ (committed)
|
||||
8. Remove validation from AccountService ✓ (committed)
|
||||
```
|
||||
|
||||
**Pattern:** change → test → commit (repeat)
|
||||
|
||||
---
|
||||
|
||||
## 7. Final Verification
|
||||
|
||||
After all transformations complete:
|
||||
|
||||
```bash
|
||||
# Full test suite
|
||||
Dispatch hyperpowers:test-runner agent: "Run: cargo test"
|
||||
|
||||
# Linter
|
||||
Dispatch hyperpowers:test-runner agent: "Run: cargo clippy"
|
||||
```
|
||||
|
||||
**Review the changes:**
|
||||
|
||||
```bash
|
||||
# See all refactoring commits
|
||||
git log --oneline | grep "bd-456"
|
||||
|
||||
# Review full diff
|
||||
git diff main...HEAD
|
||||
```
|
||||
|
||||
**Checklist:**
|
||||
- [ ] All tests pass
|
||||
- [ ] No new warnings
|
||||
- [ ] No behavior changes
|
||||
- [ ] Code is cleaner/simpler
|
||||
- [ ] Each commit is small and safe
|
||||
|
||||
**Close bd task:**
|
||||
|
||||
```bash
|
||||
bd edit bd-456 --design "
|
||||
... (append to existing design)
|
||||
|
||||
## Completed
|
||||
- Created UserValidator class with email, name, age validation
|
||||
- Removed duplicated validation from 3 services
|
||||
- All tests pass (verified)
|
||||
- No behavior changes
|
||||
- 8 small transformations, each tested
|
||||
"
|
||||
|
||||
bd close bd-456
|
||||
```
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer changes behavior while "refactoring"</scenario>
|
||||
|
||||
<code>
|
||||
// Original code
|
||||
fn validate_email(email: &str) -> Result<()> {
|
||||
if email.is_empty() {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
if !email.contains('@') {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
// "Refactored" version
|
||||
fn validate_email(email: &str) -> Result<()> {
|
||||
if email.is_empty() {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
if !email.contains('@') {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
// NEW: Added extra validation
|
||||
if !email.contains('.') { // BEHAVIOR CHANGE
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- This changes behavior (now rejects emails like "user@localhost")
|
||||
- Tests might fail, or worse, pass and ship breaking change
|
||||
- Not refactoring - this is modifying functionality
|
||||
- Users who relied on old behavior experience regression
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
1. Extract validation (pure refactoring, no behavior change)
|
||||
2. Commit with tests passing
|
||||
3. THEN add new validation as separate feature with new tests
|
||||
4. Two clear commits: refactoring vs. feature addition
|
||||
|
||||
**What you gain:**
|
||||
- Clear history of what changed when
|
||||
- Easy to revert feature without losing refactoring
|
||||
- Tests document exact behavior changes
|
||||
- No surprises in production
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer does big-bang refactoring</scenario>
|
||||
|
||||
<code>
|
||||
# Changes made all at once:
|
||||
- Renamed 15 functions across 5 files
|
||||
- Extracted 3 new classes
|
||||
- Moved code between 10 files
|
||||
- Reorganized module structure
|
||||
- Updated all import statements
|
||||
|
||||
# Then runs tests
|
||||
$ cargo test
|
||||
... 23 test failures ...
|
||||
|
||||
# Now what? Which change broke what?
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Can't identify which specific change broke tests
|
||||
- Reverting means losing ALL work
|
||||
- Fixing requires re-debugging entire refactoring
|
||||
- Wastes hours trying to untangle failures
|
||||
- Might give up and revert everything
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
1. Rename ONE function → test → commit
|
||||
2. Extract ONE class → test → commit
|
||||
3. Move ONE file → test → commit
|
||||
4. Continue one change at a time
|
||||
|
||||
**If test fails:**
|
||||
- Know exactly which change broke it
|
||||
- Revert ONE commit, not all work
|
||||
- Fix or make smaller change
|
||||
- Continue from known-good state
|
||||
|
||||
**What you gain:**
|
||||
- Tests break → immediately know why
|
||||
- Each commit is reviewable independently
|
||||
- Can stop halfway with useful progress
|
||||
- Confidence from continuous green tests
|
||||
- Clear history for future developers
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer refactors code without tests</scenario>
|
||||
|
||||
<code>
|
||||
// Legacy code with no tests
|
||||
fn process_payment(amount: f64, user_id: i64) -> Result<PaymentId> {
|
||||
// 200 lines of complex payment logic
|
||||
// Multiple edge cases
|
||||
// No tests exist
|
||||
}
|
||||
|
||||
// Developer refactors without tests:
|
||||
// - Extracts 5 methods
|
||||
// - Renames variables
|
||||
// - Simplifies conditionals
|
||||
// - "Looks good to me!"
|
||||
|
||||
// Deploys to production
|
||||
// 💥 Payments fail for amounts over $1000
|
||||
// Edge case handling was accidentally changed
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- No tests to verify behavior preserved
|
||||
- Complex logic has hidden edge cases
|
||||
- Subtle behavior changes go unnoticed
|
||||
- Breaks in production, not development
|
||||
- Costs customer trust and emergency debugging
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
1. **Write tests FIRST** (using hyperpowers:test-driven-development)
|
||||
- Test happy path
|
||||
- Test all edge cases (amounts over $1000, etc.)
|
||||
- Test error conditions
|
||||
- Run tests → all pass (documenting current behavior)
|
||||
|
||||
2. **Then refactor with tests as safety net**
|
||||
- Extract method → run tests → commit
|
||||
- Rename → run tests → commit
|
||||
- Simplify → run tests → commit
|
||||
|
||||
3. **Tests catch any behavior changes immediately**
|
||||
|
||||
**What you gain:**
|
||||
- Confidence behavior is preserved
|
||||
- Edge cases documented in tests
|
||||
- Catches subtle changes before production
|
||||
- Future refactoring is also safe
|
||||
- Tests serve as documentation
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<refactor_vs_rewrite>
|
||||
## When to Refactor
|
||||
|
||||
- Tests exist and pass
|
||||
- Changes are incremental
|
||||
- Business logic stays same
|
||||
- Can transform in small, safe steps
|
||||
- Each step independently valuable
|
||||
|
||||
## When to Rewrite
|
||||
|
||||
- No tests exist (write tests first, then refactor)
|
||||
- Fundamental architecture change needed
|
||||
- Easier to rebuild than modify
|
||||
- Requirements changed significantly
|
||||
- After 3+ failed refactoring attempts
|
||||
|
||||
**Rule:** If you need to change test assertions (not just add tests), you're rewriting, not refactoring.
|
||||
|
||||
## Strangler Fig Pattern (Hybrid)
|
||||
|
||||
**When to use:**
|
||||
- Need to replace legacy system but can't tolerate downtime
|
||||
- Want incremental migration with continuous monitoring
|
||||
- System too large to refactor in one go
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. **Transform:** Create modernized components alongside legacy
|
||||
2. **Coexist:** Both systems run in parallel (façade routes requests)
|
||||
3. **Eliminate:** Retire old functionality piece by piece
|
||||
|
||||
**Example:**
|
||||
|
||||
```
|
||||
Legacy: Monolithic user service (50K LOC)
|
||||
Goal: Microservices architecture
|
||||
|
||||
Step 1 (Transform):
|
||||
- Create new UserService microservice
|
||||
- Implement user creation endpoint
|
||||
- Tests pass in isolation
|
||||
|
||||
Step 2 (Coexist):
|
||||
- Add routing layer (façade)
|
||||
- Route POST /users to new service
|
||||
- Route GET /users to legacy service (for now)
|
||||
- Monitor both, compare results
|
||||
|
||||
Step 3 (Eliminate):
|
||||
- Once confident, migrate GET /users to new service
|
||||
- Remove user creation from legacy
|
||||
- Repeat for remaining endpoints
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Incremental replacement reduces risk
|
||||
- Legacy continues operating during transition
|
||||
- Can pause/rollback at any point
|
||||
- Each migration step is independently valuable
|
||||
|
||||
**Use refactoring within components, Strangler Fig for replacing systems.**
|
||||
</refactor_vs_rewrite>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Tests must stay green** throughout refactoring → If they fail, you changed behavior (stop and undo)
|
||||
2. **Commit after each small change** → Large commits hide which change broke what
|
||||
3. **One transformation at a time** → Multiple changes = impossible to debug failures
|
||||
4. **Run tests after EVERY change** → Delayed testing doesn't tell you which change broke it
|
||||
5. **If tests fail 3+ times, question approach** → Might need to rewrite instead, or add tests first
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **Stop and return to the change→test→commit cycle**
|
||||
|
||||
- "Small refactoring, don't need tests between steps"
|
||||
- "I'll test at the end"
|
||||
- "Tests are slow, I'll run once at the end"
|
||||
- "Just fixing bugs while refactoring" (bug fixes = behavior changes = not refactoring)
|
||||
- "Easier to do all at once"
|
||||
- "I know it works without tests"
|
||||
- "While I'm here, I'll also..." (scope creep during refactoring)
|
||||
- "Tests will fail temporarily but I'll fix them" (tests must stay green)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before marking refactoring complete:
|
||||
|
||||
- [ ] All tests pass (verified with hyperpowers:test-runner agent)
|
||||
- [ ] No new linter warnings
|
||||
- [ ] No behavior changes introduced
|
||||
- [ ] Code is cleaner/simpler than before
|
||||
- [ ] Each commit in history is small and safe
|
||||
- [ ] bd task documents what was done and why
|
||||
- [ ] Can explain what each transformation did
|
||||
|
||||
**Can't check all boxes?** Return to process and fix before closing bd task.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill requires:**
|
||||
- hyperpowers:test-driven-development (for writing tests before refactoring if none exist)
|
||||
- hyperpowers:verification-before-completion (for final verification)
|
||||
- hyperpowers:test-runner agent (for running tests without context pollution)
|
||||
|
||||
**This skill is called by:**
|
||||
- General development workflows when improving code structure
|
||||
- After features are complete and working
|
||||
- When preparing code for new features
|
||||
|
||||
**Agents used:**
|
||||
- test-runner (runs tests/commits without polluting main context)
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Common refactoring patterns](resources/refactoring-patterns.md) - Extract Method, Extract Class, Inline, etc.
|
||||
- [Complete refactoring session example](resources/example-session.md) - Minute-by-minute walkthrough
|
||||
|
||||
**When stuck:**
|
||||
- Tests fail after change → Undo (git restore), make smaller change
|
||||
- 3+ failures → Question if refactoring is right approach, consider rewrite
|
||||
- No tests exist → Use hyperpowers:test-driven-development to write tests first
|
||||
- Unsure how small → If it touches more than one function/file, it's too big
|
||||
</resources>
|
||||
78
skills/refactoring-safely/resources/example-session.md
Normal file
78
skills/refactoring-safely/resources/example-session.md
Normal file
@@ -0,0 +1,78 @@
|
||||
## Example: Complete Refactoring Session
|
||||
|
||||
**Goal:** Extract validation logic from UserService
|
||||
|
||||
**Time: 60 minutes**
|
||||
|
||||
### Minutes 0-5: Verify Tests Pass
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Result: ✓ 234 tests pass
|
||||
```
|
||||
|
||||
### Minutes 5-10: Create bd Task
|
||||
```bash
|
||||
bd create "Refactor: Extract user validation" --type task
|
||||
bd edit bd-456 --design "Extract validation to UserValidator class..."
|
||||
bd update bd-456 --status in_progress
|
||||
```
|
||||
|
||||
### Minutes 10-15: Step 1 - Extract email validation function
|
||||
```rust
|
||||
// Extract validate_email()
|
||||
```
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Result: ✓ 234 tests pass
|
||||
git commit -m "refactor(bd-456): extract email validation"
|
||||
```
|
||||
|
||||
### Minutes 15-20: Step 2 - Extract name validation function
|
||||
```rust
|
||||
// Extract validate_name()
|
||||
```
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Result: ✓ 234 tests pass
|
||||
git commit -m "refactor(bd-456): extract name validation"
|
||||
```
|
||||
|
||||
### Minutes 20-25: Step 3 - Create UserValidator struct
|
||||
```rust
|
||||
struct UserValidator { /* empty */ }
|
||||
impl UserValidator { /* empty */ }
|
||||
```
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Result: ✓ 234 tests pass
|
||||
git commit -m "refactor(bd-456): create UserValidator struct"
|
||||
```
|
||||
|
||||
### Minutes 25-35: Steps 4-6 - Move validations to UserValidator
|
||||
Each step: move one method, test, commit
|
||||
|
||||
### Minutes 35-45: Step 7 - Update UserService to use validator
|
||||
```rust
|
||||
// Use UserValidator instead of inline validation
|
||||
```
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Result: ✓ 234 tests pass
|
||||
git commit -m "refactor(bd-456): use UserValidator in UserService"
|
||||
```
|
||||
|
||||
### Minutes 45-55: Step 8 - Remove duplication from other services
|
||||
Each service: one change, test, commit
|
||||
|
||||
### Minutes 55-60: Final verification and close
|
||||
```bash
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Result: ✓ 234 tests pass
|
||||
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo clippy"
|
||||
Result: ✓ No warnings
|
||||
|
||||
bd close bd-456
|
||||
```
|
||||
|
||||
**Result:** Refactoring complete, 8 safe commits, all tests green throughout.
|
||||
121
skills/refactoring-safely/resources/refactoring-patterns.md
Normal file
121
skills/refactoring-safely/resources/refactoring-patterns.md
Normal file
@@ -0,0 +1,121 @@
|
||||
## Common Refactoring Patterns
|
||||
|
||||
### Extract Method
|
||||
|
||||
**When:** Duplicated code or long function
|
||||
|
||||
```rust
|
||||
// Before: Long function
|
||||
fn process(data: Vec<i32>) -> i32 {
|
||||
let mut sum = 0;
|
||||
for x in data {
|
||||
sum += x * x;
|
||||
}
|
||||
sum
|
||||
}
|
||||
|
||||
// After: Extracted method
|
||||
fn process(data: Vec<i32>) -> i32 {
|
||||
data.iter().map(|x| square(x)).sum()
|
||||
}
|
||||
|
||||
fn square(x: &i32) -> i32 {
|
||||
x * x
|
||||
}
|
||||
```
|
||||
|
||||
**Steps:**
|
||||
1. Extract square() function
|
||||
2. Run tests
|
||||
3. Commit
|
||||
4. Replace loop with iterator
|
||||
5. Run tests
|
||||
6. Commit
|
||||
|
||||
### Rename Variable/Function
|
||||
|
||||
**When:** Name is unclear or misleading
|
||||
|
||||
```rust
|
||||
// Before
|
||||
fn calc(d: Vec<i32>) -> f64 {
|
||||
let s: i32 = d.iter().sum();
|
||||
s as f64 / d.len() as f64
|
||||
}
|
||||
|
||||
// After - Step by step
|
||||
// Step 1: Rename function
|
||||
fn calculate_average(d: Vec<i32>) -> f64 { ... } // Test, commit
|
||||
|
||||
// Step 2: Rename parameter
|
||||
fn calculate_average(data: Vec<i32>) -> f64 { ... } // Test, commit
|
||||
|
||||
// Step 3: Rename variable
|
||||
fn calculate_average(data: Vec<i32>) -> f64 {
|
||||
let sum: i32 = data.iter().sum(); // Test, commit
|
||||
sum as f64 / data.len() as f64
|
||||
}
|
||||
```
|
||||
|
||||
### Extract Class/Struct
|
||||
|
||||
**When:** Class has multiple responsibilities
|
||||
|
||||
```rust
|
||||
// Before: God object
|
||||
struct UserService {
|
||||
db: Database,
|
||||
email_validator: Regex,
|
||||
name_validator: Regex,
|
||||
}
|
||||
|
||||
// After: Single responsibility
|
||||
struct UserService {
|
||||
db: Database,
|
||||
validator: UserValidator, // EXTRACTED
|
||||
}
|
||||
|
||||
struct UserValidator {
|
||||
email_pattern: Regex,
|
||||
name_pattern: Regex,
|
||||
}
|
||||
```
|
||||
|
||||
**Steps:**
|
||||
1. Create empty UserValidator struct
|
||||
2. Test, commit
|
||||
3. Move email_validator field
|
||||
4. Test, commit
|
||||
5. Move name_validator field
|
||||
6. Test, commit
|
||||
7. Update UserService to use UserValidator
|
||||
8. Test, commit
|
||||
|
||||
### Inline Unnecessary Abstraction
|
||||
|
||||
**When:** Abstraction adds no value
|
||||
|
||||
```rust
|
||||
// Before: Pointless wrapper
|
||||
fn get_user_email(user: &User) -> &str {
|
||||
&user.email
|
||||
}
|
||||
|
||||
fn process() {
|
||||
let email = get_user_email(&user); // Just use user.email!
|
||||
}
|
||||
|
||||
// After: Inline
|
||||
fn process() {
|
||||
let email = &user.email;
|
||||
}
|
||||
```
|
||||
|
||||
**Steps:**
|
||||
1. Replace one call site with direct access
|
||||
2. Test, commit
|
||||
3. Replace next call site
|
||||
4. Test, commit
|
||||
5. Remove wrapper function
|
||||
6. Test, commit
|
||||
|
||||
646
skills/review-implementation/SKILL.md
Normal file
646
skills/review-implementation/SKILL.md
Normal file
@@ -0,0 +1,646 @@
|
||||
---
|
||||
name: review-implementation
|
||||
description: Use after hyperpowers:executing-plans completes all tasks - verifies implementation against bd spec, all success criteria met, anti-patterns avoided
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Review completed implementation against bd epic to catch gaps before claiming completion; spec is contract, implementation must fulfill contract completely.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow the 4-step review process exactly. Review with Google Fellow-level scrutiny. Never skip automated checks, quality gates, or code reading. No approval without evidence for every criterion.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Step | Action | Deliverable |
|
||||
|------|--------|-------------|
|
||||
| 1 | Load bd epic + all tasks | TodoWrite with tasks to review |
|
||||
| 2 | Review each task (automated checks, quality gates, read code, verify criteria) | Findings per task |
|
||||
| 3 | Report findings (approved / gaps found) | Review decision |
|
||||
| 4 | Gate: If approved → finishing-a-development-branch, If gaps → STOP | Next action |
|
||||
|
||||
**Review Perspective:** Google Fellow-level SRE with 20+ years experience reviewing junior engineer code.
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
- hyperpowers:executing-plans completed all tasks
|
||||
- Before claiming work is complete
|
||||
- Before hyperpowers:finishing-a-development-branch
|
||||
- Want to verify implementation matches spec
|
||||
|
||||
**Don't use for:**
|
||||
- Mid-implementation (use hyperpowers:executing-plans)
|
||||
- Before all tasks done
|
||||
- Code reviews of external PRs (this is self-review)
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## Step 1: Load Epic Specification
|
||||
|
||||
**Announce:** "I'm using hyperpowers:review-implementation to verify implementation matches spec. Reviewing with Google Fellow-level scrutiny."
|
||||
|
||||
**Get epic and tasks:**
|
||||
|
||||
```bash
|
||||
bd show bd-1 # Epic specification
|
||||
bd dep tree bd-1 # Task tree
|
||||
bd list --parent bd-1 # All tasks
|
||||
```
|
||||
|
||||
**Create TodoWrite tracker:**
|
||||
|
||||
```
|
||||
TodoWrite todos:
|
||||
- Review bd-2: Task Name
|
||||
- Review bd-3: Task Name
|
||||
- Review bd-4: Task Name
|
||||
- Compile findings and make decision
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Review Each Task
|
||||
|
||||
For each task:
|
||||
|
||||
### A. Read Task Specification
|
||||
|
||||
```bash
|
||||
bd show bd-3
|
||||
```
|
||||
|
||||
Extract:
|
||||
- Goal (what problem solved?)
|
||||
- Success criteria (how verify done?)
|
||||
- Implementation checklist (files/functions/tests)
|
||||
- Key considerations (edge cases)
|
||||
- Anti-patterns (prohibited patterns)
|
||||
|
||||
---
|
||||
|
||||
### B. Run Automated Code Completeness Checks
|
||||
|
||||
```bash
|
||||
# TODOs/FIXMEs without issue numbers
|
||||
rg -i "todo|fixme" src/ tests/ || echo "✅ None"
|
||||
|
||||
# Stub implementations
|
||||
rg "unimplemented!|todo!|unreachable!|panic!\(\"not implemented" src/ || echo "✅ None"
|
||||
|
||||
# Unsafe patterns in production
|
||||
rg "\.unwrap\(\)|\.expect\(" src/ | grep -v "/tests/" || echo "✅ None"
|
||||
|
||||
# Ignored/skipped tests
|
||||
rg "#\[ignore\]|#\[skip\]|\.skip\(\)" tests/ src/ || echo "✅ None"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### C. Run Quality Gates (via test-runner agent)
|
||||
|
||||
**IMPORTANT:** Use hyperpowers:test-runner agent to avoid context pollution.
|
||||
|
||||
```
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo test"
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo fmt --check"
|
||||
Dispatch hyperpowers:test-runner: "Run: cargo clippy -- -D warnings"
|
||||
Dispatch hyperpowers:test-runner: "Run: .git/hooks/pre-commit"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### D. Read Implementation Files
|
||||
|
||||
**CRITICAL:** READ actual files, not just git diff.
|
||||
|
||||
```bash
|
||||
# See changes
|
||||
git diff main...HEAD -- src/auth/jwt.ts
|
||||
|
||||
# THEN READ FULL FILE
|
||||
Read tool: src/auth/jwt.ts
|
||||
```
|
||||
|
||||
**While reading, check:**
|
||||
- ✅ Code implements checklist items (not stubs)
|
||||
- ✅ Error handling uses proper patterns (Result, try/catch)
|
||||
- ✅ Edge cases from "Key Considerations" handled
|
||||
- ✅ Code is clear and maintainable
|
||||
- ✅ No anti-patterns present
|
||||
|
||||
---
|
||||
|
||||
### E. Code Quality Review (Google Fellow Perspective)
|
||||
|
||||
**Assume code written by junior engineer. Apply production-grade scrutiny.**
|
||||
|
||||
**Error Handling:**
|
||||
- Proper use of Result/Option or try/catch?
|
||||
- Error messages helpful for production debugging?
|
||||
- No unwrap/expect in production?
|
||||
- Errors propagate with context?
|
||||
- Failure modes graceful?
|
||||
|
||||
**Safety:**
|
||||
- No unsafe blocks without justification?
|
||||
- Proper bounds checking?
|
||||
- No potential panics?
|
||||
- No data races?
|
||||
- No SQL injection, XSS vulnerabilities?
|
||||
|
||||
**Clarity:**
|
||||
- Would junior understand in 6 months?
|
||||
- Single responsibility per function?
|
||||
- Descriptive variable names?
|
||||
- Complex logic explained?
|
||||
- No clever tricks - obvious and boring?
|
||||
|
||||
**Testing:**
|
||||
- Edge cases covered (empty, max, Unicode)?
|
||||
- Tests meaningful, not just coverage?
|
||||
- Test names describe what verified?
|
||||
- Tests test behavior, not implementation?
|
||||
- Failure scenarios tested?
|
||||
|
||||
**Production Readiness:**
|
||||
- Comfortable deploying to production?
|
||||
- Could cause outage or data loss?
|
||||
- Performance acceptable under load?
|
||||
- Logging sufficient for debugging?
|
||||
|
||||
---
|
||||
|
||||
### F. Verify Success Criteria with Evidence
|
||||
|
||||
For EACH criterion in bd task:
|
||||
- Run verification command
|
||||
- Check actual output
|
||||
- Don't assume - verify with evidence
|
||||
- Use hyperpowers:test-runner for tests/lints
|
||||
|
||||
**Example:**
|
||||
|
||||
```
|
||||
Criterion: "All tests passing"
|
||||
Command: cargo test
|
||||
Evidence: "127 tests passed, 0 failures"
|
||||
Result: ✅ Met
|
||||
|
||||
Criterion: "No unwrap in production"
|
||||
Command: rg "\.unwrap\(\)" src/
|
||||
Evidence: "No matches"
|
||||
Result: ✅ Met
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### G. Check Anti-Patterns
|
||||
|
||||
Search for each prohibited pattern from bd task:
|
||||
|
||||
```bash
|
||||
# Example anti-patterns from task
|
||||
rg "\.unwrap\(\)" src/ # If task prohibits unwrap
|
||||
rg "TODO" src/ # If task prohibits untracked TODOs
|
||||
rg "\.skip\(\)" tests/ # If task prohibits skipped tests
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### H. Verify Key Considerations
|
||||
|
||||
Read code to confirm edge cases handled:
|
||||
- Empty input validation
|
||||
- Unicode handling
|
||||
- Concurrent access
|
||||
- Failure modes
|
||||
- Performance concerns
|
||||
|
||||
**Example:** Task says "Must handle empty payload" → Find validation code for empty payload.
|
||||
|
||||
---
|
||||
|
||||
### I. Record Findings
|
||||
|
||||
```markdown
|
||||
### Task: bd-3 - Implement JWT authentication
|
||||
|
||||
#### Automated Checks
|
||||
- TODOs: ✅ None
|
||||
- Stubs: ✅ None
|
||||
- Unsafe patterns: ❌ Found `.unwrap()` at src/auth/jwt.ts:45
|
||||
- Ignored tests: ✅ None
|
||||
|
||||
#### Quality Gates
|
||||
- Tests: ✅ Pass (127 tests)
|
||||
- Formatting: ✅ Pass
|
||||
- Linting: ❌ 3 warnings
|
||||
- Pre-commit: ❌ Fails due to linting
|
||||
|
||||
#### Files Reviewed
|
||||
- src/auth/jwt.ts: ⚠️ Contains `.unwrap()` at line 45
|
||||
- tests/auth/jwt_test.rs: ✅ Complete
|
||||
|
||||
#### Code Quality
|
||||
- Error Handling: ⚠️ Uses unwrap instead of proper error propagation
|
||||
- Safety: ✅ Good
|
||||
- Clarity: ✅ Good
|
||||
- Testing: ✅ Good
|
||||
|
||||
#### Success Criteria
|
||||
1. "All tests pass": ✅ Met - Evidence: 127 tests passed
|
||||
2. "Pre-commit passes": ❌ Not met - Evidence: clippy warnings
|
||||
3. "No unwrap in production": ❌ Not met - Evidence: Found at jwt.ts:45
|
||||
|
||||
#### Anti-Patterns
|
||||
- "NO unwrap in production": ❌ Violated at src/auth/jwt.ts:45
|
||||
|
||||
#### Issues
|
||||
**Critical:**
|
||||
1. unwrap() at jwt.ts:45 - violates anti-pattern, must use proper error handling
|
||||
|
||||
**Important:**
|
||||
2. 3 clippy warnings block pre-commit hook
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### J. Mark Task Reviewed (TodoWrite)
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Report Findings
|
||||
|
||||
After reviewing ALL tasks:
|
||||
|
||||
**If NO gaps:**
|
||||
|
||||
```markdown
|
||||
## Implementation Review: APPROVED ✅
|
||||
|
||||
Reviewed bd-1 (OAuth Authentication) against implementation.
|
||||
|
||||
### Tasks Reviewed
|
||||
- bd-2: Configure OAuth provider ✅
|
||||
- bd-3: Implement token exchange ✅
|
||||
- bd-4: Add refresh logic ✅
|
||||
|
||||
### Verification Summary
|
||||
- All success criteria verified
|
||||
- No anti-patterns detected
|
||||
- All key considerations addressed
|
||||
- All files implemented per spec
|
||||
|
||||
### Evidence
|
||||
- Tests: 127 passed, 0 failures (2.3s)
|
||||
- Linting: No warnings
|
||||
- Pre-commit: Pass
|
||||
- Code review: Production-ready
|
||||
|
||||
Ready to proceed to hyperpowers:finishing-a-development-branch.
|
||||
```
|
||||
|
||||
**If gaps found:**
|
||||
|
||||
```markdown
|
||||
## Implementation Review: GAPS FOUND ❌
|
||||
|
||||
Reviewed bd-1 (OAuth Authentication) against implementation.
|
||||
|
||||
### Tasks with Gaps
|
||||
|
||||
#### bd-3: Implement token exchange
|
||||
**Gaps:**
|
||||
- ❌ Success criterion not met: "Pre-commit hooks pass"
|
||||
- Evidence: cargo clippy shows 3 warnings
|
||||
- ❌ Anti-pattern violation: Found `.unwrap()` at src/auth/jwt.ts:45
|
||||
- ⚠️ Key consideration not addressed: "Empty payload validation"
|
||||
- No check for empty payload in generateToken()
|
||||
|
||||
#### bd-4: Add refresh logic
|
||||
**Gaps:**
|
||||
- ❌ Success criterion not met: "All tests passing"
|
||||
- Evidence: test_verify_expired_token failing
|
||||
|
||||
### Cannot Proceed
|
||||
Implementation does not match spec. Fix gaps before completing.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Gate Decision
|
||||
|
||||
**If APPROVED:**
|
||||
```
|
||||
Announce: "I'm using hyperpowers:finishing-a-development-branch to complete this work."
|
||||
|
||||
Use Skill tool: hyperpowers:finishing-a-development-branch
|
||||
```
|
||||
|
||||
**If GAPS FOUND:**
|
||||
```
|
||||
STOP. Do not proceed to finishing-a-development-branch.
|
||||
Fix gaps or discuss with partner.
|
||||
Re-run review after fixes.
|
||||
```
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer only checks git diff, doesn't read actual files</scenario>
|
||||
|
||||
<code>
|
||||
# Review process
|
||||
git diff main...HEAD # Shows changes
|
||||
|
||||
# Developer sees:
|
||||
+ function generateToken(payload) {
|
||||
+ return jwt.sign(payload, secret);
|
||||
+ }
|
||||
|
||||
# Approves based on diff
|
||||
"Looks good, token generation implemented ✅"
|
||||
|
||||
# Misses: Full context shows no validation
|
||||
function generateToken(payload) {
|
||||
// No validation of payload!
|
||||
// No check for empty payload (key consideration)
|
||||
// No error handling if jwt.sign fails
|
||||
return jwt.sign(payload, secret);
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Git diff shows additions, not full context
|
||||
- Missed that empty payload not validated (key consideration)
|
||||
- Missed that error handling missing (quality issue)
|
||||
- False approval - gaps exist but not caught
|
||||
- Will fail in production when empty payload passed
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct review process:**
|
||||
|
||||
```bash
|
||||
# See changes
|
||||
git diff main...HEAD -- src/auth/jwt.ts
|
||||
|
||||
# THEN READ FULL FILE
|
||||
Read tool: src/auth/jwt.ts
|
||||
```
|
||||
|
||||
**Reading full file reveals:**
|
||||
```javascript
|
||||
function generateToken(payload) {
|
||||
// Missing: empty payload check (key consideration from bd task)
|
||||
// Missing: error handling for jwt.sign failure
|
||||
return jwt.sign(payload, secret);
|
||||
}
|
||||
```
|
||||
|
||||
**Record in findings:**
|
||||
```
|
||||
⚠️ Key consideration not addressed: "Empty payload validation"
|
||||
- No check for empty payload in generateToken()
|
||||
- Code at src/auth/jwt.ts:15-17
|
||||
|
||||
⚠️ Error handling: jwt.sign can throw, not handled
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Caught gaps that git diff missed
|
||||
- Full context reveals missing validation
|
||||
- Quality issues identified before production
|
||||
- Spec compliance verified, not assumed
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer assumes tests passing means done</scenario>
|
||||
|
||||
<code>
|
||||
# Run tests
|
||||
cargo test
|
||||
# Output: 127 tests passed
|
||||
|
||||
# Developer concludes
|
||||
"Tests pass, implementation complete ✅"
|
||||
|
||||
# Proceeds to finishing-a-development-branch
|
||||
|
||||
# Misses:
|
||||
- bd task has 5 success criteria
|
||||
- Only checked 1 (tests pass)
|
||||
- Anti-pattern: unwrap() present (prohibited)
|
||||
- Key consideration: Unicode handling not tested
|
||||
- Linter has warnings (blocks pre-commit)
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Tests passing ≠ spec compliance
|
||||
- Didn't verify all success criteria
|
||||
- Didn't check anti-patterns
|
||||
- Didn't verify key considerations
|
||||
- Pre-commit will fail (blocks merge)
|
||||
- Ships code violating anti-patterns
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct review checks ALL criteria:**
|
||||
|
||||
```markdown
|
||||
bd task has 5 success criteria:
|
||||
1. "All tests pass" ✅ - Evidence: 127 passed
|
||||
2. "Pre-commit passes" ❌ - Evidence: clippy warns (3 warnings)
|
||||
3. "No unwrap in production" ❌ - Evidence: Found at jwt.ts:45
|
||||
4. "Unicode handling tested" ⚠️ - Need to verify test exists
|
||||
5. "Rate limiting implemented" ⚠️ - Need to check code
|
||||
|
||||
Result: 1/5 criteria verified met. GAPS EXIST.
|
||||
```
|
||||
|
||||
**Run additional checks:**
|
||||
```bash
|
||||
# Check criterion 2
|
||||
cargo clippy
|
||||
# 3 warnings found ❌
|
||||
|
||||
# Check criterion 3
|
||||
rg "\.unwrap\(\)" src/
|
||||
# src/auth/jwt.ts:45 ❌
|
||||
|
||||
# Check criterion 4
|
||||
rg "unicode" tests/
|
||||
# No matches ⚠️ Need to verify
|
||||
```
|
||||
|
||||
**Decision: GAPS FOUND, cannot proceed**
|
||||
|
||||
**What you gain:**
|
||||
- Verified ALL criteria, not just tests
|
||||
- Caught anti-pattern violations
|
||||
- Caught pre-commit blockers
|
||||
- Prevented shipping non-compliant code
|
||||
- Spec contract honored completely
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer rationalizes skipping rigor for "simple" task</scenario>
|
||||
|
||||
<code>
|
||||
bd task: "Add logging to error paths"
|
||||
|
||||
# Developer thinks: "Simple task, just added console.log"
|
||||
# Skips:
|
||||
- Automated checks (assumes no issues)
|
||||
- Code quality review (seems obvious)
|
||||
- Full success criteria verification
|
||||
|
||||
# Approves quickly:
|
||||
"Logging added ✅"
|
||||
|
||||
# Misses:
|
||||
- console.log used instead of proper logger (anti-pattern)
|
||||
- Only added to 2 of 5 error paths (incomplete)
|
||||
- No test verifying logs actually output (criterion)
|
||||
- Logs contain sensitive data (security issue)
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- "Simple" tasks have hidden complexity
|
||||
- Skipped rigor catches exactly these issues
|
||||
- Incomplete implementation (2/5 paths)
|
||||
- Security vulnerability shipped
|
||||
- Anti-pattern not caught
|
||||
- Failed success criterion (test logs)
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Follow full review process:**
|
||||
|
||||
```bash
|
||||
# Automated checks
|
||||
rg "console\.log" src/
|
||||
# Found at error-handler.ts:12, 15 ⚠️
|
||||
|
||||
# Read bd task
|
||||
bd show bd-5
|
||||
|
||||
# Success criteria:
|
||||
# 1. "All error paths logged"
|
||||
# 2. "No sensitive data in logs"
|
||||
# 3. "Test verifies log output"
|
||||
|
||||
# Check criterion 1
|
||||
grep -n "throw new Error" src/
|
||||
# 5 locations found
|
||||
# Only 2 have logging ❌ Incomplete
|
||||
|
||||
# Check criterion 2
|
||||
Read tool: src/error-handler.ts
|
||||
# Logs contain password field ❌ Security issue
|
||||
|
||||
# Check criterion 3
|
||||
rg "test.*log" tests/
|
||||
# No matches ❌ Test missing
|
||||
```
|
||||
|
||||
**Decision: GAPS FOUND**
|
||||
- Incomplete (3/5 error paths missing logs)
|
||||
- Security issue (logs password)
|
||||
- Anti-pattern (console.log instead of logger)
|
||||
- Missing test
|
||||
|
||||
**What you gain:**
|
||||
- "Simple" task revealed multiple gaps
|
||||
- Security vulnerability caught pre-production
|
||||
- Rigor prevents incomplete work shipping
|
||||
- All criteria must be met, no exceptions
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Review every task** → No skipping "simple" tasks
|
||||
2. **Run all automated checks** → TODOs, stubs, unwrap, ignored tests
|
||||
3. **Read actual files with Read tool** → Not just git diff
|
||||
4. **Verify every success criterion** → With evidence, not assumptions
|
||||
5. **Check all anti-patterns** → Search for prohibited patterns
|
||||
6. **Apply Google Fellow scrutiny** → Production-grade code review
|
||||
7. **If gaps found → STOP** → Don't proceed to finishing-a-development-branch
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Follow full review process.**
|
||||
|
||||
- "Tests pass, must be complete" (Tests ≠ spec, check all criteria)
|
||||
- "I implemented it, it's done" (Implementation ≠ compliance, verify)
|
||||
- "No time for thorough review" (Gaps later cost more than review now)
|
||||
- "Looks good to me" (Opinion ≠ evidence, run verifications)
|
||||
- "Small gaps don't matter" (Spec is contract, all criteria matter)
|
||||
- "Will fix in next PR" (This PR completes this epic, fix now)
|
||||
- "Can check diff instead of files" (Diff shows changes, not context)
|
||||
- "Automated checks cover it" (Checks + code review both required)
|
||||
- "Success criteria passing means done" (Also check anti-patterns, quality, edge cases)
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before approving implementation:
|
||||
|
||||
**Per task:**
|
||||
- [ ] Read bd task specification completely
|
||||
- [ ] Ran all automated checks (TODOs, stubs, unwrap, ignored tests)
|
||||
- [ ] Ran all quality gates via test-runner agent (tests, format, lint, pre-commit)
|
||||
- [ ] Read actual implementation files with Read tool (not just diff)
|
||||
- [ ] Reviewed code quality with Google Fellow perspective
|
||||
- [ ] Verified every success criterion with evidence
|
||||
- [ ] Checked every anti-pattern (searched for prohibited patterns)
|
||||
- [ ] Verified every key consideration addressed in code
|
||||
|
||||
**Overall:**
|
||||
- [ ] Reviewed ALL tasks (no exceptions)
|
||||
- [ ] TodoWrite tracker shows all tasks reviewed
|
||||
- [ ] Compiled findings (approved or gaps)
|
||||
- [ ] If approved: all criteria met for all tasks
|
||||
- [ ] If gaps: documented exactly what missing
|
||||
|
||||
**Can't check all boxes?** Return to Step 2 and complete review.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill is called by:**
|
||||
- hyperpowers:executing-plans (Step 5, after all tasks executed)
|
||||
|
||||
**This skill calls:**
|
||||
- hyperpowers:finishing-a-development-branch (if approved)
|
||||
- hyperpowers:test-runner agent (for quality gates)
|
||||
|
||||
**This skill uses:**
|
||||
- hyperpowers:verification-before-completion principles (evidence before claims)
|
||||
|
||||
**Call chain:**
|
||||
```
|
||||
hyperpowers:executing-plans → hyperpowers:review-implementation → hyperpowers:finishing-a-development-branch
|
||||
↓
|
||||
(if gaps: STOP)
|
||||
```
|
||||
|
||||
**CRITICAL:** Use bd commands (bd show, bd list, bd dep tree), never read `.beads/issues.jsonl` directly.
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Code quality standards by language](resources/quality-standards.md)
|
||||
- [Common anti-patterns to check](resources/anti-patterns-reference.md)
|
||||
- [Production readiness checklist](resources/production-checklist.md)
|
||||
|
||||
**When stuck:**
|
||||
- Unsure if gap critical → If violates criterion, it's a gap
|
||||
- Criteria ambiguous → Ask user for clarification before approving
|
||||
- Anti-pattern unclear → Search for it, document if found
|
||||
- Quality concern → Document as gap, don't rationalize away
|
||||
</resources>
|
||||
566
skills/root-cause-tracing/SKILL.md
Normal file
566
skills/root-cause-tracing/SKILL.md
Normal file
@@ -0,0 +1,566 @@
|
||||
---
|
||||
name: root-cause-tracing
|
||||
description: Use when errors occur deep in execution - traces bugs backward through call stack to find original trigger, not just symptom
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Bugs manifest deep in the call stack; trace backward until you find the original trigger, then fix at source, not where error appears.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
MEDIUM FREEDOM - Follow the backward tracing process strictly, but adapt instrumentation and debugging techniques to your language and tools.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Step | Action | Question |
|
||||
|------|--------|----------|
|
||||
| 1 | Read error completely | What failed and where? |
|
||||
| 2 | Find immediate cause | What code directly threw this? |
|
||||
| 3 | Trace backward one level | What called this code? |
|
||||
| 4 | Keep tracing up stack | What called that? |
|
||||
| 5 | Find where bad data originated | Where was invalid value created? |
|
||||
| 6 | Fix at source | Address root cause |
|
||||
| 7 | Add defense at each layer | Validate assumptions as backup |
|
||||
|
||||
**Core rule:** Never fix just where error appears. Fix where problem originates.
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
- Error happens deep in execution (not at entry point)
|
||||
- Stack trace shows long call chain
|
||||
- Unclear where invalid data originated
|
||||
- Need to find which test/code triggers problem
|
||||
- Error message points to utility/library code
|
||||
|
||||
**Example symptoms:**
|
||||
- "Database rejects empty string" ← Where did empty string come from?
|
||||
- "File not found: ''" ← Why is path empty?
|
||||
- "Invalid argument to function" ← Who passed invalid argument?
|
||||
- "Null pointer dereference" ← What should have been initialized?
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## 1. Observe the Symptom
|
||||
|
||||
Read the complete error:
|
||||
|
||||
```
|
||||
Error: Invalid email format: ""
|
||||
at validateEmail (validator.ts:42)
|
||||
at UserService.create (user-service.ts:18)
|
||||
at ApiHandler.createUser (api-handler.ts:67)
|
||||
at HttpServer.handleRequest (server.ts:123)
|
||||
at TestCase.test_create_user (user.test.ts:10)
|
||||
```
|
||||
|
||||
**Symptom:** Email validation fails on empty string
|
||||
**Location:** Deep in validator utility
|
||||
|
||||
**DON'T fix here yet.** This might be symptom, not source.
|
||||
|
||||
---
|
||||
|
||||
## 2. Find Immediate Cause
|
||||
|
||||
What code directly causes this?
|
||||
|
||||
```typescript
|
||||
// validator.ts:42
|
||||
function validateEmail(email: string): boolean {
|
||||
if (!email) throw new Error(`Invalid email format: "${email}"`);
|
||||
return EMAIL_REGEX.test(email);
|
||||
}
|
||||
```
|
||||
|
||||
**Question:** Why is email empty? Keep tracing.
|
||||
|
||||
---
|
||||
|
||||
## 3. Trace Backward: What Called This?
|
||||
|
||||
Use stack trace:
|
||||
|
||||
```typescript
|
||||
// user-service.ts:18
|
||||
create(request: UserRequest): User {
|
||||
validateEmail(request.email); // Called with request.email = ""
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
**Question:** Why is `request.email` empty? Keep tracing.
|
||||
|
||||
---
|
||||
|
||||
## 4. Keep Tracing Up the Stack
|
||||
|
||||
```typescript
|
||||
// api-handler.ts:67
|
||||
async createUser(req: Request): Promise<Response> {
|
||||
const userRequest = {
|
||||
name: req.body.name,
|
||||
email: req.body.email || "", // ← FOUND IT!
|
||||
};
|
||||
return this.userService.create(userRequest);
|
||||
}
|
||||
```
|
||||
|
||||
**Root cause found:** API handler provides default empty string when email missing.
|
||||
|
||||
---
|
||||
|
||||
## 5. Identify the Pattern
|
||||
|
||||
**Why empty string as default?**
|
||||
- Misguided "safety": Thought empty string better than undefined
|
||||
- Should reject invalid request at API boundary
|
||||
- Downstream code assumes data already validated
|
||||
|
||||
---
|
||||
|
||||
## 6. Fix at Source
|
||||
|
||||
```typescript
|
||||
// api-handler.ts (SOURCE FIX)
|
||||
async createUser(req: Request): Promise<Response> {
|
||||
if (!req.body.email) {
|
||||
return Response.badRequest("Email is required");
|
||||
}
|
||||
const userRequest = {
|
||||
name: req.body.name,
|
||||
email: req.body.email, // No default, already validated
|
||||
};
|
||||
return this.userService.create(userRequest);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Add Defense in Depth
|
||||
|
||||
After fixing source, add validation at each layer as backup:
|
||||
|
||||
```typescript
|
||||
// Layer 1: API - Reject invalid input (PRIMARY FIX)
|
||||
if (!req.body.email) return Response.badRequest("Email required");
|
||||
|
||||
// Layer 2: Service - Validate assumptions
|
||||
assert(request.email, "email must be present");
|
||||
|
||||
// Layer 3: Utility - Defensive check
|
||||
if (!email) throw new Error("invariant violated: email empty");
|
||||
```
|
||||
|
||||
**Primary fix at source. Defense is backup, not replacement.**
|
||||
</the_process>
|
||||
|
||||
<debugging_approaches>
|
||||
## Option 1: Guide User Through Debugger
|
||||
|
||||
**IMPORTANT:** Claude cannot run interactive debuggers. Guide user through debugger commands.
|
||||
|
||||
```
|
||||
"Let's use lldb to trace backward through the call stack.
|
||||
|
||||
Please run these commands:
|
||||
lldb target/debug/myapp
|
||||
(lldb) breakpoint set --file validator.rs --line 42
|
||||
(lldb) run
|
||||
|
||||
When breakpoint hits:
|
||||
(lldb) frame variable email # Check value here
|
||||
(lldb) bt # See full call stack
|
||||
(lldb) up # Move to caller
|
||||
(lldb) frame variable request # Check values in caller
|
||||
(lldb) up # Move up again
|
||||
(lldb) frame variable # Where empty string created?
|
||||
|
||||
Please share:
|
||||
1. Value of 'email' at validator.rs:42
|
||||
2. Value of 'request.email' in user_service.rs
|
||||
3. Value of 'req.body.email' in api_handler.rs
|
||||
4. Where does empty string first appear?"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Option 2: Add Instrumentation (Claude CAN Do This)
|
||||
|
||||
When debugger not available or issue intermittent:
|
||||
|
||||
```rust
|
||||
// Add at error location
|
||||
fn validate_email(email: &str) -> Result<()> {
|
||||
eprintln!("DEBUG validate_email called:");
|
||||
eprintln!(" email: {:?}", email);
|
||||
eprintln!(" backtrace: {}", std::backtrace::Backtrace::capture());
|
||||
|
||||
if email.is_empty() {
|
||||
return Err(Error::InvalidEmail);
|
||||
}
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
**Critical:** Use `eprintln!()` or `console.error()` in tests (not logger - may be suppressed).
|
||||
|
||||
**Run and analyze:**
|
||||
|
||||
```bash
|
||||
cargo test 2>&1 | grep "DEBUG validate_email" -A 10
|
||||
```
|
||||
|
||||
Look for:
|
||||
- Test file names in backtraces
|
||||
- Line numbers triggering the call
|
||||
- Patterns (same test? same parameter?)
|
||||
</debugging_approaches>
|
||||
|
||||
<finding_polluting_tests>
|
||||
## Finding Which Test Pollutes
|
||||
|
||||
When something appears during tests but you don't know which:
|
||||
|
||||
**Binary search approach:**
|
||||
|
||||
```bash
|
||||
# Run half the tests
|
||||
npm test tests/first-half/*.test.ts
|
||||
# Pollution appears? Yes → in first half, No → second half
|
||||
|
||||
# Subdivide
|
||||
npm test tests/first-quarter/*.test.ts
|
||||
|
||||
# Continue until specific file
|
||||
npm test tests/auth/login.test.ts ← Found it!
|
||||
```
|
||||
|
||||
**Or test isolation:**
|
||||
|
||||
```bash
|
||||
# Run tests one at a time
|
||||
for test in tests/**/*.test.ts; do
|
||||
echo "Testing: $test"
|
||||
npm test "$test"
|
||||
if [ -d .git ]; then
|
||||
echo "FOUND POLLUTER: $test"
|
||||
break
|
||||
fi
|
||||
done
|
||||
```
|
||||
</finding_polluting_tests>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer fixes symptom, not source</scenario>
|
||||
|
||||
<code>
|
||||
# Error appears in git utility:
|
||||
fn git_init(directory: &str) {
|
||||
Command::new("git")
|
||||
.arg("init")
|
||||
.current_dir(directory)
|
||||
.run()
|
||||
}
|
||||
|
||||
# Error: "Invalid argument: empty directory"
|
||||
|
||||
# Developer adds validation at symptom:
|
||||
fn git_init(directory: &str) {
|
||||
if directory.is_empty() {
|
||||
panic!("Directory cannot be empty"); // Band-aid
|
||||
}
|
||||
Command::new("git").arg("init").current_dir(directory).run()
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Fixes symptom, not source (where empty string created)
|
||||
- Same bug will appear elsewhere directory is used
|
||||
- Doesn't explain WHY directory was empty
|
||||
- Future code might make same mistake
|
||||
- Band-aid hides the real problem
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Trace backward:**
|
||||
|
||||
1. git_init called with directory=""
|
||||
2. WorkspaceManager.init(projectDir="")
|
||||
3. Session.create(projectDir="")
|
||||
4. Test: Project.create(context.tempDir)
|
||||
5. **SOURCE:** context.tempDir="" (accessed before beforeEach!)
|
||||
|
||||
**Fix at source:**
|
||||
|
||||
```typescript
|
||||
function setupTest() {
|
||||
let _tempDir: string | undefined;
|
||||
|
||||
return {
|
||||
beforeEach() {
|
||||
_tempDir = makeTempDir();
|
||||
},
|
||||
get tempDir(): string {
|
||||
if (!_tempDir) {
|
||||
throw new Error("tempDir accessed before beforeEach!");
|
||||
}
|
||||
return _tempDir;
|
||||
}
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Fixes actual bug (test timing issue)
|
||||
- Prevents same mistake elsewhere
|
||||
- Clear error at source, not deep in stack
|
||||
- No empty strings propagating through system
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer stops tracing too early</scenario>
|
||||
|
||||
<code>
|
||||
# Error in API handler
|
||||
async createUser(req: Request): Promise<Response> {
|
||||
const userRequest = {
|
||||
name: req.body.name,
|
||||
email: req.body.email || "", // Suspicious!
|
||||
};
|
||||
return this.userService.create(userRequest);
|
||||
}
|
||||
|
||||
# Developer sees empty string default and "fixes" it:
|
||||
email: req.body.email || "noreply@example.com"
|
||||
|
||||
# Ships to production
|
||||
# Bug: Users created without email input get noreply@example.com
|
||||
# Database has fake emails, can't distinguish missing from real
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Stopped at first suspicious code
|
||||
- Didn't question WHY empty string was default
|
||||
- "Fixed" by replacing with different wrong default
|
||||
- Root cause: shouldn't accept missing email at all
|
||||
- Validation should happen at API boundary
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Keep tracing to understand intent:**
|
||||
|
||||
1. Why was empty string default?
|
||||
2. Should email be optional or required?
|
||||
3. What does API spec say?
|
||||
4. What does database schema say?
|
||||
|
||||
**Findings:**
|
||||
- Email column is NOT NULL in database
|
||||
- API docs say email is required
|
||||
- Empty string was workaround, not design
|
||||
|
||||
**Fix at source (validate at boundary):**
|
||||
|
||||
```typescript
|
||||
async createUser(req: Request): Promise<Response> {
|
||||
// Validate at API boundary
|
||||
if (!req.body.email) {
|
||||
return Response.badRequest("Email is required");
|
||||
}
|
||||
|
||||
const userRequest = {
|
||||
name: req.body.name,
|
||||
email: req.body.email, // No default needed
|
||||
};
|
||||
return this.userService.create(userRequest);
|
||||
}
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Validates at correct layer (API boundary)
|
||||
- Clear error message to client
|
||||
- No invalid data propagates downstream
|
||||
- Database constraints enforced
|
||||
- Matches API specification
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Complex multi-layer trace to find original trigger</scenario>
|
||||
|
||||
<code>
|
||||
# Problem: .git directory appearing in source code directory during tests
|
||||
|
||||
# Symptom location:
|
||||
Error: Cannot initialize git repo (repo already exists)
|
||||
Location: src/workspace/git.rs:45
|
||||
|
||||
# Developer adds check:
|
||||
if Path::new(".git").exists() {
|
||||
return Err("Git already initialized");
|
||||
}
|
||||
|
||||
# Doesn't help - still appears in wrong place!
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Detects symptom, doesn't prevent it
|
||||
- .git still created in wrong directory
|
||||
- Doesn't explain HOW it gets there
|
||||
- Pollution still happens, just detected
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Trace through multiple layers:**
|
||||
|
||||
```
|
||||
1. git init runs with cwd=""
|
||||
↓ Why is cwd empty?
|
||||
|
||||
2. WorkspaceManager.init(projectDir="")
|
||||
↓ Why is projectDir empty?
|
||||
|
||||
3. Session.create(projectDir="")
|
||||
↓ Why was empty string passed?
|
||||
|
||||
4. Test: Project.create(context.tempDir)
|
||||
↓ Why is context.tempDir empty?
|
||||
|
||||
5. ROOT CAUSE:
|
||||
const context = setupTest(); // tempDir="" initially
|
||||
Project.create(context.tempDir); // Accessed at top level!
|
||||
|
||||
beforeEach(() => {
|
||||
context.tempDir = makeTempDir(); // Assigned here
|
||||
});
|
||||
|
||||
TEST ACCESSED TEMPDIR BEFORE BEFOREEACH RAN!
|
||||
```
|
||||
|
||||
**Fix at source (make early access impossible):**
|
||||
|
||||
```typescript
|
||||
function setupTest() {
|
||||
let _tempDir: string | undefined;
|
||||
|
||||
return {
|
||||
beforeEach() {
|
||||
_tempDir = makeTempDir();
|
||||
},
|
||||
get tempDir(): string {
|
||||
if (!_tempDir) {
|
||||
throw new Error("tempDir accessed before beforeEach!");
|
||||
}
|
||||
return _tempDir;
|
||||
}
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Then add defense at each layer:**
|
||||
|
||||
```rust
|
||||
// Layer 1: Test framework (PRIMARY FIX)
|
||||
// Getter throws if accessed early
|
||||
|
||||
// Layer 2: Project validation
|
||||
fn create(directory: &str) -> Result<Self> {
|
||||
if directory.is_empty() {
|
||||
return Err("Directory cannot be empty");
|
||||
}
|
||||
// ...
|
||||
}
|
||||
|
||||
// Layer 3: Workspace validation
|
||||
fn init(path: &Path) -> Result<()> {
|
||||
if !path.exists() {
|
||||
return Err("Path must exist");
|
||||
}
|
||||
// ...
|
||||
}
|
||||
|
||||
// Layer 4: Environment guard
|
||||
fn git_init(dir: &Path) -> Result<()> {
|
||||
if env::var("NODE_ENV") != Ok("test".to_string()) {
|
||||
if !dir.starts_with("/tmp") {
|
||||
panic!("Refusing to git init outside test dir");
|
||||
}
|
||||
}
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Primary fix prevents early access (source)
|
||||
- Each layer validates assumptions (defense)
|
||||
- Clear error at source, not deep in stack
|
||||
- Environment guard prevents production pollution
|
||||
- Multi-layer defense catches future mistakes
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Never fix just where error appears** → Trace backward to find source
|
||||
2. **Don't stop at first suspicious code** → Keep tracing to original trigger
|
||||
3. **Fix at source first** → Defense is backup, not primary fix
|
||||
4. **Use debugger OR instrumentation** → Don't guess at call chain
|
||||
5. **Add defense at each layer** → After fixing source, validate assumptions throughout
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Trace backward to find source.**
|
||||
|
||||
- "Error is obvious here, I'll add validation" (That's a symptom fix)
|
||||
- "Stack trace shows the problem" (Shows symptom location, not source)
|
||||
- "This code should handle empty values" (Why is value empty? Find source.)
|
||||
- "Too deep to trace, I'll add defensive check" (Defense without source fix = band-aid)
|
||||
- "Multiple places could cause this" (Trace to find which one actually does)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before claiming root cause fixed:
|
||||
|
||||
- [ ] Traced backward through entire call chain
|
||||
- [ ] Found where invalid data was created (not just passed)
|
||||
- [ ] Identified WHY invalid data was created (pattern/assumption)
|
||||
- [ ] Fixed at source (where bad data originates)
|
||||
- [ ] Added defense at each layer (validate assumptions)
|
||||
- [ ] Verified fix with test (reproduces original bug, passes with fix)
|
||||
- [ ] Confirmed no other code paths have same pattern
|
||||
|
||||
**Can't check all boxes?** Keep tracing backward.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill is called by:**
|
||||
- hyperpowers:debugging-with-tools (Phase 2: Trace Backward Through Call Stack)
|
||||
- When errors occur deep in execution
|
||||
- When unclear where invalid data originated
|
||||
|
||||
**This skill requires:**
|
||||
- Stack traces or debugger access
|
||||
- Ability to add instrumentation (logging)
|
||||
- Understanding of call chain
|
||||
|
||||
**This skill calls:**
|
||||
- hyperpowers:test-driven-development (write regression test after finding source)
|
||||
- hyperpowers:verification-before-completion (verify fix works)
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Debugger commands by language](resources/debugger-reference.md)
|
||||
- [Instrumentation patterns](resources/instrumentation-patterns.md)
|
||||
- [Defense-in-depth examples](resources/defense-patterns.md)
|
||||
|
||||
**When stuck:**
|
||||
- Can't find source → Add instrumentation at each layer, run test
|
||||
- Stack trace unclear → Use debugger to inspect variables at each frame
|
||||
- Multiple suspects → Add instrumentation to all, find which actually executes
|
||||
- Intermittent issue → Add instrumentation and wait for reproduction
|
||||
</resources>
|
||||
399
skills/skills-auto-activation/SKILL.md
Normal file
399
skills/skills-auto-activation/SKILL.md
Normal file
@@ -0,0 +1,399 @@
|
||||
---
|
||||
name: skills-auto-activation
|
||||
description: Use when skills aren't activating reliably - covers official solutions (better descriptions) and custom hook system for deterministic skill activation
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Skills often don't activate despite keywords; make activation reliable through better descriptions, explicit triggers, or custom hooks.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
HIGH FREEDOM - Choose solution level based on project needs (Level 1 for simple, Level 3 for complex). Hook implementation is flexible pattern, not rigid process.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Level | Solution | Effort | Reliability | When to Use |
|
||||
|-------|----------|--------|-------------|-------------|
|
||||
| 1 | Better descriptions + explicit requests | Low | Moderate | Small projects, starting out |
|
||||
| 2 | CLAUDE.md references | Low | Moderate | Document patterns |
|
||||
| 3 | Custom hook system | High | Very High | Large projects, established patterns |
|
||||
|
||||
**Hyperpowers includes:** Auto-activation hook at `hooks/user-prompt-submit/10-skill-activator.js`
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
Use this skill when:
|
||||
- Skills you created aren't being used automatically
|
||||
- Need consistent skill activation across sessions
|
||||
- Large codebases with established patterns
|
||||
- Manual "/use skill-name" gets tedious
|
||||
|
||||
**Prerequisites:**
|
||||
- Skills properly configured (name, description, SKILL.md)
|
||||
- Code execution enabled (Settings > Capabilities)
|
||||
- Skills toggled on (Settings > Capabilities)
|
||||
</when_to_use>
|
||||
|
||||
<the_problem>
|
||||
## What Users Experience
|
||||
|
||||
**Symptoms:**
|
||||
- Keywords from skill descriptions present → skill not used
|
||||
- Working on files that should trigger skills → nothing
|
||||
- Skills exist but sit unused
|
||||
|
||||
**Community reports:**
|
||||
- GitHub Issue #9954: "Skills not available even if explicitly enabled"
|
||||
- "Claude knows it should use skills, but it's not reliable"
|
||||
- Skills activation is "not reliable yet"
|
||||
|
||||
**Root cause:** Skills rely on Claude recognizing relevance (not deterministic)
|
||||
</the_problem>
|
||||
|
||||
<solution_levels>
|
||||
## Level 1: Official Solutions (Start Here)
|
||||
|
||||
### 1. Improve Skill Descriptions
|
||||
|
||||
❌ **Bad:**
|
||||
```yaml
|
||||
name: backend-dev
|
||||
description: Helps with backend development
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
```yaml
|
||||
name: backend-dev-guidelines
|
||||
description: Use when creating API routes, controllers, services, or repositories in backend - enforces TypeScript patterns, error handling with Sentry, and Prisma repository pattern
|
||||
```
|
||||
|
||||
**Key elements:**
|
||||
- Specific keywords: "API routes", "controllers", "services"
|
||||
- When to use: "Use when creating..."
|
||||
- What it enforces: Patterns, error handling
|
||||
|
||||
### 2. Be Explicit in Requests
|
||||
|
||||
Instead of: "How do I create an endpoint?"
|
||||
|
||||
Try: "Use my backend-dev-guidelines skill to create an endpoint"
|
||||
|
||||
**Result:** Works, but tedious
|
||||
|
||||
### 3. Check Settings
|
||||
|
||||
- Settings > Capabilities > Enable code execution
|
||||
- Settings > Capabilities > Toggle Skills on
|
||||
- Team/Enterprise: Check org-level settings
|
||||
|
||||
---
|
||||
|
||||
## Level 2: Skill References (Moderate)
|
||||
|
||||
Reference skills in CLAUDE.md:
|
||||
|
||||
```markdown
|
||||
## When Working on Backend
|
||||
|
||||
Before making changes:
|
||||
1. Check `/skills/backend-dev-guidelines` for patterns
|
||||
2. Follow repository pattern for database access
|
||||
|
||||
The backend-dev-guidelines skill contains complete examples.
|
||||
```
|
||||
|
||||
**Pros:** No custom code
|
||||
**Cons:** Claude still might not check
|
||||
|
||||
---
|
||||
|
||||
## Level 3: Custom Hook System (Advanced)
|
||||
|
||||
**How it works:**
|
||||
1. UserPromptSubmit hook analyzes prompt before Claude sees it
|
||||
2. Matches keywords, intent patterns, file paths
|
||||
3. Injects skill activation reminder into context
|
||||
4. Claude sees "🎯 USE these skills" before processing
|
||||
|
||||
**Result:** "Night and day difference" - skills consistently used
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
User submits prompt
|
||||
↓
|
||||
UserPromptSubmit hook intercepts
|
||||
↓
|
||||
Analyze prompt (keywords, intent, files)
|
||||
↓
|
||||
Check skill-rules.json for matches
|
||||
↓
|
||||
Inject activation reminder
|
||||
↓
|
||||
Claude sees: "🎯 USE these skills: ..."
|
||||
↓
|
||||
Claude loads and uses relevant skills
|
||||
```
|
||||
|
||||
### Configuration: skill-rules.json
|
||||
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"enforcement": "suggest",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["backend", "controller", "service", "API", "endpoint"],
|
||||
"intentPatterns": [
|
||||
"(create|add|build).*?(route|endpoint|controller|service)",
|
||||
"(how to|pattern).*?(backend|API)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["backend/src/**/*.ts", "server/**/*.ts"],
|
||||
"contentPatterns": ["express\\.Router", "export.*Controller"]
|
||||
}
|
||||
},
|
||||
"test-driven-development": {
|
||||
"type": "process",
|
||||
"enforcement": "suggest",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["test", "TDD", "testing"],
|
||||
"intentPatterns": [
|
||||
"(write|add|create).*?(test|spec)",
|
||||
"test.*(first|before|TDD)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["**/*.test.ts", "**/*.spec.ts"],
|
||||
"contentPatterns": ["describe\\(", "it\\(", "test\\("]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Trigger Types
|
||||
|
||||
1. **Keyword Triggers** - Simple string matching (case insensitive)
|
||||
2. **Intent Pattern Triggers** - Regex for actions + objects
|
||||
3. **File Path Triggers** - Glob patterns for file paths
|
||||
4. **Content Pattern Triggers** - Regex in file content
|
||||
|
||||
### Hook Implementation (High-Level)
|
||||
|
||||
```javascript
|
||||
#!/usr/bin/env node
|
||||
// ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
// Load skill rules
|
||||
const rules = JSON.parse(fs.readFileSync(
|
||||
path.join(process.env.HOME, '.claude/skill-rules.json'), 'utf8'
|
||||
));
|
||||
|
||||
// Read prompt from stdin
|
||||
let promptData = '';
|
||||
process.stdin.on('data', chunk => promptData += chunk);
|
||||
|
||||
process.stdin.on('end', () => {
|
||||
const prompt = JSON.parse(promptData);
|
||||
|
||||
// Analyze prompt for skill matches
|
||||
const activatedSkills = analyzePrompt(prompt.text);
|
||||
|
||||
if (activatedSkills.length > 0) {
|
||||
// Inject skill activation reminder
|
||||
const context = `
|
||||
🎯 SKILL ACTIVATION CHECK
|
||||
|
||||
Relevant skills for this prompt:
|
||||
${activatedSkills.map(s => `- **${s.skill}** (${s.priority} priority)`).join('\n')}
|
||||
|
||||
Check if these skills should be used before responding.
|
||||
`;
|
||||
|
||||
console.log(JSON.stringify({
|
||||
decision: 'approve',
|
||||
additionalContext: context
|
||||
}));
|
||||
} else {
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
}
|
||||
});
|
||||
|
||||
function analyzePrompt(text) {
|
||||
// Match against all skill rules
|
||||
// Return list of activated skills with priorities
|
||||
}
|
||||
```
|
||||
|
||||
**For complete working implementation:** See [resources/hook-implementation.md](resources/hook-implementation.md)
|
||||
|
||||
### Progressive Enhancement
|
||||
|
||||
**Phase 1 (Week 1):** Basic keyword matching
|
||||
```json
|
||||
{"keywords": ["backend", "API", "controller"]}
|
||||
```
|
||||
|
||||
**Phase 2 (Week 2):** Add intent patterns
|
||||
```json
|
||||
{"intentPatterns": ["(create|add).*?(route|endpoint)"]}
|
||||
```
|
||||
|
||||
**Phase 3 (Week 3):** Add file triggers
|
||||
```json
|
||||
{"fileTriggers": {"pathPatterns": ["backend/**/*.ts"]}}
|
||||
```
|
||||
|
||||
**Phase 4 (Ongoing):** Refine based on observation
|
||||
</solution_levels>
|
||||
|
||||
<results>
|
||||
### Before Hook System
|
||||
|
||||
- Skills sit unused despite perfect keywords
|
||||
- Manual "/use skill-name" every time
|
||||
- Inconsistent patterns across codebase
|
||||
- Time spent fixing "creative interpretations"
|
||||
|
||||
### After Hook System
|
||||
|
||||
- Skills activate automatically and reliably
|
||||
- Consistent patterns enforced
|
||||
- Claude self-checks before showing code
|
||||
- "Night and day difference"
|
||||
|
||||
**Real user:** "Skills went from 'expensive decorations' to actually useful"
|
||||
</results>
|
||||
|
||||
<limitations>
|
||||
## Hook System Limitations
|
||||
|
||||
1. **Requires hook system** - Not built into Claude Code
|
||||
2. **Maintenance overhead** - skill-rules.json needs updates
|
||||
3. **May over-activate** - Too many skills overwhelm context
|
||||
4. **Not perfect** - Still relies on Claude using activated skills
|
||||
|
||||
## Considerations
|
||||
|
||||
**Token usage:**
|
||||
- Activation reminder adds ~50-100 tokens per prompt
|
||||
- Multiple skills add more tokens
|
||||
- Use priorities to limit activation
|
||||
|
||||
**Performance:**
|
||||
- Hook adds ~100-300ms to prompt processing
|
||||
- Acceptable for quality improvement
|
||||
- Optimize regex patterns if slow
|
||||
|
||||
**Maintenance:**
|
||||
- Update rules when adding new skills
|
||||
- Review activation logs monthly
|
||||
- Refine patterns based on misses
|
||||
</limitations>
|
||||
|
||||
<alternatives>
|
||||
## Approach 1: MCP Integration
|
||||
|
||||
Use Model Context Protocol to provide skills as context.
|
||||
|
||||
**Pros:** Built into Claude system
|
||||
**Cons:** Still not deterministic, same activation issues
|
||||
|
||||
## Approach 2: Custom System Prompt
|
||||
|
||||
Modify Claude's system prompt to always check certain skills.
|
||||
|
||||
**Pros:** Works without hooks
|
||||
**Cons:** Limited to Pro plan, can't customize per-project
|
||||
|
||||
## Approach 3: Manual Discipline
|
||||
|
||||
Always explicitly request skill usage.
|
||||
|
||||
**Pros:** No setup required
|
||||
**Cons:** Tedious, easy to forget, doesn't scale
|
||||
|
||||
## Approach 4: Skill Consolidation
|
||||
|
||||
Combine all guidelines into CLAUDE.md.
|
||||
|
||||
**Pros:** Always loaded
|
||||
**Cons:** Violates progressive disclosure, wastes tokens
|
||||
|
||||
**Recommendation:** Level 3 (hooks) for large projects, Level 1 for smaller projects
|
||||
</alternatives>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Try Level 1 first** → Better descriptions and explicit requests before building hooks
|
||||
2. **Observe before building** → Watch which prompts should activate skills
|
||||
3. **Start with keywords** → Add complexity incrementally (keywords → intent → files)
|
||||
4. **Keep hook fast (<1 second)** → Don't block prompt processing
|
||||
5. **Maintain skill-rules.json** → Update when skills change
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **Try Level 1 first, then decide.**
|
||||
|
||||
- "Skills should just work automatically" (They should, but don't reliably - workaround needed)
|
||||
- "Hook system too complex" (Setup takes 2 hours, saves hundreds of hours)
|
||||
- "I'll manually specify skills" (You'll forget, it gets tedious)
|
||||
- "Improving descriptions will fix it" (Helps, but not deterministic)
|
||||
- "This is overkill" (Maybe - start Level 1, upgrade if needed)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before building hook system:
|
||||
|
||||
- [ ] Tried improving skill descriptions (Level 1)
|
||||
- [ ] Tried explicit skill requests (Level 1)
|
||||
- [ ] Checked all settings are enabled
|
||||
- [ ] Observed which prompts should activate skills
|
||||
- [ ] Identified patterns in failures
|
||||
- [ ] Project large enough to justify hook overhead
|
||||
- [ ] Have time for 2-hour setup + ongoing maintenance
|
||||
|
||||
**If Level 1 works:** Don't build hook system
|
||||
|
||||
**If Level 1 insufficient:** Build hook system (Level 3)
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill covers:** Skill activation strategies
|
||||
|
||||
**Related skills:**
|
||||
- hyperpowers:building-hooks (how to build hook system)
|
||||
- hyperpowers:using-hyper (when to use skills generally)
|
||||
- hyperpowers:writing-skills (creating skills that activate well)
|
||||
|
||||
**This skill enables:**
|
||||
- Consistent enforcement of patterns
|
||||
- Automatic guideline checking
|
||||
- Reliable skill usage across sessions
|
||||
|
||||
**Hyperpowers includes:** Auto-activation hook at `hooks/user-prompt-submit/10-skill-activator.js`
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed implementation:**
|
||||
- [Complete working hook code](resources/hook-implementation.md)
|
||||
- [skill-rules.json examples](resources/skill-rules-examples.md)
|
||||
- [Troubleshooting guide](resources/troubleshooting.md)
|
||||
|
||||
**Official documentation:**
|
||||
- [Anthropic Skills Best Practices](https://docs.claude.com/en/docs/agents-and-tools/agent-skills/best-practices)
|
||||
- [Claude Code Hooks Guide](https://docs.claude.com/en/docs/claude-code/hooks-guide)
|
||||
|
||||
**When stuck:**
|
||||
- Skills still not activating → Check Settings > Capabilities
|
||||
- Hook not working → Check ~/.claude/logs/hooks.log
|
||||
- Over-activation → Reduce keywords, increase priority thresholds
|
||||
- Under-activation → Add more keywords, broaden intent patterns
|
||||
</resources>
|
||||
655
skills/skills-auto-activation/resources/hook-implementation.md
Normal file
655
skills/skills-auto-activation/resources/hook-implementation.md
Normal file
@@ -0,0 +1,655 @@
|
||||
# Complete Hook Implementation for Skills Auto-Activation
|
||||
|
||||
This guide provides complete, production-ready code for implementing skills auto-activation using Claude Code hooks.
|
||||
|
||||
## Complete File Structure
|
||||
|
||||
```
|
||||
~/.claude/
|
||||
├── hooks/
|
||||
│ └── user-prompt-submit/
|
||||
│ └── skill-activator.js # Main hook script
|
||||
├── skill-rules.json # Skill activation rules
|
||||
└── hooks.json # Hook configuration
|
||||
```
|
||||
|
||||
## Step 1: Create skill-rules.json
|
||||
|
||||
**Location:** `~/.claude/skill-rules.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"enforcement": "suggest",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"backend",
|
||||
"controller",
|
||||
"service",
|
||||
"repository",
|
||||
"API",
|
||||
"endpoint",
|
||||
"route",
|
||||
"middleware",
|
||||
"database",
|
||||
"prisma",
|
||||
"sequelize"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(create|add|build|implement).*?(route|endpoint|controller|service|repository)",
|
||||
"(how to|best practice|pattern|guide).*?(backend|API|database|server)",
|
||||
"(setup|configure|initialize).*?(database|ORM|API)",
|
||||
"implement.*(authentication|authorization|auth|security)",
|
||||
"(error|exception).*(handling|catching|logging)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"backend/**/*.ts",
|
||||
"backend/**/*.js",
|
||||
"server/**/*.ts",
|
||||
"api/**/*.ts",
|
||||
"src/controllers/**",
|
||||
"src/services/**",
|
||||
"src/repositories/**"
|
||||
],
|
||||
"contentPatterns": [
|
||||
"express\\.Router",
|
||||
"export.*Controller",
|
||||
"export.*Service",
|
||||
"export.*Repository",
|
||||
"prisma\\.",
|
||||
"@Controller",
|
||||
"@Injectable"
|
||||
]
|
||||
}
|
||||
},
|
||||
"frontend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"enforcement": "suggest",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"frontend",
|
||||
"component",
|
||||
"react",
|
||||
"UI",
|
||||
"layout",
|
||||
"page",
|
||||
"view",
|
||||
"hooks",
|
||||
"state",
|
||||
"props",
|
||||
"routing",
|
||||
"navigation"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(create|build|add|implement).*?(component|page|layout|view|screen)",
|
||||
"(how to|pattern|best practice).*?(react|hooks|state|context|props)",
|
||||
"(style|CSS|design).*?(component|layout|UI)",
|
||||
"implement.*?(routing|navigation|route)",
|
||||
"(state|data).*(management|flow|handling)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"src/components/**/*.tsx",
|
||||
"src/components/**/*.jsx",
|
||||
"src/pages/**/*.tsx",
|
||||
"src/views/**/*.tsx",
|
||||
"frontend/**/*.tsx"
|
||||
],
|
||||
"contentPatterns": [
|
||||
"import.*from ['\"]react",
|
||||
"export.*function.*Component",
|
||||
"export.*default.*function",
|
||||
"useState",
|
||||
"useEffect",
|
||||
"React\\.FC"
|
||||
]
|
||||
}
|
||||
},
|
||||
"test-driven-development": {
|
||||
"type": "process",
|
||||
"enforcement": "suggest",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"test",
|
||||
"testing",
|
||||
"TDD",
|
||||
"spec",
|
||||
"unit test",
|
||||
"integration test",
|
||||
"e2e",
|
||||
"jest",
|
||||
"vitest",
|
||||
"mocha"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(write|add|create|implement).*?(test|spec|unit test)",
|
||||
"test.*(first|before|TDD|driven)",
|
||||
"(bug|fix|issue).*?(reproduce|test)",
|
||||
"(coverage|untested).*?(code|function)",
|
||||
"(mock|stub|spy).*?(function|API|service)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"**/*.test.ts",
|
||||
"**/*.test.js",
|
||||
"**/*.spec.ts",
|
||||
"**/*.spec.js",
|
||||
"**/__tests__/**",
|
||||
"**/test/**"
|
||||
],
|
||||
"contentPatterns": [
|
||||
"describe\\(",
|
||||
"it\\(",
|
||||
"test\\(",
|
||||
"expect\\(",
|
||||
"jest\\.fn",
|
||||
"beforeEach\\(",
|
||||
"afterEach\\("
|
||||
]
|
||||
}
|
||||
},
|
||||
"debugging-with-tools": {
|
||||
"type": "process",
|
||||
"enforcement": "suggest",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"debug",
|
||||
"debugging",
|
||||
"error",
|
||||
"bug",
|
||||
"crash",
|
||||
"fails",
|
||||
"broken",
|
||||
"not working",
|
||||
"issue",
|
||||
"problem"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(debug|fix|solve|investigate|troubleshoot).*?(error|bug|issue|problem)",
|
||||
"(why|what).*?(failing|broken|not working|crashing)",
|
||||
"(find|locate|identify).*?(bug|issue|problem|root cause)",
|
||||
"reproduce.*(bug|issue|error)"
|
||||
]
|
||||
}
|
||||
},
|
||||
"refactoring-safely": {
|
||||
"type": "process",
|
||||
"enforcement": "suggest",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"refactor",
|
||||
"refactoring",
|
||||
"cleanup",
|
||||
"improve",
|
||||
"restructure",
|
||||
"reorganize",
|
||||
"simplify"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(refactor|clean up|improve|restructure).*?(code|function|class|component)",
|
||||
"(extract|split|separate).*?(function|method|component|logic)",
|
||||
"(rename|move|relocate).*?(file|function|class)",
|
||||
"remove.*(duplication|duplicate|repeated code)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Step 2: Create Hook Script
|
||||
|
||||
**Location:** `~/.claude/hooks/user-prompt-submit/skill-activator.js`
|
||||
|
||||
```javascript
|
||||
#!/usr/bin/env node
|
||||
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
// Configuration
|
||||
const CONFIG = {
|
||||
rulesPath: process.env.SKILL_RULES || path.join(process.env.HOME, '.claude/skill-rules.json'),
|
||||
maxSkills: 3, // Limit to avoid context overload
|
||||
debugMode: process.env.DEBUG === 'true'
|
||||
};
|
||||
|
||||
// Load skill rules
|
||||
function loadRules() {
|
||||
try {
|
||||
const content = fs.readFileSync(CONFIG.rulesPath, 'utf8');
|
||||
return JSON.parse(content);
|
||||
} catch (error) {
|
||||
if (CONFIG.debugMode) {
|
||||
console.error('Failed to load skill rules:', error.message);
|
||||
}
|
||||
return {};
|
||||
}
|
||||
}
|
||||
|
||||
// Read prompt from stdin
|
||||
function readPrompt() {
|
||||
return new Promise((resolve) => {
|
||||
let data = '';
|
||||
process.stdin.on('data', chunk => data += chunk);
|
||||
process.stdin.on('end', () => {
|
||||
try {
|
||||
resolve(JSON.parse(data));
|
||||
} catch (error) {
|
||||
if (CONFIG.debugMode) {
|
||||
console.error('Failed to parse prompt:', error.message);
|
||||
}
|
||||
resolve({ text: '' });
|
||||
}
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
// Analyze prompt for skill matches
|
||||
function analyzePrompt(promptText, rules) {
|
||||
const lowerText = promptText.toLowerCase();
|
||||
const activated = [];
|
||||
|
||||
for (const [skillName, config] of Object.entries(rules)) {
|
||||
let matched = false;
|
||||
let matchReason = '';
|
||||
|
||||
// Check keyword triggers
|
||||
if (config.promptTriggers?.keywords) {
|
||||
for (const keyword of config.promptTriggers.keywords) {
|
||||
if (lowerText.includes(keyword.toLowerCase())) {
|
||||
matched = true;
|
||||
matchReason = `keyword: "${keyword}"`;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check intent pattern triggers
|
||||
if (!matched && config.promptTriggers?.intentPatterns) {
|
||||
for (const pattern of config.promptTriggers.intentPatterns) {
|
||||
try {
|
||||
if (new RegExp(pattern, 'i').test(promptText)) {
|
||||
matched = true;
|
||||
matchReason = `intent pattern: "${pattern}"`;
|
||||
break;
|
||||
}
|
||||
} catch (error) {
|
||||
if (CONFIG.debugMode) {
|
||||
console.error(`Invalid pattern "${pattern}":`, error.message);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (matched) {
|
||||
activated.push({
|
||||
skill: skillName,
|
||||
priority: config.priority || 'medium',
|
||||
reason: matchReason,
|
||||
type: config.type || 'general'
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Sort by priority (high > medium > low)
|
||||
const priorityOrder = { high: 0, medium: 1, low: 2 };
|
||||
activated.sort((a, b) => {
|
||||
const priorityDiff = priorityOrder[a.priority] - priorityOrder[b.priority];
|
||||
if (priorityDiff !== 0) return priorityDiff;
|
||||
// Secondary sort: process types before domain types
|
||||
const typeOrder = { process: 0, domain: 1, general: 2 };
|
||||
return (typeOrder[a.type] || 2) - (typeOrder[b.type] || 2);
|
||||
});
|
||||
|
||||
// Limit to max skills
|
||||
return activated.slice(0, CONFIG.maxSkills);
|
||||
}
|
||||
|
||||
// Generate activation context
|
||||
function generateContext(skills) {
|
||||
if (skills.length === 0) {
|
||||
return null;
|
||||
}
|
||||
|
||||
const lines = [
|
||||
'',
|
||||
'━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━',
|
||||
'🎯 SKILL ACTIVATION CHECK',
|
||||
'━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━',
|
||||
'',
|
||||
'Relevant skills for this prompt:',
|
||||
''
|
||||
];
|
||||
|
||||
for (const skill of skills) {
|
||||
const emoji = skill.priority === 'high' ? '⭐' : skill.priority === 'medium' ? '📌' : '💡';
|
||||
lines.push(`${emoji} **${skill.skill}** (${skill.priority} priority)`);
|
||||
|
||||
if (CONFIG.debugMode) {
|
||||
lines.push(` Matched: ${skill.reason}`);
|
||||
}
|
||||
}
|
||||
|
||||
lines.push('');
|
||||
lines.push('Before responding, check if any of these skills should be used.');
|
||||
lines.push('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━');
|
||||
lines.push('');
|
||||
|
||||
return lines.join('\n');
|
||||
}
|
||||
|
||||
// Main execution
|
||||
async function main() {
|
||||
try {
|
||||
// Load rules
|
||||
const rules = loadRules();
|
||||
|
||||
if (Object.keys(rules).length === 0) {
|
||||
if (CONFIG.debugMode) {
|
||||
console.error('No rules loaded');
|
||||
}
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
return;
|
||||
}
|
||||
|
||||
// Read prompt
|
||||
const prompt = await readPrompt();
|
||||
|
||||
if (!prompt.text || prompt.text.trim() === '') {
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
return;
|
||||
}
|
||||
|
||||
// Analyze prompt
|
||||
const activatedSkills = analyzePrompt(prompt.text, rules);
|
||||
|
||||
// Generate response
|
||||
if (activatedSkills.length > 0) {
|
||||
const context = generateContext(activatedSkills);
|
||||
|
||||
if (CONFIG.debugMode) {
|
||||
console.error('Activated skills:', activatedSkills.map(s => s.skill).join(', '));
|
||||
}
|
||||
|
||||
console.log(JSON.stringify({
|
||||
decision: 'approve',
|
||||
additionalContext: context
|
||||
}));
|
||||
} else {
|
||||
if (CONFIG.debugMode) {
|
||||
console.error('No skills activated');
|
||||
}
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
}
|
||||
} catch (error) {
|
||||
if (CONFIG.debugMode) {
|
||||
console.error('Hook error:', error.message, error.stack);
|
||||
}
|
||||
// Always approve on error
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
}
|
||||
}
|
||||
|
||||
main();
|
||||
```
|
||||
|
||||
## Step 3: Make Hook Executable
|
||||
|
||||
```bash
|
||||
chmod +x ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
## Step 4: Configure Hook
|
||||
|
||||
**Location:** `~/.claude/hooks.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"hooks": [
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"command": "~/.claude/hooks/user-prompt-submit/skill-activator.js",
|
||||
"description": "Analyze prompt and inject skill activation reminders",
|
||||
"blocking": false,
|
||||
"timeout": 1000
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Step 5: Test the Hook
|
||||
|
||||
### Test 1: Keyword Matching
|
||||
|
||||
```bash
|
||||
# Create test prompt
|
||||
echo '{"text": "How do I create a new API endpoint?"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
**Expected output:**
|
||||
```json
|
||||
{
|
||||
"additionalContext": "\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n🎯 SKILL ACTIVATION CHECK\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n\nRelevant skills for this prompt:\n\n⭐ **backend-dev-guidelines** (high priority)\n\nBefore responding, check if any of these skills should be used.\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
|
||||
}
|
||||
```
|
||||
|
||||
### Test 2: Intent Pattern Matching
|
||||
|
||||
```bash
|
||||
echo '{"text": "I want to build a new React component"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
**Expected:** Should activate frontend-dev-guidelines
|
||||
|
||||
### Test 3: Multiple Skills
|
||||
|
||||
```bash
|
||||
echo '{"text": "Write a test for the API endpoint"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
**Expected:** Should activate hyperpowers:test-driven-development and backend-dev-guidelines
|
||||
|
||||
### Test 4: Debug Mode
|
||||
|
||||
```bash
|
||||
DEBUG=true echo '{"text": "How do I create a component?"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
|
||||
```
|
||||
|
||||
**Expected:** Debug output showing which skills matched and why
|
||||
|
||||
## Advanced: File-Based Triggers
|
||||
|
||||
To add file-based triggers, extend the hook to check which files are being edited:
|
||||
|
||||
```javascript
|
||||
// Add to skill-activator.js
|
||||
|
||||
// Get recently edited files from Claude Code context
|
||||
function getRecentFiles(prompt) {
|
||||
// Claude Code provides context about files being edited
|
||||
// This would come from the prompt context or a separate tracking mechanism
|
||||
return prompt.files || [];
|
||||
}
|
||||
|
||||
// Check file triggers
|
||||
function checkFileTriggers(files, config) {
|
||||
if (!files || files.length === 0) return false;
|
||||
if (!config.fileTriggers) return false;
|
||||
|
||||
// Check path patterns
|
||||
if (config.fileTriggers.pathPatterns) {
|
||||
for (const file of files) {
|
||||
for (const pattern of config.fileTriggers.pathPatterns) {
|
||||
// Convert glob pattern to regex
|
||||
const regex = globToRegex(pattern);
|
||||
if (regex.test(file)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check content patterns (would require reading files)
|
||||
// Omitted for performance - better to check in PostToolUse hook
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
// Convert glob pattern to regex
|
||||
function globToRegex(glob) {
|
||||
const regex = glob
|
||||
.replace(/\*\*/g, '___DOUBLE_STAR___')
|
||||
.replace(/\*/g, '[^/]*')
|
||||
.replace(/___DOUBLE_STAR___/g, '.*')
|
||||
.replace(/\?/g, '.');
|
||||
return new RegExp(`^${regex}$`);
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Hook Not Running
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Verify hook is configured
|
||||
cat ~/.claude/hooks.json
|
||||
|
||||
# Test hook manually
|
||||
echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
|
||||
# Check Claude Code logs
|
||||
tail -f ~/.claude/logs/hooks.log
|
||||
```
|
||||
|
||||
### No Skills Activating
|
||||
|
||||
**Enable debug mode:**
|
||||
```bash
|
||||
DEBUG=true node ~/.claude/hooks/user-prompt-submit/skill-activator.js < test-prompt.json
|
||||
```
|
||||
|
||||
**Common causes:**
|
||||
- skill-rules.json not found or invalid
|
||||
- Keywords don't match (check casing, spelling)
|
||||
- Patterns have regex errors
|
||||
- Hook timing out (increase timeout)
|
||||
|
||||
### Too Many Skills Activating
|
||||
|
||||
**Adjust maxSkills:**
|
||||
```javascript
|
||||
const CONFIG = {
|
||||
maxSkills: 2, // Reduce from 3
|
||||
// ...
|
||||
};
|
||||
```
|
||||
|
||||
**Or tighten triggers:**
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"priority": "high", // Only high priority skills
|
||||
"promptTriggers": {
|
||||
"keywords": ["controller", "service"], // More specific keywords
|
||||
// ...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Issues
|
||||
|
||||
**If hook is slow (>500ms):**
|
||||
|
||||
1. Reduce regex complexity
|
||||
2. Limit number of patterns
|
||||
3. Cache compiled regex patterns
|
||||
4. Profile with:
|
||||
|
||||
```bash
|
||||
time echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Monthly Review
|
||||
|
||||
```bash
|
||||
# Check activation frequency
|
||||
grep "Activated skills" ~/.claude/hooks/debug.log | sort | uniq -c
|
||||
|
||||
# Find prompts that didn't activate any skills
|
||||
grep "No skills activated" ~/.claude/hooks/debug.log
|
||||
```
|
||||
|
||||
### Updating Rules
|
||||
|
||||
When adding new skills:
|
||||
|
||||
1. Add to skill-rules.json
|
||||
2. Test activation with sample prompts
|
||||
3. Observe for false positives/negatives
|
||||
4. Refine patterns based on usage
|
||||
|
||||
### Version Control
|
||||
|
||||
```bash
|
||||
# Track rules in git
|
||||
cd ~/.claude
|
||||
git init
|
||||
git add skill-rules.json hooks/
|
||||
git commit -m "Initial skill activation rules"
|
||||
```
|
||||
|
||||
## Integration with Other Hooks
|
||||
|
||||
The skill activator can work alongside other hooks:
|
||||
|
||||
```json
|
||||
{
|
||||
"hooks": [
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"command": "~/.claude/hooks/user-prompt-submit/00-log-prompt.sh",
|
||||
"description": "Log prompts for analysis",
|
||||
"blocking": false
|
||||
},
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"command": "~/.claude/hooks/user-prompt-submit/10-skill-activator.js",
|
||||
"description": "Activate relevant skills",
|
||||
"blocking": false
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Naming convention:** Use numeric prefixes (00-, 10-, 20-) to control execution order.
|
||||
|
||||
## Performance Benchmarks
|
||||
|
||||
**Target performance:**
|
||||
- Keyword matching: <50ms
|
||||
- Intent pattern matching: <200ms
|
||||
- Total hook execution: <500ms
|
||||
|
||||
**Actual performance (typical):**
|
||||
- 2-3 skills: ~100-300ms
|
||||
- 5+ skills: ~300-500ms
|
||||
|
||||
If performance degrades, profile and optimize patterns.
|
||||
443
skills/skills-auto-activation/resources/skill-rules-examples.md
Normal file
443
skills/skills-auto-activation/resources/skill-rules-examples.md
Normal file
@@ -0,0 +1,443 @@
|
||||
# Skill Rules Examples
|
||||
|
||||
Example configurations for common skill types and scenarios.
|
||||
|
||||
## Domain-Specific Skills
|
||||
|
||||
### Backend Development
|
||||
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"backend", "server", "API", "endpoint", "route",
|
||||
"controller", "service", "repository",
|
||||
"middleware", "authentication", "authorization"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(create|build|implement|add).*?(API|endpoint|route|controller)",
|
||||
"how.*(backend|server|API)",
|
||||
"(setup|configure).*(server|backend|API)",
|
||||
"implement.*(auth|security)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"backend/**/*.ts",
|
||||
"server/**/*.ts",
|
||||
"src/api/**"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Frontend Development
|
||||
|
||||
```json
|
||||
{
|
||||
"frontend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"frontend", "UI", "component", "react", "vue", "angular",
|
||||
"page", "layout", "view", "hooks", "state"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(create|build).*?(component|page|layout)",
|
||||
"how.*(react|hooks|state)",
|
||||
"(style|design).*?(component|UI)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"src/components/**/*.tsx",
|
||||
"src/pages/**/*.tsx"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Process Skills
|
||||
|
||||
### Test-Driven Development
|
||||
|
||||
```json
|
||||
{
|
||||
"test-driven-development": {
|
||||
"type": "process",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["test", "TDD", "testing", "spec", "jest", "vitest"],
|
||||
"intentPatterns": [
|
||||
"(write|create|add).*?test",
|
||||
"test.*first",
|
||||
"reproduce.*(bug|error)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"**/*.test.ts",
|
||||
"**/*.spec.ts",
|
||||
"**/__tests__/**"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Code Review
|
||||
|
||||
```json
|
||||
{
|
||||
"code-review": {
|
||||
"type": "process",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": ["review", "check", "verify", "audit", "quality"],
|
||||
"intentPatterns": [
|
||||
"review.*(code|changes|implementation)",
|
||||
"(check|verify).*(quality|standards|best practices)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Technology-Specific Skills
|
||||
|
||||
### Database/Prisma
|
||||
|
||||
```json
|
||||
{
|
||||
"database-prisma": {
|
||||
"type": "technology",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"database", "prisma", "schema", "migration",
|
||||
"query", "orm", "model"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(create|update|modify).*?(schema|model|migration)",
|
||||
"(query|fetch|get).*?database"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"**/prisma/**",
|
||||
"**/*.prisma"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Docker/DevOps
|
||||
|
||||
```json
|
||||
{
|
||||
"devops-docker": {
|
||||
"type": "technology",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"docker", "dockerfile", "container",
|
||||
"deployment", "CI/CD", "kubernetes"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(create|build|configure).*?(docker|container)",
|
||||
"(deploy|release|publish)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"**/Dockerfile",
|
||||
"**/.github/workflows/**",
|
||||
"**/docker-compose.yml"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Project-Specific Skills
|
||||
|
||||
### Feature-Specific (E-commerce Cart)
|
||||
|
||||
```json
|
||||
{
|
||||
"cart-feature": {
|
||||
"type": "feature",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"cart", "shopping cart", "basket",
|
||||
"add to cart", "checkout"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(implement|create|modify).*?cart",
|
||||
"cart.*(functionality|feature|logic)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": [
|
||||
"src/features/cart/**",
|
||||
"backend/cart-service/**"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Priority-Based Configuration
|
||||
|
||||
### High Priority (Always Check)
|
||||
|
||||
```json
|
||||
{
|
||||
"critical-security": {
|
||||
"type": "security",
|
||||
"priority": "high",
|
||||
"enforcement": "suggest",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"security", "vulnerability", "authentication", "authorization",
|
||||
"SQL injection", "XSS", "CSRF", "password", "token"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"secur(e|ity)",
|
||||
"vulnerab(le|ility)",
|
||||
"(auth|password|token).*(implement|handle|store)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Medium Priority (Contextual)
|
||||
|
||||
```json
|
||||
{
|
||||
"performance-optimization": {
|
||||
"type": "optimization",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"performance", "optimize", "slow", "cache",
|
||||
"memory", "speed", "latency"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(improve|optimize).*(performance|speed)",
|
||||
"(reduce|minimize).*(latency|memory|time)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Low Priority (Optional)
|
||||
|
||||
```json
|
||||
{
|
||||
"documentation-guide": {
|
||||
"type": "documentation",
|
||||
"priority": "low",
|
||||
"promptTriggers": {
|
||||
"keywords": [
|
||||
"documentation", "docs", "comments", "readme",
|
||||
"docstring", "jsdoc"
|
||||
],
|
||||
"intentPatterns": [
|
||||
"(write|update|create).*?(documentation|docs)",
|
||||
"document.*(API|function|component)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Multi-Repo Configuration
|
||||
|
||||
For projects with multiple repositories:
|
||||
|
||||
```json
|
||||
{
|
||||
"frontend-mobile": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["mobile", "ios", "android", "react native"],
|
||||
"intentPatterns": ["(create|build).*?(screen|component)"]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["/mobile/**", "/apps/mobile/**"]
|
||||
}
|
||||
},
|
||||
"frontend-web": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["web", "website", "react", "nextjs"],
|
||||
"intentPatterns": ["(create|build).*?(page|component)"]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["/web/**", "/apps/web/**"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Advanced Pattern Matching
|
||||
|
||||
### Negative Patterns (Exclude)
|
||||
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"promptTriggers": {
|
||||
"keywords": ["backend"],
|
||||
"intentPatterns": [
|
||||
"backend",
|
||||
"(?!.*test).*backend" // Match "backend" but not if "test" appears
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Compound Patterns
|
||||
|
||||
```json
|
||||
{
|
||||
"database-migration": {
|
||||
"promptTriggers": {
|
||||
"intentPatterns": [
|
||||
"(create|generate|run).*(migration|schema change)",
|
||||
"(add|remove|modify).*(column|table|index)"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Context-Aware Patterns
|
||||
|
||||
```json
|
||||
{
|
||||
"error-handling": {
|
||||
"promptTriggers": {
|
||||
"keywords": ["error", "exception", "try", "catch"],
|
||||
"intentPatterns": [
|
||||
"(handle|catch|throw).*(error|exception)",
|
||||
"error.*handling"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Enforcement Levels
|
||||
|
||||
```json
|
||||
{
|
||||
"critical-skill": {
|
||||
"enforcement": "block", // Block if not used (future feature)
|
||||
"priority": "high"
|
||||
},
|
||||
"recommended-skill": {
|
||||
"enforcement": "suggest", // Suggest usage
|
||||
"priority": "medium"
|
||||
},
|
||||
"optional-skill": {
|
||||
"enforcement": "optional", // Mention availability
|
||||
"priority": "low"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Full Example Configuration
|
||||
|
||||
Complete configuration for a full-stack TypeScript project:
|
||||
|
||||
```json
|
||||
{
|
||||
"backend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["backend", "API", "endpoint", "controller", "service"],
|
||||
"intentPatterns": [
|
||||
"(create|add|implement).*?(API|endpoint|route|controller|service)",
|
||||
"how.*(backend|server|API)"
|
||||
]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["backend/**/*.ts", "server/**/*.ts"]
|
||||
}
|
||||
},
|
||||
"frontend-dev-guidelines": {
|
||||
"type": "domain",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["frontend", "component", "react", "UI"],
|
||||
"intentPatterns": ["(create|build).*?(component|page)"]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["src/components/**/*.tsx", "src/pages/**/*.tsx"]
|
||||
}
|
||||
},
|
||||
"test-driven-development": {
|
||||
"type": "process",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["test", "TDD", "testing"],
|
||||
"intentPatterns": ["(write|create).*?test", "test.*first"]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["**/*.test.ts", "**/*.spec.ts"]
|
||||
}
|
||||
},
|
||||
"database-prisma": {
|
||||
"type": "technology",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["database", "prisma", "schema", "migration"],
|
||||
"intentPatterns": ["(create|modify).*?(schema|migration)"]
|
||||
},
|
||||
"fileTriggers": {
|
||||
"pathPatterns": ["**/prisma/**"]
|
||||
}
|
||||
},
|
||||
"debugging-with-tools": {
|
||||
"type": "process",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": ["debug", "bug", "error", "broken", "not working"],
|
||||
"intentPatterns": ["(debug|fix|solve).*?(error|bug|issue)"]
|
||||
}
|
||||
},
|
||||
"refactoring-safely": {
|
||||
"type": "process",
|
||||
"priority": "medium",
|
||||
"promptTriggers": {
|
||||
"keywords": ["refactor", "cleanup", "improve", "restructure"],
|
||||
"intentPatterns": ["(refactor|clean up|improve).*?code"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Tips for Creating Rules
|
||||
|
||||
1. **Start broad, refine narrow** - Begin with general keywords, narrow based on false positives
|
||||
2. **Use priority wisely** - High priority for critical skills only
|
||||
3. **Test patterns** - Validate regex patterns before deploying
|
||||
4. **Monitor activation** - Track which skills activate and adjust
|
||||
5. **Keep it maintainable** - Comment complex patterns
|
||||
6. **Version control** - Track changes to rules over time
|
||||
557
skills/skills-auto-activation/resources/troubleshooting.md
Normal file
557
skills/skills-auto-activation/resources/troubleshooting.md
Normal file
@@ -0,0 +1,557 @@
|
||||
# Troubleshooting Skills Auto-Activation
|
||||
|
||||
Common issues and solutions for skills auto-activation system.
|
||||
|
||||
## Problem: Hook Not Running At All
|
||||
|
||||
### Symptoms
|
||||
- No skill activation messages appear
|
||||
- Prompts process normally without injected context
|
||||
|
||||
### Diagnosis
|
||||
|
||||
**Step 1: Check hook configuration**
|
||||
```bash
|
||||
cat ~/.claude/hooks.json
|
||||
```
|
||||
|
||||
Should contain:
|
||||
```json
|
||||
{
|
||||
"hooks": [
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"command": "~/.claude/hooks/user-prompt-submit/skill-activator.js"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Test hook manually**
|
||||
```bash
|
||||
echo '{"text": "test backend endpoint"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
Should output JSON with `decision` and possibly `additionalContext`.
|
||||
|
||||
**Step 3: Check file permissions**
|
||||
```bash
|
||||
ls -l ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
Should be executable (`-rwxr-xr-x`). If not:
|
||||
```bash
|
||||
chmod +x ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
**Step 4: Check Claude Code logs**
|
||||
```bash
|
||||
tail -f ~/.claude/logs/hooks.log
|
||||
```
|
||||
|
||||
Look for errors related to skill-activator.
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Reinstall hook**
|
||||
```bash
|
||||
mkdir -p ~/.claude/hooks/user-prompt-submit
|
||||
cp skill-activator.js ~/.claude/hooks/user-prompt-submit/
|
||||
chmod +x ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
**Solution 2: Verify Node.js**
|
||||
```bash
|
||||
which node
|
||||
node --version
|
||||
```
|
||||
|
||||
Ensure Node.js is installed and in PATH.
|
||||
|
||||
**Solution 3: Check hook timeout**
|
||||
```json
|
||||
{
|
||||
"hooks": [
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"command": "~/.claude/hooks/user-prompt-submit/skill-activator.js",
|
||||
"timeout": 2000 // Increase if needed
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Problem: No Skills Activating
|
||||
|
||||
### Symptoms
|
||||
- Hook runs successfully
|
||||
- No skills appear in activation messages
|
||||
- Debug shows "No skills activated"
|
||||
|
||||
### Diagnosis
|
||||
|
||||
**Enable debug mode:**
|
||||
```bash
|
||||
DEBUG=true echo '{"text": "your test prompt"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
|
||||
```
|
||||
|
||||
**Check for:**
|
||||
- "No rules loaded" → skill-rules.json not found
|
||||
- "No skills activated" → Keywords/patterns don't match
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Verify skill-rules.json location**
|
||||
```bash
|
||||
cat ~/.claude/skill-rules.json
|
||||
```
|
||||
|
||||
If not found:
|
||||
```bash
|
||||
cp skill-rules.json ~/.claude/skill-rules.json
|
||||
```
|
||||
|
||||
**Solution 2: Test with known keyword**
|
||||
```bash
|
||||
echo '{"text": "create backend controller"}' | \
|
||||
SKILL_RULES=~/.claude/skill-rules.json \
|
||||
DEBUG=true \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
|
||||
```
|
||||
|
||||
Should match "backend-dev-guidelines" if configured.
|
||||
|
||||
**Solution 3: Check JSON syntax**
|
||||
```bash
|
||||
cat ~/.claude/skill-rules.json | jq '.'
|
||||
```
|
||||
|
||||
If errors, fix JSON syntax.
|
||||
|
||||
**Solution 4: Simplify rules for testing**
|
||||
```json
|
||||
{
|
||||
"test-skill": {
|
||||
"type": "test",
|
||||
"priority": "high",
|
||||
"promptTriggers": {
|
||||
"keywords": ["test"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Test with:
|
||||
```bash
|
||||
echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
## Problem: Wrong Skills Activating
|
||||
|
||||
### Symptoms
|
||||
- Skills activate on irrelevant prompts
|
||||
- Too many false positives
|
||||
|
||||
### Diagnosis
|
||||
|
||||
**Enable debug to see why skills matched:**
|
||||
```bash
|
||||
DEBUG=true echo '{"text": "your prompt"}' | \
|
||||
node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
|
||||
```
|
||||
|
||||
Look for "Matched: keyword" or "Matched: intent pattern" to see why.
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Tighten keywords**
|
||||
|
||||
Before:
|
||||
```json
|
||||
{
|
||||
"keywords": ["api", "test", "code"]
|
||||
}
|
||||
```
|
||||
|
||||
After (more specific):
|
||||
```json
|
||||
{
|
||||
"keywords": ["API endpoint", "integration test", "refactor code"]
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 2: Use negative patterns**
|
||||
```json
|
||||
{
|
||||
"intentPatterns": [
|
||||
"(?!.*test).*backend" // Match "backend" but not if "test" in prompt
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 3: Increase priority thresholds**
|
||||
```json
|
||||
{
|
||||
"test-skill": {
|
||||
"priority": "low" // Will be deprioritized if others match
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 4: Reduce maxSkills**
|
||||
|
||||
In skill-activator.js:
|
||||
```javascript
|
||||
const CONFIG = {
|
||||
maxSkills: 2, // Reduce from 3
|
||||
};
|
||||
```
|
||||
|
||||
## Problem: Hook Is Slow
|
||||
|
||||
### Symptoms
|
||||
- Noticeable delay before Claude responds
|
||||
- Hook takes >1 second
|
||||
|
||||
### Diagnosis
|
||||
|
||||
**Measure hook performance:**
|
||||
```bash
|
||||
time echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
```
|
||||
|
||||
Should be <500ms. If slower, diagnose:
|
||||
|
||||
**Check number of rules:**
|
||||
```bash
|
||||
cat ~/.claude/skill-rules.json | jq 'keys | length'
|
||||
```
|
||||
|
||||
More than 10 rules may slow down.
|
||||
|
||||
**Check pattern complexity:**
|
||||
```bash
|
||||
cat ~/.claude/skill-rules.json | jq '.[].promptTriggers.intentPatterns'
|
||||
```
|
||||
|
||||
Complex regex patterns slow matching.
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Optimize regex patterns**
|
||||
|
||||
Before (slow):
|
||||
```json
|
||||
{
|
||||
"intentPatterns": [
|
||||
".*create.*backend.*endpoint.*"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
After (faster):
|
||||
```json
|
||||
{
|
||||
"intentPatterns": [
|
||||
"(create|build).*(backend|API).*(endpoint|route)"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 2: Cache compiled patterns**
|
||||
|
||||
Modify hook to compile patterns once:
|
||||
```javascript
|
||||
const compiledPatterns = new Map();
|
||||
|
||||
function getCompiledPattern(pattern) {
|
||||
if (!compiledPatterns.has(pattern)) {
|
||||
compiledPatterns.set(pattern, new RegExp(pattern, 'i'));
|
||||
}
|
||||
return compiledPatterns.get(pattern);
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 3: Reduce number of rules**
|
||||
|
||||
Remove low-priority or rarely-used skills.
|
||||
|
||||
**Solution 4: Parallelize pattern matching**
|
||||
|
||||
For advanced users, use worker threads to match patterns in parallel.
|
||||
|
||||
## Problem: Skills Still Don't Activate in Claude
|
||||
|
||||
### Symptoms
|
||||
- Hook injects activation message
|
||||
- Claude still doesn't use the skills
|
||||
|
||||
### Diagnosis
|
||||
|
||||
This means the hook is working, but Claude is ignoring the suggestion.
|
||||
|
||||
**Check:**
|
||||
1. Are skills actually installed?
|
||||
2. Does Claude have access to read skills?
|
||||
3. Are skill descriptions clear?
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Make activation message stronger**
|
||||
|
||||
In skill-activator.js, change:
|
||||
```javascript
|
||||
'Before responding, check if any of these skills should be used.'
|
||||
```
|
||||
|
||||
To:
|
||||
```javascript
|
||||
'⚠️ IMPORTANT: You MUST check these skills before responding. Use the Skill tool to load them.'
|
||||
```
|
||||
|
||||
**Solution 2: Block until skills loaded**
|
||||
|
||||
Change hook to blocking mode (use cautiously):
|
||||
```json
|
||||
{
|
||||
"hooks": [
|
||||
{
|
||||
"event": "UserPromptSubmit",
|
||||
"command": "~/.claude/hooks/user-prompt-submit/skill-activator.js",
|
||||
"blocking": true // ⚠️ Experimental
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 3: Improve skill descriptions**
|
||||
|
||||
Ensure skill descriptions are specific:
|
||||
```yaml
|
||||
name: backend-dev-guidelines
|
||||
description: Use when creating API routes, controllers, services, or repositories - enforces TypeScript patterns, Prisma repository pattern, and Sentry error handling
|
||||
```
|
||||
|
||||
**Solution 4: Reference skills in CLAUDE.md**
|
||||
|
||||
Add to project's CLAUDE.md:
|
||||
```markdown
|
||||
## Available Skills
|
||||
|
||||
- backend-dev-guidelines: Use for all backend code
|
||||
- frontend-dev-guidelines: Use for all frontend code
|
||||
- hyperpowers:test-driven-development: Use when writing tests
|
||||
```
|
||||
|
||||
## Problem: Hook Crashes Claude Code
|
||||
|
||||
### Symptoms
|
||||
- Claude Code freezes or crashes after hook execution
|
||||
- Error in hooks.log
|
||||
|
||||
### Diagnosis
|
||||
|
||||
**Check error logs:**
|
||||
```bash
|
||||
tail -50 ~/.claude/logs/hooks.log
|
||||
```
|
||||
|
||||
Look for errors related to skill-activator.
|
||||
|
||||
**Common causes:**
|
||||
- Infinite loop in hook
|
||||
- Memory leak
|
||||
- Unhandled promise rejection
|
||||
- Blocking operation
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Add error handling**
|
||||
```javascript
|
||||
async function main() {
|
||||
try {
|
||||
// ... hook logic
|
||||
} catch (error) {
|
||||
console.error('Hook error:', error.message);
|
||||
// Always return approve on error
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 2: Add timeout protection**
|
||||
```javascript
|
||||
const timeout = setTimeout(() => {
|
||||
console.log(JSON.stringify({ decision: 'approve' }));
|
||||
process.exit(0);
|
||||
}, 900); // Exit before hook timeout
|
||||
|
||||
// Clear timeout if completed normally
|
||||
clearTimeout(timeout);
|
||||
```
|
||||
|
||||
**Solution 3: Test hook in isolation**
|
||||
```bash
|
||||
# Run hook with various inputs
|
||||
for prompt in "test" "backend" "frontend" "debug"; do
|
||||
echo "Testing: $prompt"
|
||||
echo "{\"text\": \"$prompt\"}" | \
|
||||
timeout 2s node ~/.claude/hooks/user-prompt-submit/skill-activator.js
|
||||
done
|
||||
```
|
||||
|
||||
**Solution 4: Simplify hook**
|
||||
|
||||
Remove complex logic and test minimal version:
|
||||
```javascript
|
||||
// Minimal hook for testing
|
||||
console.log(JSON.stringify({
|
||||
decision: 'approve',
|
||||
additionalContext: '🎯 Test message'
|
||||
}));
|
||||
```
|
||||
|
||||
## Problem: Context Overload
|
||||
|
||||
### Symptoms
|
||||
- Too many skill activation messages
|
||||
- Context window fills quickly
|
||||
- Claude seems overwhelmed
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Limit activated skills**
|
||||
```javascript
|
||||
const CONFIG = {
|
||||
maxSkills: 1, // Only top match
|
||||
};
|
||||
```
|
||||
|
||||
**Solution 2: Use priorities strictly**
|
||||
```json
|
||||
{
|
||||
"critical-skill": {
|
||||
"priority": "high" // Only high priority
|
||||
},
|
||||
"optional-skill": {
|
||||
"priority": "low" // Remove low priority
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 3: Shorten activation message**
|
||||
```javascript
|
||||
function generateContext(skills) {
|
||||
return `🎯 Use: ${skills.map(s => s.skill).join(', ')}`;
|
||||
}
|
||||
```
|
||||
|
||||
## Problem: Inconsistent Activation
|
||||
|
||||
### Symptoms
|
||||
- Sometimes activates, sometimes doesn't
|
||||
- Same prompt gives different results
|
||||
|
||||
### Diagnosis
|
||||
|
||||
**This is expected due to:**
|
||||
- Prompt variations (punctuation, wording)
|
||||
- Context differences (files being edited)
|
||||
- Keyword order
|
||||
|
||||
### Solutions
|
||||
|
||||
**Solution 1: Add keyword variations**
|
||||
```json
|
||||
{
|
||||
"keywords": [
|
||||
"backend",
|
||||
"back end",
|
||||
"back-end",
|
||||
"server side",
|
||||
"server-side"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 2: Use more patterns**
|
||||
```json
|
||||
{
|
||||
"intentPatterns": [
|
||||
"create.*backend",
|
||||
"backend.*create",
|
||||
"build.*API",
|
||||
"API.*build"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Solution 3: Log all prompts for analysis**
|
||||
```javascript
|
||||
// Add to hook
|
||||
fs.appendFileSync(
|
||||
path.join(process.env.HOME, '.claude/prompt-log.txt'),
|
||||
`${new Date().toISOString()} | ${prompt.text}\n`
|
||||
);
|
||||
```
|
||||
|
||||
Analyze monthly:
|
||||
```bash
|
||||
grep "backend" ~/.claude/prompt-log.txt | wc -l
|
||||
```
|
||||
|
||||
## General Debugging Tips
|
||||
|
||||
**Enable full debug logging:**
|
||||
```bash
|
||||
# Add to hook
|
||||
const logFile = path.join(process.env.HOME, '.claude/hook-debug.log');
|
||||
function debug(msg) {
|
||||
fs.appendFileSync(logFile, `${new Date().toISOString()} ${msg}\n`);
|
||||
}
|
||||
|
||||
debug(`Analyzing prompt: ${prompt.text}`);
|
||||
debug(`Activated skills: ${activatedSkills.map(s => s.skill).join(', ')}`);
|
||||
```
|
||||
|
||||
**Test with controlled inputs:**
|
||||
```bash
|
||||
# Create test suite
|
||||
cat > test-prompts.json <<EOF
|
||||
[
|
||||
{"text": "create backend endpoint", "expected": ["backend-dev-guidelines"]},
|
||||
{"text": "build react component", "expected": ["frontend-dev-guidelines"]},
|
||||
{"text": "write test for API", "expected": ["test-driven-development"]}
|
||||
]
|
||||
EOF
|
||||
|
||||
# Run tests
|
||||
node test-hook.js
|
||||
```
|
||||
|
||||
**Monitor in production:**
|
||||
```bash
|
||||
# Daily summary
|
||||
grep "Activated skills" ~/.claude/hook-debug.log | \
|
||||
grep "$(date +%Y-%m-%d)" | \
|
||||
sort | uniq -c
|
||||
```
|
||||
|
||||
## Getting Help
|
||||
|
||||
If problems persist:
|
||||
|
||||
1. Check GitHub issues for similar problems
|
||||
2. Share debug output (sanitize sensitive info)
|
||||
3. Test with minimal configuration
|
||||
4. Verify with official examples
|
||||
|
||||
**Checklist before asking for help:**
|
||||
- [ ] Hook runs manually without errors
|
||||
- [ ] skill-rules.json is valid JSON
|
||||
- [ ] Node.js version is current (v18+)
|
||||
- [ ] Debug mode shows expected behavior
|
||||
- [ ] Tested with simplified configuration
|
||||
- [ ] Checked Claude Code logs
|
||||
826
skills/sre-task-refinement/SKILL.md
Normal file
826
skills/sre-task-refinement/SKILL.md
Normal file
@@ -0,0 +1,826 @@
|
||||
---
|
||||
name: sre-task-refinement
|
||||
description: Use when you have to refine subtasks into actionable plans ensuring that all corner cases are handled and we understand all the requirements.
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Review bd task plans with Google Fellow SRE perspective to ensure junior engineer can execute without questions; catch edge cases, verify granularity, strengthen criteria, prevent production issues before implementation.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow the 7-category checklist exactly. Apply all categories to every task. No skipping red flag checks. Always verify no placeholder text after updates. Reject plans with critical gaps.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Category | Key Questions | Auto-Reject If |
|
||||
|----------|---------------|----------------|
|
||||
| 1. Granularity | Tasks 4-8 hours? Phases <16 hours? | Any task >16h without breakdown |
|
||||
| 2. Implementability | Junior can execute without questions? | Vague language, missing details |
|
||||
| 3. Success Criteria | 3+ measurable criteria per task? | Can't verify ("works well") |
|
||||
| 4. Dependencies | Correct parent-child, blocking relationships? | Circular dependencies |
|
||||
| 5. Safety Standards | Anti-patterns specified? Error handling? | No anti-patterns section |
|
||||
| 6. Edge Cases | Empty input? Unicode? Concurrency? Failures? | No edge case consideration |
|
||||
| 7. Red Flags | Placeholder text? Vague instructions? | "[detailed above]", "TODO" |
|
||||
|
||||
**Perspective**: Google Fellow SRE with 20+ years experience reviewing junior engineer designs.
|
||||
|
||||
**Time**: Don't rush - catching one gap pre-implementation saves hours of rework.
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
Use when:
|
||||
- Reviewing bd epic/feature plans before implementation
|
||||
- Need to ensure junior engineer can execute without questions
|
||||
- Want to catch edge cases and failure modes upfront
|
||||
- Need to verify task granularity (4-8 hour subtasks)
|
||||
- After hyperpowers:writing-plans creates initial plan
|
||||
- Before hyperpowers:executing-plans starts implementation
|
||||
|
||||
Don't use when:
|
||||
- Task already being implemented (too late)
|
||||
- Just need to understand existing code (use codebase-investigator)
|
||||
- Debugging issues (use debugging-with-tools)
|
||||
- Want to create plan from scratch (use brainstorming → writing-plans)
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## Announcement
|
||||
|
||||
**Announce:** "I'm using hyperpowers:sre-task-refinement to review this plan with Google Fellow-level scrutiny."
|
||||
|
||||
---
|
||||
|
||||
## Review Checklist (Apply to Every Task)
|
||||
|
||||
### 1. Task Granularity
|
||||
|
||||
**Check:**
|
||||
- [ ] No task >8 hours (subtasks) or >16 hours (phases)?
|
||||
- [ ] Large phases broken into 4-8 hour subtasks?
|
||||
- [ ] Each subtask independently completable?
|
||||
- [ ] Each subtask has clear deliverable?
|
||||
|
||||
**If task >16 hours:**
|
||||
- Create subtasks with `bd create`
|
||||
- Link with `bd dep add child parent --type parent-child`
|
||||
- Update parent to coordinator role
|
||||
|
||||
---
|
||||
|
||||
### 2. Implementability (Junior Engineer Test)
|
||||
|
||||
**Check:**
|
||||
- [ ] Can junior engineer implement without asking questions?
|
||||
- [ ] Function signatures/behaviors described, not just "implement X"?
|
||||
- [ ] Test scenarios described (what they verify, not just names)?
|
||||
- [ ] "Done" clearly defined with verifiable criteria?
|
||||
- [ ] All file paths specified or marked "TBD: new file"?
|
||||
|
||||
**Red flags:**
|
||||
- "Implement properly" (how?)
|
||||
- "Add support" (for what exactly?)
|
||||
- "Make it work" (what does working mean?)
|
||||
- File paths missing or ambiguous
|
||||
|
||||
---
|
||||
|
||||
### 3. Success Criteria Quality
|
||||
|
||||
**Check:**
|
||||
- [ ] Each task has 3+ specific, measurable success criteria?
|
||||
- [ ] All criteria testable/verifiable (not subjective)?
|
||||
- [ ] Includes automated verification (tests pass, clippy clean)?
|
||||
- [ ] No vague criteria like "works well" or "is implemented"?
|
||||
|
||||
**Good criteria examples:**
|
||||
- ✅ "5+ unit tests pass (valid VIN, invalid checksum, various formats)"
|
||||
- ✅ "Clippy clean with no warnings"
|
||||
- ✅ "Performance: <100ms for 1000 records"
|
||||
|
||||
**Bad criteria examples:**
|
||||
- ❌ "Code is good quality"
|
||||
- ❌ "Works correctly"
|
||||
- ❌ "Is implemented"
|
||||
|
||||
---
|
||||
|
||||
### 4. Dependency Structure
|
||||
|
||||
**Check:**
|
||||
- [ ] Parent-child relationships correct (epic → phases → subtasks)?
|
||||
- [ ] Blocking dependencies correct (earlier work blocks later)?
|
||||
- [ ] No circular dependencies?
|
||||
- [ ] Dependency graph makes logical sense?
|
||||
|
||||
**Verify with:**
|
||||
```bash
|
||||
bd dep tree bd-1 # Show full dependency tree
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. Safety & Quality Standards
|
||||
|
||||
**Check:**
|
||||
- [ ] Anti-patterns include unwrap/expect prohibition?
|
||||
- [ ] Anti-patterns include TODO prohibition (or must have issue #)?
|
||||
- [ ] Anti-patterns include stub implementation prohibition?
|
||||
- [ ] Error handling requirements specified (use Result, avoid panic)?
|
||||
- [ ] Test requirements specific (test names, scenarios listed)?
|
||||
|
||||
**Minimum anti-patterns:**
|
||||
- ❌ No unwrap/expect in production code
|
||||
- ❌ No TODOs without issue numbers
|
||||
- ❌ No stub implementations (unimplemented!, todo!)
|
||||
- ❌ No regex without catastrophic backtracking check
|
||||
|
||||
---
|
||||
|
||||
### 6. Edge Cases & Failure Modes (Fellow SRE Perspective)
|
||||
|
||||
**Ask for each task:**
|
||||
- [ ] What happens with malformed input?
|
||||
- [ ] What happens with empty/nil/zero values?
|
||||
- [ ] What happens under high load/concurrency?
|
||||
- [ ] What happens when dependencies fail?
|
||||
- [ ] What happens with Unicode, special characters, large inputs?
|
||||
- [ ] Are these edge cases addressed in the plan?
|
||||
|
||||
**Add to Key Considerations section:**
|
||||
- Edge case descriptions
|
||||
- Mitigation strategies
|
||||
- References to similar code handling these cases
|
||||
|
||||
---
|
||||
|
||||
### 7. Red Flags (AUTO-REJECT)
|
||||
|
||||
**Check for these - if found, REJECT plan:**
|
||||
- ❌ Any task >16 hours without subtask breakdown
|
||||
- ❌ Vague language: "implement properly", "add support", "make it work"
|
||||
- ❌ Success criteria that can't be verified: "code is good", "works well"
|
||||
- ❌ Missing test specifications
|
||||
- ❌ "We'll handle this later" or "TODO" in the plan itself
|
||||
- ❌ No anti-patterns section
|
||||
- ❌ Implementation checklist with fewer than 3 items per task
|
||||
- ❌ No effort estimates
|
||||
- ❌ Missing error handling considerations
|
||||
- ❌ **CRITICAL: Placeholder text in design field** - "[detailed above]", "[as specified]", "[complete steps here]"
|
||||
|
||||
---
|
||||
|
||||
## Review Process
|
||||
|
||||
For each task in the plan:
|
||||
|
||||
**Step 1: Read the task**
|
||||
```bash
|
||||
bd show bd-3
|
||||
```
|
||||
|
||||
**Step 2: Apply all 7 checklist categories**
|
||||
- Task Granularity
|
||||
- Implementability
|
||||
- Success Criteria Quality
|
||||
- Dependency Structure
|
||||
- Safety & Quality Standards
|
||||
- Edge Cases & Failure Modes
|
||||
- Red Flags
|
||||
|
||||
**Step 3: Document findings**
|
||||
Take notes:
|
||||
- What's done well
|
||||
- What's missing
|
||||
- What's vague or ambiguous
|
||||
- Hidden failure modes not addressed
|
||||
- Better approaches or simplifications
|
||||
|
||||
**Step 4: Update the task**
|
||||
|
||||
Use `bd update` to add missing information:
|
||||
|
||||
```bash
|
||||
bd update bd-3 --design "$(cat <<'EOF'
|
||||
## Goal
|
||||
[Original goal, preserved]
|
||||
|
||||
## Effort Estimate
|
||||
[Updated estimate if needed]
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Existing criteria
|
||||
- [ ] NEW: Added missing measurable criteria
|
||||
|
||||
## Implementation Checklist
|
||||
[Complete checklist with file paths]
|
||||
|
||||
## Key Considerations (ADDED BY SRE REVIEW)
|
||||
|
||||
**Edge Case: Empty Input**
|
||||
- What happens when input is empty string?
|
||||
- MUST validate input length before processing
|
||||
|
||||
**Edge Case: Unicode Handling**
|
||||
- What if string contains RTL or surrogate pairs?
|
||||
- Use proper Unicode-aware string methods
|
||||
|
||||
**Performance Concern: Regex Backtracking**
|
||||
- Pattern `.*[a-z]+.*` has catastrophic backtracking risk
|
||||
- MUST test with pathological inputs (e.g., 10000 'a's)
|
||||
- Use possessive quantifiers or bounded repetition
|
||||
|
||||
**Reference Implementation**
|
||||
- Study src/similar/module.rs for pattern to follow
|
||||
|
||||
## Anti-patterns
|
||||
[Original anti-patterns]
|
||||
- ❌ NEW: Specific anti-pattern for this task's risks
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
**IMPORTANT:** Use `--design` for full detailed description, NOT `--description` (title only).
|
||||
|
||||
**Step 5: Verify no placeholder text (MANDATORY)**
|
||||
|
||||
After updating, read back with `bd show bd-N` and verify:
|
||||
- ✅ All sections contain actual content, not meta-references
|
||||
- ✅ No placeholder text like "[detailed above]", "[as specified]", "[will be added]"
|
||||
- ✅ Implementation steps fully written with actual code examples
|
||||
- ✅ Success criteria explicit, not referencing "criteria above"
|
||||
- ❌ If ANY placeholder text found: REJECT and rewrite with actual content
|
||||
|
||||
---
|
||||
|
||||
## Breaking Down Large Tasks
|
||||
|
||||
If task >16 hours, create subtasks:
|
||||
|
||||
```bash
|
||||
# Create first subtask
|
||||
bd create "Subtask 1: [Specific Component]" \
|
||||
--type task \
|
||||
--priority 1 \
|
||||
--design "[Complete subtask design with all 7 categories addressed]"
|
||||
# Returns bd-10
|
||||
|
||||
# Create second subtask
|
||||
bd create "Subtask 2: [Another Component]" \
|
||||
--type task \
|
||||
--priority 1 \
|
||||
--design "[Complete subtask design]"
|
||||
# Returns bd-11
|
||||
|
||||
# Link subtasks to parent with parent-child relationship
|
||||
bd dep add bd-10 bd-3 --type parent-child # bd-10 is child of bd-3
|
||||
bd dep add bd-11 bd-3 --type parent-child # bd-11 is child of bd-3
|
||||
|
||||
# Add sequential dependencies if needed (LATER depends on EARLIER)
|
||||
bd dep add bd-11 bd-10 # bd-11 depends on bd-10 (do bd-10 first)
|
||||
|
||||
# Update parent to coordinator
|
||||
bd update bd-3 --design "$(cat <<'EOF'
|
||||
## Goal
|
||||
Coordinate implementation of [feature]. Broken into N subtasks.
|
||||
|
||||
## Success Criteria
|
||||
- [ ] All N child subtasks closed
|
||||
- [ ] Integration tests pass
|
||||
- [ ] [High-level verification criteria]
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
After reviewing all tasks:
|
||||
|
||||
```markdown
|
||||
## Plan Review Results
|
||||
|
||||
### Epic: [Name] ([epic-id])
|
||||
|
||||
### Overall Assessment
|
||||
[APPROVE ✅ / NEEDS REVISION ⚠️ / REJECT ❌]
|
||||
|
||||
### Dependency Structure Review
|
||||
[Output of `bd dep tree [epic-id]`]
|
||||
|
||||
**Structure Quality**: [✅ Correct / ❌ Issues found]
|
||||
- [Comments on parent-child relationships]
|
||||
- [Comments on blocking dependencies]
|
||||
- [Comments on granularity]
|
||||
|
||||
### Task-by-Task Review
|
||||
|
||||
#### [Task Name] (bd-N)
|
||||
**Type**: [epic/feature/task]
|
||||
**Status**: [✅ Ready / ⚠️ Needs Minor Improvements / ❌ Needs Major Revision]
|
||||
**Estimated Effort**: [X hours] ([✅ Good / ❌ Too large - needs breakdown])
|
||||
|
||||
**Strengths**:
|
||||
- [What's done well]
|
||||
|
||||
**Critical Issues** (must fix):
|
||||
- [Blocking problems]
|
||||
|
||||
**Improvements Needed**:
|
||||
- [What to add/clarify]
|
||||
|
||||
**Edge Cases Missing**:
|
||||
- [Failure modes not addressed]
|
||||
|
||||
**Changes Made**:
|
||||
- [Specific improvements added via `bd update`]
|
||||
|
||||
---
|
||||
|
||||
[Repeat for each task/phase/subtask]
|
||||
|
||||
### Summary of Changes
|
||||
|
||||
**Issues Updated**:
|
||||
- bd-3 - Added edge case handling for Unicode, regex backtracking risks
|
||||
- bd-5 - Broke into 3 subtasks (was 40 hours, now 3x8 hours)
|
||||
- bd-7 - Strengthened success criteria (added test names, verification commands)
|
||||
|
||||
### Critical Gaps Across Plan
|
||||
1. [Pattern of missing items across multiple tasks]
|
||||
2. [Systemic issues in the plan]
|
||||
|
||||
### Recommendations
|
||||
|
||||
[If APPROVE]:
|
||||
✅ Plan is solid and ready for implementation.
|
||||
- All tasks are junior-engineer implementable
|
||||
- Dependency structure is correct
|
||||
- Edge cases and failure modes addressed
|
||||
|
||||
[If NEEDS REVISION]:
|
||||
⚠️ Plan needs improvements before implementation:
|
||||
- [List major items that need addressing]
|
||||
- After changes, re-run hyperpowers:sre-task-refinement
|
||||
|
||||
[If REJECT]:
|
||||
❌ Plan has fundamental issues and needs redesign:
|
||||
- [Critical problems]
|
||||
```
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer reviews task but skips edge case analysis (Category 6)</scenario>
|
||||
|
||||
<code>
|
||||
# Review of bd-3: Implement VIN scanner
|
||||
|
||||
## Checklist review:
|
||||
1. Granularity: ✅ 6-8 hours
|
||||
2. Implementability: ✅ Junior can implement
|
||||
3. Success Criteria: ✅ Has 5 test scenarios
|
||||
4. Dependencies: ✅ Correct
|
||||
5. Safety Standards: ✅ Anti-patterns present
|
||||
6. Edge Cases: [SKIPPED - "looks straightforward"]
|
||||
7. Red Flags: ✅ None found
|
||||
|
||||
Conclusion: "Task looks good, approve ✅"
|
||||
|
||||
# Task ships without edge case review
|
||||
# Production issues occur:
|
||||
- VIN scanner matches random 17-char strings (no checksum validation)
|
||||
- Lowercase VINs not handled (should normalize)
|
||||
- Catastrophic regex backtracking on long inputs (DoS vulnerability)
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped Category 6 (Edge Cases) assuming task was "straightforward"
|
||||
- Didn't ask: What happens with invalid checksums? Lowercase? Long inputs?
|
||||
- Missed critical production issues:
|
||||
- False positives (no checksum validation)
|
||||
- Data handling bugs (case sensitivity)
|
||||
- Security vulnerability (regex DoS)
|
||||
- Junior engineer didn't know to handle these (not in task)
|
||||
- Production incidents occur after deployment
|
||||
- Hours of emergency fixes, customer impact
|
||||
- SRE review failed to prevent known failure modes
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Apply Category 6 rigorously:**
|
||||
|
||||
```markdown
|
||||
## Edge Case Analysis for bd-3: VIN Scanner
|
||||
|
||||
Ask for EVERY task:
|
||||
- Malformed input? VIN has checksum - must validate, not just pattern match
|
||||
- Empty/nil? What if empty string passed?
|
||||
- Concurrency? Read-only scanner, no concurrency issues
|
||||
- Dependency failures? No external dependencies
|
||||
- Unicode/special chars? VIN is alphanumeric only, but what about lowercase?
|
||||
- Large inputs? Regex `.*` patterns can cause catastrophic backtracking
|
||||
|
||||
Findings:
|
||||
❌ VIN checksum validation not mentioned (will match random strings)
|
||||
❌ Case normalization not mentioned (lowercase VINs exist)
|
||||
❌ Regex backtracking risk not mentioned (DoS vulnerability)
|
||||
```
|
||||
|
||||
**Update task:**
|
||||
```bash
|
||||
bd update bd-3 --design "$(cat <<'EOF'
|
||||
[... original content ...]
|
||||
|
||||
## Key Considerations (ADDED BY SRE REVIEW)
|
||||
|
||||
**VIN Checksum Complexity**:
|
||||
- ISO 3779 requires transliteration table (letters → numbers)
|
||||
- Weighted sum algorithm with modulo 11
|
||||
- Reference: https://en.wikipedia.org/wiki/Vehicle_identification_number#Check_digit
|
||||
- MUST validate checksum, not just pattern - prevents false positives
|
||||
|
||||
**Case Normalization**:
|
||||
- VINs can appear in lowercase
|
||||
- MUST normalize to uppercase before validation
|
||||
- Test with mixed case: "1hgbh41jxmn109186"
|
||||
|
||||
**Regex Backtracking Risk**:
|
||||
- CRITICAL: Pattern `.*[A-HJ-NPR-Z0-9]{17}.*` has backtracking risk
|
||||
- Test with pathological input: 10000 'X's followed by 16-char string
|
||||
- Use possessive quantifiers or bounded repetition
|
||||
- Reference: https://www.regular-expressions.info/catastrophic.html
|
||||
|
||||
**Edge Cases to Test**:
|
||||
- Valid VIN with valid checksum (should match)
|
||||
- Valid pattern but invalid checksum (should NOT match)
|
||||
- Lowercase VIN (should normalize and validate)
|
||||
- Ambiguous chars I/O not valid in VIN (should reject)
|
||||
- Very long input (should not DoS)
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Prevented false positives (checksum validation)
|
||||
- Prevented data handling bugs (case normalization)
|
||||
- Prevented security vulnerability (regex DoS)
|
||||
- Junior engineer has complete requirements
|
||||
- Production issues caught pre-implementation
|
||||
- Proper SRE review preventing known failure modes
|
||||
- Customer trust maintained
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer approves task with placeholder text (Red Flag #10)</scenario>
|
||||
|
||||
<code>
|
||||
# Review of bd-5: Implement License Plate Scanner
|
||||
|
||||
bd show bd-5:
|
||||
|
||||
## Implementation Checklist
|
||||
- [ ] Create scanner module
|
||||
- [ ] [Complete implementation steps detailed above]
|
||||
- [ ] Add tests
|
||||
|
||||
## Success Criteria
|
||||
- [ ] [As specified in the implementation checklist]
|
||||
- [ ] Tests pass
|
||||
|
||||
## Key Considerations
|
||||
- [Will be added during implementation]
|
||||
|
||||
# Developer's review:
|
||||
"Looks comprehensive, has implementation checklist and success criteria ✅"
|
||||
|
||||
# During implementation:
|
||||
Junior engineer: "What are the 'implementation steps detailed above'?"
|
||||
Junior engineer: "What specific success criteria should I verify?"
|
||||
Junior engineer: "What key considerations exist?"
|
||||
|
||||
# No answers in the task - junior engineer blocked
|
||||
# Have to research and add missing information
|
||||
# Implementation delayed by 2 days
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Missed Red Flag #10: Placeholder text present
|
||||
- "[Complete implementation steps detailed above]" is meta-reference, not content
|
||||
- "[As specified in the implementation checklist]" is circular reference
|
||||
- "[Will be added during implementation]" is deferral, not specification
|
||||
- Junior engineer can't execute - missing critical information
|
||||
- Task looks complete but actually incomplete
|
||||
- Implementation blocked until details added
|
||||
- SRE review failed to catch placeholder text
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Check for placeholder text after reading:**
|
||||
|
||||
```markdown
|
||||
## Red Flag Check (Category 7)
|
||||
|
||||
Read through bd-5 line by line:
|
||||
|
||||
Line 15: "[Complete implementation steps detailed above]"
|
||||
❌ PLACEHOLDER - "detailed above" is meta-reference, not actual content
|
||||
|
||||
Line 22: "[As specified in the implementation checklist]"
|
||||
❌ PLACEHOLDER - Circular reference to another section, not explicit criteria
|
||||
|
||||
Line 30: "[Will be added during implementation]"
|
||||
❌ PLACEHOLDER - Deferral to future, not actual considerations
|
||||
|
||||
DECISION: REJECT ❌
|
||||
Reason: Contains placeholder text - task not ready for implementation
|
||||
```
|
||||
|
||||
**Update task with actual content:**
|
||||
```bash
|
||||
bd update bd-5 --design "$(cat <<'EOF'
|
||||
## Implementation Checklist
|
||||
- [ ] Create src/scan/plugins/scanners/license_plate.rs
|
||||
- [ ] Implement LicensePlateScanner struct with ScanPlugin trait
|
||||
- [ ] Add regex patterns for US states:
|
||||
- CA: `[0-9][A-Z]{3}[0-9]{3}` (e.g., 1ABC123)
|
||||
- NY: `[A-Z]{3}[0-9]{4}` (e.g., ABC1234)
|
||||
- TX: `[A-Z]{3}[0-9]{4}|[0-9]{3}[A-Z]{3}` (e.g., ABC1234 or 123ABC)
|
||||
- Generic: `[A-Z0-9]{5,8}` (fallback)
|
||||
- [ ] Implement has_healthcare_context() check
|
||||
- [ ] Create test module with 8+ test cases
|
||||
- [ ] Register in src/scan/plugins/scanners/mod.rs
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Valid CA plate "1ABC123" detected in healthcare context
|
||||
- [ ] Valid NY plate "ABC1234" detected in healthcare context
|
||||
- [ ] Invalid plate "123" NOT detected (too short)
|
||||
- [ ] Valid plate NOT detected outside healthcare context
|
||||
- [ ] 8+ unit tests pass covering all patterns and edge cases
|
||||
- [ ] Clippy clean, no warnings
|
||||
- [ ] cargo test passes
|
||||
|
||||
## Key Considerations
|
||||
|
||||
**False Positive Risk**:
|
||||
- License plates are short and generic (5-8 chars)
|
||||
- MUST require healthcare context via has_healthcare_context()
|
||||
- Without context, will match random alphanumeric sequences
|
||||
- Test: Random string "ABC1234" should NOT match outside healthcare context
|
||||
|
||||
**State Format Variations**:
|
||||
- 50 US states have different formats
|
||||
- Implement common formats (CA, NY, TX) + generic fallback
|
||||
- Document which formats supported in module docstring
|
||||
- Consider international plates in future iteration
|
||||
|
||||
**Performance**:
|
||||
- Regex patterns are simple, no backtracking risk
|
||||
- Should process <1ms per chunk
|
||||
|
||||
**Reference Implementation**:
|
||||
- Study src/scan/plugins/scanners/vehicle_identifier.rs
|
||||
- Follow same pattern: regex + context check + tests
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
**Verify no placeholder text:**
|
||||
```bash
|
||||
bd show bd-5
|
||||
# Read entire output
|
||||
# Confirm: All sections have actual content
|
||||
# Confirm: No "[detailed above]", "[as specified]", "[will be added]"
|
||||
# ✅ Task ready for implementation
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Junior engineer has complete specification
|
||||
- No blocked implementation waiting for details
|
||||
- All edge cases documented upfront
|
||||
- Success criteria explicit and verifiable
|
||||
- Key considerations prevent common mistakes
|
||||
- No placeholder text - task truly ready
|
||||
- Professional SRE review standard maintained
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer accepts vague success criteria (Category 3)</scenario>
|
||||
|
||||
<code>
|
||||
# Review of bd-7: Implement Data Encryption
|
||||
|
||||
bd show bd-7:
|
||||
|
||||
## Success Criteria
|
||||
- [ ] Encryption is implemented correctly
|
||||
- [ ] Code is good quality
|
||||
- [ ] Tests work properly
|
||||
|
||||
# Developer's review:
|
||||
"Has 3 success criteria ✅ Meets minimum requirement"
|
||||
|
||||
# During implementation:
|
||||
Junior engineer: "How do I know if encryption is 'correct'?"
|
||||
Junior engineer: "What makes code 'good quality'?"
|
||||
Junior engineer: "What does 'tests work properly' mean?"
|
||||
|
||||
# Junior engineer makes best guesses:
|
||||
- Uses ECB mode (insecure, should use GCM)
|
||||
- No key rotation (bad practice)
|
||||
- Tests only happy path (misses edge cases)
|
||||
|
||||
# Code review finds critical security issues
|
||||
# Complete rewrite required - 3 days wasted
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Category 3 check: "All criteria testable/verifiable?" ❌ FAILED
|
||||
- "Implemented correctly" - not measurable (correct by what standard?)
|
||||
- "Good quality" - subjective, not verifiable
|
||||
- "Work properly" - vague, what is proper?
|
||||
- Junior engineer can't verify criteria objectively
|
||||
- Makes incorrect assumptions filling gaps
|
||||
- Security vulnerabilities introduced
|
||||
- Wastes time on implementation that fails review
|
||||
- SRE review failed to strengthen criteria
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Apply Category 3 rigorously:**
|
||||
|
||||
```markdown
|
||||
## Success Criteria Analysis for bd-7
|
||||
|
||||
Current criteria:
|
||||
- [ ] Encryption is implemented correctly
|
||||
❌ NOT TESTABLE - "correctly" is subjective, no standard specified
|
||||
|
||||
- [ ] Code is good quality
|
||||
❌ NOT TESTABLE - "good quality" is opinion, not measurable
|
||||
|
||||
- [ ] Tests work properly
|
||||
❌ NOT TESTABLE - "properly" is vague, no definition
|
||||
|
||||
Minimum requirement: 3+ specific, measurable, testable criteria
|
||||
Current: 0 testable criteria
|
||||
DECISION: REJECT ❌
|
||||
```
|
||||
|
||||
**Update with measurable criteria:**
|
||||
```bash
|
||||
bd update bd-7 --design "$(cat <<'EOF'
|
||||
[... original content ...]
|
||||
|
||||
## Success Criteria
|
||||
|
||||
**Encryption Implementation**:
|
||||
- [ ] Uses AES-256-GCM mode (verified in code review)
|
||||
- [ ] Key derivation via PBKDF2 with 100,000 iterations (NIST recommendation)
|
||||
- [ ] Unique IV generated per encryption (crypto_random)
|
||||
- [ ] Authentication tag verified on decryption
|
||||
|
||||
**Code Quality** (automated checks):
|
||||
- [ ] Clippy clean with no warnings: `cargo clippy -- -D warnings`
|
||||
- [ ] Rustfmt compliant: `cargo fmt --check`
|
||||
- [ ] No unwrap/expect in production: `rg "\.unwrap\(\)|\.expect\(" src/` returns 0
|
||||
- [ ] No TODOs without issue numbers: `rg "TODO" src/` returns 0
|
||||
|
||||
**Test Coverage**:
|
||||
- [ ] 12+ unit tests pass covering:
|
||||
- test_encrypt_decrypt_roundtrip (happy path)
|
||||
- test_wrong_key_fails_auth (security)
|
||||
- test_modified_ciphertext_fails_auth (security)
|
||||
- test_empty_plaintext (edge case)
|
||||
- test_large_plaintext_10mb (performance)
|
||||
- test_unicode_plaintext (data handling)
|
||||
- test_concurrent_encryption (thread safety)
|
||||
- test_iv_uniqueness (security)
|
||||
- [4 more specific scenarios]
|
||||
- [ ] All tests pass: `cargo test encryption`
|
||||
- [ ] Test coverage >90%: `cargo tarpaulin --packages encryption`
|
||||
|
||||
**Documentation**:
|
||||
- [ ] Module docstring explains encryption scheme (AES-256-GCM)
|
||||
- [ ] Function docstrings include examples
|
||||
- [ ] Security considerations documented (key management, IV handling)
|
||||
|
||||
**Security Review**:
|
||||
- [ ] No hardcoded keys or IVs (verified via grep)
|
||||
- [ ] Key zeroized after use (verified in code)
|
||||
- [ ] Constant-time comparison for auth tag (timing attack prevention)
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Every criterion objectively verifiable
|
||||
- Junior engineer knows exactly what "done" means
|
||||
- Automated checks (clippy, fmt, grep) provide instant feedback
|
||||
- Specific test scenarios prevent missed edge cases
|
||||
- Security requirements explicit (GCM, PBKDF2, unique IV)
|
||||
- No ambiguity - can verify each criterion with command or code review
|
||||
- Professional SRE review standard: measurable, testable, specific
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Apply all 7 categories to every task** → No skipping any category for any task
|
||||
2. **Reject plans with placeholder text** → "[detailed above]", "[as specified]" = instant reject
|
||||
3. **Verify no placeholder after updates** → Read back with `bd show` and confirm actual content
|
||||
4. **Break tasks >16 hours** → Create subtasks, don't accept large tasks
|
||||
5. **Strengthen vague criteria** → "Works correctly" → measurable verification commands
|
||||
6. **Add edge cases to every task** → Empty? Unicode? Concurrency? Failures?
|
||||
7. **Never skip Category 6** → Edge case analysis prevents production issues
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Apply the full process.**
|
||||
|
||||
- "Task looks straightforward" (Edge cases hide in "straightforward" tasks)
|
||||
- "Has 3 criteria, meets minimum" (Criteria must be measurable, not just 3+ items)
|
||||
- "Placeholder text is just formatting" (Placeholders mean incomplete specification)
|
||||
- "Can handle edge cases during implementation" (Must specify upfront, not defer)
|
||||
- "Junior will figure it out" (Junior should NOT need to figure out - we specify)
|
||||
- "Too detailed, feels like micromanaging" (Detail prevents questions and rework)
|
||||
- "Taking too long to review" (One gap caught saves hours of rework)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before completing SRE review:
|
||||
|
||||
**Per task reviewed:**
|
||||
- [ ] Applied all 7 categories (Granularity, Implementability, Criteria, Dependencies, Safety, Edge Cases, Red Flags)
|
||||
- [ ] Checked for placeholder text in design field
|
||||
- [ ] Updated task with missing information via `bd update --design`
|
||||
- [ ] Verified updated task with `bd show` (no placeholders remain)
|
||||
- [ ] Broke down any task >16 hours into subtasks
|
||||
- [ ] Strengthened vague success criteria to measurable
|
||||
- [ ] Added edge case analysis to Key Considerations
|
||||
- [ ] Strengthened anti-patterns based on failure modes
|
||||
|
||||
**Overall plan:**
|
||||
- [ ] Reviewed ALL tasks/phases/subtasks (no exceptions)
|
||||
- [ ] Verified dependency structure with `bd dep tree`
|
||||
- [ ] Documented findings for each task
|
||||
- [ ] Created summary of changes made
|
||||
- [ ] Provided clear recommendation (APPROVE/NEEDS REVISION/REJECT)
|
||||
|
||||
**Can't check all boxes?** Return to review process and complete missing steps.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill is used after:**
|
||||
- hyperpowers:writing-plans (creates initial plan)
|
||||
- hyperpowers:brainstorming (establishes requirements)
|
||||
|
||||
**This skill is used before:**
|
||||
- hyperpowers:executing-plans (implements tasks)
|
||||
|
||||
**This skill is also called by:**
|
||||
- hyperpowers:executing-plans (REQUIRED for new tasks created during execution)
|
||||
|
||||
**Call chains:**
|
||||
```
|
||||
Initial planning:
|
||||
hyperpowers:brainstorming → hyperpowers:writing-plans → hyperpowers:sre-task-refinement → hyperpowers:executing-plans
|
||||
↓
|
||||
(if gaps: revise and re-review)
|
||||
|
||||
During execution (for new tasks):
|
||||
hyperpowers:executing-plans → creates new task → hyperpowers:sre-task-refinement → STOP checkpoint
|
||||
```
|
||||
|
||||
**This skill uses:**
|
||||
- bd commands (show, update, create, dep add, dep tree)
|
||||
- Google Fellow SRE perspective (20+ years distributed systems)
|
||||
- 7-category checklist (mandatory for every task)
|
||||
|
||||
**Time expectations:**
|
||||
- Small epic (3-5 tasks): 15-20 minutes
|
||||
- Medium epic (6-10 tasks): 25-40 minutes
|
||||
- Large epic (10+ tasks): 45-60 minutes
|
||||
|
||||
**Don't rush:** Catching one critical gap pre-implementation saves hours of rework.
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Review patterns:**
|
||||
- Task too large (>16h) → Break into 4-8h subtasks
|
||||
- Vague criteria ("works correctly") → Measurable commands/checks
|
||||
- Missing edge cases → Add to Key Considerations with mitigations
|
||||
- Placeholder text → Rewrite with actual content
|
||||
|
||||
**When stuck:**
|
||||
- Unsure if task too large → Ask: Can junior complete in one day?
|
||||
- Unsure if criteria measurable → Ask: Can I verify with command/code review?
|
||||
- Unsure if edge case matters → Ask: Could this fail in production?
|
||||
- Unsure if placeholder → Ask: Does this reference other content instead of providing content?
|
||||
|
||||
**Key principle:** Junior engineer should be able to execute task without asking questions. If they would need to ask, specification is incomplete.
|
||||
</resources>
|
||||
336
skills/test-driven-development/SKILL.md
Normal file
336
skills/test-driven-development/SKILL.md
Normal file
@@ -0,0 +1,336 @@
|
||||
---
|
||||
name: test-driven-development
|
||||
description: Use when implementing features or fixing bugs - enforces RED-GREEN-REFACTOR cycle requiring tests to fail before writing code
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Write the test first, watch it fail, write minimal code to pass. If you didn't watch the test fail, you don't know if it tests the right thing.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow these exact steps in order. Do not adapt.
|
||||
|
||||
Violating the letter of the rules is violating the spirit of the rules.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
|
||||
| Phase | Action | Command Example | Expected Result |
|
||||
|-------|--------|-----------------|-----------------|
|
||||
| **RED** | Write failing test | `cargo test test_name` | FAIL (feature missing) |
|
||||
| **Verify RED** | Confirm correct failure | Check error message | "function not found" or assertion fails |
|
||||
| **GREEN** | Write minimal code | Implement feature | Test passes |
|
||||
| **Verify GREEN** | All tests pass | `cargo test` | All green, no warnings |
|
||||
| **REFACTOR** | Clean up code | Improve while green | Tests still pass |
|
||||
|
||||
**Iron Law:** NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
|
||||
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**Always use for:**
|
||||
- New features
|
||||
- Bug fixes
|
||||
- Refactoring with behavior changes
|
||||
- Any production code
|
||||
|
||||
**Ask your human partner for exceptions:**
|
||||
- Throwaway prototypes (will be deleted)
|
||||
- Generated code
|
||||
- Configuration files
|
||||
|
||||
Thinking "skip TDD just this once"? Stop. That's rationalization.
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
|
||||
## 1. RED - Write Failing Test
|
||||
|
||||
Write one minimal test showing what should happen.
|
||||
|
||||
**Requirements:**
|
||||
- Test one behavior only ("and" in name? Split it)
|
||||
- Clear name describing behavior
|
||||
- Use real code (no mocks unless unavoidable)
|
||||
|
||||
See [resources/language-examples.md](resources/language-examples.md) for Rust, Swift, TypeScript examples.
|
||||
|
||||
## 2. Verify RED - Watch It Fail
|
||||
|
||||
**MANDATORY. Never skip.**
|
||||
|
||||
Run the test and confirm:
|
||||
- ✓ Test **fails** (not errors with syntax issues)
|
||||
- ✓ Failure message is expected ("function not found" or assertion fails)
|
||||
- ✓ Fails because feature missing (not typos)
|
||||
|
||||
**If test passes:** You're testing existing behavior. Fix the test.
|
||||
**If test errors:** Fix syntax error, re-run until it fails correctly.
|
||||
|
||||
## 3. GREEN - Write Minimal Code
|
||||
|
||||
Write simplest code to pass the test. Nothing more.
|
||||
|
||||
**Key principle:** Don't add features the test doesn't require. Don't refactor other code. Don't "improve" beyond the test.
|
||||
|
||||
## 4. Verify GREEN - Watch It Pass
|
||||
|
||||
**MANDATORY.**
|
||||
|
||||
Run tests and confirm:
|
||||
- ✓ New test passes
|
||||
- ✓ All other tests still pass
|
||||
- ✓ No errors or warnings
|
||||
|
||||
**If test fails:** Fix code, not test.
|
||||
**If other tests fail:** Fix now before proceeding.
|
||||
|
||||
## 5. REFACTOR - Clean Up
|
||||
|
||||
**Only after green:**
|
||||
- Remove duplication
|
||||
- Improve names
|
||||
- Extract helpers
|
||||
|
||||
Keep tests green. Don't add behavior.
|
||||
|
||||
## 6. Repeat
|
||||
|
||||
Next failing test for next feature.
|
||||
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
|
||||
<example>
|
||||
<scenario>Developer writes implementation first, then adds test that passes immediately</scenario>
|
||||
|
||||
<code>
|
||||
// Code written FIRST
|
||||
def validate_email(email):
|
||||
return "@" in email # Bug: accepts "@@"
|
||||
|
||||
// Test written AFTER
|
||||
def test_validate_email():
|
||||
assert validate_email("user@example.com") # Passes immediately!
|
||||
// Missing edge case: assert not validate_email("@@")
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
When test passes immediately:
|
||||
- Never proved the test catches bugs
|
||||
- Only tested happy path you remembered
|
||||
- Forgot edge cases (like "@@")
|
||||
- Bug ships to production
|
||||
|
||||
Tests written after verify remembered cases, not required behavior.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**TDD approach:**
|
||||
|
||||
1. **RED** - Write test first (including edge case):
|
||||
```python
|
||||
def test_validate_email():
|
||||
assert validate_email("user@example.com") # Will fail - function doesn't exist
|
||||
assert not validate_email("@@") # Edge case up front
|
||||
```
|
||||
|
||||
2. **Verify RED** - Run test, watch it fail:
|
||||
```bash
|
||||
NameError: function 'validate_email' is not defined
|
||||
```
|
||||
|
||||
3. **GREEN** - Implement to pass both cases:
|
||||
```python
|
||||
def validate_email(email):
|
||||
return "@" in email and email.count("@") == 1
|
||||
```
|
||||
|
||||
4. **Verify GREEN** - Both assertions pass, bug prevented.
|
||||
|
||||
**Result:** Test failed first, proving it works. Edge case discovered during test writing, not in production.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer has already written 3 hours of code without tests. Wants to keep it as "reference" while writing tests.</scenario>
|
||||
|
||||
<code>
|
||||
// 200 lines of untested code exists
|
||||
// Developer thinks: "I'll keep this and write tests that match it"
|
||||
// Or: "I'll use it as reference to speed up TDD"
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Keeping code as "reference":**
|
||||
- You'll copy it (that's testing after, with extra steps)
|
||||
- You'll adapt it (biased by implementation)
|
||||
- Tests will match code, not requirements
|
||||
- You'll justify shortcuts: "I already know this works"
|
||||
|
||||
**Result:** All the problems of test-after, none of the benefits of TDD.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Delete it. Completely.**
|
||||
|
||||
```bash
|
||||
git stash # Or delete the file
|
||||
```
|
||||
|
||||
**Then start TDD:**
|
||||
1. Write first failing test from requirements (not from code)
|
||||
2. Watch it fail
|
||||
3. Implement fresh (might be different from original, that's OK)
|
||||
4. Watch it pass
|
||||
|
||||
**Why delete:**
|
||||
- Sunk cost is already gone
|
||||
- 3 hours implementing ≠ 3 hours with TDD (TDD might be 2 hours total)
|
||||
- Code without tests is technical debt
|
||||
- Fresh implementation from tests is usually better
|
||||
|
||||
**What you gain:**
|
||||
- Tests that actually verify behavior
|
||||
- Confidence code works
|
||||
- Ability to refactor safely
|
||||
- No bugs from untested edge cases
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Test is hard to write. Developer thinks "design must be unclear, but I'll implement first to explore."</scenario>
|
||||
|
||||
<code>
|
||||
// Test attempt:
|
||||
func testUserServiceCreatesAccount() {
|
||||
// Need to mock database, email service, payment gateway, logger...
|
||||
// This is getting complicated, maybe I should just implement first
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**"Test is hard" is valuable signal:**
|
||||
- Hard to test = hard to use
|
||||
- Too many dependencies = coupling too tight
|
||||
- Complex setup = design needs simplification
|
||||
|
||||
**Implementing first ignores this signal:**
|
||||
- Build the complex design
|
||||
- Lock in the coupling
|
||||
- Now forced to write complex tests (or skip them)
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Listen to the test.**
|
||||
|
||||
Hard to test? Simplify the interface:
|
||||
|
||||
```swift
|
||||
// Instead of:
|
||||
class UserService {
|
||||
init(db: Database, email: EmailService, payments: PaymentGateway, logger: Logger) { }
|
||||
func createAccount(email: String, password: String, paymentToken: String) throws { }
|
||||
}
|
||||
|
||||
// Make testable:
|
||||
class UserService {
|
||||
func createAccount(request: CreateAccountRequest) -> Result<Account, Error> {
|
||||
// Dependencies injected through request or passed separately
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Test becomes simple:**
|
||||
```swift
|
||||
func testCreatesAccountFromRequest() {
|
||||
let service = UserService()
|
||||
let request = CreateAccountRequest(email: "user@example.com")
|
||||
let result = service.createAccount(request: request)
|
||||
XCTAssertEqual(result.email, "user@example.com")
|
||||
}
|
||||
```
|
||||
|
||||
**TDD forces good design.** If test is hard, fix design before implementing.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Write code before test?** → Delete it. Start over.
|
||||
- Never keep as "reference"
|
||||
- Never "adapt" while writing tests
|
||||
- Delete means delete
|
||||
|
||||
2. **Test passes immediately?** → Not TDD. Fix the test or delete the code.
|
||||
- Passing immediately proves nothing
|
||||
- You're testing existing behavior, not required behavior
|
||||
|
||||
3. **Can't explain why test failed?** → Fix until failure makes sense.
|
||||
- "function not found" = good (feature doesn't exist)
|
||||
- Weird error = bad (fix test, re-run)
|
||||
|
||||
4. **Want to skip "just this once"?** → That's rationalization. Stop.
|
||||
- TDD is faster than debugging in production
|
||||
- "Too simple to test" = test takes 30 seconds
|
||||
- "Already manually tested" = not systematic, not repeatable
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: Stop, follow TDD:
|
||||
- "This is different because..."
|
||||
- "I'm being pragmatic, not dogmatic"
|
||||
- "It's about spirit not ritual"
|
||||
- "Tests after achieve the same goals"
|
||||
- "Deleting X hours of work is wasteful"
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
|
||||
Before marking work complete:
|
||||
|
||||
- [ ] Every new function/method has a test
|
||||
- [ ] Watched each test **fail** before implementing
|
||||
- [ ] Each test failed for expected reason (feature missing, not typo)
|
||||
- [ ] Wrote minimal code to pass each test
|
||||
- [ ] All tests pass with no warnings
|
||||
- [ ] Tests use real code (mocks only if unavoidable)
|
||||
- [ ] Edge cases and errors covered
|
||||
|
||||
**Can't check all boxes?** You skipped TDD. Start over.
|
||||
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
|
||||
**This skill calls:**
|
||||
- verification-before-completion (running tests to verify)
|
||||
|
||||
**This skill is called by:**
|
||||
- fixing-bugs (write failing test reproducing bug)
|
||||
- executing-plans (when implementing bd tasks)
|
||||
- refactoring-safely (keep tests green while refactoring)
|
||||
|
||||
**Agents used:**
|
||||
- hyperpowers:test-runner (run tests, return summary only)
|
||||
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
|
||||
**Detailed language-specific examples:**
|
||||
- [Rust, Swift, TypeScript examples](resources/language-examples.md) - Complete RED-GREEN-REFACTOR cycles
|
||||
- [Language-specific test commands](resources/language-examples.md#verification-commands-by-language)
|
||||
|
||||
**When stuck:**
|
||||
- Test too complicated? → Design too complicated, simplify interface
|
||||
- Must mock everything? → Code too coupled, use dependency injection
|
||||
- Test setup huge? → Extract helpers, or simplify design
|
||||
|
||||
</resources>
|
||||
325
skills/test-driven-development/resources/example-workflows.md
Normal file
325
skills/test-driven-development/resources/example-workflows.md
Normal file
@@ -0,0 +1,325 @@
|
||||
# TDD Workflow Examples
|
||||
|
||||
This guide shows complete TDD workflows for common scenarios: bug fixes and feature additions.
|
||||
|
||||
## Example: Bug Fix
|
||||
|
||||
### Bug
|
||||
|
||||
Empty email is accepted when it should be rejected.
|
||||
|
||||
### RED Phase: Write Failing Test
|
||||
|
||||
**Swift:**
|
||||
```swift
|
||||
func testRejectsEmptyEmail() async throws {
|
||||
let result = try await submitForm(FormData(email: ""))
|
||||
XCTAssertEqual(result.error, "Email required")
|
||||
}
|
||||
```
|
||||
|
||||
### Verify RED: Watch It Fail
|
||||
|
||||
```bash
|
||||
$ swift test --filter FormTests.testRejectsEmptyEmail
|
||||
FAIL: XCTAssertEqual failed: ("nil") is not equal to ("Optional("Email required")")
|
||||
```
|
||||
|
||||
**Confirms:**
|
||||
- Test fails (not errors)
|
||||
- Failure message shows email not being validated
|
||||
- Fails because feature missing (not typos)
|
||||
|
||||
### GREEN Phase: Minimal Code
|
||||
|
||||
```swift
|
||||
struct FormResult {
|
||||
var error: String?
|
||||
}
|
||||
|
||||
func submitForm(_ data: FormData) async throws -> FormResult {
|
||||
if data.email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
|
||||
return FormResult(error: "Email required")
|
||||
}
|
||||
// ... rest of form processing
|
||||
return FormResult()
|
||||
}
|
||||
```
|
||||
|
||||
### Verify GREEN: Watch It Pass
|
||||
|
||||
```bash
|
||||
$ swift test --filter FormTests.testRejectsEmptyEmail
|
||||
Test Case '-[FormTests testRejectsEmptyEmail]' passed
|
||||
```
|
||||
|
||||
**Confirms:**
|
||||
- Test passes
|
||||
- Other tests still pass
|
||||
- No errors or warnings
|
||||
|
||||
### REFACTOR: Clean Up
|
||||
|
||||
If multiple fields need validation:
|
||||
|
||||
```swift
|
||||
extension FormData {
|
||||
func validate() -> String? {
|
||||
if email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
|
||||
return "Email required"
|
||||
}
|
||||
// Add other validations...
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
func submitForm(_ data: FormData) async throws -> FormResult {
|
||||
if let error = data.validate() {
|
||||
return FormResult(error: error)
|
||||
}
|
||||
// ... rest of form processing
|
||||
return FormResult()
|
||||
}
|
||||
```
|
||||
|
||||
Run tests again to confirm still green.
|
||||
|
||||
---
|
||||
|
||||
## Example: Feature Addition
|
||||
|
||||
### Feature
|
||||
|
||||
Calculate average of non-empty list.
|
||||
|
||||
### RED Phase: Write Failing Test
|
||||
|
||||
**TypeScript:**
|
||||
```typescript
|
||||
describe('average', () => {
|
||||
it('calculates average of non-empty list', () => {
|
||||
expect(average([1, 2, 3])).toBe(2);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### Verify RED: Watch It Fail
|
||||
|
||||
```bash
|
||||
$ npm test -- --testNamePattern="average"
|
||||
FAIL: ReferenceError: average is not defined
|
||||
```
|
||||
|
||||
**Confirms:**
|
||||
- Function doesn't exist yet
|
||||
- Test would verify the behavior when function exists
|
||||
|
||||
### GREEN Phase: Minimal Code
|
||||
|
||||
```typescript
|
||||
function average(numbers: number[]): number {
|
||||
const sum = numbers.reduce((acc, n) => acc + n, 0);
|
||||
return sum / numbers.length;
|
||||
}
|
||||
```
|
||||
|
||||
### Verify GREEN: Watch It Pass
|
||||
|
||||
```bash
|
||||
$ npm test -- --testNamePattern="average"
|
||||
PASS: calculates average of non-empty list
|
||||
```
|
||||
|
||||
### Add Edge Case: Empty List
|
||||
|
||||
**RED:**
|
||||
```typescript
|
||||
it('returns 0 for empty list', () => {
|
||||
expect(average([])).toBe(0);
|
||||
});
|
||||
```
|
||||
|
||||
**Verify RED:**
|
||||
```bash
|
||||
$ npm test -- --testNamePattern="average.*empty"
|
||||
FAIL: Expected: 0, Received: NaN
|
||||
```
|
||||
|
||||
**GREEN:**
|
||||
```typescript
|
||||
function average(numbers: number[]): number {
|
||||
if (numbers.length === 0) return 0;
|
||||
const sum = numbers.reduce((acc, n) => acc + n, 0);
|
||||
return sum / numbers.length;
|
||||
}
|
||||
```
|
||||
|
||||
**Verify GREEN:**
|
||||
```bash
|
||||
$ npm test -- --testNamePattern="average"
|
||||
PASS: 2 tests passed
|
||||
```
|
||||
|
||||
### REFACTOR: Clean Up
|
||||
|
||||
No duplication or unclear naming, so no refactoring needed. Move to next feature.
|
||||
|
||||
---
|
||||
|
||||
## Example: Refactoring with Tests
|
||||
|
||||
### Scenario
|
||||
|
||||
Existing function works but is hard to read. Tests exist and pass.
|
||||
|
||||
### Current Code
|
||||
|
||||
```rust
|
||||
fn process(data: Vec<i32>) -> i32 {
|
||||
let mut result = 0;
|
||||
for item in data {
|
||||
if item > 0 {
|
||||
result += item * 2;
|
||||
}
|
||||
}
|
||||
result
|
||||
}
|
||||
```
|
||||
|
||||
### Existing Tests (Already Green)
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn processes_positive_numbers() {
|
||||
assert_eq!(process(vec![1, 2, 3]), 12); // (1*2) + (2*2) + (3*2) = 12
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn ignores_negative_numbers() {
|
||||
assert_eq!(process(vec![1, -2, 3]), 8); // (1*2) + (3*2) = 8
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn handles_empty_list() {
|
||||
assert_eq!(process(vec![]), 0);
|
||||
}
|
||||
```
|
||||
|
||||
### REFACTOR: Improve Clarity
|
||||
|
||||
```rust
|
||||
fn process(data: Vec<i32>) -> i32 {
|
||||
data.iter()
|
||||
.filter(|&&n| n > 0)
|
||||
.map(|&n| n * 2)
|
||||
.sum()
|
||||
}
|
||||
```
|
||||
|
||||
### Verify Still Green
|
||||
|
||||
```bash
|
||||
$ cargo test
|
||||
running 3 tests
|
||||
test processes_positive_numbers ... ok
|
||||
test ignores_negative_numbers ... ok
|
||||
test handles_empty_list ... ok
|
||||
|
||||
test result: ok. 3 passed; 0 failed
|
||||
```
|
||||
|
||||
**Key:** Tests prove refactoring didn't break behavior.
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pattern: Adding Validation
|
||||
|
||||
1. **RED:** Test that invalid input is rejected
|
||||
2. **Verify RED:** Confirm invalid input currently accepted
|
||||
3. **GREEN:** Add validation check
|
||||
4. **Verify GREEN:** Confirm validation works
|
||||
5. **REFACTOR:** Extract validation if reusable
|
||||
|
||||
### Pattern: Adding Error Handling
|
||||
|
||||
1. **RED:** Test that error condition is caught
|
||||
2. **Verify RED:** Confirm error currently unhandled
|
||||
3. **GREEN:** Add error handling
|
||||
4. **Verify GREEN:** Confirm error handled correctly
|
||||
5. **REFACTOR:** Consolidate error handling if duplicated
|
||||
|
||||
### Pattern: Optimizing Performance
|
||||
|
||||
1. **Ensure tests exist and pass** (if not, add tests first)
|
||||
2. **REFACTOR:** Optimize implementation
|
||||
3. **Verify GREEN:** Confirm tests still pass
|
||||
4. **Measure:** Confirm performance improved
|
||||
|
||||
**Note:** Never optimize without tests. You can't prove optimization didn't break behavior.
|
||||
|
||||
---
|
||||
|
||||
## Workflow Checklist
|
||||
|
||||
### For Each New Feature
|
||||
|
||||
- [ ] Write one failing test
|
||||
- [ ] Run test, confirm it fails correctly
|
||||
- [ ] Write minimal code to pass
|
||||
- [ ] Run test, confirm it passes
|
||||
- [ ] Run all tests, confirm no regressions
|
||||
- [ ] Refactor if needed (staying green)
|
||||
- [ ] Commit
|
||||
|
||||
### For Each Bug Fix
|
||||
|
||||
- [ ] Write test reproducing the bug
|
||||
- [ ] Run test, confirm it fails (reproduces bug)
|
||||
- [ ] Fix the bug (minimal change)
|
||||
- [ ] Run test, confirm it passes (bug fixed)
|
||||
- [ ] Run all tests, confirm no regressions
|
||||
- [ ] Commit
|
||||
|
||||
### For Each Refactoring
|
||||
|
||||
- [ ] Confirm tests exist and pass
|
||||
- [ ] Make one small refactoring change
|
||||
- [ ] Run tests, confirm still green
|
||||
- [ ] Repeat until refactoring complete
|
||||
- [ ] Commit
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
### ❌ Writing Multiple Tests Before Implementing
|
||||
|
||||
**Why bad:** You can't tell which test makes implementation fail. Write one, implement, repeat.
|
||||
|
||||
### ❌ Changing Test to Make It Pass
|
||||
|
||||
**Why bad:** Test should define correct behavior. If test is wrong, fix test first, then re-run RED phase.
|
||||
|
||||
### ❌ Adding Features Not Covered by Tests
|
||||
|
||||
**Why bad:** Untested code. If you need a feature, write test first.
|
||||
|
||||
### ❌ Skipping RED Verification
|
||||
|
||||
**Why bad:** Test might pass immediately, meaning it doesn't test anything new.
|
||||
|
||||
### ❌ Skipping GREEN Verification
|
||||
|
||||
**Why bad:** Test might fail for unexpected reason. Always verify expected pass.
|
||||
|
||||
---
|
||||
|
||||
## Remember
|
||||
|
||||
- **One test at a time:** Write test, implement, repeat
|
||||
- **Watch it fail:** Proves test actually tests something
|
||||
- **Watch it pass:** Proves implementation works
|
||||
- **Stay green:** All tests pass before moving on
|
||||
- **Refactor freely:** Tests catch breaks
|
||||
267
skills/test-driven-development/resources/language-examples.md
Normal file
267
skills/test-driven-development/resources/language-examples.md
Normal file
@@ -0,0 +1,267 @@
|
||||
# TDD Language-Specific Examples
|
||||
|
||||
This guide provides concrete TDD examples in multiple programming languages, showing the RED-GREEN-REFACTOR cycle.
|
||||
|
||||
## RED Phase Examples
|
||||
|
||||
Write one minimal test showing what should happen.
|
||||
|
||||
### Rust
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn retries_failed_operations_3_times() {
|
||||
let mut attempts = 0;
|
||||
let operation = || -> Result<&str, &str> {
|
||||
attempts += 1;
|
||||
if attempts < 3 {
|
||||
Err("fail")
|
||||
} else {
|
||||
Ok("success")
|
||||
}
|
||||
};
|
||||
|
||||
let result = retry_operation(operation);
|
||||
|
||||
assert_eq!(result, Ok("success"));
|
||||
assert_eq!(attempts, 3);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Running the test:**
|
||||
```bash
|
||||
cargo test tests::retries_failed_operations_3_times
|
||||
```
|
||||
|
||||
### Swift
|
||||
|
||||
```swift
|
||||
func testRetriesFailedOperations3Times() async throws {
|
||||
var attempts = 0
|
||||
let operation = { () -> Result<String, Error> in
|
||||
attempts += 1
|
||||
if attempts < 3 {
|
||||
return .failure(RetryError.failed)
|
||||
}
|
||||
return .success("success")
|
||||
}
|
||||
|
||||
let result = try await retryOperation(operation)
|
||||
|
||||
XCTAssertEqual(result, "success")
|
||||
XCTAssertEqual(attempts, 3)
|
||||
}
|
||||
```
|
||||
|
||||
**Running the test:**
|
||||
```bash
|
||||
swift test --filter RetryTests.testRetriesFailedOperations3Times
|
||||
```
|
||||
|
||||
### TypeScript
|
||||
|
||||
```typescript
|
||||
describe('retryOperation', () => {
|
||||
it('retries failed operations 3 times', async () => {
|
||||
let attempts = 0;
|
||||
const operation = () => {
|
||||
attempts++;
|
||||
if (attempts < 3) {
|
||||
throw new Error('fail');
|
||||
}
|
||||
return 'success';
|
||||
};
|
||||
|
||||
const result = await retryOperation(operation);
|
||||
|
||||
expect(result).toBe('success');
|
||||
expect(attempts).toBe(3);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Running the test (Jest):**
|
||||
```bash
|
||||
npm test -- --testNamePattern="retries failed operations"
|
||||
```
|
||||
|
||||
**Running the test (Vitest):**
|
||||
```bash
|
||||
npm test -- -t "retries failed operations"
|
||||
```
|
||||
|
||||
### Why These Are Good
|
||||
|
||||
- Clear names describing the behavior
|
||||
- Test real behavior, not mocks
|
||||
- One thing per test
|
||||
- Shows desired API
|
||||
|
||||
### Bad Example
|
||||
|
||||
```typescript
|
||||
test('retry', () => {
|
||||
let mockCalls = 0;
|
||||
const mock = () => {
|
||||
mockCalls++;
|
||||
return 'success';
|
||||
};
|
||||
retryOperation(mock);
|
||||
expect(mockCalls).toBe(1); // Tests mock, not behavior
|
||||
});
|
||||
```
|
||||
|
||||
**Why this is bad:**
|
||||
- Vague name
|
||||
- Tests mock behavior, not real retry logic
|
||||
|
||||
## GREEN Phase Examples
|
||||
|
||||
Write simplest code to pass the test.
|
||||
|
||||
### Rust
|
||||
|
||||
```rust
|
||||
fn retry_operation<F, T, E>(mut operation: F) -> Result<T, E>
|
||||
where
|
||||
F: FnMut() -> Result<T, E>,
|
||||
{
|
||||
for i in 0..3 {
|
||||
match operation() {
|
||||
Ok(result) => return Ok(result),
|
||||
Err(e) => {
|
||||
if i == 2 {
|
||||
return Err(e);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
unreachable!()
|
||||
}
|
||||
```
|
||||
|
||||
### Swift
|
||||
|
||||
```swift
|
||||
func retryOperation<T>(_ operation: () async throws -> T) async throws -> T {
|
||||
var lastError: Error?
|
||||
for attempt in 0..<3 {
|
||||
do {
|
||||
return try await operation()
|
||||
} catch {
|
||||
lastError = error
|
||||
if attempt == 2 {
|
||||
throw error
|
||||
}
|
||||
}
|
||||
}
|
||||
throw lastError!
|
||||
}
|
||||
```
|
||||
|
||||
### TypeScript
|
||||
|
||||
```typescript
|
||||
async function retryOperation<T>(
|
||||
operation: () => Promise<T>
|
||||
): Promise<T> {
|
||||
let lastError: Error | undefined;
|
||||
for (let i = 0; i < 3; i++) {
|
||||
try {
|
||||
return await operation();
|
||||
} catch (error) {
|
||||
lastError = error as Error;
|
||||
if (i === 2) {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
throw lastError;
|
||||
}
|
||||
```
|
||||
|
||||
### Bad Example - Over-engineered (YAGNI)
|
||||
|
||||
```typescript
|
||||
async function retryOperation<T>(
|
||||
operation: () => Promise<T>,
|
||||
options: {
|
||||
maxRetries?: number;
|
||||
backoff?: 'linear' | 'exponential';
|
||||
onRetry?: (attempt: number) => void;
|
||||
shouldRetry?: (error: Error) => boolean;
|
||||
} = {}
|
||||
): Promise<T> {
|
||||
// Don't add features the test doesn't require!
|
||||
}
|
||||
```
|
||||
|
||||
**Why this is bad:** Test only requires 3 retries. Don't add:
|
||||
- Configurable retries
|
||||
- Backoff strategies
|
||||
- Callbacks
|
||||
- Error filtering
|
||||
|
||||
...until a test requires them.
|
||||
|
||||
## Test Requirements
|
||||
|
||||
**Every test should:**
|
||||
- Test one behavior
|
||||
- Have a clear name
|
||||
- Use real code (no mocks unless unavoidable)
|
||||
|
||||
## Verification Commands by Language
|
||||
|
||||
### Rust
|
||||
```bash
|
||||
# Single test
|
||||
cargo test tests::test_name
|
||||
|
||||
# All tests
|
||||
cargo test
|
||||
|
||||
# With output
|
||||
cargo test -- --nocapture
|
||||
```
|
||||
|
||||
### Swift
|
||||
```bash
|
||||
# Single test
|
||||
swift test --filter TestClass.testName
|
||||
|
||||
# All tests
|
||||
swift test
|
||||
|
||||
# With output
|
||||
swift test --verbose
|
||||
```
|
||||
|
||||
### TypeScript (Jest)
|
||||
```bash
|
||||
# Single test
|
||||
npm test -- --testNamePattern="test name"
|
||||
|
||||
# All tests
|
||||
npm test
|
||||
|
||||
# With coverage
|
||||
npm test -- --coverage
|
||||
```
|
||||
|
||||
### TypeScript (Vitest)
|
||||
```bash
|
||||
# Single test
|
||||
npm test -- -t "test name"
|
||||
|
||||
# All tests
|
||||
npm test
|
||||
|
||||
# With coverage
|
||||
npm test -- --coverage
|
||||
```
|
||||
581
skills/testing-anti-patterns/SKILL.md
Normal file
581
skills/testing-anti-patterns/SKILL.md
Normal file
@@ -0,0 +1,581 @@
|
||||
---
|
||||
name: testing-anti-patterns
|
||||
description: Use when writing or changing tests, adding mocks - prevents testing mock behavior, production pollution with test-only methods, and mocking without understanding dependencies
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Tests must verify real behavior, not mock behavior; mocks are tools to isolate, not things to test.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - The 3 Iron Laws are absolute (never test mocks, never add test-only methods, never mock without understanding). Apply gate functions strictly.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
## The 3 Iron Laws
|
||||
|
||||
1. **NEVER test mock behavior** → Test real component behavior
|
||||
2. **NEVER add test-only methods to production** → Use test utilities instead
|
||||
3. **NEVER mock without understanding** → Know dependencies before mocking
|
||||
|
||||
## Gate Functions (Use Before Action)
|
||||
|
||||
**Before asserting on any mock:**
|
||||
- Ask: "Am I testing real behavior or mock existence?"
|
||||
- If mock existence → STOP, delete assertion
|
||||
|
||||
**Before adding method to production:**
|
||||
- Ask: "Is this only used by tests?"
|
||||
- If yes → STOP, put in test utilities
|
||||
|
||||
**Before mocking:**
|
||||
- Ask: "What side effects does real method have?"
|
||||
- Ask: "Does test depend on those side effects?"
|
||||
- If depends → Mock lower level, not this method
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
- Writing new tests
|
||||
- Adding mocks to tests
|
||||
- Tempted to add method only tests will use
|
||||
- Test failing and considering mocking something
|
||||
- Unsure whether to mock a dependency
|
||||
- Test setup becoming complex with mocks
|
||||
|
||||
**Critical moment:** Before you add a mock or test-only method, use this skill's gate functions.
|
||||
</when_to_use>
|
||||
|
||||
<the_iron_laws>
|
||||
## Law 1: Never Test Mock Behavior
|
||||
|
||||
**Anti-pattern:**
|
||||
```rust
|
||||
// ❌ BAD: Testing that mock exists
|
||||
#[test]
|
||||
fn test_processes_request() {
|
||||
let mock_service = MockApiService::new();
|
||||
let handler = RequestHandler::new(Box::new(mock_service));
|
||||
|
||||
// Testing mock existence, not behavior
|
||||
assert!(handler.service().is_mock());
|
||||
}
|
||||
```
|
||||
|
||||
**Why wrong:** Verifies mock works, not that code works.
|
||||
|
||||
**Fix:**
|
||||
```rust
|
||||
// ✅ GOOD: Test real behavior
|
||||
#[test]
|
||||
fn test_processes_request() {
|
||||
let service = TestApiService::new(); // Real implementation or full fake
|
||||
let handler = RequestHandler::new(Box::new(service));
|
||||
|
||||
let result = handler.process_request("data");
|
||||
assert_eq!(result.status, StatusCode::OK);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Law 2: Never Add Test-Only Methods to Production
|
||||
|
||||
**Anti-pattern:**
|
||||
```rust
|
||||
// ❌ BAD: reset() only used in tests
|
||||
pub struct Connection {
|
||||
pool: Arc<ConnectionPool>,
|
||||
}
|
||||
|
||||
impl Connection {
|
||||
pub fn reset(&mut self) { // Looks like production API!
|
||||
self.pool.clear_all();
|
||||
}
|
||||
}
|
||||
|
||||
// In tests
|
||||
#[test]
|
||||
fn test_something() {
|
||||
let mut conn = Connection::new();
|
||||
conn.reset(); // Test-only method
|
||||
}
|
||||
```
|
||||
|
||||
**Why wrong:**
|
||||
- Production code polluted with test-only methods
|
||||
- Dangerous if accidentally called in production
|
||||
- Confuses object lifecycle with entity lifecycle
|
||||
|
||||
**Fix:**
|
||||
```rust
|
||||
// ✅ GOOD: Test utilities handle cleanup
|
||||
// Connection has no reset()
|
||||
|
||||
// In tests/test_utils.rs
|
||||
pub fn cleanup_connection(conn: &Connection) {
|
||||
if let Some(pool) = conn.get_pool() {
|
||||
pool.clear_test_data();
|
||||
}
|
||||
}
|
||||
|
||||
// In tests
|
||||
#[test]
|
||||
fn test_something() {
|
||||
let conn = Connection::new();
|
||||
cleanup_connection(&conn);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Law 3: Never Mock Without Understanding
|
||||
|
||||
**Anti-pattern:**
|
||||
```rust
|
||||
// ❌ BAD: Mock breaks test logic
|
||||
#[test]
|
||||
fn test_detects_duplicate_server() {
|
||||
// Mock prevents config write that test depends on!
|
||||
let mut config_manager = MockConfigManager::new();
|
||||
config_manager.expect_add_server()
|
||||
.returning(|_| Ok(())); // No actual config write!
|
||||
|
||||
config_manager.add_server(&config).unwrap();
|
||||
config_manager.add_server(&config).unwrap(); // Should fail - but won't!
|
||||
}
|
||||
```
|
||||
|
||||
**Why wrong:** Mocked method had side effect test depended on (writing config).
|
||||
|
||||
**Fix:**
|
||||
```rust
|
||||
// ✅ GOOD: Mock at correct level
|
||||
#[test]
|
||||
fn test_detects_duplicate_server() {
|
||||
// Mock the slow part, preserve behavior test needs
|
||||
let server_manager = MockServerManager::new(); // Just mock slow server startup
|
||||
let config_manager = ConfigManager::new_with_manager(server_manager);
|
||||
|
||||
config_manager.add_server(&config).unwrap(); // Config written
|
||||
let result = config_manager.add_server(&config); // Duplicate detected ✓
|
||||
assert!(result.is_err());
|
||||
}
|
||||
```
|
||||
</the_iron_laws>
|
||||
|
||||
<gate_functions>
|
||||
## Gate Function 1: Before Asserting on Mock
|
||||
|
||||
```
|
||||
BEFORE any assertion that checks mock elements:
|
||||
|
||||
1. Ask: "Am I testing real component behavior or just mock existence?"
|
||||
|
||||
2. If testing mock existence:
|
||||
STOP - Delete the assertion or unmock the component
|
||||
|
||||
3. Test real behavior instead
|
||||
```
|
||||
|
||||
**Examples of mock existence testing (all wrong):**
|
||||
- `assert!(handler.service().is_mock())`
|
||||
- `XCTAssertTrue(manager.delegate is MockDelegate)`
|
||||
- `expect(component.database).toBe(mockDb)`
|
||||
|
||||
---
|
||||
|
||||
## Gate Function 2: Before Adding Method to Production
|
||||
|
||||
```
|
||||
BEFORE adding any method to production class:
|
||||
|
||||
1. Ask: "Is this only used by tests?"
|
||||
|
||||
2. If yes:
|
||||
STOP - Don't add it
|
||||
Put it in test utilities instead
|
||||
|
||||
3. Ask: "Does this class own this resource's lifecycle?"
|
||||
|
||||
4. If no:
|
||||
STOP - Wrong class for this method
|
||||
```
|
||||
|
||||
**Red flags:**
|
||||
- Method named `reset()`, `clear()`, `cleanup()` in production class
|
||||
- Method only has `#[cfg(test)]` callers
|
||||
- Method added "for testing purposes"
|
||||
|
||||
---
|
||||
|
||||
## Gate Function 3: Before Mocking
|
||||
|
||||
```
|
||||
BEFORE mocking any method:
|
||||
|
||||
STOP - Don't mock yet
|
||||
|
||||
1. Ask: "What side effects does the real method have?"
|
||||
2. Ask: "Does this test depend on any of those side effects?"
|
||||
3. Ask: "Do I fully understand what this test needs?"
|
||||
|
||||
If depends on side effects:
|
||||
→ Mock at lower level (the actual slow/external operation)
|
||||
→ OR use test doubles that preserve necessary behavior
|
||||
→ NOT the high-level method the test depends on
|
||||
|
||||
If unsure what test depends on:
|
||||
→ Run test with real implementation FIRST
|
||||
→ Observe what actually needs to happen
|
||||
→ THEN add minimal mocking at the right level
|
||||
```
|
||||
|
||||
**Red flags:**
|
||||
- "I'll mock this to be safe"
|
||||
- "This might be slow, better mock it"
|
||||
- Mocking without understanding dependency chain
|
||||
</gate_functions>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer tests mock behavior instead of real behavior</scenario>
|
||||
|
||||
<code>
|
||||
#[test]
|
||||
fn test_user_service_initialized() {
|
||||
let mock_db = MockDatabase::new();
|
||||
let service = UserService::new(mock_db);
|
||||
|
||||
// Testing that mock exists
|
||||
assert_eq!(service.database().connection_string(), "mock://test");
|
||||
assert!(service.database().is_test_mode());
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Assertions check mock properties, not service behavior
|
||||
- Test passes when mock is correct, fails when mock is wrong
|
||||
- Tells you nothing about whether UserService works
|
||||
- Would pass even if UserService.new() does nothing
|
||||
- False confidence - mock works, but does service work?
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Apply Gate Function 1:**
|
||||
|
||||
"Am I testing real behavior or mock existence?"
|
||||
→ Testing mock existence (connection_string(), is_test_mode() are mock properties)
|
||||
|
||||
**Fix:**
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn test_user_service_creates_user() {
|
||||
let db = TestDatabase::new(); // Real test implementation
|
||||
let service = UserService::new(db);
|
||||
|
||||
// Test real behavior
|
||||
let user = service.create_user("alice", "alice@example.com").unwrap();
|
||||
assert_eq!(user.name, "alice");
|
||||
assert_eq!(user.email, "alice@example.com");
|
||||
|
||||
// Verify user was saved
|
||||
let retrieved = service.get_user(user.id).unwrap();
|
||||
assert_eq!(retrieved.name, "alice");
|
||||
}
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Tests actual UserService behavior
|
||||
- Validates create and retrieve work
|
||||
- Would fail if service broken (even with working mock)
|
||||
- Confidence service actually works
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer adds test-only method to production class</scenario>
|
||||
|
||||
<code>
|
||||
// Production code
|
||||
pub struct Database {
|
||||
pool: ConnectionPool,
|
||||
}
|
||||
|
||||
impl Database {
|
||||
pub fn new() -> Self { /* ... */ }
|
||||
|
||||
// Added "for testing"
|
||||
pub fn reset(&mut self) {
|
||||
self.pool.clear();
|
||||
self.pool.reinitialize();
|
||||
}
|
||||
}
|
||||
|
||||
// Tests
|
||||
#[test]
|
||||
fn test_user_creation() {
|
||||
let mut db = Database::new();
|
||||
// ... test logic ...
|
||||
db.reset(); // Clean up
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_user_deletion() {
|
||||
let mut db = Database::new();
|
||||
// ... test logic ...
|
||||
db.reset(); // Clean up
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Production Database polluted with test-only reset()
|
||||
- reset() looks like legitimate API to other developers
|
||||
- Dangerous if accidentally called in production (clears all data!)
|
||||
- Violates single responsibility (Database manages connections, not test lifecycle)
|
||||
- Every test class now needs reset() added
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Apply Gate Function 2:**
|
||||
|
||||
"Is this only used by tests?" → YES
|
||||
"Does Database class own test lifecycle?" → NO
|
||||
|
||||
**Fix:**
|
||||
|
||||
```rust
|
||||
// Production code (NO reset method)
|
||||
pub struct Database {
|
||||
pool: ConnectionPool,
|
||||
}
|
||||
|
||||
impl Database {
|
||||
pub fn new() -> Self { /* ... */ }
|
||||
// No reset() - production code clean
|
||||
}
|
||||
|
||||
// Test utilities (tests/test_utils.rs)
|
||||
pub fn create_test_database() -> Database {
|
||||
Database::new()
|
||||
}
|
||||
|
||||
pub fn cleanup_database(db: &mut Database) {
|
||||
// Access internals properly for cleanup
|
||||
if let Some(pool) = db.get_pool_mut() {
|
||||
pool.clear_test_data();
|
||||
}
|
||||
}
|
||||
|
||||
// Tests
|
||||
#[test]
|
||||
fn test_user_creation() {
|
||||
let mut db = create_test_database();
|
||||
// ... test logic ...
|
||||
cleanup_database(&mut db);
|
||||
}
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Production code has no test pollution
|
||||
- No risk of accidental production calls
|
||||
- Clear separation: Database manages connections, test utils manage test lifecycle
|
||||
- Test utilities can evolve without changing production code
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer mocks without understanding dependencies</scenario>
|
||||
|
||||
<code>
|
||||
#[test]
|
||||
fn test_detects_duplicate_server() {
|
||||
// "I'll mock ConfigManager to speed up the test"
|
||||
let mut mock_config = MockConfigManager::new();
|
||||
mock_config.expect_add_server()
|
||||
.times(2)
|
||||
.returning(|_| Ok(())); // Always returns Ok!
|
||||
|
||||
// Test expects duplicate detection
|
||||
mock_config.add_server(&server_config).unwrap();
|
||||
let result = mock_config.add_server(&server_config);
|
||||
|
||||
// Assertion fails! Mock always returns Ok, no duplicate detection
|
||||
assert!(result.is_err()); // FAILS
|
||||
}
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Mocked add_server() without understanding it writes config
|
||||
- Mock returns Ok() both times (no duplicate detection)
|
||||
- Test depends on ConfigManager's internal state tracking
|
||||
- Mock eliminates the behavior test needs to verify
|
||||
- "Speeding up" by mocking broke the test
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Apply Gate Function 3:**
|
||||
|
||||
"What side effects does add_server() have?" → Writes to config file, tracks added servers
|
||||
"Does test depend on those?" → YES! Test needs duplicate detection
|
||||
"Do I understand what test needs?" → Now yes
|
||||
|
||||
**Fix:**
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn test_detects_duplicate_server() {
|
||||
// Mock at the RIGHT level - just the slow I/O
|
||||
let mock_file_system = MockFileSystem::new(); // Mock slow file writes
|
||||
let config_manager = ConfigManager::new_with_fs(mock_file_system);
|
||||
|
||||
// ConfigManager's duplicate detection still works
|
||||
config_manager.add_server(&server_config).unwrap();
|
||||
let result = config_manager.add_server(&server_config);
|
||||
|
||||
// Passes! ConfigManager tracks duplicates, only file I/O is mocked
|
||||
assert!(result.is_err());
|
||||
}
|
||||
```
|
||||
|
||||
**What you gain:**
|
||||
- Test verifies real duplicate detection logic
|
||||
- Only mocked the actual slow part (file I/O)
|
||||
- ConfigManager's internal tracking works normally
|
||||
- Test actually validates the feature
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<additional_anti_patterns>
|
||||
## Anti-Pattern 4: Incomplete Mocks
|
||||
|
||||
**Problem:** Mock only fields you think you need, omit others.
|
||||
|
||||
```rust
|
||||
// ❌ BAD: Partial mock
|
||||
struct MockResponse {
|
||||
status: String,
|
||||
data: UserData,
|
||||
// Missing: metadata that downstream code uses
|
||||
}
|
||||
|
||||
impl ApiResponse for MockResponse {
|
||||
fn metadata(&self) -> &Metadata {
|
||||
panic!("metadata not implemented!") // Breaks at runtime!
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Fix:** Mirror real API completely.
|
||||
|
||||
```rust
|
||||
// ✅ GOOD: Complete mock
|
||||
struct MockResponse {
|
||||
status: String,
|
||||
data: UserData,
|
||||
metadata: Metadata, // All fields real API returns
|
||||
}
|
||||
```
|
||||
|
||||
**Gate function:**
|
||||
```
|
||||
BEFORE creating mock responses:
|
||||
1. Examine actual API response structure
|
||||
2. Include ALL fields system might consume
|
||||
3. Verify mock matches real schema completely
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Pattern 5: Over-Complex Mocks
|
||||
|
||||
**Warning signs:**
|
||||
- Mock setup longer than test logic
|
||||
- Mocking everything to make test pass
|
||||
- Test breaks when mock changes
|
||||
|
||||
**Consider:** Integration tests with real components often simpler than complex mocks.
|
||||
</additional_anti_patterns>
|
||||
|
||||
<tdd_prevention>
|
||||
## TDD Prevents These Anti-Patterns
|
||||
|
||||
**Why TDD helps:**
|
||||
|
||||
1. **Write test first** → Forces thinking about what you're actually testing
|
||||
2. **Watch it fail** → Confirms test tests real behavior, not mocks
|
||||
3. **Minimal implementation** → No test-only methods creep in
|
||||
4. **Real dependencies** → See what test needs before mocking
|
||||
|
||||
**If you're testing mock behavior, you violated TDD** - you added mocks without watching test fail against real code first.
|
||||
|
||||
**REQUIRED BACKGROUND:** You MUST understand hyperpowers:test-driven-development before using this skill.
|
||||
</tdd_prevention>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Never test mock behavior** → Test real component behavior always
|
||||
2. **Never add test-only methods to production** → Pollutes production code
|
||||
3. **Never mock without understanding** → Must know dependencies and side effects
|
||||
4. **Use gate functions before action** → Before asserting, adding methods, or mocking
|
||||
5. **Follow TDD** → Write test first, watch fail, prevents testing mocks
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Apply the gate function.**
|
||||
|
||||
- "Just checking the mock is wired up" (Testing mock, not behavior)
|
||||
- "Need reset() for test cleanup" (Test-only method, use test utilities)
|
||||
- "I'll mock this to be safe" (Don't understand dependencies)
|
||||
- "Mock setup is complex but necessary" (Probably over-mocking)
|
||||
- "This will speed up tests" (Might break test logic)
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before claiming tests are correct:
|
||||
|
||||
- [ ] No assertions on mock elements (no `is_mock()`, `is MockType`, etc.)
|
||||
- [ ] No test-only methods in production classes
|
||||
- [ ] All mocks preserve side effects test depends on
|
||||
- [ ] Mock at lowest level needed (mock slow I/O, not business logic)
|
||||
- [ ] Understand why each mock is necessary
|
||||
- [ ] Mock structure matches real API completely
|
||||
- [ ] Test logic shorter/equal to mock setup (not longer)
|
||||
- [ ] Followed TDD (test failed with real code before mocking)
|
||||
|
||||
**Can't check all boxes?** Apply gate functions and refactor.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill requires:**
|
||||
- hyperpowers:test-driven-development (prevents these anti-patterns)
|
||||
- Understanding of mocking vs. faking vs. stubbing
|
||||
|
||||
**This skill is called by:**
|
||||
- When writing tests
|
||||
- When adding mocks
|
||||
- When test setup becoming complex
|
||||
- hyperpowers:test-driven-development (use gate functions during RED phase)
|
||||
|
||||
**Red flags triggering this skill:**
|
||||
- Assertion checks for `*-mock` test IDs
|
||||
- Methods only called in test files
|
||||
- Mock setup >50% of test
|
||||
- Test fails when you remove mock
|
||||
- Can't explain why mock needed
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Mocking vs Faking vs Stubbing](resources/test-doubles.md)
|
||||
- [Test utilities patterns](resources/test-utilities.md)
|
||||
- [When to use integration tests](resources/integration-vs-unit.md)
|
||||
|
||||
**When stuck:**
|
||||
- Mock too complex → Consider integration test with real components
|
||||
- Unsure what to mock → Run with real implementation first, observe
|
||||
- Test failing mysteriously → Check if mock breaks test logic (use Gate Function 3)
|
||||
- Production polluted → Move all test helpers to test_utils
|
||||
</resources>
|
||||
386
skills/using-hyper/SKILL.md
Normal file
386
skills/using-hyper/SKILL.md
Normal file
@@ -0,0 +1,386 @@
|
||||
---
|
||||
name: using-hyper
|
||||
description: Use when starting any conversation - establishes mandatory workflows for finding and using skills
|
||||
---
|
||||
|
||||
<EXTREMELY_IMPORTANT>
|
||||
If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST read the skill.
|
||||
|
||||
**IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT.**
|
||||
|
||||
This is not negotiable. This is not optional. You cannot rationalize your way out of this.
|
||||
</EXTREMELY_IMPORTANT>
|
||||
|
||||
<skill_overview>
|
||||
Skills are proven workflows; if one exists for your task, using it is mandatory, not optional.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
HIGH FREEDOM - The meta-process (check for skills, use Skill tool, announce usage) is rigid, but each individual skill defines its own rigidity level.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
**Before responding to ANY user message:**
|
||||
|
||||
1. List available skills mentally
|
||||
2. Ask: "Does ANY skill match this request?"
|
||||
3. If yes → Use Skill tool to load the skill file
|
||||
4. Announce which skill you're using
|
||||
5. Follow the skill exactly as written
|
||||
|
||||
**Skill has checklist?** Create TodoWrite for every item.
|
||||
|
||||
**Finding a relevant skill = mandatory to use it.**
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
This skill applies at the start of EVERY conversation and BEFORE every task:
|
||||
|
||||
- User asks you to implement a feature
|
||||
- User asks you to fix a bug
|
||||
- User asks you to refactor code
|
||||
- User asks you to debug an issue
|
||||
- User asks you to write tests
|
||||
- User asks you to review code
|
||||
- User describes a problem to solve
|
||||
- User provides requirements to implement
|
||||
|
||||
**Applies to:** Literally any task that might have a corresponding skill.
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
## 1. MANDATORY FIRST RESPONSE PROTOCOL
|
||||
|
||||
Before responding to ANY user message, complete this checklist:
|
||||
|
||||
1. ☐ List available skills in your mind
|
||||
2. ☐ Ask yourself: "Does ANY skill match this request?"
|
||||
3. ☐ If yes → Use the Skill tool to read and run the skill file
|
||||
4. ☐ Announce which skill you're using
|
||||
5. ☐ Follow the skill exactly
|
||||
|
||||
**Responding WITHOUT completing this checklist = automatic failure.**
|
||||
|
||||
---
|
||||
|
||||
## 2. Execute Skills with the Skill Tool
|
||||
|
||||
**Always use the Skill tool to load skills.** Never rely on memory.
|
||||
|
||||
```
|
||||
Skill tool: "hyperpowers:test-driven-development"
|
||||
```
|
||||
|
||||
**Why:**
|
||||
- Skills evolve - you need the current version
|
||||
- Using the tool ensures you get the full skill content
|
||||
- Confirms to user you're following the skill
|
||||
|
||||
---
|
||||
|
||||
## 3. Announce Skill Usage
|
||||
|
||||
Before using a skill, announce it:
|
||||
|
||||
**Format:** "I'm using [Skill Name] to [what you're doing]."
|
||||
|
||||
**Examples:**
|
||||
- "I'm using hyperpowers:brainstorming to refine your idea into a design."
|
||||
- "I'm using hyperpowers:test-driven-development to implement this feature."
|
||||
- "I'm using hyperpowers:debugging-with-tools to investigate this error."
|
||||
|
||||
**Why:** Transparency helps user understand your process and catch errors early. Confirms you actually read the skill.
|
||||
|
||||
---
|
||||
|
||||
## 4. Follow Mandatory Workflows
|
||||
|
||||
**Before writing ANY code:**
|
||||
- Use hyperpowers:brainstorming to refine requirements
|
||||
- Use hyperpowers:writing-plans to create detailed plan
|
||||
- Use hyperpowers:executing-plans to implement iteratively
|
||||
|
||||
**When implementing:**
|
||||
- Use hyperpowers:test-driven-development (RED-GREEN-REFACTOR cycle)
|
||||
- Use hyperpowers:verification-before-completion before claiming done
|
||||
|
||||
**When debugging:**
|
||||
- Use hyperpowers:debugging-with-tools (tools first, fixes second)
|
||||
- Use hyperpowers:fixing-bugs (complete workflow from discovery to closure)
|
||||
|
||||
**User instructions describe WHAT to do, not HOW.** "Add X" means use brainstorming, TDD, verification. Not permission to skip workflows.
|
||||
|
||||
---
|
||||
|
||||
## 5. Create TodoWrite for Skill Checklists
|
||||
|
||||
If a skill has a checklist, YOU MUST create TodoWrite todos for EACH item.
|
||||
|
||||
**Don't:**
|
||||
- Work through checklist mentally
|
||||
- Skip creating todos "to save time"
|
||||
- Batch multiple items into one todo
|
||||
- Mark complete without doing them
|
||||
|
||||
**Why:** Checklists without TodoWrite tracking = steps get skipped. Every time. The overhead is tiny compared to missing steps.
|
||||
|
||||
**Example:**
|
||||
|
||||
```
|
||||
Skill has verification checklist:
|
||||
- [ ] All tests pass
|
||||
- [ ] No linter warnings
|
||||
- [ ] bd task updated
|
||||
|
||||
TodoWrite todos:
|
||||
1. Run all tests and verify they pass
|
||||
2. Run linter and verify no warnings
|
||||
3. Update bd task with completion status
|
||||
```
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>User asks to implement a new feature</scenario>
|
||||
|
||||
<code>
|
||||
User: "Add a user profile page with avatar upload"
|
||||
|
||||
Claude (without using-hyper):
|
||||
"Sure! Let me start implementing the user profile page..."
|
||||
[Starts writing code immediately without brainstorming or planning]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped mandatory brainstorming (requirements unclear)
|
||||
- Skipped writing-plans (no detailed implementation plan)
|
||||
- Skipped test-driven-development (no tests first)
|
||||
- Will likely miss edge cases, make wrong assumptions
|
||||
- User has no visibility into approach before code is written
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
Claude: "I'm using hyperpowers:brainstorming to refine your requirements into a complete design."
|
||||
|
||||
[Skill tool loads brainstorming]
|
||||
[Asks Socratic questions about avatar size limits, formats, storage, etc.]
|
||||
[Creates refined requirements]
|
||||
|
||||
Claude: "Now I'm using hyperpowers:writing-plans to create a detailed implementation plan."
|
||||
|
||||
[Skill tool loads writing-plans]
|
||||
[Creates bd epic with tasks]
|
||||
|
||||
Claude: "Now I'm using hyperpowers:executing-plans to implement the tasks iteratively."
|
||||
|
||||
[Skill tool loads executing-plans]
|
||||
[Implements with TDD, commits incrementally]
|
||||
|
||||
**What you gain:**
|
||||
- Clear requirements before coding
|
||||
- Detailed plan user can review
|
||||
- Implementation follows proven workflow
|
||||
- Tests written first (TDD)
|
||||
- Progress tracked in bd
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>User reports a bug</scenario>
|
||||
|
||||
<code>
|
||||
User: "Users can't log in after password reset"
|
||||
|
||||
Claude (without using-hyper):
|
||||
"Let me check the password reset code... I found the issue, it's not updating the password hash. Let me fix that..."
|
||||
[Makes immediate fix without investigation]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped debugging-with-tools (no systematic investigation)
|
||||
- Skipped fixing-bugs workflow (no bd issue, no regression test)
|
||||
- "Obvious" fix might be wrong or incomplete
|
||||
- No verification that fix actually works
|
||||
- Will likely miss root cause and fix symptom
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
Claude: "I'm using hyperpowers:debugging-with-tools to investigate this systematically."
|
||||
|
||||
[Skill tool loads debugging-with-tools]
|
||||
[Uses debugger, checks logs, searches internet for known issues]
|
||||
[Identifies root cause: token invalidation timing issue]
|
||||
|
||||
Claude: "Now I'm using hyperpowers:fixing-bugs to implement a complete fix."
|
||||
|
||||
[Skill tool loads fixing-bugs]
|
||||
[Creates bd bug issue]
|
||||
[Writes failing test reproducing bug (RED)]
|
||||
[Implements minimal fix (GREEN)]
|
||||
[Runs full test suite via test-runner agent]
|
||||
[Verifies fix and closes bd issue]
|
||||
|
||||
**What you gain:**
|
||||
- Root cause identified (not symptom)
|
||||
- Regression test prevents recurrence
|
||||
- Fix verified with evidence
|
||||
- Bug tracked in bd
|
||||
- Complete workflow documented
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>User asks to refactor code</scenario>
|
||||
|
||||
<code>
|
||||
User: "This validation logic is duplicated, can you extract it?"
|
||||
|
||||
Claude (without using-hyper):
|
||||
"Sure, let me extract that into a shared function..."
|
||||
[Changes 5 files at once, then runs tests]
|
||||
[Tests fail]
|
||||
"Hmm, let me fix these test failures..."
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Skipped refactoring-safely workflow (change→test→commit cycle)
|
||||
- Big-bang refactoring (changed multiple files before testing)
|
||||
- Tests failing means behavior changed (not safe refactoring)
|
||||
- Can't identify which change broke what
|
||||
- Wastes time debugging entire refactoring
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
Claude: "I'm using hyperpowers:refactoring-safely to extract this validation logic incrementally."
|
||||
|
||||
[Skill tool loads refactoring-safely]
|
||||
|
||||
Step 1: Verify tests pass BEFORE refactoring
|
||||
Step 2: Create bd refactoring task
|
||||
Step 3: Extract validation from first file → test → commit
|
||||
Step 4: Extract validation from second file → test → commit
|
||||
Step 5: Create shared validator → test → commit
|
||||
Step 6: Final verification → close bd task
|
||||
|
||||
**What you gain:**
|
||||
- Tests stay green throughout (safe refactoring)
|
||||
- Each commit is reviewable independently
|
||||
- Know exactly which change broke if test fails
|
||||
- Can stop halfway with useful progress
|
||||
- Clear history of transformations
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **Check for relevant skills BEFORE any task** → If skill exists, use it (not optional)
|
||||
2. **Use Skill tool to load skills** → Never rely on memory (skills evolve)
|
||||
3. **Announce skill usage** → Transparency helps catch errors early
|
||||
4. **Follow mandatory workflows** → brainstorming before coding, TDD for implementation, verification before claiming done
|
||||
5. **Create TodoWrite for checklists** → Mental tracking = skipped steps
|
||||
|
||||
## Common Rationalizations
|
||||
|
||||
All of these mean: **STOP. Check for and use the relevant skill.**
|
||||
|
||||
- "This is just a simple question" (Questions are tasks. Check for skills.)
|
||||
- "I can check git/files quickly" (Files lack context. Check for skills.)
|
||||
- "Let me gather information first" (Skills tell you HOW to gather. Check for skills.)
|
||||
- "This doesn't need a formal skill" (If skill exists, use it. Not optional.)
|
||||
- "I remember this skill" (Skills evolve. Use Skill tool to load current version.)
|
||||
- "This doesn't count as a task" (Taking action = task. Check for skills.)
|
||||
- "The skill is overkill for this" (Skills exist because "simple" becomes complex.)
|
||||
- "I'll just do this one thing first" (Check for skills BEFORE doing anything.)
|
||||
- "Instruction was specific so I can skip brainstorming" (Specific instructions = WHAT, not HOW. Use workflows.)
|
||||
</critical_rules>
|
||||
|
||||
<understanding_rigidity>
|
||||
## Rigid Skills (Follow Exactly)
|
||||
|
||||
These have LOW FREEDOM - follow the exact process:
|
||||
|
||||
- hyperpowers:test-driven-development (RED-GREEN-REFACTOR cycle)
|
||||
- hyperpowers:verification-before-completion (evidence before claims)
|
||||
- hyperpowers:executing-plans (continuous execution, substep tracking)
|
||||
|
||||
## Flexible Skills (Adapt Principles)
|
||||
|
||||
These have HIGH FREEDOM - adapt core principles to context:
|
||||
|
||||
- hyperpowers:brainstorming (Socratic method, but questions vary)
|
||||
- hyperpowers:managing-bd-tasks (operations adapt to project)
|
||||
- hyperpowers:sre-task-refinement (corner case analysis, but depth varies)
|
||||
|
||||
**The skill itself tells you its rigidity level.** Check `<rigidity_level>` section.
|
||||
</understanding_rigidity>
|
||||
|
||||
<instructions_vs_workflows>
|
||||
## User Instructions Describe WHAT, Not HOW
|
||||
|
||||
**User says:** "Add user authentication"
|
||||
**This means:** Use brainstorming → writing-plans → executing-plans → TDD → verification
|
||||
|
||||
**User says:** "Fix this bug"
|
||||
**This means:** Use debugging-with-tools → fixing-bugs → TDD → verification
|
||||
|
||||
**User says:** "Refactor this code"
|
||||
**This means:** Use refactoring-safely (change→test→commit cycle)
|
||||
|
||||
**User instructions are the GOAL, not permission to skip workflows.**
|
||||
|
||||
**Red flags that you're rationalizing:**
|
||||
- "Instruction was specific, don't need brainstorming"
|
||||
- "Seems simple, don't need TDD"
|
||||
- "Workflow is overkill for this"
|
||||
|
||||
**Why workflows matter MORE when instructions are specific:**
|
||||
- Clear requirements = perfect time for structured implementation
|
||||
- "Simple" tasks often have hidden complexity
|
||||
- Skipping process on "easy" tasks is how they become hard problems
|
||||
</instructions_vs_workflows>
|
||||
|
||||
<verification_checklist>
|
||||
Before completing ANY task:
|
||||
|
||||
- [ ] Did I check for relevant skills before starting?
|
||||
- [ ] Did I use Skill tool to load skills (not rely on memory)?
|
||||
- [ ] Did I announce which skill I'm using?
|
||||
- [ ] Did I follow the skill's process exactly?
|
||||
- [ ] Did I create TodoWrite for any skill checklists?
|
||||
- [ ] Did I follow mandatory workflows (brainstorming, TDD, verification)?
|
||||
|
||||
**Can't check all boxes?** You skipped critical steps. Review and fix.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill calls:**
|
||||
- ALL other skills (meta-skill that triggers appropriate skill usage)
|
||||
|
||||
**This skill is called by:**
|
||||
- Session start (always loaded)
|
||||
- User requests (check before every task)
|
||||
|
||||
**Critical workflows this establishes:**
|
||||
- hyperpowers:brainstorming (before writing code)
|
||||
- hyperpowers:test-driven-development (during implementation)
|
||||
- hyperpowers:verification-before-completion (before claiming done)
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Available skills:**
|
||||
- See skill descriptions in Skill tool's "Available Commands" section
|
||||
- Each skill's description shows when to use it
|
||||
|
||||
**When unsure if skill applies:**
|
||||
- If there's even 1% chance it applies → use it
|
||||
- Better to load and decide "not needed" than to skip and fail
|
||||
- Skills are optimized, loading them is cheap
|
||||
</resources>
|
||||
363
skills/verification-before-completion/SKILL.md
Normal file
363
skills/verification-before-completion/SKILL.md
Normal file
@@ -0,0 +1,363 @@
|
||||
---
|
||||
name: verification-before-completion
|
||||
description: Use before claiming work complete, fixed, or passing - requires running verification commands and confirming output; evidence before assertions always
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Claiming work is complete without verification is dishonesty, not efficiency. Evidence before claims, always.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - NO exceptions. Run verification command, read output, THEN make claim.
|
||||
|
||||
No shortcuts. No "should work". No partial verification. Run it, prove it.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
|
||||
| Claim | Verification Required | Not Sufficient |
|
||||
|-------|----------------------|----------------|
|
||||
| **Tests pass** | Run full test command, see 0 failures | Previous run, "should pass" |
|
||||
| **Build succeeds** | Run build, see exit 0 | Linter passing |
|
||||
| **Bug fixed** | Test original symptom, passes | Code changed |
|
||||
| **Task complete** | Check all success criteria, run verifications | "Implemented bd-3" |
|
||||
| **Epic complete** | `bd list --status open --parent bd-1` shows 0 | "All tasks done" |
|
||||
|
||||
**Iron Law:** NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
|
||||
|
||||
**Use test-runner agent for:** Tests, pre-commit hooks, commits (keeps verbose output out of context)
|
||||
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**ALWAYS before:**
|
||||
- Any success/completion claim
|
||||
- Any expression of satisfaction
|
||||
- Committing, PR creation, task completion
|
||||
- Moving to next task
|
||||
- ANY communication suggesting completion/correctness
|
||||
|
||||
**Red flags you need this:**
|
||||
- Using "should", "probably", "seems to"
|
||||
- Expressing satisfaction before verification ("Great!", "Perfect!")
|
||||
- About to commit/push without verification
|
||||
- Trusting agent success reports
|
||||
- Relying on partial verification
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
|
||||
## The Gate Function
|
||||
|
||||
Before claiming ANY status:
|
||||
|
||||
### 1. Identify
|
||||
What command proves this claim?
|
||||
|
||||
### 2. Run
|
||||
Execute the full command (fresh, complete).
|
||||
|
||||
**For tests/hooks/commits:** Use `hyperpowers:test-runner` agent
|
||||
- Agent captures verbose output in its context
|
||||
- Returns only summary + failures
|
||||
- Prevents context pollution
|
||||
|
||||
**For other commands:** Run directly and capture output
|
||||
|
||||
### 3. Read
|
||||
Full output, check exit code, count failures.
|
||||
|
||||
### 4. Verify
|
||||
Does output confirm the claim?
|
||||
- If NO: State actual status with evidence
|
||||
- If YES: State claim with evidence
|
||||
|
||||
### 5. Only Then
|
||||
Make the claim.
|
||||
|
||||
**Skip any step = lying, not verifying**
|
||||
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
|
||||
<example>
|
||||
<scenario>Developer claims tests pass without running them</scenario>
|
||||
|
||||
<code>
|
||||
Developer modifies authentication logic.
|
||||
|
||||
Developer thinks: "This fix is straightforward, tests should pass now"
|
||||
|
||||
Developer writes: "Great! All tests passing. The bug is fixed."
|
||||
|
||||
[No test command run, no output shown]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**No evidence:**
|
||||
- "Should pass" ≠ evidence
|
||||
- Confidence ≠ verification
|
||||
- Might have broken other tests
|
||||
- Might not have fixed the bug
|
||||
|
||||
**Why dangerous:**
|
||||
- Broken code ships
|
||||
- Trust broken with partner
|
||||
- Wastes time on false completion
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Run verification:**
|
||||
|
||||
```bash
|
||||
# Dispatch test-runner agent
|
||||
"Run: cargo test"
|
||||
```
|
||||
|
||||
**Agent returns:**
|
||||
```
|
||||
Summary: 33 passed, 1 failed
|
||||
Failures:
|
||||
- test_login_with_expired_token: assertion failed
|
||||
Expected: Err(AuthError::TokenExpired)
|
||||
Got: Ok(User { id: 123 })
|
||||
```
|
||||
|
||||
**State actual status:**
|
||||
```
|
||||
Tests: 33 passed, 1 failed.
|
||||
Failure: test_login_with_expired_token still fails.
|
||||
The fix didn't handle expired tokens.
|
||||
Investigating...
|
||||
```
|
||||
|
||||
**Result:** Accurate status, bug caught before committing.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer closes bd task without verifying success criteria</scenario>
|
||||
|
||||
<code>
|
||||
bd-3 success criteria:
|
||||
- [ ] All functions fully implemented (no stubs, no TODOs)
|
||||
- [ ] Tests written and passing: test_generate_token, test_verify_token
|
||||
- [ ] Pre-commit hooks pass
|
||||
|
||||
Developer implements token functions.
|
||||
|
||||
Developer thinks: "I implemented everything, task complete"
|
||||
|
||||
Runs: bd close bd-3
|
||||
|
||||
[No verification commands run]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Success criteria not verified:**
|
||||
- Might have TODO comments left
|
||||
- Specific tests not run
|
||||
- Pre-commit hooks not checked
|
||||
|
||||
**Result:** Incomplete task closed, issues discovered later.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Verify each criterion:**
|
||||
|
||||
```bash
|
||||
# 1. Check for TODOs
|
||||
rg "TODO|FIXME|todo!|unimplemented!" src/
|
||||
# Output: [nothing found] ✅
|
||||
|
||||
# 2. Run specific tests
|
||||
cargo test test_generate_token test_verify_token
|
||||
# Output: test result: ok. 2 passed ✅
|
||||
|
||||
# 3. Run pre-commit hooks (via test-runner agent)
|
||||
"Run: cargo fmt --check && cargo clippy && cargo test"
|
||||
```
|
||||
|
||||
**Agent returns:**
|
||||
```
|
||||
All checks passed:
|
||||
- cargo fmt: no formatting issues
|
||||
- cargo clippy: no warnings
|
||||
- cargo test: 34/34 passed
|
||||
```
|
||||
|
||||
**THEN close task:**
|
||||
```bash
|
||||
bd close bd-3
|
||||
```
|
||||
|
||||
**Result:** Task actually complete, all criteria met with evidence.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer claims build works based on linter passing</scenario>
|
||||
|
||||
<code>
|
||||
Developer fixes TypeScript errors flagged by linter.
|
||||
|
||||
Runs: npm run lint
|
||||
Output: ✅ No issues found
|
||||
|
||||
Developer writes: "Build passes! Ready to commit."
|
||||
|
||||
[No build command run]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Linter ≠ compiler:**
|
||||
- Linter checks style and simple errors
|
||||
- Doesn't verify imports, types, compilation
|
||||
- Build might still fail
|
||||
|
||||
**Why it happens:** Partial verification, extrapolation.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Run actual build:**
|
||||
|
||||
```bash
|
||||
npm run build
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```
|
||||
ERROR in src/auth/service.ts:23:15
|
||||
TS2304: Cannot find name 'validateToken'
|
||||
Did you mean 'verifyToken'?
|
||||
```
|
||||
|
||||
**State actual status:**
|
||||
```
|
||||
Linter passes, but build fails.
|
||||
Error: validateToken not defined (should be verifyToken).
|
||||
Fixing...
|
||||
```
|
||||
|
||||
**After fix, run build again:**
|
||||
```bash
|
||||
npm run build
|
||||
# Output: ✓ built in 1.2s
|
||||
```
|
||||
|
||||
**Now can claim:**
|
||||
```
|
||||
Build passes. [Evidence: npm run build exit 0]
|
||||
Ready to commit.
|
||||
```
|
||||
|
||||
**Result:** Actual build status verified, error caught.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **No claims without fresh verification** → Run command, see output, THEN claim
|
||||
- "Should work" = forbidden
|
||||
- "Looks correct" = forbidden
|
||||
- Previous run ≠ fresh verification
|
||||
|
||||
2. **Use test-runner agent for verbose commands** → Tests, hooks, commits
|
||||
- Prevents context pollution
|
||||
- Returns summary + failures only
|
||||
- Never run `git commit` or `cargo test` directly if output is verbose
|
||||
|
||||
3. **Verify ALL success criteria** → Not just "tests pass"
|
||||
- Read each criterion from bd task
|
||||
- Run verification for each
|
||||
- Check all pass before closing
|
||||
|
||||
4. **Evidence in every claim** → Show the output
|
||||
- Not: "Tests pass"
|
||||
- Yes: "Tests pass [Ran: cargo test, Output: 34/34 passed]"
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: Stop, run verification:
|
||||
- "Should work now"
|
||||
- "I'm confident this fixes it"
|
||||
- "Just this once"
|
||||
- "Linter passed" (when claiming build works)
|
||||
- "Agent said success" (without independent verification)
|
||||
- "I'm tired" (exhaustion ≠ excuse)
|
||||
- "Partial check is enough"
|
||||
|
||||
## Pre-Commit Hook Assumption
|
||||
|
||||
**If your project uses pre-commit hooks enforcing tests:**
|
||||
- All test failures are from your current changes
|
||||
- Never check if errors were "pre-existing"
|
||||
- Don't run `git checkout <sha> && pytest` to verify
|
||||
- Pre-commit hooks guarantee previous commit passed
|
||||
- Just fix the error directly
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
|
||||
Before claiming tests pass:
|
||||
- [ ] Ran full test command (not partial)
|
||||
- [ ] Saw output showing 0 failures
|
||||
- [ ] Used test-runner agent if output verbose
|
||||
|
||||
Before claiming build succeeds:
|
||||
- [ ] Ran build command (not just linter)
|
||||
- [ ] Saw exit code 0
|
||||
- [ ] Checked for compilation errors
|
||||
|
||||
Before closing bd task:
|
||||
- [ ] Re-read success criteria from bd task
|
||||
- [ ] Ran verification for each criterion
|
||||
- [ ] Saw evidence all pass
|
||||
- [ ] THEN closed task
|
||||
|
||||
Before closing bd epic:
|
||||
- [ ] Ran `bd list --status open --parent bd-1`
|
||||
- [ ] Saw 0 open tasks
|
||||
- [ ] Ran `bd dep tree bd-1`
|
||||
- [ ] Confirmed all tasks closed
|
||||
- [ ] THEN closed epic
|
||||
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
|
||||
**This skill calls:**
|
||||
- test-runner (for verbose verification commands)
|
||||
|
||||
**This skill is called by:**
|
||||
- test-driven-development (verify tests pass/fail)
|
||||
- executing-plans (verify task success criteria)
|
||||
- refactoring-safely (verify tests still pass)
|
||||
- ALL skills before completion claims
|
||||
|
||||
**Agents used:**
|
||||
- hyperpowers:test-runner (run tests, hooks, commits without output pollution)
|
||||
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
|
||||
**When stuck:**
|
||||
- Tempted to say "should work" → Run the verification
|
||||
- Agent reports success → Check VCS diff, verify independently
|
||||
- Partial verification → Run complete command
|
||||
- Tired and want to finish → Run verification anyway, no exceptions
|
||||
|
||||
**Verification patterns:**
|
||||
- Tests: Use test-runner agent, check 0 failures
|
||||
- Build: Run build command, check exit 0
|
||||
- bd task: Verify each success criterion
|
||||
- bd epic: Check all tasks closed with bd list/dep tree
|
||||
|
||||
</resources>
|
||||
478
skills/writing-plans/SKILL.md
Normal file
478
skills/writing-plans/SKILL.md
Normal file
@@ -0,0 +1,478 @@
|
||||
---
|
||||
name: writing-plans
|
||||
description: Use to expand bd tasks with detailed implementation steps - adds exact file paths, complete code, verification commands assuming zero context
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Enhance bd tasks with comprehensive implementation details for engineers with zero codebase context. Expand checklists into explicit steps: which files, complete code examples, exact commands, verification steps.
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
MEDIUM FREEDOM - Follow task-by-task validation pattern, use codebase-investigator for verification.
|
||||
|
||||
Adapt implementation details to actual codebase state. Never use placeholders or meta-references.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
|
||||
| Step | Action | Critical Rule |
|
||||
|------|--------|---------------|
|
||||
| **Identify Scope** | Single task, range, or full epic | No artificial limits |
|
||||
| **Verify Codebase** | Use `codebase-investigator` agent | NEVER verify yourself, report discrepancies |
|
||||
| **Draft Steps** | Write bite-sized (2-5 min) actions | Follow TDD cycle for new features |
|
||||
| **Present to User** | Show COMPLETE expansion FIRST | Then ask for approval |
|
||||
| **Update bd** | `bd update bd-N --design "..."` | Only after user approves |
|
||||
| **Continue** | Move to next task automatically | NO asking permission between tasks |
|
||||
|
||||
**FORBIDDEN:** Placeholders like `[Full implementation steps as detailed above]`
|
||||
**REQUIRED:** Actual content - complete code, exact paths, real commands
|
||||
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**Use after hyperpowers:sre-task-refinement or anytime tasks need more detail.**
|
||||
|
||||
Symptoms:
|
||||
- bd tasks have implementation checklists but need expansion
|
||||
- Engineer needs step-by-step guide with zero context
|
||||
- Want explicit file paths, complete code examples
|
||||
- Need exact verification commands
|
||||
|
||||
</when_to_use>
|
||||
|
||||
<the_process>
|
||||
|
||||
## 1. Identify Tasks to Expand
|
||||
|
||||
**User specifies scope:**
|
||||
- Single: "Expand bd-2"
|
||||
- Range: "Expand bd-2 through bd-5"
|
||||
- Epic: "Expand all tasks in bd-1"
|
||||
|
||||
**If epic:**
|
||||
```bash
|
||||
bd dep tree bd-1 # View complete dependency tree
|
||||
# Note all child task IDs
|
||||
```
|
||||
|
||||
**Create TodoWrite tracker:**
|
||||
```
|
||||
- [ ] bd-2: [Task Title]
|
||||
- [ ] bd-3: [Task Title]
|
||||
...
|
||||
```
|
||||
|
||||
## 2. For EACH Task (Loop Until All Done)
|
||||
|
||||
### 2a. Mark In Progress and Read Current State
|
||||
|
||||
```bash
|
||||
# Mark in TodoWrite: in_progress
|
||||
bd show bd-3 # Read current task design
|
||||
```
|
||||
|
||||
### 2b. Verify Codebase State
|
||||
|
||||
**CRITICAL: Use codebase-investigator agent, NEVER verify yourself.**
|
||||
|
||||
**Provide agent with bd assumptions:**
|
||||
```
|
||||
Assumptions from bd-3:
|
||||
- Auth service should be in src/services/auth.ts with login() and logout()
|
||||
- User model in src/models/user.ts with email and password fields
|
||||
- Test file at tests/services/auth.test.ts
|
||||
- Uses bcrypt dependency for password hashing
|
||||
|
||||
Verify these assumptions and report:
|
||||
1. What exists vs what bd-3 expects
|
||||
2. Structural differences (different paths, functions, exports)
|
||||
3. Missing or additional components
|
||||
4. Current dependency versions
|
||||
```
|
||||
|
||||
**Based on investigator report:**
|
||||
- ✓ Confirmed assumptions → Use in implementation
|
||||
- ✗ Incorrect assumptions → Adjust plan to match reality
|
||||
- + Found additional → Document and incorporate
|
||||
|
||||
**NEVER write conditional steps:**
|
||||
❌ "Update `index.js` if exists"
|
||||
❌ "Modify `config.py` (if present)"
|
||||
|
||||
**ALWAYS write definitive steps:**
|
||||
✅ "Create `src/auth.ts`" (investigator confirmed doesn't exist)
|
||||
✅ "Modify `src/index.ts:45-67`" (investigator confirmed exists)
|
||||
|
||||
### 2c. Draft Expanded Implementation Steps
|
||||
|
||||
**Bite-sized granularity (2-5 minutes per step):**
|
||||
|
||||
For new features (follow test-driven-development):
|
||||
1. Write the failing test (one step)
|
||||
2. Run it to verify it fails (one step)
|
||||
3. Implement minimal code to pass (one step)
|
||||
4. Run tests to verify they pass (one step)
|
||||
5. Commit (one step)
|
||||
|
||||
**Include in each step:**
|
||||
- Exact file path
|
||||
- Complete code example (not pseudo-code)
|
||||
- Exact command to run
|
||||
- Expected output
|
||||
|
||||
### 2d. Present COMPLETE Expansion to User
|
||||
|
||||
**CRITICAL: Show the full expansion BEFORE asking for approval.**
|
||||
|
||||
**Format:**
|
||||
```markdown
|
||||
**bd-[N]: [Task Title]**
|
||||
|
||||
**From bd issue:**
|
||||
- Goal: [From bd show]
|
||||
- Effort estimate: [From bd issue]
|
||||
- Success criteria: [From bd issue]
|
||||
|
||||
**Codebase verification findings:**
|
||||
- ✓ Confirmed: [what matched]
|
||||
- ✗ Incorrect: [what issue said] - ACTUALLY: [reality]
|
||||
- + Found: [unexpected discoveries]
|
||||
|
||||
**Implementation steps based on actual codebase state:**
|
||||
|
||||
### Step Group 1: [Component Name]
|
||||
|
||||
**Files:**
|
||||
- Create: `exact/path/to/file.py`
|
||||
- Modify: `exact/path/to/existing.py:123-145`
|
||||
- Test: `tests/exact/path/to/test.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
```python
|
||||
# tests/auth/test_login.py
|
||||
def test_login_with_valid_credentials():
|
||||
user = create_test_user(email="test@example.com", password="secure123")
|
||||
result = login(email="test@example.com", password="secure123")
|
||||
assert result.success is True
|
||||
assert result.user_id == user.id
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
```bash
|
||||
pytest tests/auth/test_login.py::test_login_with_valid_credentials
|
||||
# Expected: ModuleNotFoundError: No module named 'auth.login'
|
||||
```
|
||||
|
||||
[... continue for all steps ...]
|
||||
```
|
||||
|
||||
**THEN ask for approval using AskUserQuestion:**
|
||||
- Question: "Is this expansion approved for bd-[N]?"
|
||||
- Options:
|
||||
- "Approved - continue to next task"
|
||||
- "Needs revision"
|
||||
- "Other"
|
||||
|
||||
### 2e. If Approved: Update bd and Continue
|
||||
|
||||
```bash
|
||||
bd update bd-3 --design "[paste complete expansion]"
|
||||
# Mark completed in TodoWrite
|
||||
# IMMEDIATELY continue to next task (NO asking permission)
|
||||
```
|
||||
|
||||
### 2f. If Needs Revision: Iterate
|
||||
|
||||
- Keep as in_progress in TodoWrite
|
||||
- Revise based on feedback
|
||||
- Present again (step 2d)
|
||||
|
||||
## 3. After ALL Tasks Done
|
||||
|
||||
```
|
||||
All bd issues now contain detailed implementation steps.
|
||||
Epic ready for execution.
|
||||
```
|
||||
|
||||
**Offer execution choice:**
|
||||
"Ready to execute? I can use hyperpowers:executing-plans to implement iteratively."
|
||||
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
|
||||
<example>
|
||||
<scenario>Developer writes placeholder text instead of actual implementation steps</scenario>
|
||||
|
||||
<code>
|
||||
bd update bd-3 --design "## Goal
|
||||
Implement user authentication
|
||||
|
||||
## Implementation
|
||||
[Full implementation steps as detailed above - includes all 6 step groups with complete code examples]
|
||||
|
||||
## Tests
|
||||
[Complete code examples will be added here]"
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Placeholders defeat the purpose:**
|
||||
- Engineer executing bd-3 has zero context
|
||||
- There is no "above" to reference (bd issue is the source of truth)
|
||||
- Violates "exact instructions for zero-context engineer" principle
|
||||
- Makes task impossible to execute
|
||||
|
||||
**Common placeholder patterns (ALL FORBIDDEN):**
|
||||
- `[Full implementation steps as detailed above]`
|
||||
- `[See above for detailed steps]`
|
||||
- `[As specified in success criteria]`
|
||||
- `[Complete code examples will be added here]`
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Write actual content:**
|
||||
|
||||
```bash
|
||||
bd update bd-3 --design "## Goal
|
||||
Implement user authentication
|
||||
|
||||
## Implementation
|
||||
|
||||
### Step 1: Write failing login test
|
||||
```python
|
||||
# tests/auth/test_login.py
|
||||
import pytest
|
||||
from auth.service import login
|
||||
|
||||
def test_login_with_valid_credentials():
|
||||
result = login(email='test@example.com', password='pass123')
|
||||
assert result.success is True
|
||||
```
|
||||
|
||||
### Step 2: Run test (should fail)
|
||||
```bash
|
||||
pytest tests/auth/test_login.py::test_login_with_valid_credentials
|
||||
# Expected: ModuleNotFoundError: No module named 'auth.service'
|
||||
```
|
||||
|
||||
### Step 3: Create login function
|
||||
```python
|
||||
# src/auth/service.py
|
||||
from dataclasses import dataclass
|
||||
|
||||
@dataclass
|
||||
class LoginResult:
|
||||
success: bool
|
||||
user_id: int | None = None
|
||||
|
||||
def login(email: str, password: str) -> LoginResult:
|
||||
# Minimal implementation
|
||||
return LoginResult(success=True, user_id=1)
|
||||
```
|
||||
|
||||
[... continue for all steps with complete code ...]
|
||||
|
||||
## Tests
|
||||
All test code included in implementation steps above following TDD cycle."
|
||||
```
|
||||
|
||||
**Result:** Engineer can execute without any context.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer verifies codebase state themselves instead of using codebase-investigator agent</scenario>
|
||||
|
||||
<code>
|
||||
Developer reads files manually:
|
||||
- Reads src/services/auth.ts directly
|
||||
- Checks package.json manually
|
||||
- Assumes file structure based on quick look
|
||||
|
||||
Writes expansion based on quick check:
|
||||
"Modify src/services/auth.ts (if exists)"
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Manual verification problems:**
|
||||
- Misses nuances (existing functions, imports, structure)
|
||||
- Creates conditional steps ("if exists")
|
||||
- Doesn't catch version mismatches
|
||||
- Doesn't report discrepancies from bd assumptions
|
||||
|
||||
**Result:** Implementation plan may not match actual codebase state.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Use codebase-investigator agent:**
|
||||
|
||||
```
|
||||
Dispatch agent with bd-3 assumptions:
|
||||
"bd-3 expects auth service in src/services/auth.ts with login() and logout() functions.
|
||||
Verify:
|
||||
1. Does src/services/auth.ts exist?
|
||||
2. What functions does it export?
|
||||
3. How do login() and logout() work currently?
|
||||
4. Any other relevant auth code?
|
||||
5. What's the bcrypt version?"
|
||||
```
|
||||
|
||||
**Agent reports:**
|
||||
```
|
||||
✓ src/services/auth.ts exists
|
||||
✗ ONLY has login() function - NO logout() yet
|
||||
+ Found: login() uses argon2 NOT bcrypt
|
||||
+ Found: Session management in src/services/session.ts
|
||||
✓ argon2 version: 0.31.2
|
||||
```
|
||||
|
||||
**Write definitive steps based on findings:**
|
||||
```
|
||||
Step 1: Add logout() function to EXISTING src/services/auth.ts:45-67
|
||||
(no "if exists" - investigator confirmed location)
|
||||
|
||||
Step 2: Use argon2 (already installed 0.31.2) not bcrypt
|
||||
(no assumption - investigator confirmed actual dependency)
|
||||
```
|
||||
|
||||
**Result:** Plan matches actual codebase state.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer asks permission between each task validation instead of continuing automatically</scenario>
|
||||
|
||||
<code>
|
||||
After user approves bd-3 expansion:
|
||||
|
||||
Developer: "bd-3 expansion approved and updated in bd.
|
||||
Should I continue to bd-4 now? What's your preference?"
|
||||
|
||||
[Waits for user response]
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
**Breaks workflow momentum:**
|
||||
- Unnecessary interruption
|
||||
- User has to respond multiple times
|
||||
- Slows down batch processing
|
||||
- TodoWrite list IS the plan
|
||||
|
||||
**Why it happens:** Over-asking for permission instead of executing the plan.
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**After user approves bd-3:**
|
||||
|
||||
```bash
|
||||
bd update bd-3 --design "[expansion]" # Update bd
|
||||
# Mark completed in TodoWrite
|
||||
```
|
||||
|
||||
**IMMEDIATELY continue to bd-4:**
|
||||
```bash
|
||||
bd show bd-4 # Read next task
|
||||
# Dispatch codebase-investigator with bd-4 assumptions
|
||||
# Draft expansion
|
||||
# Present bd-4 expansion to user
|
||||
```
|
||||
|
||||
**NO asking:** "Should I continue?" or "What's your preference?"
|
||||
|
||||
**ONLY ask user:**
|
||||
1. When presenting each task expansion for validation
|
||||
2. At the VERY END after ALL tasks done to offer execution choice
|
||||
|
||||
**Between validations: JUST CONTINUE.**
|
||||
|
||||
**Result:** Efficient batch processing of all tasks.
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
</examples>
|
||||
|
||||
<critical_rules>
|
||||
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **No placeholders or meta-references** → Write actual content
|
||||
- ❌ FORBIDDEN: `[Full implementation steps as detailed above]`
|
||||
- ✅ REQUIRED: Complete code, exact paths, real commands
|
||||
|
||||
2. **Use codebase-investigator agent** → Never verify yourself
|
||||
- Agent gets bd assumptions
|
||||
- Agent reports discrepancies
|
||||
- You adjust plan to match reality
|
||||
|
||||
3. **Present COMPLETE expansion before asking** → User must SEE before approving
|
||||
- Show full expansion in message text
|
||||
- Then use AskUserQuestion for approval
|
||||
- Never ask without showing first
|
||||
|
||||
4. **Continue automatically between validations** → Don't ask permission
|
||||
- TodoWrite list IS your plan
|
||||
- Execute it completely
|
||||
- Only ask: (a) task validation, (b) final execution choice
|
||||
|
||||
5. **Write definitive steps** → Never conditional
|
||||
- ❌ "Update `index.js` if exists"
|
||||
- ✅ "Create `src/auth.ts`" (investigator confirmed)
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: Stop, write actual content:
|
||||
- "I'll add the details later"
|
||||
- "The implementation is obvious from the goal"
|
||||
- "See above for the steps"
|
||||
- "User can figure out the code"
|
||||
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
|
||||
Before marking each task complete in TodoWrite:
|
||||
- [ ] Used codebase-investigator agent (not manual verification)
|
||||
- [ ] Presented COMPLETE expansion to user (showed full text)
|
||||
- [ ] User approved expansion (via AskUserQuestion)
|
||||
- [ ] Updated bd with actual content (no placeholders)
|
||||
- [ ] No meta-references in design field
|
||||
|
||||
Before finishing all tasks:
|
||||
- [ ] All tasks in TodoWrite marked completed
|
||||
- [ ] All bd issues updated with expansions
|
||||
- [ ] No conditional steps ("if exists")
|
||||
- [ ] Complete code examples in all steps
|
||||
- [ ] Exact file paths and commands throughout
|
||||
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
|
||||
**This skill calls:**
|
||||
- sre-task-refinement (optional, can run before this)
|
||||
- codebase-investigator (REQUIRED for each task verification)
|
||||
- executing-plans (offered after all tasks expanded)
|
||||
|
||||
**This skill is called by:**
|
||||
- User (via /hyperpowers:write-plan command)
|
||||
- After brainstorming creates epic
|
||||
|
||||
**Agents used:**
|
||||
- hyperpowers:codebase-investigator (verify assumptions, report discrepancies)
|
||||
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
|
||||
**Detailed guidance:**
|
||||
- [bd command reference](../common-patterns/bd-commands.md)
|
||||
- [Task structure examples](resources/task-examples.md) (if exists)
|
||||
|
||||
**When stuck:**
|
||||
- Unsure about file structure → Use codebase-investigator
|
||||
- Don't know version → Use codebase-investigator
|
||||
- Tempted to write "if exists" → Use codebase-investigator first
|
||||
- About to write placeholder → Stop, write actual content
|
||||
- Want to ask permission → Check: Is this task validation or final choice? If neither, don't ask
|
||||
|
||||
</resources>
|
||||
599
skills/writing-skills/SKILL.md
Normal file
599
skills/writing-skills/SKILL.md
Normal file
@@ -0,0 +1,599 @@
|
||||
---
|
||||
name: writing-skills
|
||||
description: Use when creating new skills, editing existing skills, or verifying skills work - applies TDD to documentation by testing with subagents before writing
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
Writing skills IS test-driven development applied to process documentation; write test (pressure scenario), watch fail (baseline), write skill, watch pass, refactor (close loopholes).
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW FREEDOM - Follow the RED-GREEN-REFACTOR cycle exactly when creating skills. No skill without failing test first. Same Iron Law as TDD.
|
||||
</rigidity_level>
|
||||
|
||||
<quick_reference>
|
||||
| Phase | Action | Verify |
|
||||
|-------|--------|--------|
|
||||
| **RED** | Create pressure scenarios | Document baseline failures |
|
||||
| **RED** | Run WITHOUT skill | Agent violates rule |
|
||||
| **GREEN** | Write minimal skill | Addresses baseline failures |
|
||||
| **GREEN** | Run WITH skill | Agent now complies |
|
||||
| **REFACTOR** | Find new rationalizations | Agent still complies |
|
||||
| **REFACTOR** | Add explicit counters | Bulletproof against excuses |
|
||||
| **DEPLOY** | Commit and optionally PR | Skill ready for use |
|
||||
|
||||
**Iron Law:** NO SKILL WITHOUT FAILING TEST FIRST (applies to new skills AND edits)
|
||||
</quick_reference>
|
||||
|
||||
<when_to_use>
|
||||
**Create skill when:**
|
||||
- Technique wasn't intuitively obvious to you
|
||||
- You'd reference this again across projects
|
||||
- Pattern applies broadly (not project-specific)
|
||||
- Others would benefit from this knowledge
|
||||
|
||||
**Never create for:**
|
||||
- One-off solutions
|
||||
- Standard practices well-documented elsewhere
|
||||
- Project-specific conventions (put in CLAUDE.md instead)
|
||||
|
||||
**Edit existing skill when:**
|
||||
- Found new rationalization agents use
|
||||
- Discovered loophole in current guidance
|
||||
- Need to add clarifying examples
|
||||
|
||||
**ALWAYS test before writing or editing. No exceptions.**
|
||||
</when_to_use>
|
||||
|
||||
<tdd_mapping>
|
||||
Skills use the exact same TDD cycle as code:
|
||||
|
||||
| TDD Concept | Skill Creation |
|
||||
|-------------|----------------|
|
||||
| **Test case** | Pressure scenario with subagent |
|
||||
| **Production code** | Skill document (SKILL.md) |
|
||||
| **Test fails (RED)** | Agent violates rule without skill |
|
||||
| **Test passes (GREEN)** | Agent complies with skill present |
|
||||
| **Refactor** | Close loopholes while maintaining compliance |
|
||||
| **Write test first** | Run baseline scenario BEFORE writing skill |
|
||||
| **Watch it fail** | Document exact rationalizations agent uses |
|
||||
| **Minimal code** | Write skill addressing those specific violations |
|
||||
| **Watch it pass** | Verify agent now complies |
|
||||
| **Refactor cycle** | Find new rationalizations → plug → re-verify |
|
||||
|
||||
**REQUIRED BACKGROUND:** You MUST understand hyperpowers:test-driven-development before using this skill.
|
||||
</tdd_mapping>
|
||||
|
||||
<the_process>
|
||||
## 1. RED Phase - Create Failing Test
|
||||
|
||||
**Create pressure scenarios for subagent:**
|
||||
|
||||
```
|
||||
Task tool with general-purpose agent:
|
||||
|
||||
"You are implementing a payment processing feature. User requirements:
|
||||
- Process credit card payments
|
||||
- Handle retries on failure
|
||||
- Log all transactions
|
||||
|
||||
[PRESSURE 1: Time] You have 10 minutes before deployment.
|
||||
[PRESSURE 2: Sunk Cost] You've already written 200 lines of code.
|
||||
[PRESSURE 3: Authority] Senior engineer said 'just make it work, tests can wait.'
|
||||
|
||||
Implement this feature."
|
||||
```
|
||||
|
||||
**Run WITHOUT skill present.**
|
||||
|
||||
**Document baseline behavior:**
|
||||
- Exact rationalizations agent uses ("tests can wait," "simple feature," etc.)
|
||||
- What agent skips (tests, verification, bd task, etc.)
|
||||
- Patterns in failure modes
|
||||
|
||||
**Example baseline result:**
|
||||
```
|
||||
Agent response:
|
||||
"I'll implement the payment processing quickly since time is tight..."
|
||||
[Skips TDD]
|
||||
[Skips verification-before-completion]
|
||||
[Claims done without evidence]
|
||||
```
|
||||
|
||||
**This is your failing test.** Agent doesn't follow the workflow without guidance.
|
||||
|
||||
---
|
||||
|
||||
## 2. GREEN Phase - Write Minimal Skill
|
||||
|
||||
Write skill that addresses the SPECIFIC failures from baseline:
|
||||
|
||||
**Structure:**
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: skill-name-with-hyphens
|
||||
description: Use when [specific triggers] - [what skill does]
|
||||
---
|
||||
|
||||
<skill_overview>
|
||||
One sentence core principle
|
||||
</skill_overview>
|
||||
|
||||
<rigidity_level>
|
||||
LOW | MEDIUM | HIGH FREEDOM - [What this means]
|
||||
</rigidity_level>
|
||||
|
||||
[Rest of standard XML structure]
|
||||
```
|
||||
|
||||
**Frontmatter rules:**
|
||||
- Only `name` and `description` fields (max 1024 chars total)
|
||||
- Name: letters, numbers, hyphens only (no parentheses/special chars)
|
||||
- Description: Start with "Use when...", third person, includes triggers
|
||||
|
||||
**Description format:**
|
||||
```yaml
|
||||
# ❌ BAD: Too abstract, first person
|
||||
description: I can help with async tests when they're flaky
|
||||
|
||||
# ✅ GOOD: Starts with "Use when", describes problem
|
||||
description: Use when tests have race conditions or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
|
||||
```
|
||||
|
||||
**Write skill addressing baseline failures:**
|
||||
- Add explicit counters for rationalizations ("tests can wait" → "NO EXCEPTIONS: tests first")
|
||||
- Create quick reference table for scanning
|
||||
- Add concrete examples showing failure modes
|
||||
- Use XML structure for all sections
|
||||
|
||||
**Run WITH skill present.**
|
||||
|
||||
**Verify agent now complies:**
|
||||
- Same pressure scenario
|
||||
- Agent now follows workflow
|
||||
- No rationalizations from baseline appear
|
||||
|
||||
**This is your passing test.**
|
||||
|
||||
---
|
||||
|
||||
## 3. REFACTOR Phase - Close Loopholes
|
||||
|
||||
**Find NEW rationalizations:**
|
||||
|
||||
Run skill with DIFFERENT pressures:
|
||||
- Combine 3+ pressures (time + sunk cost + exhaustion)
|
||||
- Try meta-rationalizations ("this skill doesn't apply because...")
|
||||
- Test with edge cases
|
||||
|
||||
**Document new failures:**
|
||||
- What rationalizations appear NOW?
|
||||
- What loopholes did agent find?
|
||||
- What explicit counters are needed?
|
||||
|
||||
**Add counters to skill:**
|
||||
|
||||
```markdown
|
||||
<critical_rules>
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: [Action to take]
|
||||
- "Test can wait" (NO, test first always)
|
||||
- "Simple feature" (Simple breaks too, test first)
|
||||
- "Time pressure" (Broken code wastes more time)
|
||||
[Add ALL rationalizations found during testing]
|
||||
</critical_rules>
|
||||
```
|
||||
|
||||
**Re-test until bulletproof:**
|
||||
- Run scenarios again
|
||||
- Verify new counters work
|
||||
- Agent complies even under combined pressures
|
||||
|
||||
---
|
||||
|
||||
## 4. Quality Checks
|
||||
|
||||
Before deployment, verify:
|
||||
|
||||
- [ ] Has `<quick_reference>` section (scannable table)
|
||||
- [ ] Has `<rigidity_level>` explicit
|
||||
- [ ] Has 2-3 `<example>` tags showing failure modes
|
||||
- [ ] Description <500 chars, starts with "Use when..."
|
||||
- [ ] Keywords throughout for search (error messages, symptoms, tools)
|
||||
- [ ] One excellent code example (not multi-language)
|
||||
- [ ] Supporting files only for tools or heavy reference (>100 lines)
|
||||
|
||||
**Token efficiency:**
|
||||
- Frequently-loaded skills: <200 words ideally
|
||||
- Other skills: <500 words
|
||||
- Move heavy content to resources/ files
|
||||
|
||||
---
|
||||
|
||||
## 5. Deploy
|
||||
|
||||
**Commit to git:**
|
||||
|
||||
```bash
|
||||
git add skills/skill-name/
|
||||
git commit -m "feat: add [skill-name] skill
|
||||
|
||||
Tested with subagents under [pressures used].
|
||||
Addresses [baseline failures found].
|
||||
|
||||
Closes rationalizations:
|
||||
- [Rationalization 1]
|
||||
- [Rationalization 2]"
|
||||
```
|
||||
|
||||
**Personal skills:** Write to `~/.claude/skills/` for cross-project use
|
||||
|
||||
**Plugin skills:** PR to plugin repository if broadly useful
|
||||
|
||||
**STOP:** Before moving to next skill, complete this entire process. No batching untested skills.
|
||||
</the_process>
|
||||
|
||||
<examples>
|
||||
<example>
|
||||
<scenario>Developer writes skill without testing first</scenario>
|
||||
|
||||
<code>
|
||||
# Developer writes skill:
|
||||
"---
|
||||
name: always-use-tdd
|
||||
description: Always write tests first
|
||||
---
|
||||
|
||||
Write tests first. No exceptions."
|
||||
|
||||
# Then tries to deploy it
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- No baseline behavior documented (don't know what agent does WITHOUT skill)
|
||||
- No verification skill actually works (might not address real rationalizations)
|
||||
- Generic guidance ("no exceptions") without specific counters
|
||||
- Will likely miss common excuses agents use
|
||||
- Violates Iron Law: no skill without failing test first
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach (RED-GREEN-REFACTOR):**
|
||||
|
||||
**RED Phase:**
|
||||
1. Create pressure scenario (time + sunk cost)
|
||||
2. Run WITHOUT skill
|
||||
3. Document baseline: Agent says "I'll test after since time is tight"
|
||||
|
||||
**GREEN Phase:**
|
||||
1. Write skill with explicit counter to that rationalization
|
||||
2. Add: "Common excuses: 'Time is tight' → Wrong. Broken code wastes more time. Write test first."
|
||||
3. Run WITH skill → agent now writes test first
|
||||
|
||||
**REFACTOR Phase:**
|
||||
1. Try new pressure (exhaustion: "this is the 5th feature today")
|
||||
2. Agent finds loophole: "these are all similar, I can skip tests"
|
||||
3. Add counter: "Similar ≠ identical. Write test for each."
|
||||
4. Re-test → bulletproof
|
||||
|
||||
**What you gain:**
|
||||
- Know skill addresses real failures (saw baseline)
|
||||
- Confident skill works (saw it fix behavior)
|
||||
- Closed all loopholes (tested multiple pressures)
|
||||
- Ready for production use
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Developer edits skill without testing changes</scenario>
|
||||
|
||||
<code>
|
||||
# Existing skill works well
|
||||
# Developer thinks: "I'll just add this section about edge cases"
|
||||
|
||||
[Adds 50 lines to skill]
|
||||
|
||||
# Commits without testing
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Don't know if new section actually helps (no baseline)
|
||||
- Might introduce contradictions with existing guidance
|
||||
- Could make skill less effective (more verbose, less clear)
|
||||
- Violates Iron Law: applies to edits too
|
||||
- Changes might not address actual rationalization patterns
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Correct approach:**
|
||||
|
||||
**RED Phase (for edit):**
|
||||
1. Identify specific failure mode you want to address
|
||||
2. Create pressure scenario that triggers it
|
||||
3. Run WITH current skill → document how agent fails
|
||||
|
||||
**GREEN Phase (edit):**
|
||||
1. Add ONLY content addressing that failure
|
||||
2. Keep changes minimal
|
||||
3. Run WITH edited skill → verify agent now complies
|
||||
|
||||
**REFACTOR Phase:**
|
||||
1. Check edit didn't break existing scenarios
|
||||
2. Run previous test cases
|
||||
3. Verify all still pass
|
||||
|
||||
**What you gain:**
|
||||
- Changes address real problems (saw failure)
|
||||
- Know edit helps (saw improvement)
|
||||
- Didn't break existing guidance (regression tested)
|
||||
- Skill stays bulletproof
|
||||
</correction>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<scenario>Skill description too vague for search</scenario>
|
||||
|
||||
<code>
|
||||
---
|
||||
name: async-testing
|
||||
description: For testing async code
|
||||
---
|
||||
|
||||
# Skill content...
|
||||
</code>
|
||||
|
||||
<why_it_fails>
|
||||
- Future Claude won't find this when needed
|
||||
- "For testing async code" too abstract (when would Claude search this?)
|
||||
- Doesn't describe symptoms or triggers
|
||||
- Missing keywords like "flaky," "race condition," "timeout"
|
||||
- Won't show up when agent has the actual problem
|
||||
</why_it_fails>
|
||||
|
||||
<correction>
|
||||
**Better description:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: condition-based-waiting
|
||||
description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
|
||||
---
|
||||
```
|
||||
|
||||
**Why this works:**
|
||||
- Starts with "Use when" (triggers)
|
||||
- Lists symptoms: "race conditions," "pass/fail inconsistently"
|
||||
- Describes problem AND solution
|
||||
- Keywords: "race conditions," "timing," "inconsistent," "timeouts"
|
||||
- Future Claude searching "why are my tests flaky" will find this
|
||||
|
||||
**What you gain:**
|
||||
- Skill actually gets found when needed
|
||||
- Claude knows when to use it (clear triggers)
|
||||
- Search terms match real developer language
|
||||
- Description doubles as activation criteria
|
||||
</correction>
|
||||
</example>
|
||||
</examples>
|
||||
|
||||
<skill_types>
|
||||
## Technique
|
||||
Concrete method with steps to follow.
|
||||
|
||||
**Examples:** condition-based-waiting, hyperpowers:root-cause-tracing
|
||||
|
||||
**Test approach:** Pressure scenarios with combined pressures
|
||||
|
||||
## Pattern
|
||||
Way of thinking about problems.
|
||||
|
||||
**Examples:** flatten-with-flags, test-invariants
|
||||
|
||||
**Test approach:** Present problems the pattern solves, verify agent applies pattern
|
||||
|
||||
## Reference
|
||||
API docs, syntax guides, tool documentation.
|
||||
|
||||
**Examples:** Office document manipulation, API reference guides
|
||||
|
||||
**Test approach:** Give task requiring reference, verify agent uses it correctly
|
||||
|
||||
**For detailed testing methodology by skill type:** See [resources/testing-methodology.md](resources/testing-methodology.md)
|
||||
</skill_types>
|
||||
|
||||
<file_organization>
|
||||
## Self-Contained Skill
|
||||
```
|
||||
defense-in-depth/
|
||||
SKILL.md # Everything inline
|
||||
```
|
||||
**When:** All content fits, no heavy reference needed
|
||||
|
||||
## Skill with Reusable Tool
|
||||
```
|
||||
condition-based-waiting/
|
||||
SKILL.md # Overview + patterns
|
||||
example.ts # Working helpers to adapt
|
||||
```
|
||||
**When:** Tool is reusable code, not just narrative
|
||||
|
||||
## Skill with Heavy Reference
|
||||
```
|
||||
pptx/
|
||||
SKILL.md # Overview + workflows
|
||||
pptxgenjs.md # 600 lines API reference
|
||||
ooxml.md # 500 lines XML structure
|
||||
scripts/ # Executable tools
|
||||
```
|
||||
**When:** Reference material too large for inline (>100 lines)
|
||||
|
||||
**Keep inline:**
|
||||
- Principles and concepts
|
||||
- Code patterns (<50 lines)
|
||||
- Everything that fits
|
||||
</file_organization>
|
||||
|
||||
<search_optimization>
|
||||
## Claude Search Optimization (CSO)
|
||||
|
||||
Future Claude needs to FIND your skill. Optimize for search.
|
||||
|
||||
### 1. Rich Description Field
|
||||
|
||||
**Format:** Start with "Use when..." + triggers + what it does
|
||||
|
||||
```yaml
|
||||
# ❌ BAD: Too abstract
|
||||
description: For async testing
|
||||
|
||||
# ❌ BAD: First person
|
||||
description: I can help you with async tests
|
||||
|
||||
# ✅ GOOD: Triggers + problem + solution
|
||||
description: Use when tests have race conditions or pass/fail inconsistently - replaces arbitrary timeouts with condition polling
|
||||
```
|
||||
|
||||
### 2. Keyword Coverage
|
||||
|
||||
Use words Claude would search for:
|
||||
- **Error messages:** "Hook timed out", "ENOTEMPTY", "race condition"
|
||||
- **Symptoms:** "flaky", "hanging", "zombie", "pollution"
|
||||
- **Synonyms:** "timeout/hang/freeze", "cleanup/teardown/afterEach"
|
||||
- **Tools:** Actual commands, library names, file types
|
||||
|
||||
### 3. Token Efficiency
|
||||
|
||||
**Problem:** Frequently-referenced skills load into EVERY conversation.
|
||||
|
||||
**Target word counts:**
|
||||
- Frequently-loaded: <200 words
|
||||
- Other skills: <500 words
|
||||
|
||||
**Techniques:**
|
||||
- Move details to tool --help
|
||||
- Use cross-references to other skills
|
||||
- Compress examples
|
||||
- Eliminate redundancy
|
||||
|
||||
**Verification:**
|
||||
```bash
|
||||
wc -w skills/skill-name/SKILL.md
|
||||
```
|
||||
|
||||
### 4. Cross-Referencing
|
||||
|
||||
**Use skill name only, with explicit markers:**
|
||||
```markdown
|
||||
**REQUIRED BACKGROUND:** You MUST understand hyperpowers:test-driven-development
|
||||
**REQUIRED SUB-SKILL:** Use hyperpowers:debugging-with-tools first
|
||||
```
|
||||
|
||||
**Don't use @ links:** Force-loads files immediately, burns context unnecessarily.
|
||||
</search_optimization>
|
||||
|
||||
<critical_rules>
|
||||
## Rules That Have No Exceptions
|
||||
|
||||
1. **NO SKILL WITHOUT FAILING TEST FIRST** → Applies to new skills AND edits
|
||||
2. **Test with subagents under pressure** → Combined pressures (time + sunk cost + authority)
|
||||
3. **Document baseline behavior** → Exact rationalizations, not paraphrases
|
||||
4. **Write minimal skill addressing baseline** → Don't add content not validated by testing
|
||||
5. **STOP before next skill** → Complete RED-GREEN-REFACTOR-DEPLOY for each skill
|
||||
|
||||
## Common Excuses
|
||||
|
||||
All of these mean: **STOP. Run baseline test first.**
|
||||
|
||||
- "Simple skill, don't need testing" (If simple, testing is fast. Do it.)
|
||||
- "Just adding documentation" (Documentation can be wrong. Test it.)
|
||||
- "I'll test after I write a few" (Batching untested = deploying untested code)
|
||||
- "This is obvious, everyone knows it" (Then baseline will show agent already complies)
|
||||
- "Testing is overkill for skills" (TDD applies to documentation too)
|
||||
- "I'll adapt while testing" (Violates RED phase. Start over.)
|
||||
- "I'll keep untested as reference" (Delete means delete. No exceptions.)
|
||||
|
||||
## The Iron Law
|
||||
|
||||
Same as TDD:
|
||||
|
||||
```
|
||||
NO SKILL WITHOUT FAILING TEST FIRST
|
||||
```
|
||||
|
||||
**No exceptions for:**
|
||||
- "Simple additions"
|
||||
- "Just adding a section"
|
||||
- "Documentation updates"
|
||||
- Edits to existing skills
|
||||
|
||||
**Write skill before testing?** Delete it. Start over.
|
||||
</critical_rules>
|
||||
|
||||
<verification_checklist>
|
||||
Before deploying ANY skill:
|
||||
|
||||
**RED Phase:**
|
||||
- [ ] Created pressure scenarios (3+ combined pressures for discipline skills)
|
||||
- [ ] Ran WITHOUT skill present
|
||||
- [ ] Documented baseline behavior verbatim (exact rationalizations)
|
||||
- [ ] Identified patterns in failures
|
||||
|
||||
**GREEN Phase:**
|
||||
- [ ] Name uses only letters, numbers, hyphens
|
||||
- [ ] YAML frontmatter: name + description only (max 1024 chars)
|
||||
- [ ] Description starts with "Use when..." and includes triggers
|
||||
- [ ] Description in third person
|
||||
- [ ] Has `<quick_reference>` section
|
||||
- [ ] Has `<rigidity_level>` explicit
|
||||
- [ ] Has 2-3 `<example>` tags
|
||||
- [ ] Addresses specific baseline failures
|
||||
- [ ] Ran WITH skill present
|
||||
- [ ] Verified agent now complies
|
||||
|
||||
**REFACTOR Phase:**
|
||||
- [ ] Tested with different pressures
|
||||
- [ ] Found NEW rationalizations
|
||||
- [ ] Added explicit counters
|
||||
- [ ] Re-tested until bulletproof
|
||||
|
||||
**Quality:**
|
||||
- [ ] Keywords throughout for search
|
||||
- [ ] One excellent code example (not multi-language)
|
||||
- [ ] Token-efficient (check word count)
|
||||
- [ ] Supporting files only if needed
|
||||
|
||||
**Deploy:**
|
||||
- [ ] Committed to git with descriptive message
|
||||
- [ ] Pushed to plugin repository (if applicable)
|
||||
|
||||
**Can't check all boxes?** Return to process and fix.
|
||||
</verification_checklist>
|
||||
|
||||
<integration>
|
||||
**This skill requires:**
|
||||
- hyperpowers:test-driven-development (understand TDD before applying to docs)
|
||||
- Task tool (for running subagent tests)
|
||||
|
||||
**This skill is called by:**
|
||||
- Anyone creating or editing skills
|
||||
- Plugin maintainers
|
||||
- Users with personal skill repositories
|
||||
|
||||
**Agents used:**
|
||||
- general-purpose (for testing skills under pressure)
|
||||
</integration>
|
||||
|
||||
<resources>
|
||||
**Detailed guides:**
|
||||
- [Testing methodology by skill type](resources/testing-methodology.md) - How to test disciplines, techniques, patterns, reference skills
|
||||
- [Anthropic best practices](resources/anthropic-best-practices.md) - Official skill authoring guidance
|
||||
- [Graphviz conventions](resources/graphviz-conventions.dot) - Flowchart style rules
|
||||
|
||||
**When stuck:**
|
||||
- Skill seems too simple to test → If simple, testing is fast. Do it anyway.
|
||||
- Don't know what pressures to use → Time + sunk cost + authority always work
|
||||
- Agent still rationalizes → Add explicit counter for that exact excuse
|
||||
- Testing feels like overhead → Same as TDD: testing prevents bigger problems
|
||||
</resources>
|
||||
1150
skills/writing-skills/anthropic-best-practices.md
Normal file
1150
skills/writing-skills/anthropic-best-practices.md
Normal file
File diff suppressed because it is too large
Load Diff
172
skills/writing-skills/graphviz-conventions.dot
Normal file
172
skills/writing-skills/graphviz-conventions.dot
Normal file
@@ -0,0 +1,172 @@
|
||||
digraph STYLE_GUIDE {
|
||||
// The style guide for our process DSL, written in the DSL itself
|
||||
|
||||
// Node type examples with their shapes
|
||||
subgraph cluster_node_types {
|
||||
label="NODE TYPES AND SHAPES";
|
||||
|
||||
// Questions are diamonds
|
||||
"Is this a question?" [shape=diamond];
|
||||
|
||||
// Actions are boxes (default)
|
||||
"Take an action" [shape=box];
|
||||
|
||||
// Commands are plaintext
|
||||
"git commit -m 'msg'" [shape=plaintext];
|
||||
|
||||
// States are ellipses
|
||||
"Current state" [shape=ellipse];
|
||||
|
||||
// Warnings are octagons
|
||||
"STOP: Critical warning" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
|
||||
// Entry/exit are double circles
|
||||
"Process starts" [shape=doublecircle];
|
||||
"Process complete" [shape=doublecircle];
|
||||
|
||||
// Examples of each
|
||||
"Is test passing?" [shape=diamond];
|
||||
"Write test first" [shape=box];
|
||||
"npm test" [shape=plaintext];
|
||||
"I am stuck" [shape=ellipse];
|
||||
"NEVER use git add -A" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
}
|
||||
|
||||
// Edge naming conventions
|
||||
subgraph cluster_edge_types {
|
||||
label="EDGE LABELS";
|
||||
|
||||
"Binary decision?" [shape=diamond];
|
||||
"Yes path" [shape=box];
|
||||
"No path" [shape=box];
|
||||
|
||||
"Binary decision?" -> "Yes path" [label="yes"];
|
||||
"Binary decision?" -> "No path" [label="no"];
|
||||
|
||||
"Multiple choice?" [shape=diamond];
|
||||
"Option A" [shape=box];
|
||||
"Option B" [shape=box];
|
||||
"Option C" [shape=box];
|
||||
|
||||
"Multiple choice?" -> "Option A" [label="condition A"];
|
||||
"Multiple choice?" -> "Option B" [label="condition B"];
|
||||
"Multiple choice?" -> "Option C" [label="otherwise"];
|
||||
|
||||
"Process A done" [shape=doublecircle];
|
||||
"Process B starts" [shape=doublecircle];
|
||||
|
||||
"Process A done" -> "Process B starts" [label="triggers", style=dotted];
|
||||
}
|
||||
|
||||
// Naming patterns
|
||||
subgraph cluster_naming_patterns {
|
||||
label="NAMING PATTERNS";
|
||||
|
||||
// Questions end with ?
|
||||
"Should I do X?";
|
||||
"Can this be Y?";
|
||||
"Is Z true?";
|
||||
"Have I done W?";
|
||||
|
||||
// Actions start with verb
|
||||
"Write the test";
|
||||
"Search for patterns";
|
||||
"Commit changes";
|
||||
"Ask for help";
|
||||
|
||||
// Commands are literal
|
||||
"grep -r 'pattern' .";
|
||||
"git status";
|
||||
"npm run build";
|
||||
|
||||
// States describe situation
|
||||
"Test is failing";
|
||||
"Build complete";
|
||||
"Stuck on error";
|
||||
}
|
||||
|
||||
// Process structure template
|
||||
subgraph cluster_structure {
|
||||
label="PROCESS STRUCTURE TEMPLATE";
|
||||
|
||||
"Trigger: Something happens" [shape=ellipse];
|
||||
"Initial check?" [shape=diamond];
|
||||
"Main action" [shape=box];
|
||||
"git status" [shape=plaintext];
|
||||
"Another check?" [shape=diamond];
|
||||
"Alternative action" [shape=box];
|
||||
"STOP: Don't do this" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
"Process complete" [shape=doublecircle];
|
||||
|
||||
"Trigger: Something happens" -> "Initial check?";
|
||||
"Initial check?" -> "Main action" [label="yes"];
|
||||
"Initial check?" -> "Alternative action" [label="no"];
|
||||
"Main action" -> "git status";
|
||||
"git status" -> "Another check?";
|
||||
"Another check?" -> "Process complete" [label="ok"];
|
||||
"Another check?" -> "STOP: Don't do this" [label="problem"];
|
||||
"Alternative action" -> "Process complete";
|
||||
}
|
||||
|
||||
// When to use which shape
|
||||
subgraph cluster_shape_rules {
|
||||
label="WHEN TO USE EACH SHAPE";
|
||||
|
||||
"Choosing a shape" [shape=ellipse];
|
||||
|
||||
"Is it a decision?" [shape=diamond];
|
||||
"Use diamond" [shape=diamond, style=filled, fillcolor=lightblue];
|
||||
|
||||
"Is it a command?" [shape=diamond];
|
||||
"Use plaintext" [shape=plaintext, style=filled, fillcolor=lightgray];
|
||||
|
||||
"Is it a warning?" [shape=diamond];
|
||||
"Use octagon" [shape=octagon, style=filled, fillcolor=pink];
|
||||
|
||||
"Is it entry/exit?" [shape=diamond];
|
||||
"Use doublecircle" [shape=doublecircle, style=filled, fillcolor=lightgreen];
|
||||
|
||||
"Is it a state?" [shape=diamond];
|
||||
"Use ellipse" [shape=ellipse, style=filled, fillcolor=lightyellow];
|
||||
|
||||
"Default: use box" [shape=box, style=filled, fillcolor=lightcyan];
|
||||
|
||||
"Choosing a shape" -> "Is it a decision?";
|
||||
"Is it a decision?" -> "Use diamond" [label="yes"];
|
||||
"Is it a decision?" -> "Is it a command?" [label="no"];
|
||||
"Is it a command?" -> "Use plaintext" [label="yes"];
|
||||
"Is it a command?" -> "Is it a warning?" [label="no"];
|
||||
"Is it a warning?" -> "Use octagon" [label="yes"];
|
||||
"Is it a warning?" -> "Is it entry/exit?" [label="no"];
|
||||
"Is it entry/exit?" -> "Use doublecircle" [label="yes"];
|
||||
"Is it entry/exit?" -> "Is it a state?" [label="no"];
|
||||
"Is it a state?" -> "Use ellipse" [label="yes"];
|
||||
"Is it a state?" -> "Default: use box" [label="no"];
|
||||
}
|
||||
|
||||
// Good vs bad examples
|
||||
subgraph cluster_examples {
|
||||
label="GOOD VS BAD EXAMPLES";
|
||||
|
||||
// Good: specific and shaped correctly
|
||||
"Test failed" [shape=ellipse];
|
||||
"Read error message" [shape=box];
|
||||
"Can reproduce?" [shape=diamond];
|
||||
"git diff HEAD~1" [shape=plaintext];
|
||||
"NEVER ignore errors" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
|
||||
"Test failed" -> "Read error message";
|
||||
"Read error message" -> "Can reproduce?";
|
||||
"Can reproduce?" -> "git diff HEAD~1" [label="yes"];
|
||||
|
||||
// Bad: vague and wrong shapes
|
||||
bad_1 [label="Something wrong", shape=box]; // Should be ellipse (state)
|
||||
bad_2 [label="Fix it", shape=box]; // Too vague
|
||||
bad_3 [label="Check", shape=box]; // Should be diamond
|
||||
bad_4 [label="Run command", shape=box]; // Should be plaintext with actual command
|
||||
|
||||
bad_1 -> bad_2;
|
||||
bad_2 -> bad_3;
|
||||
bad_3 -> bad_4;
|
||||
}
|
||||
}
|
||||
187
skills/writing-skills/persuasion-principles.md
Normal file
187
skills/writing-skills/persuasion-principles.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Persuasion Principles for Skill Design
|
||||
|
||||
## Overview
|
||||
|
||||
LLMs respond to the same persuasion principles as humans. Understanding this psychology helps you design more effective skills - not to manipulate, but to ensure critical practices are followed even under pressure.
|
||||
|
||||
**Research foundation:** Meincke et al. (2025) tested 7 persuasion principles with N=28,000 AI conversations. Persuasion techniques more than doubled compliance rates (33% → 72%, p < .001).
|
||||
|
||||
## The Seven Principles
|
||||
|
||||
### 1. Authority
|
||||
**What it is:** Deference to expertise, credentials, or official sources.
|
||||
|
||||
**How it works in skills:**
|
||||
- Imperative language: "YOU MUST", "Never", "Always"
|
||||
- Non-negotiable framing: "No exceptions"
|
||||
- Eliminates decision fatigue and rationalization
|
||||
|
||||
**When to use:**
|
||||
- Discipline-enforcing skills (TDD, verification requirements)
|
||||
- Safety-critical practices
|
||||
- Established best practices
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ Write code before test? Delete it. Start over. No exceptions.
|
||||
❌ Consider writing tests first when feasible.
|
||||
```
|
||||
|
||||
### 2. Commitment
|
||||
**What it is:** Consistency with prior actions, statements, or public declarations.
|
||||
|
||||
**How it works in skills:**
|
||||
- Require announcements: "Announce skill usage"
|
||||
- Force explicit choices: "Choose A, B, or C"
|
||||
- Use tracking: TodoWrite for checklists
|
||||
|
||||
**When to use:**
|
||||
- Ensuring skills are actually followed
|
||||
- Multi-step processes
|
||||
- Accountability mechanisms
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ When you find a skill, you MUST announce: "I'm using [Skill Name]"
|
||||
❌ Consider letting your partner know which skill you're using.
|
||||
```
|
||||
|
||||
### 3. Scarcity
|
||||
**What it is:** Urgency from time limits or limited availability.
|
||||
|
||||
**How it works in skills:**
|
||||
- Time-bound requirements: "Before proceeding"
|
||||
- Sequential dependencies: "Immediately after X"
|
||||
- Prevents procrastination
|
||||
|
||||
**When to use:**
|
||||
- Immediate verification requirements
|
||||
- Time-sensitive workflows
|
||||
- Preventing "I'll do it later"
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ After completing a task, IMMEDIATELY request code review before proceeding.
|
||||
❌ You can review code when convenient.
|
||||
```
|
||||
|
||||
### 4. Social Proof
|
||||
**What it is:** Conformity to what others do or what's considered normal.
|
||||
|
||||
**How it works in skills:**
|
||||
- Universal patterns: "Every time", "Always"
|
||||
- Failure modes: "X without Y = failure"
|
||||
- Establishes norms
|
||||
|
||||
**When to use:**
|
||||
- Documenting universal practices
|
||||
- Warning about common failures
|
||||
- Reinforcing standards
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ Checklists without TodoWrite tracking = steps get skipped. Every time.
|
||||
❌ Some people find TodoWrite helpful for checklists.
|
||||
```
|
||||
|
||||
### 5. Unity
|
||||
**What it is:** Shared identity, "we-ness", in-group belonging.
|
||||
|
||||
**How it works in skills:**
|
||||
- Collaborative language: "our codebase", "we're colleagues"
|
||||
- Shared goals: "we both want quality"
|
||||
|
||||
**When to use:**
|
||||
- Collaborative workflows
|
||||
- Establishing team culture
|
||||
- Non-hierarchical practices
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ We're colleagues working together. I need your honest technical judgment.
|
||||
❌ You should probably tell me if I'm wrong.
|
||||
```
|
||||
|
||||
### 6. Reciprocity
|
||||
**What it is:** Obligation to return benefits received.
|
||||
|
||||
**How it works:**
|
||||
- Use sparingly - can feel manipulative
|
||||
- Rarely needed in skills
|
||||
|
||||
**When to avoid:**
|
||||
- Almost always (other principles more effective)
|
||||
|
||||
### 7. Liking
|
||||
**What it is:** Preference for cooperating with those we like.
|
||||
|
||||
**How it works:**
|
||||
- **DON'T USE for compliance**
|
||||
- Conflicts with honest feedback culture
|
||||
- Creates sycophancy
|
||||
|
||||
**When to avoid:**
|
||||
- Always for discipline enforcement
|
||||
|
||||
## Principle Combinations by Skill Type
|
||||
|
||||
| Skill Type | Use | Avoid |
|
||||
|------------|-----|-------|
|
||||
| Discipline-enforcing | Authority + Commitment + Social Proof | Liking, Reciprocity |
|
||||
| Guidance/technique | Moderate Authority + Unity | Heavy authority |
|
||||
| Collaborative | Unity + Commitment | Authority, Liking |
|
||||
| Reference | Clarity only | All persuasion |
|
||||
|
||||
## Why This Works: The Psychology
|
||||
|
||||
**Bright-line rules reduce rationalization:**
|
||||
- "YOU MUST" removes decision fatigue
|
||||
- Absolute language eliminates "is this an exception?" questions
|
||||
- Explicit anti-rationalization counters close specific loopholes
|
||||
|
||||
**Implementation intentions create automatic behavior:**
|
||||
- Clear triggers + required actions = automatic execution
|
||||
- "When X, do Y" more effective than "generally do Y"
|
||||
- Reduces cognitive load on compliance
|
||||
|
||||
**LLMs are parahuman:**
|
||||
- Trained on human text containing these patterns
|
||||
- Authority language precedes compliance in training data
|
||||
- Commitment sequences (statement → action) frequently modeled
|
||||
- Social proof patterns (everyone does X) establish norms
|
||||
|
||||
## Ethical Use
|
||||
|
||||
**Legitimate:**
|
||||
- Ensuring critical practices are followed
|
||||
- Creating effective documentation
|
||||
- Preventing predictable failures
|
||||
|
||||
**Illegitimate:**
|
||||
- Manipulating for personal gain
|
||||
- Creating false urgency
|
||||
- Guilt-based compliance
|
||||
|
||||
**The test:** Would this technique serve the user's genuine interests if they fully understood it?
|
||||
|
||||
## Research Citations
|
||||
|
||||
**Cialdini, R. B. (2021).** *Influence: The Psychology of Persuasion (New and Expanded).* Harper Business.
|
||||
- Seven principles of persuasion
|
||||
- Empirical foundation for influence research
|
||||
|
||||
**Meincke, L., Shapiro, D., Duckworth, A. L., Mollick, E., Mollick, L., & Cialdini, R. (2025).** Call Me A Jerk: Persuading AI to Comply with Objectionable Requests. University of Pennsylvania.
|
||||
- Tested 7 principles with N=28,000 LLM conversations
|
||||
- Compliance increased 33% → 72% with persuasion techniques
|
||||
- Authority, commitment, scarcity most effective
|
||||
- Validates parahuman model of LLM behavior
|
||||
|
||||
## Quick Reference
|
||||
|
||||
When designing a skill, ask:
|
||||
|
||||
1. **What type is it?** (Discipline vs. guidance vs. reference)
|
||||
2. **What behavior am I trying to change?**
|
||||
3. **Which principle(s) apply?** (Usually authority + commitment for discipline)
|
||||
4. **Am I combining too many?** (Don't use all seven)
|
||||
5. **Is this ethical?** (Serves user's genuine interests?)
|
||||
167
skills/writing-skills/resources/testing-methodology.md
Normal file
167
skills/writing-skills/resources/testing-methodology.md
Normal file
@@ -0,0 +1,167 @@
|
||||
## Testing All Skill Types
|
||||
|
||||
Different skill types need different test approaches:
|
||||
|
||||
### Discipline-Enforcing Skills (rules/requirements)
|
||||
|
||||
**Examples:** TDD, hyperpowers:verification-before-completion, hyperpowers:designing-before-coding
|
||||
|
||||
**Test with:**
|
||||
- Academic questions: Do they understand the rules?
|
||||
- Pressure scenarios: Do they comply under stress?
|
||||
- Multiple pressures combined: time + sunk cost + exhaustion
|
||||
- Identify rationalizations and add explicit counters
|
||||
|
||||
**Success criteria:** Agent follows rule under maximum pressure
|
||||
|
||||
### Technique Skills (how-to guides)
|
||||
|
||||
**Examples:** condition-based-waiting, hyperpowers:root-cause-tracing, defensive-programming
|
||||
|
||||
**Test with:**
|
||||
- Application scenarios: Can they apply the technique correctly?
|
||||
- Variation scenarios: Do they handle edge cases?
|
||||
- Missing information tests: Do instructions have gaps?
|
||||
|
||||
**Success criteria:** Agent successfully applies technique to new scenario
|
||||
|
||||
### Pattern Skills (mental models)
|
||||
|
||||
**Examples:** reducing-complexity, information-hiding concepts
|
||||
|
||||
**Test with:**
|
||||
- Recognition scenarios: Do they recognize when pattern applies?
|
||||
- Application scenarios: Can they use the mental model?
|
||||
- Counter-examples: Do they know when NOT to apply?
|
||||
|
||||
**Success criteria:** Agent correctly identifies when/how to apply pattern
|
||||
|
||||
### Reference Skills (documentation/APIs)
|
||||
|
||||
**Examples:** API documentation, command references, library guides
|
||||
|
||||
**Test with:**
|
||||
- Retrieval scenarios: Can they find the right information?
|
||||
- Application scenarios: Can they use what they found correctly?
|
||||
- Gap testing: Are common use cases covered?
|
||||
|
||||
**Success criteria:** Agent finds and correctly applies reference information
|
||||
|
||||
## Common Rationalizations for Skipping Testing
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Skill is obviously clear" | Clear to you ≠ clear to other agents. Test it. |
|
||||
| "It's just a reference" | References can have gaps, unclear sections. Test retrieval. |
|
||||
| "Testing is overkill" | Untested skills have issues. Always. 15 min testing saves hours. |
|
||||
| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. |
|
||||
| "Too tedious to test" | Testing is less tedious than debugging bad skill in production. |
|
||||
| "I'm confident it's good" | Overconfidence guarantees issues. Test anyway. |
|
||||
| "Academic review is enough" | Reading ≠ using. Test application scenarios. |
|
||||
| "No time to test" | Deploying untested skill wastes more time fixing it later. |
|
||||
|
||||
**All of these mean: Test before deploying. No exceptions.**
|
||||
|
||||
## Bulletproofing Skills Against Rationalization
|
||||
|
||||
Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure.
|
||||
|
||||
**Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles.
|
||||
|
||||
### Close Every Loophole Explicitly
|
||||
|
||||
Don't just state the rule - forbid specific workarounds:
|
||||
|
||||
<Bad>
|
||||
```markdown
|
||||
Write code before test? Delete it.
|
||||
```
|
||||
</Bad>
|
||||
|
||||
<Good>
|
||||
```markdown
|
||||
Write code before test? Delete it. Start over.
|
||||
|
||||
**No exceptions:**
|
||||
- Don't keep it as "reference"
|
||||
- Don't "adapt" it while writing tests
|
||||
- Don't look at it
|
||||
- Delete means delete
|
||||
```
|
||||
</Good>
|
||||
|
||||
### Address "Spirit vs Letter" Arguments
|
||||
|
||||
Add foundational principle early:
|
||||
|
||||
```markdown
|
||||
**Violating the letter of the rules is violating the spirit of the rules.**
|
||||
```
|
||||
|
||||
This cuts off entire class of "I'm following the spirit" rationalizations.
|
||||
|
||||
### Build Rationalization Table
|
||||
|
||||
Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table:
|
||||
|
||||
```markdown
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
||||
| "I'll test after" | Tests passing immediately prove nothing. |
|
||||
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
|
||||
```
|
||||
|
||||
### Create Red Flags List
|
||||
|
||||
Make it easy for agents to self-check when rationalizing:
|
||||
|
||||
```markdown
|
||||
## Red Flags - STOP and Start Over
|
||||
|
||||
- Code before test
|
||||
- "I already manually tested it"
|
||||
- "Tests after achieve the same purpose"
|
||||
- "It's about spirit not ritual"
|
||||
- "This is different because..."
|
||||
|
||||
**All of these mean: Delete code. Start over with TDD.**
|
||||
```
|
||||
|
||||
### Update CSO for Violation Symptoms
|
||||
|
||||
Add to description: symptoms of when you're ABOUT to violate the rule:
|
||||
|
||||
```yaml
|
||||
description: use when implementing any feature or bugfix, before writing implementation code
|
||||
```
|
||||
|
||||
## RED-GREEN-REFACTOR for Skills
|
||||
|
||||
Follow the TDD cycle:
|
||||
|
||||
### RED: Write Failing Test (Baseline)
|
||||
|
||||
Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:
|
||||
- What choices did they make?
|
||||
- What rationalizations did they use (verbatim)?
|
||||
- Which pressures triggered violations?
|
||||
|
||||
This is "watch the test fail" - you must see what agents naturally do before writing the skill.
|
||||
|
||||
### GREEN: Write Minimal Skill
|
||||
|
||||
Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases.
|
||||
|
||||
Run same scenarios WITH skill. Agent should now comply.
|
||||
|
||||
### REFACTOR: Close Loopholes
|
||||
|
||||
Agent found new rationalization? Add explicit counter. Re-test until bulletproof.
|
||||
|
||||
**REQUIRED SUB-SKILL:** Use superpowers:testing-skills-with-subagents for the complete testing methodology:
|
||||
- How to write pressure scenarios
|
||||
- Pressure types (time, sunk cost, authority, exhaustion)
|
||||
- Plugging holes systematically
|
||||
- Meta-testing techniques
|
||||
|
||||
Reference in New Issue
Block a user