Initial commit

2025-11-30 09:06:38 +08:00
commit ed3e4c84c3
76 changed files with 20449 additions and 0 deletions
--- a/skills/brainstorming/SKILL.md
+++ b/skills/brainstorming/SKILL.md
@@ -0,0 +1,529 @@
+---
+name: brainstorming
+description: Use when creating or developing anything, before writing code - refines rough ideas into bd epics with immutable requirements
+---
+
+<skill_overview>
+Turn rough ideas into validated designs stored as bd epics with immutable requirements; tasks created iteratively as you learn, not upfront.
+</skill_overview>
+
+<rigidity_level>
+HIGH FREEDOM - Adapt Socratic questioning to context, but always create immutable epic before code and only create first task (not full tree).
+</rigidity_level>
+
+<quick_reference>
+| Step | Action | Deliverable |
+|------|--------|-------------|
+| 1 | Ask questions (one at a time) | Understanding of requirements |
+| 2 | Research (agents for codebase/internet) | Existing patterns and approaches |
+| 3 | Propose 2-3 approaches with trade-offs | Recommended option |
+| 4 | Present design in sections (200-300 words) | Validated architecture |
+| 5 | Create bd epic with IMMUTABLE requirements | Epic with anti-patterns |
+| 6 | Create ONLY first task | Ready for executing-plans |
+| 7 | Hand off to executing-plans | Iterative implementation begins |
+
+**Key:** Epic = contract (immutable), Tasks = adaptive (created as you learn)
+</quick_reference>
+
+<when_to_use>
+- User describes new feature to implement
+- User has rough idea that needs refinement
+- About to write code without clear requirements
+- Need to explore approaches before committing
+- Requirements exist but architecture unclear
+
+**Don't use for:**
+- Executing existing plans (use hyperpowers:executing-plans)
+- Fixing bugs (use hyperpowers:fixing-bugs)
+- Refactoring (use hyperpowers:refactoring-safely)
+- Requirements already crystal clear and epic exists
+</when_to_use>
+
+<the_process>
+## 1. Understanding the Idea
+
+**Announce:** "I'm using the brainstorming skill to refine your idea into a design."
+
+**Check current state:**
+- Recent commits, existing docs, codebase structure
+- Dispatch `hyperpowers:codebase-investigator` for existing patterns
+- Dispatch `hyperpowers:internet-researcher` for external APIs/libraries
+
+**REQUIRED: Use AskUserQuestion tool for all questions**
+- One question at a time (don't batch multiple questions)
+- Prefer multiple choice options (easier to answer)
+- Wait for response before asking next question
+- Focus on: purpose, constraints, success criteria
+- Gather enough context to propose approaches
+
+**Do NOT just print questions and wait for "yes"** - use the AskUserQuestion tool.
+
+**Example questions:**
+- "What problem does this solve for users?"
+- "Are there existing implementations we should follow?"
+- "What's the most important success criterion?"
+- "Token storage: cookies, localStorage, or sessionStorage?"
+
+---
+
+## 2. Exploring Approaches
+
+**Research first:**
+- Similar feature exists → dispatch codebase-investigator
+- New integration → dispatch internet-researcher
+- Review findings before proposing
+
+**Propose 2-3 approaches with trade-offs:**
+
+```
+Based on [research findings], I recommend:
+
+1. **[Approach A]** (recommended)
+   - Pros: [benefits, especially "matches existing pattern"]
+   - Cons: [drawbacks]
+
+2. **[Approach B]**
+   - Pros: [benefits]
+   - Cons: [drawbacks]
+
+3. **[Approach C]**
+   - Pros: [benefits]
+   - Cons: [drawbacks]
+
+I recommend option 1 because [specific reason, especially codebase consistency].
+```
+
+**Lead with recommended option and explain why.**
+
+---
+
+## 3. Presenting the Design
+
+**Once approach is chosen, present design in sections:**
+- Break into 200-300 word chunks
+- Ask after each: "Does this look right so far?"
+- Cover: architecture, components, data flow, error handling, testing
+- Be ready to go back and clarify
+
+**Show research findings:**
+- "Based on codebase investigation: auth/ uses passport.js..."
+- "API docs show OAuth flow requires..."
+- Demonstrate how design builds on existing code
+
+---
+
+## 4. Creating the bd Epic
+
+**After design validated, create epic as immutable contract:**
+
+```bash
+bd create "Feature: [Feature Name]" \
+  --type epic \
+  --priority [0-4] \
+  --design "## Requirements (IMMUTABLE)
+[What MUST be true when complete - specific, testable]
+- Requirement 1: [concrete requirement]
+- Requirement 2: [concrete requirement]
+- Requirement 3: [concrete requirement]
+
+## Success Criteria (MUST ALL BE TRUE)
+- [ ] Criterion 1 (objective, testable - e.g., 'Integration tests pass')
+- [ ] Criterion 2 (objective, testable - e.g., 'Works with existing User model')
+- [ ] All tests passing
+- [ ] Pre-commit hooks passing
+
+## Anti-Patterns (FORBIDDEN)
+- ❌ [Specific shortcut that violates requirements]
+- ❌ [Rationalization to prevent - e.g., 'NO mocking core behavior']
+- ❌ [Pattern to avoid - e.g., 'NO localStorage for tokens']
+
+## Approach
+[2-3 paragraph summary of chosen approach]
+
+## Architecture
+[Key components, data flow, integration points]
+
+## Context
+[Links to similar implementations: file.ts:123]
+[External docs consulted]
+[Agent research findings]"
+```
+
+**Critical:** Anti-patterns section prevents watering down requirements when blockers occur.
+
+**Example anti-patterns:**
+- ❌ NO localStorage tokens (violates httpOnly security requirement)
+- ❌ NO new user model (must integrate with existing db/models/user.ts)
+- ❌ NO mocking OAuth in integration tests (defeats validation)
+- ❌ NO TODO stubs for core authentication flow
+
+---
+
+## 5. Creating ONLY First Task
+
+**Create one task, not full tree:**
+
+```bash
+bd create "Task 1: [Specific Deliverable]" \
+  --type feature \
+  --priority [match-epic] \
+  --design "## Goal
+[What this task delivers - one clear outcome]
+
+## Implementation
+[Detailed step-by-step for this task]
+
+1. Study existing code
+   [Point to 2-3 similar implementations: file.ts:line]
+
+2. Write tests first (TDD)
+   [Specific test cases for this task]
+
+3. Implementation checklist
+   - [ ] file.ts:line - function_name() - [exactly what it does]
+   - [ ] test.ts:line - test_name() - [what scenario it tests]
+
+## Success Criteria
+- [ ] [Specific, measurable outcome]
+- [ ] Tests passing
+- [ ] Pre-commit hooks passing"
+
+bd dep add bd-2 bd-1 --type parent-child  # Link to epic
+```
+
+**Why only one task?**
+- Subsequent tasks created iteratively by executing-plans
+- Each task reflects learnings from previous
+- Avoids brittle task trees that break when assumptions change
+
+---
+
+## 6. SRE Refinement and Handoff
+
+After epic and first task created:
+
+**REQUIRED: Run SRE refinement before handoff**
+
+```
+Use Skill tool: hyperpowers:sre-task-refinement
+```
+
+SRE refinement will:
+- Apply 7-category corner-case analysis (Opus 4.1)
+- Strengthen success criteria
+- Identify edge cases and failure modes
+- Ensure task is ready for implementation
+
+**Do NOT skip SRE refinement.** The first task sets the pattern for the entire epic.
+
+**After refinement approved, present handoff:**
+
+```
+"Epic bd-1 is ready with immutable requirements and success criteria.
+First task bd-2 has been refined and is ready to execute.
+
+Ready to start implementation? I'll use executing-plans to work through this iteratively.
+
+The executing-plans skill will:
+1. Execute the current task
+2. Review what was learned against epic requirements
+3. Create next task based on current reality
+4. Run SRE refinement on new tasks
+5. Repeat until all epic success criteria met
+
+This approach avoids brittle upfront planning - each task adapts to what we learn."
+```
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer skips research, proposes approach without checking codebase</scenario>
+
+<code>
+User: "Add OAuth authentication"
+
+Claude (without brainstorming):
+"I'll implement OAuth with Auth0..."
+[Proposes approach without checking if auth exists]
+[Doesn't research existing patterns]
+[Misses that passport.js already set up]
+</code>
+
+<why_it_fails>
+- Proposes Auth0 when passport.js already exists in codebase
+- Creates inconsistent architecture (two auth systems)
+- Wastes time implementing when partial solution exists
+- Doesn't leverage existing code
+- User has to redirect to existing pattern
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+1. **Research first:**
+   - Dispatch codebase-investigator: "Find existing auth implementation"
+   - Findings: passport.js at auth/passport-config.ts
+   - Dispatch internet-researcher: "Passport OAuth2 strategies"
+
+2. **Propose approaches building on findings:**
+   ```
+   Based on codebase showing passport.js at auth/passport-config.ts:
+
+   1. Extend existing passport setup (recommended)
+      - Add google-oauth20 strategy
+      - Matches codebase pattern
+      - Pros: Consistent, tested library
+      - Cons: Requires OAuth provider setup
+
+   2. Custom JWT implementation
+      - Pros: Full control
+      - Cons: Security complexity, breaks pattern
+
+   I recommend option 1 because it builds on existing auth/ setup.
+   ```
+
+**What you gain:**
+- Leverages existing code (faster)
+- Consistent architecture (maintainable)
+- Research informs design (correct)
+- User sees you understand codebase (trust)
+</correction>
+</example>
+
+<example>
+<scenario>Developer creates full task tree upfront</scenario>
+
+<code>
+bd create "Epic: Add OAuth"
+bd create "Task 1: Configure OAuth provider"
+bd create "Task 2: Implement token exchange"
+bd create "Task 3: Add refresh token logic"
+bd create "Task 4: Create middleware"
+bd create "Task 5: Add UI components"
+bd create "Task 6: Write integration tests"
+
+# Starts implementing Task 1
+# Discovers OAuth library handles refresh automatically
+# Now Task 3 is wrong, needs deletion
+# Discovers middleware already exists
+# Now Task 4 is wrong
+# Task tree brittle to reality
+</code>
+
+<why_it_fails>
+- Assumptions about implementation prove wrong
+- Task tree becomes incorrect as you learn
+- Wastes time updating/deleting wrong tasks
+- Rigid plan fights with reality
+- Context switching between fixing plan and implementing
+</why_it_fails>
+
+<correction>
+**Correct approach (iterative):**
+
+```bash
+bd create "Epic: Add OAuth" [with immutable requirements]
+bd create "Task 1: Configure OAuth provider"
+
+# Execute Task 1
+# Learn: OAuth library handles refresh, middleware exists
+
+bd create "Task 2: Integrate with existing middleware"
+# [Created AFTER learning from Task 1]
+
+# Execute Task 2
+# Learn: UI needs OAuth button component
+
+bd create "Task 3: Add OAuth button to login UI"
+# [Created AFTER learning from Task 2]
+```
+
+**What you gain:**
+- Tasks reflect current reality (accurate)
+- No wasted time fixing wrong plans (efficient)
+- Each task informed by previous learnings (adaptive)
+- Plan evolves with understanding (flexible)
+- Epic requirements stay immutable (contract preserved)
+</correction>
+</example>
+
+<example>
+<scenario>Epic created without anti-patterns section</scenario>
+
+<code>
+bd create "Epic: OAuth Authentication" --design "
+## Requirements
+- Users authenticate via Google OAuth2
+- Tokens stored securely
+- Session management
+
+## Success Criteria
+- [ ] Login flow works
+- [ ] Tokens secured
+- [ ] All tests pass
+"
+
+# During implementation, hits blocker:
+# "Integration tests for OAuth are complex, I'll mock it..."
+# [No anti-pattern preventing this]
+# Ships with mocked OAuth (defeats validation)
+</code>
+
+<why_it_fails>
+- No explicit forbidden patterns
+- Agent rationalizes shortcuts when blocked
+- "Tokens stored securely" too vague (localStorage? cookies?)
+- Requirements can be "met" without meeting intent
+- Mocking defeats the purpose of integration tests
+</why_it_fails>
+
+<correction>
+**Correct approach with anti-patterns:**
+
+```bash
+bd create "Epic: OAuth Authentication" --design "
+## Requirements (IMMUTABLE)
+- Users authenticate via Google OAuth2
+- Tokens stored in httpOnly cookies (NOT localStorage)
+- Session expires after 24h inactivity
+- Integrates with existing User model at db/models/user.ts
+
+## Success Criteria
+- [ ] Login redirects to Google and back
+- [ ] Tokens in httpOnly cookies
+- [ ] Token refresh works automatically
+- [ ] Integration tests pass WITHOUT mocking OAuth
+- [ ] All tests passing
+
+## Anti-Patterns (FORBIDDEN)
+- ❌ NO localStorage tokens (violates httpOnly requirement)
+- ❌ NO new user model (must use existing)
+- ❌ NO mocking OAuth in integration tests (defeats validation)
+- ❌ NO skipping token refresh (explicit requirement)
+"
+```
+
+**What you gain:**
+- Requirements concrete and specific (testable)
+- Forbidden patterns explicit (prevents shortcuts)
+- Agent can't rationalize away requirements (contract enforced)
+- Success criteria unambiguous (clear done state)
+- Anti-patterns prevent "letter not spirit" compliance
+</correction>
+</example>
+</examples>
+
+<key_principles>
+- **One question at a time** - Don't overwhelm
+- **Multiple choice preferred** - Easier to answer when possible
+- **Delegate research** - Use codebase-investigator and internet-researcher agents
+- **YAGNI ruthlessly** - Remove unnecessary features from all designs
+- **Explore alternatives** - Propose 2-3 approaches before settling
+- **Incremental validation** - Present design in sections, validate each
+- **Epic is contract** - Requirements immutable, tasks adapt
+- **Anti-patterns prevent shortcuts** - Explicit forbidden patterns stop rationalization
+- **One task only** - Subsequent tasks created iteratively (not upfront)
+</key_principles>
+
+<research_agents>
+## Use codebase-investigator when:
+- Understanding how existing features work
+- Finding where specific functionality lives
+- Identifying patterns to follow
+- Verifying assumptions about structure
+- Checking if feature already exists
+
+## Use internet-researcher when:
+- Finding current API documentation
+- Researching library capabilities
+- Comparing technology options
+- Understanding community recommendations
+- Finding official code examples
+
+## Research protocol:
+1. Codebase pattern exists → Use it (unless clearly unwise)
+2. No codebase pattern → Research external patterns
+3. Research yields nothing → Ask user for direction
+</research_agents>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Use AskUserQuestion tool** → Don't just print questions and wait
+2. **Research BEFORE proposing** → Use agents to understand context
+3. **Propose 2-3 approaches** → Don't jump to single solution
+4. **Epic requirements IMMUTABLE** → Tasks adapt, requirements don't
+5. **Include anti-patterns section** → Prevents watering down requirements
+6. **Create ONLY first task** → Subsequent tasks created iteratively
+7. **Run SRE refinement** → Before handoff to executing-plans
+
+## Common Excuses
+
+All of these mean: **STOP. Follow the process.**
+
+- "Requirements obvious, don't need questions" (Questions reveal hidden complexity)
+- "I know this pattern, don't need research" (Research might show better way)
+- "Can plan all tasks upfront" (Plans become brittle, tasks adapt as you learn)
+- "Anti-patterns section overkill" (Prevents rationalization under pressure)
+- "Epic can evolve" (Requirements contract, tasks evolve)
+- "Can just print questions" (Use AskUserQuestion tool - it's more interactive)
+- "SRE refinement overkill for first task" (First task sets pattern for entire epic)
+- "User said yes, design is done" (Still need SRE refinement before execution)
+</critical_rules>
+
+<verification_checklist>
+Before handing off to executing-plans:
+
+- [ ] Used AskUserQuestion tool for clarifying questions (one at a time)
+- [ ] Researched codebase patterns (if applicable)
+- [ ] Researched external docs/libraries (if applicable)
+- [ ] Proposed 2-3 approaches with trade-offs
+- [ ] Presented design in sections, validated each
+- [ ] Created bd epic with all sections (requirements, success criteria, anti-patterns, approach, architecture)
+- [ ] Requirements are IMMUTABLE and specific
+- [ ] Anti-patterns section prevents shortcuts
+- [ ] Created ONLY first task (not full tree)
+- [ ] First task has detailed implementation checklist
+- [ ] Ran SRE refinement on first task (hyperpowers:sre-task-refinement)
+- [ ] Announced handoff to executing-plans after refinement approved
+
+**Can't check all boxes?** Return to process and complete missing steps.
+</verification_checklist>
+
+<integration>
+**This skill calls:**
+- hyperpowers:codebase-investigator (for finding existing patterns)
+- hyperpowers:internet-researcher (for external documentation)
+- hyperpowers:sre-task-refinement (REQUIRED before handoff to executing-plans)
+- hyperpowers:executing-plans (handoff after refinement approved)
+
+**Call chain:**
+```
+brainstorming → sre-task-refinement → executing-plans
+```
+
+**This skill is called by:**
+- hyperpowers:using-hyper (mandatory before writing code)
+- User requests for new features
+- Beginning of greenfield development
+
+**Agents used:**
+- codebase-investigator (understand existing code)
+- internet-researcher (find external documentation)
+
+**Tools required:**
+- AskUserQuestion (for all clarifying questions)
+</integration>
+
+<resources>
+**Detailed guides:**
+- [bd epic template examples](resources/epic-templates.md)
+- [Socratic questioning patterns](resources/questioning-patterns.md)
+- [Anti-pattern examples by domain](resources/anti-patterns.md)
+
+**When stuck:**
+- User gives vague answer → Ask follow-up multiple choice question
+- Research yields nothing → Ask user for direction explicitly
+- Too many approaches → Narrow to top 2-3, explain why others eliminated
+- User changes requirements mid-design → Acknowledge, return to understanding phase
+</resources>
--- a/skills/building-hooks/SKILL.md
+++ b/skills/building-hooks/SKILL.md
@@ -0,0 +1,609 @@
+---
+name: building-hooks
+description: Use when creating Claude Code hooks - covers hook patterns, composition, testing, progressive enhancement from simple to advanced
+---
+
+<skill_overview>
+Hooks encode business rules at application level; start with observation, add automation, enforce only when patterns clear.
+</skill_overview>
+
+<rigidity_level>
+MEDIUM FREEDOM - Follow progressive enhancement (observe → automate → enforce) strictly. Hook patterns are adaptable, but always start non-blocking and test thoroughly.
+</rigidity_level>
+
+<quick_reference>
+| Phase | Approach | Example |
+|-------|----------|---------|
+| 1. Observe | Non-blocking, report only | Log edits, display reminders |
+| 2. Automate | Background tasks, non-blocking | Auto-format, run builds |
+| 3. Enforce | Blocking only when necessary | Block dangerous ops, require fixes |
+
+**Most used events:** UserPromptSubmit (before processing), Stop (after completion)
+
+**Critical:** Start Phase 1, observe for a week, then Phase 2. Only add Phase 3 if absolutely necessary.
+</quick_reference>
+
+<when_to_use>
+Use hooks for:
+- Automatic quality checks (build, lint, format)
+- Workflow automation (skill activation, context injection)
+- Error prevention (catching issues early)
+- Consistent behavior (formatting, conventions)
+
+**Never use hooks for:**
+- Complex business logic (use tools/scripts)
+- Slow operations that block workflow (use background jobs)
+- Anything requiring LLM reasoning (hooks are deterministic)
+</when_to_use>
+
+<hook_lifecycle_events>
+| Event | When Fires | Use Cases |
+|-------|------------|-----------|
+| UserPromptSubmit | Before Claude processes prompt | Validation, context injection, skill activation |
+| Stop | After Claude finishes | Build checks, formatting, quality reminders |
+| PostToolUse | After each tool execution | Logging, tracking, validation |
+| PreToolUse | Before tool execution | Permission checks, validation |
+| ToolError | When tool fails | Error handling, fallbacks |
+| SessionStart | New session begins | Environment setup, context loading |
+| SessionEnd | Session closes | Cleanup, logging |
+| Error | Unhandled error | Error recovery, notifications |
+</hook_lifecycle_events>
+
+<progressive_enhancement>
+## Phase 1: Observation (Non-Blocking)
+
+**Goal:** Understand patterns before acting
+
+**Examples:**
+- Log file edits (PostToolUse)
+- Display reminders (Stop, non-blocking)
+- Track metrics
+
+**Duration:** Observe for 1 week minimum
+
+---
+
+## Phase 2: Automation (Background)
+
+**Goal:** Automate tedious tasks
+
+**Examples:**
+- Auto-format edited files (Stop)
+- Run builds after changes (Stop)
+- Inject helpful context (UserPromptSubmit)
+
+**Requirement:** Fast (<2 seconds), non-blocking
+
+---
+
+## Phase 3: Enforcement (Blocking)
+
+**Goal:** Prevent errors, enforce standards
+
+**Examples:**
+- Block dangerous operations (PreToolUse)
+- Require fixes before continuing (Stop, blocking)
+- Validate inputs (UserPromptSubmit, blocking)
+
+**Requirement:** Only add when patterns clear from Phase 1-2
+</progressive_enhancement>
+
+<common_hook_patterns>
+## Pattern 1: Build Checker (Stop Hook)
+
+**Problem:** TypeScript errors left behind
+
+**Solution:**
+```bash
+#!/bin/bash
+# Stop hook - runs after Claude finishes
+
+# Check modified repos
+modified_repos=$(grep -h "edited" ~/.claude/edit-log.txt | cut -d: -f1 | sort -u)
+
+for repo in $modified_repos; do
+  echo "Building $repo..."
+  cd "$repo" && npm run build 2>&1 | tee /tmp/build-output.txt
+
+  error_count=$(grep -c "error TS" /tmp/build-output.txt || echo "0")
+
+  if [ "$error_count" -gt 0 ]; then
+    if [ "$error_count" -ge 5 ]; then
+      echo "⚠️  Found $error_count errors - consider error-resolver agent"
+    else
+      echo "🔴 Found $error_count TypeScript errors:"
+      grep "error TS" /tmp/build-output.txt
+    fi
+  else
+    echo "✅ Build passed"
+  fi
+done
+```
+
+**Configuration:**
+```json
+{
+  "event": "Stop",
+  "command": "~/.claude/hooks/build-checker.sh",
+  "description": "Run builds on modified repos",
+  "blocking": false
+}
+```
+
+**Result:** Zero errors left behind
+
+---
+
+## Pattern 2: Auto-Formatter (Stop Hook)
+
+**Problem:** Inconsistent formatting
+
+**Solution:**
+```bash
+#!/bin/bash
+# Stop hook - format all edited files
+
+edited_files=$(tail -20 ~/.claude/edit-log.txt | grep "^/" | sort -u)
+
+for file in $edited_files; do
+  repo_dir=$(dirname "$file")
+  while [ "$repo_dir" != "/" ]; do
+    if [ -f "$repo_dir/.prettierrc" ]; then
+      echo "Formatting $file..."
+      cd "$repo_dir" && npx prettier --write "$file"
+      break
+    fi
+    repo_dir=$(dirname "$repo_dir")
+  done
+done
+
+echo "✅ Formatting complete"
+```
+
+**Result:** All code consistently formatted
+
+---
+
+## Pattern 3: Error Handling Reminder (Stop Hook)
+
+**Problem:** Claude forgets error handling
+
+**Solution:**
+```bash
+#!/bin/bash
+# Stop hook - gentle reminder
+
+edited_files=$(tail -20 ~/.claude/edit-log.txt | grep "^/")
+
+risky_patterns=0
+for file in $edited_files; do
+  if grep -q "try\|catch\|async\|await\|prisma\|router\." "$file"; then
+    ((risky_patterns++))
+  fi
+done
+
+if [ "$risky_patterns" -gt 0 ]; then
+  cat <<EOF
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📋 ERROR HANDLING SELF-CHECK
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+⚠️  Risky Patterns Detected
+   $risky_patterns file(s) with async/try-catch/database operations
+
+   ❓ Did you add proper error handling?
+   ❓ Are errors logged appropriately?
+
+   💡 Consider: Sentry.captureException(), proper logging
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+EOF
+fi
+```
+
+**Result:** Claude self-checks without blocking
+
+---
+
+## Pattern 4: Skills Auto-Activation
+
+**See:** hyperpowers:skills-auto-activation for complete implementation
+
+**Summary:** Analyzes prompt keywords, injects skill activation reminder before Claude processes.
+</common_hook_patterns>
+
+<hook_composition>
+## Naming for Order Control
+
+Multiple hooks for same event run in **alphabetical order** by filename.
+
+**Use numeric prefixes:**
+
+```
+hooks/
+├── 00-log-prompt.sh       # First (logging)
+├── 10-inject-context.sh   # Second (context)
+├── 20-activate-skills.sh  # Third (skills)
+└── 99-notify.sh           # Last (notifications)
+```
+
+## Hook Dependencies
+
+If Hook B depends on Hook A's output:
+
+1. **Option 1:** Numeric prefixes (A before B)
+2. **Option 2:** Combine into single hook
+3. **Option 3:** File-based communication
+
+**Example:**
+```bash
+# 10-track-edits.sh writes to edit-log.txt
+# 20-check-builds.sh reads from edit-log.txt
+```
+</hook_composition>
+
+<testing_hooks>
+## Test in Isolation
+
+```bash
+# Manually trigger
+bash ~/.claude/hooks/build-checker.sh
+
+# Check exit code
+echo $?  # 0 = success
+```
+
+## Test with Mock Data
+
+```bash
+# Create mock log
+echo "/path/to/test/file.ts" > /tmp/test-edit-log.txt
+
+# Run with test data
+EDIT_LOG=/tmp/test-edit-log.txt bash ~/.claude/hooks/build-checker.sh
+```
+
+## Test Non-Blocking Behavior
+
+- Hook exits quickly (<2 seconds)
+- Doesn't block Claude
+- Provides clear output
+
+## Test Blocking Behavior
+
+- Blocking decision correct
+- Reason message helpful
+- Escape hatch exists
+
+## Debugging
+
+**Enable logging:**
+```bash
+set -x  # Debug output
+exec 2>~/.claude/hooks/debug.log
+```
+
+**Check execution:**
+```bash
+tail -f ~/.claude/logs/hooks.log
+```
+
+**Common issues:**
+- Timeout (>10 second default)
+- Wrong working directory
+- Missing environment variables
+- File permissions
+</testing_hooks>
+
+<examples>
+<example>
+<scenario>Developer adds blocking hook immediately without observation</scenario>
+
+<code>
+# Developer frustrated by TypeScript errors
+# Creates blocking Stop hook immediately:
+
+#!/bin/bash
+npm run build
+
+if [ $? -ne 0 ]; then
+  echo "BUILD FAILED - BLOCKING"
+  exit 1  # Blocks Claude
+fi
+</code>
+
+<why_it_fails>
+- No observation period to understand patterns
+- Blocks even for minor errors
+- No escape hatch if hook misbehaves
+- Might block during experimentation
+- Frustrates workflow when building is slow
+- Haven't identified when blocking is actually needed
+</why_it_fails>
+
+<correction>
+**Phase 1: Observe (1 week)**
+
+```bash
+#!/bin/bash
+# Non-blocking observation
+npm run build 2>&1 | tee /tmp/build.log
+
+if grep -q "error TS" /tmp/build.log; then
+  echo "🔴 Build errors found (not blocking)"
+fi
+```
+
+**After 1 week, review:**
+- How often do errors appear?
+- Are they usually fixed quickly?
+- Do they cause real problems or just noise?
+
+**Phase 2: If errors are frequent, automate**
+
+```bash
+#!/bin/bash
+# Still non-blocking, but more helpful
+npm run build 2>&1 | tee /tmp/build.log
+
+error_count=$(grep -c "error TS" /tmp/build.log || echo "0")
+
+if [ "$error_count" -ge 5 ]; then
+  echo "⚠️  $error_count errors - consider using error-resolver agent"
+elif [ "$error_count" -gt 0 ]; then
+  echo "🔴 $error_count errors (not blocking):"
+  grep "error TS" /tmp/build.log | head -5
+fi
+```
+
+**Phase 3: Only if observation shows blocking is necessary**
+
+Never reached - non-blocking works fine!
+
+**What you gain:**
+- Understood patterns before acting
+- Non-blocking keeps workflow smooth
+- Helpful messages without friction
+- Can experiment without frustration
+</correction>
+</example>
+
+<example>
+<scenario>Hook is slow, blocks workflow</scenario>
+
+<code>
+#!/bin/bash
+# Stop hook that's too slow
+
+# Run full test suite (takes 45 seconds!)
+npm test
+
+# Run linter (takes 10 seconds)
+npm run lint
+
+# Run build (takes 30 seconds)
+npm run build
+
+# Total: 85 seconds of blocking!
+</code>
+
+<why_it_fails>
+- Hook takes 85 seconds to complete
+- Blocks Claude for entire duration
+- User can't continue working
+- Frustrating, likely to be disabled
+- Defeats purpose of automation
+</why_it_fails>
+
+<correction>
+**Make hook fast (<2 seconds):**
+
+```bash
+#!/bin/bash
+# Stop hook - fast checks only
+
+# Quick syntax check (< 1 second)
+npm run check-syntax
+
+if [ $? -ne 0 ]; then
+  echo "🔴 Syntax errors found"
+  echo "💡 Run 'npm test' manually for full test suite"
+fi
+
+echo "✅ Quick checks passed (run 'npm test' for full suite)"
+```
+
+**Or run slow checks in background:**
+
+```bash
+#!/bin/bash
+# Stop hook - trigger background job
+
+# Start tests in background
+(
+  npm test > /tmp/test-results.txt 2>&1
+  if [ $? -ne 0 ]; then
+    echo "🔴 Tests failed (see /tmp/test-results.txt)"
+  fi
+) &
+
+echo "⏳ Tests running in background (check /tmp/test-results.txt)"
+```
+
+**What you gain:**
+- Hook completes instantly
+- Workflow not blocked
+- Still get quality checks
+- User can continue working
+</correction>
+</example>
+
+<example>
+<scenario>Hook has no error handling, fails silently</scenario>
+
+<code>
+#!/bin/bash
+# Hook with no error handling
+
+file=$(tail -1 ~/.claude/edit-log.txt)
+prettier --write "$file"
+</code>
+
+<why_it_fails>
+- If edit-log.txt missing → hook fails silently
+- If file path invalid → prettier errors not caught
+- If prettier not installed → silent failure
+- No logging, can't debug
+- User has no idea hook ran or failed
+</why_it_fails>
+
+<correction>
+**Add error handling:**
+
+```bash
+#!/bin/bash
+set -euo pipefail  # Exit on error, undefined vars
+
+# Log execution
+echo "[$(date)] Hook started" >> ~/.claude/hooks/formatter.log
+
+# Validate input
+if [ ! -f ~/.claude/edit-log.txt ]; then
+  echo "[$(date)] ERROR: edit-log.txt not found" >> ~/.claude/hooks/formatter.log
+  exit 1
+fi
+
+file=$(tail -1 ~/.claude/edit-log.txt | grep "^/.*\.ts$")
+
+if [ -z "$file" ]; then
+  echo "[$(date)] No TypeScript file to format" >> ~/.claude/hooks/formatter.log
+  exit 0
+fi
+
+if [ ! -f "$file" ]; then
+  echo "[$(date)] ERROR: File not found: $file" >> ~/.claude/hooks/formatter.log
+  exit 1
+fi
+
+# Check prettier exists
+if ! command -v prettier &> /dev/null; then
+  echo "[$(date)] ERROR: prettier not installed" >> ~/.claude/hooks/formatter.log
+  exit 1
+fi
+
+# Format
+echo "[$(date)] Formatting: $file" >> ~/.claude/hooks/formatter.log
+if prettier --write "$file" 2>&1 | tee -a ~/.claude/hooks/formatter.log; then
+  echo "✅ Formatted $file"
+else
+  echo "🔴 Formatting failed (see ~/.claude/hooks/formatter.log)"
+fi
+```
+
+**What you gain:**
+- Errors logged and visible
+- Graceful handling of missing files
+- Can debug when issues occur
+- Clear feedback to user
+- Hook doesn't fail silently
+</correction>
+</example>
+</examples>
+
+<security>
+**Hooks run with your credentials and have full system access.**
+
+## Best Practices
+
+1. **Review code carefully** - Hooks execute any command
+2. **Use absolute paths** - Don't rely on PATH
+3. **Validate inputs** - Don't trust file paths blindly
+4. **Limit scope** - Only access what's needed
+5. **Log actions** - Track what hooks do
+6. **Test thoroughly** - Especially blocking hooks
+
+## Dangerous Patterns
+
+❌ **Don't:**
+```bash
+# DANGEROUS - executes arbitrary code
+cmd=$(tail -1 ~/.claude/edit-log.txt)
+eval "$cmd"
+```
+
+✅ **Do:**
+```bash
+# SAFE - validates and sanitizes
+file=$(tail -1 ~/.claude/edit-log.txt | grep "^/.*\.ts$")
+if [ -f "$file" ]; then
+  prettier --write "$file"
+fi
+```
+</security>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Start with Phase 1 (observe)** → Understand patterns before acting
+2. **Keep hooks fast (<2 seconds)** → Don't block workflow
+3. **Test thoroughly** → Hooks have full system access
+4. **Add error handling and logging** → Silent failures are debugging nightmares
+5. **Use progressive enhancement** → Observe → Automate → Enforce (only if needed)
+
+## Common Excuses
+
+All of these mean: **STOP. Follow progressive enhancement.**
+
+- "Hook is simple, don't need testing" (Untested hooks fail in production)
+- "Blocking is fine, need to enforce" (Start non-blocking, observe first)
+- "I'll add error handling later" (Hook errors silent, add now)
+- "Hook is slow but thorough" (Slow hooks block workflow, optimize)
+- "Need access to everything" (Minimal permissions only)
+</critical_rules>
+
+<verification_checklist>
+Before deploying hook:
+
+- [ ] Tested in isolation (manual execution)
+- [ ] Tested with mock data
+- [ ] Completes quickly (<2 seconds for non-blocking)
+- [ ] Has error handling (set -euo pipefail)
+- [ ] Has logging (can debug failures)
+- [ ] Validates inputs (doesn't trust blindly)
+- [ ] Uses absolute paths
+- [ ] Started with Phase 1 (observation)
+- [ ] If blocking: has escape hatch
+
+**Can't check all boxes?** Return to development and fix.
+</verification_checklist>
+
+<integration>
+**This skill covers:** Hook creation and patterns
+
+**Related skills:**
+- hyperpowers:skills-auto-activation (complete skill activation hook)
+- hyperpowers:verification-before-completion (quality hooks automate this)
+- hyperpowers:testing-anti-patterns (avoid in hooks)
+
+**Hook patterns support:**
+- Automatic skill activation
+- Build verification
+- Code formatting
+- Error prevention
+- Workflow automation
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Complete hook examples](resources/hook-examples.md)
+- [Hook pattern library](resources/hook-patterns.md)
+- [Testing strategies](resources/testing-hooks.md)
+
+**Official documentation:**
+- [Anthropic Hooks Guide](https://docs.claude.com/en/docs/claude-code/hooks-guide)
+
+**When stuck:**
+- Hook failing silently → Add logging, check ~/.claude/hooks/debug.log
+- Hook too slow → Profile execution, move slow parts to background
+- Hook blocking incorrectly → Return to Phase 1, observe patterns
+- Testing unclear → Start with manual execution, then mock data
+</resources>
--- a/skills/building-hooks/resources/hook-examples.md
+++ b/skills/building-hooks/resources/hook-examples.md
@@ -0,0 +1,577 @@
+# Complete Hook Examples
+
+This guide provides complete, production-ready hook implementations you can use and adapt.
+
+## Example 1: File Edit Tracker (PostToolUse)
+
+**Purpose:** Track which files were edited and in which repos for later analysis.
+
+**File:** `~/.claude/hooks/post-tool-use/01-track-edits.sh`
+
+```bash
+#!/bin/bash
+
+# Configuration
+LOG_FILE="$HOME/.claude/edit-log.txt"
+MAX_LOG_LINES=1000
+
+# Create log if doesn't exist
+touch "$LOG_FILE"
+
+# Function to log edit
+log_edit() {
+    local file_path="$1"
+    local timestamp=$(date +"%Y-%m-%d %H:%M:%S")
+    local repo=$(find_repo "$file_path")
+
+    echo "$timestamp | $repo | $file_path" >> "$LOG_FILE"
+}
+
+# Function to find repo root
+find_repo() {
+    local dir=$(dirname "$1")
+    while [ "$dir" != "/" ]; do
+        if [ -d "$dir/.git" ]; then
+            basename "$dir"
+            return
+        fi
+        dir=$(dirname "$dir")
+    done
+    echo "unknown"
+}
+
+# Read tool use event from stdin
+read -r tool_use_json
+
+# Extract file path from tool use
+tool_name=$(echo "$tool_use_json" | jq -r '.tool.name')
+file_path=""
+
+case "$tool_name" in
+    "Edit"|"Write")
+        file_path=$(echo "$tool_use_json" | jq -r '.tool.input.file_path')
+        ;;
+    "MultiEdit")
+        # MultiEdit has multiple files - log each
+        echo "$tool_use_json" | jq -r '.tool.input.edits[].file_path' | while read -r path; do
+            log_edit "$path"
+        done
+        exit 0
+        ;;
+esac
+
+# Log single edit
+if [ -n "$file_path" ] && [ "$file_path" != "null" ]; then
+    log_edit "$file_path"
+fi
+
+# Rotate log if too large
+line_count=$(wc -l < "$LOG_FILE")
+if [ "$line_count" -gt "$MAX_LOG_LINES" ]; then
+    tail -n "$MAX_LOG_LINES" "$LOG_FILE" > "$LOG_FILE.tmp"
+    mv "$LOG_FILE.tmp" "$LOG_FILE"
+fi
+
+# Return success (non-blocking)
+echo '{}'
+```
+
+**Configuration (`hooks.json`):**
+```json
+{
+  "hooks": [
+    {
+      "event": "PostToolUse",
+      "command": "~/.claude/hooks/post-tool-use/01-track-edits.sh",
+      "description": "Track file edits for build checking",
+      "blocking": false,
+      "timeout": 1000
+    }
+  ]
+}
+```
+
+## Example 2: Multi-Repo Build Checker (Stop)
+
+**Purpose:** Run builds on all repos that were modified, report errors.
+
+**File:** `~/.claude/hooks/stop/20-build-checker.sh`
+
+```bash
+#!/bin/bash
+
+# Configuration
+LOG_FILE="$HOME/.claude/edit-log.txt"
+PROJECT_ROOT="$HOME/git/myproject"
+ERROR_THRESHOLD=5
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+NC='\033[0m' # No Color
+
+# Get repos modified since last check
+get_modified_repos() {
+    # Get unique repos from recent edits
+    tail -50 "$LOG_FILE" 2>/dev/null | \
+        cut -d'|' -f2 | \
+        tr -d ' ' | \
+        sort -u | \
+        grep -v "unknown"
+}
+
+# Run build in repo
+build_repo() {
+    local repo_name="$1"
+    local repo_path="$PROJECT_ROOT/$repo_name"
+
+    if [ ! -d "$repo_path" ]; then
+        return 0
+    fi
+
+    # Determine build command
+    local build_cmd=""
+    if [ -f "$repo_path/package.json" ]; then
+        build_cmd="npm run build"
+    elif [ -f "$repo_path/Cargo.toml" ]; then
+        build_cmd="cargo build"
+    elif [ -f "$repo_path/go.mod" ]; then
+        build_cmd="go build ./..."
+    else
+        return 0  # No build system found
+    fi
+
+    echo "Building $repo_name..."
+
+    # Run build and capture output
+    cd "$repo_path"
+    local output=$(eval "$build_cmd" 2>&1)
+    local exit_code=$?
+
+    if [ $exit_code -ne 0 ]; then
+        # Count errors
+        local error_count=$(echo "$output" | grep -c "error" || echo "0")
+
+        if [ "$error_count" -ge "$ERROR_THRESHOLD" ]; then
+            echo -e "${YELLOW}⚠️  $repo_name: $error_count errors found${NC}"
+            echo "   Consider launching auto-error-resolver agent"
+        else
+            echo -e "${RED}🔴 $repo_name: $error_count errors${NC}"
+            echo "$output" | grep "error" | head -10
+        fi
+
+        return 1
+    else
+        echo -e "${GREEN}✅ $repo_name: Build passed${NC}"
+        return 0
+    fi
+}
+
+# Main execution
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo "🔨 BUILD VERIFICATION"
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo ""
+
+modified_repos=$(get_modified_repos)
+
+if [ -z "$modified_repos" ]; then
+    echo "No repos modified since last check"
+    exit 0
+fi
+
+build_failures=0
+
+for repo in $modified_repos; do
+    if ! build_repo "$repo"; then
+        ((build_failures++))
+    fi
+    echo ""
+done
+
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+
+if [ "$build_failures" -gt 0 ]; then
+    echo -e "${RED}$build_failures repo(s) failed to build${NC}"
+else
+    echo -e "${GREEN}All builds passed${NC}"
+fi
+
+# Non-blocking - always return success
+echo '{}'
+```
+
+## Example 3: TypeScript Prettier Formatter (Stop)
+
+**Purpose:** Auto-format all edited TypeScript/JavaScript files.
+
+**File:** `~/.claude/hooks/stop/30-format-code.sh`
+
+```bash
+#!/bin/bash
+
+# Configuration
+LOG_FILE="$HOME/.claude/edit-log.txt"
+PROJECT_ROOT="$HOME/git/myproject"
+
+# Get recently edited files
+get_edited_files() {
+    tail -50 "$LOG_FILE" 2>/dev/null | \
+        cut -d'|' -f3 | \
+        tr -d ' ' | \
+        grep -E '\.(ts|tsx|js|jsx)$' | \
+        sort -u
+}
+
+# Format file with prettier
+format_file() {
+    local file="$1"
+
+    if [ ! -f "$file" ]; then
+        return 0
+    fi
+
+    # Find prettier config
+    local dir=$(dirname "$file")
+    local prettier_config=""
+
+    while [ "$dir" != "/" ]; do
+        if [ -f "$dir/.prettierrc" ] || [ -f "$dir/.prettierrc.json" ]; then
+            prettier_config="$dir"
+            break
+        fi
+        dir=$(dirname "$dir")
+    done
+
+    if [ -z "$prettier_config" ]; then
+        return 0
+    fi
+
+    # Format the file
+    cd "$prettier_config"
+    npx prettier --write "$file" 2>/dev/null
+
+    if [ $? -eq 0 ]; then
+        echo "✓ Formatted: $(basename $file)"
+    fi
+}
+
+# Main execution
+echo "🎨 Formatting edited files..."
+
+edited_files=$(get_edited_files)
+
+if [ -z "$edited_files" ]; then
+    echo "No files to format"
+    exit 0
+fi
+
+formatted_count=0
+
+for file in $edited_files; do
+    if format_file "$file"; then
+        ((formatted_count++))
+    fi
+done
+
+echo "✅ Formatted $formatted_count file(s)"
+
+# Non-blocking
+echo '{}'
+```
+
+## Example 4: Skill Activation Injector (UserPromptSubmit)
+
+**Purpose:** Analyze user prompt and inject skill activation reminders.
+
+**File:** `~/.claude/hooks/user-prompt-submit/skill-activator.js`
+
+```javascript
+#!/usr/bin/env node
+
+const fs = require('fs');
+const path = require('path');
+
+// Load skill rules
+const rulesPath = process.env.SKILL_RULES || path.join(process.env.HOME, '.claude/skill-rules.json');
+const rules = JSON.parse(fs.readFileSync(rulesPath, 'utf8'));
+
+// Read prompt from stdin
+let promptData = '';
+process.stdin.on('data', chunk => {
+    promptData += chunk;
+});
+
+process.stdin.on('end', () => {
+    const prompt = JSON.parse(promptData);
+    const activatedSkills = analyzePrompt(prompt.text);
+
+    if (activatedSkills.length > 0) {
+        const context = generateContext(activatedSkills);
+        console.log(JSON.stringify({
+            decision: 'approve',
+            additionalContext: context
+        }));
+    } else {
+        console.log(JSON.stringify({ decision: 'approve' }));
+    }
+});
+
+function analyzePrompt(text) {
+    const lowerText = text.toLowerCase();
+    const activated = [];
+
+    for (const [skillName, config] of Object.entries(rules)) {
+        // Check keywords
+        if (config.promptTriggers?.keywords) {
+            for (const keyword of config.promptTriggers.keywords) {
+                if (lowerText.includes(keyword.toLowerCase())) {
+                    activated.push({ skill: skillName, priority: config.priority || 'medium' });
+                    break;
+                }
+            }
+        }
+
+        // Check intent patterns
+        if (config.promptTriggers?.intentPatterns) {
+            for (const pattern of config.promptTriggers.intentPatterns) {
+                if (new RegExp(pattern, 'i').test(text)) {
+                    activated.push({ skill: skillName, priority: config.priority || 'medium' });
+                    break;
+                }
+            }
+        }
+    }
+
+    // Sort by priority
+    return activated.sort((a, b) => {
+        const priorityOrder = { high: 0, medium: 1, low: 2 };
+        return priorityOrder[a.priority] - priorityOrder[b.priority];
+    });
+}
+
+function generateContext(skills) {
+    const skillList = skills.map(s => s.skill).join(', ');
+
+    return `
+🎯 SKILL ACTIVATION CHECK
+
+The following skills may be relevant to this prompt:
+${skills.map(s => `- **${s.skill}** (${s.priority} priority)`).join('\n')}
+
+Before responding, check if any of these skills should be used.
+`;
+}
+```
+
+**Configuration (`skill-rules.json`):**
+```json
+{
+  "backend-dev-guidelines": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["backend", "controller", "service", "API", "endpoint"],
+      "intentPatterns": [
+        "(create|add).*?(route|endpoint|controller)",
+        "(how to|best practice).*?(backend|API)"
+      ]
+    }
+  },
+  "frontend-dev-guidelines": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["frontend", "component", "react", "UI", "layout"],
+      "intentPatterns": [
+        "(create|build).*?(component|page|view)",
+        "(how to|pattern).*?(react|frontend)"
+      ]
+    }
+  }
+}
+```
+
+## Example 5: Error Handling Reminder (Stop)
+
+**Purpose:** Gentle reminder to check error handling in risky code.
+
+**File:** `~/.claude/hooks/stop/40-error-reminder.sh`
+
+```bash
+#!/bin/bash
+
+LOG_FILE="$HOME/.claude/edit-log.txt"
+
+# Get recently edited files
+get_edited_files() {
+    tail -20 "$LOG_FILE" 2>/dev/null | \
+        cut -d'|' -f3 | \
+        tr -d ' ' | \
+        sort -u
+}
+
+# Check for risky patterns
+check_file_risk() {
+    local file="$1"
+
+    if [ ! -f "$file" ]; then
+        return 1
+    fi
+
+    # Look for risky patterns
+    if grep -q -E "try|catch|async|await|prisma|\.execute\(|fetch\(|axios\." "$file"; then
+        return 0
+    fi
+
+    return 1
+}
+
+# Main execution
+risky_count=0
+backend_files=0
+
+for file in $(get_edited_files); do
+    if check_file_risk "$file"; then
+        ((risky_count++))
+
+        if echo "$file" | grep -q "backend\|server\|api"; then
+            ((backend_files++))
+        fi
+    fi
+done
+
+if [ "$risky_count" -gt 0 ]; then
+    cat <<EOF
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📋 ERROR HANDLING SELF-CHECK
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+⚠️  Risky Patterns Detected
+   $risky_count file(s) with async/try-catch/database operations
+
+   ❓ Did you add proper error handling?
+   ❓ Are errors logged/captured appropriately?
+   ❓ Are promises handled correctly?
+
+EOF
+
+    if [ "$backend_files" -gt 0 ]; then
+        cat <<EOF
+   💡 Backend Best Practice:
+      - All errors should be captured (Sentry, logging)
+      - Database operations need try-catch
+      - API routes should use error middleware
+
+EOF
+    fi
+
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+fi
+
+# Non-blocking
+echo '{}'
+```
+
+## Example 6: Dangerous Operation Blocker (PreToolUse)
+
+**Purpose:** Block dangerous file operations (deletion, overwrite) in production paths.
+
+**File:** `~/.claude/hooks/pre-tool-use/dangerous-ops.sh`
+
+```bash
+#!/bin/bash
+
+# Read tool use event
+read -r tool_use_json
+
+tool_name=$(echo "$tool_use_json" | jq -r '.tool.name')
+file_path=$(echo "$tool_use_json" | jq -r '.tool.input.file_path // empty')
+
+# Dangerous paths (customize for your project)
+PROTECTED_PATHS=(
+    "/production/"
+    "/prod/"
+    "/.env.production"
+    "/config/production"
+)
+
+# Check if operation is dangerous
+is_dangerous() {
+    local path="$1"
+
+    for protected in "${PROTECTED_PATHS[@]}"; do
+        if [[ "$path" == *"$protected"* ]]; then
+            return 0
+        fi
+    done
+
+    return 1
+}
+
+# Check dangerous operations
+if [ "$tool_name" == "Write" ] || [ "$tool_name" == "Edit" ]; then
+    if is_dangerous "$file_path"; then
+        cat <<EOF | jq -c '.'
+{
+  "decision": "block",
+  "reason": "⛔ BLOCKED: Attempting to modify protected path\\n\\nFile: $file_path\\n\\nThis path is protected from automatic modification.\\nIf you need to make changes:\\n1. Review changes carefully\\n2. Use manual file editing\\n3. Confirm with teammate\\n\\nTo override, edit ~/.claude/hooks/pre-tool-use/dangerous-ops.sh"
+}
+EOF
+        exit 0
+    fi
+fi
+
+# Allow operation (NOTE: PreToolUse hooks should use hookSpecificOutput format with permissionDecision)
+echo '{"decision": "allow"}'
+```
+
+## Testing These Examples
+
+### Test Edit Tracker
+```bash
+# Create test log entry
+echo "2025-01-15 10:30:00 | frontend | /path/to/file.ts" > ~/.claude/edit-log.txt
+
+# Test formatting script
+bash ~/.claude/hooks/stop/30-format-code.sh
+```
+
+### Test Build Checker
+```bash
+# Add some edits to log
+echo "2025-01-15 10:30:00 | backend | /path/to/backend/file.ts" >> ~/.claude/edit-log.txt
+
+# Run build checker
+bash ~/.claude/hooks/stop/20-build-checker.sh
+```
+
+### Test Skill Activator
+```bash
+# Test with mock prompt
+echo '{"text": "How do I create a new API endpoint?"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+## Debugging Tips
+
+**Enable debug mode:**
+```bash
+# Add to top of any bash script
+set -x
+exec 2>>~/.claude/hooks/debug.log
+```
+
+**Check hook execution:**
+```bash
+# Watch hooks run in real-time
+tail -f ~/.claude/logs/hooks.log
+```
+
+**Test hook output:**
+```bash
+# Capture output
+bash ~/.claude/hooks/stop/20-build-checker.sh > /tmp/hook-test.log 2>&1
+cat /tmp/hook-test.log
+```
--- a/skills/building-hooks/resources/hook-patterns.md
+++ b/skills/building-hooks/resources/hook-patterns.md
@@ -0,0 +1,610 @@
+# Hook Patterns Library
+
+Reusable patterns for common hook use cases.
+
+## Pattern: File Path Validation
+
+Safely validate and sanitize file paths in hooks.
+
+```bash
+validate_file_path() {
+    local path="$1"
+
+    # Remove null/empty
+    if [ -z "$path" ] || [ "$path" == "null" ]; then
+        return 1
+    fi
+
+    # Must be absolute path
+    if [[ ! "$path" =~ ^/ ]]; then
+        return 1
+    fi
+
+    # Must exist
+    if [ ! -f "$path" ]; then
+        return 1
+    fi
+
+    # Check file extension whitelist
+    if [[ ! "$path" =~ \.(ts|tsx|js|jsx|py|rs|go|java)$ ]]; then
+        return 1
+    fi
+
+    return 0
+}
+
+# Usage
+if validate_file_path "$file_path"; then
+    # Safe to operate on file
+    process_file "$file_path"
+fi
+```
+
+## Pattern: Finding Project Root
+
+Locate the project root directory from any file path.
+
+```bash
+find_project_root() {
+    local dir="$1"
+
+    # Start from file's directory
+    if [ -f "$dir" ]; then
+        dir=$(dirname "$dir")
+    fi
+
+    # Walk up until finding markers
+    while [ "$dir" != "/" ]; do
+        # Check for project markers
+        if [ -f "$dir/package.json" ] || \
+           [ -f "$dir/Cargo.toml" ] || \
+           [ -f "$dir/go.mod" ] || \
+           [ -d "$dir/.git" ]; then
+            echo "$dir"
+            return 0
+        fi
+        dir=$(dirname "$dir")
+    done
+
+    return 1
+}
+
+# Usage
+project_root=$(find_project_root "$file_path")
+if [ -n "$project_root" ]; then
+    cd "$project_root"
+    npm run build
+fi
+```
+
+## Pattern: Conditional Hook Execution
+
+Run hook only when certain conditions are met.
+
+```bash
+#!/bin/bash
+
+# Configuration
+MIN_CHANGES=3
+TARGET_REPO="backend"
+
+# Check if should run
+should_run() {
+    # Count recent edits
+    local edit_count=$(tail -20 ~/.claude/edit-log.txt | wc -l)
+
+    if [ "$edit_count" -lt "$MIN_CHANGES" ]; then
+        return 1
+    fi
+
+    # Check if target repo was modified
+    if ! tail -20 ~/.claude/edit-log.txt | grep -q "$TARGET_REPO"; then
+        return 1
+    fi
+
+    return 0
+}
+
+# Main execution
+if ! should_run; then
+    echo '{}'
+    exit 0
+fi
+
+# Run actual hook logic
+perform_build_check
+```
+
+## Pattern: Rate Limiting
+
+Prevent hooks from running too frequently.
+
+```bash
+#!/bin/bash
+
+RATE_LIMIT_FILE="/tmp/hook-last-run"
+MIN_INTERVAL=30  # seconds
+
+# Check if enough time has passed
+should_run() {
+    if [ ! -f "$RATE_LIMIT_FILE" ]; then
+        return 0
+    fi
+
+    local last_run=$(cat "$RATE_LIMIT_FILE")
+    local now=$(date +%s)
+    local elapsed=$((now - last_run))
+
+    if [ "$elapsed" -lt "$MIN_INTERVAL" ]; then
+        echo "Skipping (ran ${elapsed}s ago, min interval ${MIN_INTERVAL}s)"
+        return 1
+    fi
+
+    return 0
+}
+
+# Update last run time
+mark_run() {
+    date +%s > "$RATE_LIMIT_FILE"
+}
+
+# Usage
+if should_run; then
+    perform_expensive_operation
+    mark_run
+fi
+
+echo '{}'
+```
+
+## Pattern: Multi-Project Detection
+
+Detect which project/repo a file belongs to.
+
+```bash
+detect_project() {
+    local file="$1"
+    local project_root="/Users/myuser/projects"
+
+    # Extract project name from path
+    if [[ "$file" =~ $project_root/([^/]+) ]]; then
+        echo "${BASH_REMATCH[1]}"
+        return 0
+    fi
+
+    echo "unknown"
+    return 1
+}
+
+# Usage
+project=$(detect_project "$file_path")
+
+case "$project" in
+    "frontend")
+        npm --prefix ~/projects/frontend run build
+        ;;
+    "backend")
+        cargo build --manifest-path ~/projects/backend/Cargo.toml
+        ;;
+    *)
+        echo "Unknown project: $project"
+        ;;
+esac
+```
+
+## Pattern: Graceful Degradation
+
+Handle failures gracefully without blocking workflow.
+
+```bash
+#!/bin/bash
+
+# Try operation with fallback
+try_with_fallback() {
+    local primary_cmd="$1"
+    local fallback_cmd="$2"
+    local description="$3"
+
+    echo "Attempting: $description"
+
+    # Try primary command
+    if eval "$primary_cmd" 2>/dev/null; then
+        echo "✅ Success"
+        return 0
+    fi
+
+    echo "⚠️  Primary failed, trying fallback..."
+
+    # Try fallback
+    if eval "$fallback_cmd" 2>/dev/null; then
+        echo "✅ Fallback succeeded"
+        return 0
+    fi
+
+    echo "❌ Both failed, continuing anyway"
+    return 1
+}
+
+# Usage
+try_with_fallback \
+    "npm run build" \
+    "npm run build:dev" \
+    "Building project"
+
+# Always return empty response (non-blocking)
+echo '{}'
+```
+
+## Pattern: Parallel Execution
+
+Run multiple checks in parallel for speed.
+
+```bash
+#!/bin/bash
+
+# Run checks in parallel
+run_parallel_checks() {
+    local pids=()
+
+    # Start each check in background
+    check_typescript &
+    pids+=($!)
+
+    check_eslint &
+    pids+=($!)
+
+    check_tests &
+    pids+=($!)
+
+    # Wait for all to complete
+    local exit_code=0
+    for pid in "${pids[@]}"; do
+        wait "$pid" || exit_code=1
+    done
+
+    return $exit_code
+}
+
+check_typescript() {
+    npx tsc --noEmit > /tmp/tsc-output.txt 2>&1
+    if [ $? -ne 0 ]; then
+        echo "TypeScript errors found"
+        return 1
+    fi
+}
+
+check_eslint() {
+    npx eslint . > /tmp/eslint-output.txt 2>&1
+}
+
+check_tests() {
+    npm test > /tmp/test-output.txt 2>&1
+}
+
+# Usage
+if run_parallel_checks; then
+    echo "✅ All checks passed"
+else
+    echo "⚠️  Some checks failed"
+    cat /tmp/tsc-output.txt
+    cat /tmp/eslint-output.txt
+fi
+
+echo '{}'
+```
+
+## Pattern: Smart Caching
+
+Cache results to avoid redundant work.
+
+```bash
+#!/bin/bash
+
+CACHE_DIR="$HOME/.claude/hook-cache"
+mkdir -p "$CACHE_DIR"
+
+# Generate cache key
+cache_key() {
+    local file="$1"
+    echo -n "$file:$(stat -f %m "$file" 2>/dev/null || stat -c %Y "$file")" | md5sum | cut -d' ' -f1
+}
+
+# Check cache
+check_cache() {
+    local file="$1"
+    local key=$(cache_key "$file")
+    local cache_file="$CACHE_DIR/$key"
+
+    if [ -f "$cache_file" ]; then
+        # Cache hit
+        cat "$cache_file"
+        return 0
+    fi
+
+    return 1
+}
+
+# Update cache
+update_cache() {
+    local file="$1"
+    local result="$2"
+    local key=$(cache_key "$file")
+    local cache_file="$CACHE_DIR/$key"
+
+    echo "$result" > "$cache_file"
+
+    # Clean old cache entries (older than 1 day)
+    find "$CACHE_DIR" -type f -mtime +1 -delete 2>/dev/null
+}
+
+# Usage
+if cached=$(check_cache "$file_path"); then
+    echo "Cache hit: $cached"
+else
+    result=$(expensive_operation "$file_path")
+    update_cache "$file_path" "$result"
+    echo "Computed: $result"
+fi
+```
+
+## Pattern: Progressive Output
+
+Show progress for long-running hooks.
+
+```bash
+#!/bin/bash
+
+# Progress indicator
+show_progress() {
+    local message="$1"
+    echo -n "$message..."
+}
+
+complete_progress() {
+    local status="$1"
+    if [ "$status" == "success" ]; then
+        echo " ✅"
+    else
+        echo " ❌"
+    fi
+}
+
+# Usage
+show_progress "Running TypeScript compiler"
+if npx tsc --noEmit 2>/dev/null; then
+    complete_progress "success"
+else
+    complete_progress "failure"
+fi
+
+show_progress "Running linter"
+if npx eslint . 2>/dev/null; then
+    complete_progress "success"
+else
+    complete_progress "failure"
+fi
+
+echo '{}'
+```
+
+## Pattern: Context Injection
+
+Inject helpful context into Claude's prompt.
+
+```javascript
+// UserPromptSubmit hook
+function injectContext(prompt) {
+    const context = [];
+
+    // Add relevant documentation
+    if (prompt.includes('API')) {
+        context.push('📖 API Documentation: https://docs.example.com/api');
+    }
+
+    // Add recent changes
+    const recentFiles = getRecentlyEditedFiles();
+    if (recentFiles.length > 0) {
+        context.push(`📝 Recently edited: ${recentFiles.join(', ')}`);
+    }
+
+    // Add project status
+    const buildStatus = getLastBuildStatus();
+    if (!buildStatus.passed) {
+        context.push(`⚠️  Current build has ${buildStatus.errorCount} errors`);
+    }
+
+    if (context.length === 0) {
+        return { decision: 'approve' };
+    }
+
+    return {
+        decision: 'approve',
+        additionalContext: `\n\n---\n${context.join('\n')}\n---\n`
+    };
+}
+```
+
+## Pattern: Error Accumulation
+
+Collect multiple errors before reporting.
+
+```bash
+#!/bin/bash
+
+ERRORS=()
+
+# Add error to collection
+add_error() {
+    ERRORS+=("$1")
+}
+
+# Report all errors
+report_errors() {
+    if [ ${#ERRORS[@]} -eq 0 ]; then
+        echo "✅ No errors found"
+        return 0
+    fi
+
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+    echo "⚠️  Found ${#ERRORS[@]} issue(s):"
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+
+    local i=1
+    for error in "${ERRORS[@]}"; do
+        echo "$i. $error"
+        ((i++))
+    done
+
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+    return 1
+}
+
+# Usage
+if ! run_typescript_check; then
+    add_error "TypeScript compilation failed"
+fi
+
+if ! run_lint_check; then
+    add_error "Linting issues found"
+fi
+
+if ! run_test_check; then
+    add_error "Tests failing"
+fi
+
+report_errors
+
+echo '{}'
+```
+
+## Pattern: Conditional Blocking
+
+Block only on critical errors, warn on others.
+
+```bash
+#!/bin/bash
+
+ERROR_LEVEL="none"  # none, warning, critical
+
+# Check for issues
+check_critical_issues() {
+    if grep -q "FIXME\|XXX\|TODO: CRITICAL" "$file_path"; then
+        ERROR_LEVEL="critical"
+        return 1
+    fi
+    return 0
+}
+
+check_warnings() {
+    if grep -q "console.log\|debugger" "$file_path"; then
+        ERROR_LEVEL="warning"
+        return 1
+    fi
+    return 0
+}
+
+# Run checks
+check_critical_issues
+check_warnings
+
+# Return appropriate decision
+case "$ERROR_LEVEL" in
+    "critical")
+        echo '{
+            "decision": "block",
+            "reason": "🚫 CRITICAL: Found critical TODOs or FIXMEs that must be addressed"
+        }' | jq -c '.'
+        ;;
+    "warning")
+        echo "⚠️  Warning: Found debug statements (console.log, debugger)"
+        echo '{}'
+        ;;
+    *)
+        echo '{}'
+        ;;
+esac
+```
+
+## Pattern: Hook Coordination
+
+Coordinate between multiple hooks using shared state.
+
+```bash
+# Hook 1: Track state
+#!/bin/bash
+STATE_FILE="/tmp/hook-state.json"
+
+# Update state
+jq -n \
+    --arg timestamp "$(date +%s)" \
+    --arg files "$files_edited" \
+    '{lastRun: $timestamp, filesEdited: ($files | split(","))}' \
+    > "$STATE_FILE"
+
+echo '{}'
+```
+
+```bash
+# Hook 2: Read state
+#!/bin/bash
+STATE_FILE="/tmp/hook-state.json"
+
+if [ -f "$STATE_FILE" ]; then
+    last_run=$(jq -r '.lastRun' "$STATE_FILE")
+    files=$(jq -r '.filesEdited[]' "$STATE_FILE")
+
+    # Use state from previous hook
+    for file in $files; do
+        process_file "$file"
+    done
+fi
+
+echo '{}'
+```
+
+## Pattern: User Notification
+
+Notify user of important events without blocking.
+
+```bash
+#!/bin/bash
+
+# Send desktop notification (macOS)
+notify_macos() {
+    osascript -e "display notification \"$1\" with title \"Claude Code Hook\""
+}
+
+# Send desktop notification (Linux)
+notify_linux() {
+    notify-send "Claude Code Hook" "$1"
+}
+
+# Notify based on OS
+notify() {
+    local message="$1"
+
+    case "$OSTYPE" in
+        darwin*)
+            notify_macos "$message"
+            ;;
+        linux*)
+            notify_linux "$message"
+            ;;
+    esac
+}
+
+# Usage
+if [ "$error_count" -gt 10 ]; then
+    notify "⚠️  Build has $error_count errors"
+fi
+
+echo '{}'
+```
+
+## Remember
+
+- **Keep it simple** - Start with basic patterns, add complexity only when needed
+- **Test thoroughly** - Test each pattern in isolation before combining
+- **Fail gracefully** - Non-blocking hooks should never crash workflow
+- **Log everything** - You'll need it for debugging
+- **Document patterns** - Future you will thank present you
--- a/skills/building-hooks/resources/testing-hooks.md
+++ b/skills/building-hooks/resources/testing-hooks.md
@@ -0,0 +1,657 @@
+# Testing Hooks
+
+Comprehensive testing strategies for Claude Code hooks.
+
+## Testing Philosophy
+
+**Hooks run with full system access. Test them thoroughly before deploying.**
+
+### Testing Levels
+
+1. **Unit testing** - Test functions in isolation
+2. **Integration testing** - Test with mock Claude Code events
+3. **Manual testing** - Test in real Claude Code sessions
+4. **Regression testing** - Verify hooks don't break existing workflows
+
+## Unit Testing Hook Functions
+
+### Bash Functions
+
+**Example: Testing file validation**
+
+```bash
+# hook-functions.sh - extractable functions
+validate_file_path() {
+    local path="$1"
+
+    if [ -z "$path" ] || [ "$path" == "null" ]; then
+        return 1
+    fi
+
+    if [[ ! "$path" =~ ^/ ]]; then
+        return 1
+    fi
+
+    if [ ! -f "$path" ]; then
+        return 1
+    fi
+
+    return 0
+}
+
+# Test script
+#!/bin/bash
+source ./hook-functions.sh
+
+test_validate_file_path() {
+    # Test valid path
+    touch /tmp/test-file.txt
+    if validate_file_path "/tmp/test-file.txt"; then
+        echo "✅ Valid path test passed"
+    else
+        echo "❌ Valid path test failed"
+        return 1
+    fi
+
+    # Test invalid path
+    if ! validate_file_path ""; then
+        echo "✅ Empty path test passed"
+    else
+        echo "❌ Empty path test failed"
+        return 1
+    fi
+
+    # Test null path
+    if ! validate_file_path "null"; then
+        echo "✅ Null path test passed"
+    else
+        echo "❌ Null path test failed"
+        return 1
+    fi
+
+    # Test relative path
+    if ! validate_file_path "relative/path.txt"; then
+        echo "✅ Relative path test passed"
+    else
+        echo "❌ Relative path test failed"
+        return 1
+    fi
+
+    rm /tmp/test-file.txt
+    return 0
+}
+
+# Run test
+test_validate_file_path
+```
+
+### JavaScript Functions
+
+**Example: Testing prompt analysis**
+
+```javascript
+// skill-activator.js
+function analyzePrompt(text, rules) {
+    const lowerText = text.toLowerCase();
+    const activated = [];
+
+    for (const [skillName, config] of Object.entries(rules)) {
+        if (config.promptTriggers?.keywords) {
+            for (const keyword of config.promptTriggers.keywords) {
+                if (lowerText.includes(keyword.toLowerCase())) {
+                    activated.push({ skill: skillName, priority: config.priority || 'medium' });
+                    break;
+                }
+            }
+        }
+    }
+
+    return activated;
+}
+
+// test.js
+const assert = require('assert');
+
+const testRules = {
+    'backend-dev': {
+        priority: 'high',
+        promptTriggers: {
+            keywords: ['backend', 'API', 'endpoint']
+        }
+    }
+};
+
+// Test keyword matching
+function testKeywordMatching() {
+    const result = analyzePrompt('How do I create a backend endpoint?', testRules);
+    assert.equal(result.length, 1, 'Should find one skill');
+    assert.equal(result[0].skill, 'backend-dev', 'Should match backend-dev');
+    assert.equal(result[0].priority, 'high', 'Should have high priority');
+    console.log('✅ Keyword matching test passed');
+}
+
+// Test no match
+function testNoMatch() {
+    const result = analyzePrompt('How do I write Python?', testRules);
+    assert.equal(result.length, 0, 'Should find no skills');
+    console.log('✅ No match test passed');
+}
+
+// Test case insensitivity
+function testCaseInsensitive() {
+    const result = analyzePrompt('BACKEND endpoint', testRules);
+    assert.equal(result.length, 1, 'Should match regardless of case');
+    console.log('✅ Case insensitive test passed');
+}
+
+// Run tests
+testKeywordMatching();
+testNoMatch();
+testCaseInsensitive();
+```
+
+## Integration Testing with Mock Events
+
+### Creating Mock Events
+
+**PostToolUse event:**
+```json
+{
+  "event": "PostToolUse",
+  "tool": {
+    "name": "Edit",
+    "input": {
+      "file_path": "/Users/test/project/src/file.ts",
+      "old_string": "const x = 1;",
+      "new_string": "const x = 2;"
+    }
+  },
+  "result": {
+    "success": true
+  }
+}
+```
+
+**UserPromptSubmit event:**
+```json
+{
+  "event": "UserPromptSubmit",
+  "text": "How do I create a new API endpoint?",
+  "timestamp": "2025-01-15T10:30:00Z"
+}
+```
+
+**Stop event:**
+```json
+{
+  "event": "Stop",
+  "sessionId": "abc123",
+  "messageCount": 10
+}
+```
+
+### Testing Hook with Mock Events
+
+```bash
+#!/bin/bash
+# test-hook.sh
+
+# Create mock event
+create_mock_edit_event() {
+    cat <<EOF
+{
+  "event": "PostToolUse",
+  "tool": {
+    "name": "Edit",
+    "input": {
+      "file_path": "/tmp/test-file.ts"
+    }
+  }
+}
+EOF
+}
+
+# Test hook
+test_edit_tracker() {
+    # Setup
+    export LOG_FILE="/tmp/test-edit-log.txt"
+    rm -f "$LOG_FILE"
+
+    # Run hook with mock event
+    create_mock_edit_event | bash hooks/post-tool-use/01-track-edits.sh
+
+    # Verify
+    if [ -f "$LOG_FILE" ]; then
+        if grep -q "test-file.ts" "$LOG_FILE"; then
+            echo "✅ Edit tracker test passed"
+            return 0
+        fi
+    fi
+
+    echo "❌ Edit tracker test failed"
+    return 1
+}
+
+test_edit_tracker
+```
+
+### Testing JavaScript Hooks
+
+```javascript
+// test-skill-activator.js
+const { execSync } = require('child_process');
+
+function testSkillActivator(prompt) {
+    const mockEvent = JSON.stringify({
+        text: prompt
+    });
+
+    const result = execSync(
+        'node hooks/user-prompt-submit/skill-activator.js',
+        {
+            input: mockEvent,
+            encoding: 'utf8',
+            env: {
+                ...process.env,
+                SKILL_RULES: './test-skill-rules.json'
+            }
+        }
+    );
+
+    return JSON.parse(result);
+}
+
+// Test activation
+function testBackendActivation() {
+    const result = testSkillActivator('How do I create a backend endpoint?');
+
+    if (result.additionalContext && result.additionalContext.includes('backend')) {
+        console.log('✅ Backend activation test passed');
+    } else {
+        console.log('❌ Backend activation test failed');
+        process.exit(1);
+    }
+}
+
+testBackendActivation();
+```
+
+## Manual Testing in Claude Code
+
+### Testing Checklist
+
+**Before deployment:**
+- [ ] Hook executes without errors
+- [ ] Hook completes within timeout (default 10s)
+- [ ] Output is helpful and not overwhelming
+- [ ] Non-blocking hooks don't prevent work
+- [ ] Blocking hooks have clear error messages
+- [ ] Hook handles missing files gracefully
+- [ ] Hook handles malformed input gracefully
+
+### Manual Test Procedure
+
+**1. Enable debug mode:**
+```bash
+# Add to top of hook
+set -x
+exec 2>>~/.claude/hooks/debug-$(date +%Y%m%d).log
+```
+
+**2. Test with minimal prompt:**
+```
+Create a simple test file
+```
+
+**3. Observe hook execution:**
+```bash
+# Watch debug log
+tail -f ~/.claude/hooks/debug-*.log
+```
+
+**4. Verify output:**
+- Check that hook completes
+- Verify no errors in debug log
+- Confirm expected behavior
+
+**5. Test edge cases:**
+- Empty file paths
+- Non-existent files
+- Files outside project
+- Malformed input
+- Missing dependencies
+
+**6. Test performance:**
+```bash
+# Time hook execution
+time bash hooks/stop/build-checker.sh
+```
+
+## Regression Testing
+
+### Creating Test Suite
+
+```bash
+#!/bin/bash
+# regression-test.sh
+
+TEST_DIR="/tmp/hook-tests"
+mkdir -p "$TEST_DIR"
+
+# Setup test environment
+setup() {
+    export LOG_FILE="$TEST_DIR/edit-log.txt"
+    export PROJECT_ROOT="$TEST_DIR/projects"
+    mkdir -p "$PROJECT_ROOT"
+}
+
+# Cleanup after tests
+teardown() {
+    rm -rf "$TEST_DIR"
+}
+
+# Test 1: Edit tracker logs edits
+test_edit_tracker_logs() {
+    echo '{"tool": {"name": "Edit", "input": {"file_path": "/test/file.ts"}}}' | \
+        bash hooks/post-tool-use/01-track-edits.sh
+
+    if grep -q "file.ts" "$LOG_FILE"; then
+        echo "✅ Test 1 passed"
+        return 0
+    fi
+
+    echo "❌ Test 1 failed"
+    return 1
+}
+
+# Test 2: Build checker finds errors
+test_build_checker_finds_errors() {
+    # Create mock project with errors
+    mkdir -p "$PROJECT_ROOT/test-project"
+    echo 'const x: string = 123;' > "$PROJECT_ROOT/test-project/error.ts"
+
+    # Add to log
+    echo "2025-01-15 10:00:00 | test-project | error.ts" > "$LOG_FILE"
+
+    # Run build checker (should find errors)
+    output=$(bash hooks/stop/20-build-checker.sh)
+
+    if echo "$output" | grep -q "error"; then
+        echo "✅ Test 2 passed"
+        return 0
+    fi
+
+    echo "❌ Test 2 failed"
+    return 1
+}
+
+# Test 3: Formatter handles missing prettier
+test_formatter_missing_prettier() {
+    # Create file without prettier config
+    mkdir -p "$PROJECT_ROOT/no-prettier"
+    echo 'const x=1' > "$PROJECT_ROOT/no-prettier/file.js"
+    echo "2025-01-15 10:00:00 | no-prettier | file.js" > "$LOG_FILE"
+
+    # Should complete without error
+    if bash hooks/stop/30-format-code.sh 2>&1; then
+        echo "✅ Test 3 passed"
+        return 0
+    fi
+
+    echo "❌ Test 3 failed"
+    return 1
+}
+
+# Run all tests
+run_all_tests() {
+    setup
+
+    local failed=0
+
+    test_edit_tracker_logs || ((failed++))
+    test_build_checker_finds_errors || ((failed++))
+    test_formatter_missing_prettier || ((failed++))
+
+    teardown
+
+    if [ $failed -eq 0 ]; then
+        echo "━━━━━━━━━━━━━━━━━━━━━━━━"
+        echo "✅ All tests passed!"
+        echo "━━━━━━━━━━━━━━━━━━━━━━━━"
+        return 0
+    else
+        echo "━━━━━━━━━━━━━━━━━━━━━━━━"
+        echo "❌ $failed test(s) failed"
+        echo "━━━━━━━━━━━━━━━━━━━━━━━━"
+        return 1
+    fi
+}
+
+run_all_tests
+```
+
+### Running Regression Suite
+
+```bash
+# Run before deploying changes
+bash test/regression-test.sh
+
+# Run on schedule (cron)
+0 0 * * * cd ~/hooks && bash test/regression-test.sh
+```
+
+## Performance Testing
+
+### Measuring Hook Performance
+
+```bash
+#!/bin/bash
+# benchmark-hook.sh
+
+ITERATIONS=10
+HOOK_PATH="hooks/stop/build-checker.sh"
+
+total_time=0
+
+for i in $(seq 1 $ITERATIONS); do
+    start=$(date +%s%N)
+    bash "$HOOK_PATH" > /dev/null 2>&1
+    end=$(date +%s%N)
+
+    elapsed=$(( (end - start) / 1000000 ))  # Convert to ms
+    total_time=$(( total_time + elapsed ))
+
+    echo "Iteration $i: ${elapsed}ms"
+done
+
+average=$(( total_time / ITERATIONS ))
+
+echo "━━━━━━━━━━━━━━━━━━━━━━━━"
+echo "Average: ${average}ms"
+echo "Total: ${total_time}ms"
+echo "━━━━━━━━━━━━━━━━━━━━━━━━"
+
+if [ $average -gt 2000 ]; then
+    echo "⚠️  Hook is slow (>2s)"
+    exit 1
+fi
+
+echo "✅ Performance acceptable"
+```
+
+### Performance Targets
+
+- **Non-blocking hooks:** <2 seconds
+- **Blocking hooks:** <5 seconds
+- **UserPromptSubmit:** <1 second (critical path)
+- **PostToolUse:** <500ms (runs frequently)
+
+## Continuous Testing
+
+### Pre-commit Hook for Hook Testing
+
+```bash
+#!/bin/bash
+# .git/hooks/pre-commit
+
+echo "Testing Claude Code hooks..."
+
+# Run test suite
+if bash test/regression-test.sh; then
+    echo "✅ Hook tests passed"
+    exit 0
+else
+    echo "❌ Hook tests failed"
+    echo "Fix tests before committing"
+    exit 1
+fi
+```
+
+### CI/CD Integration
+
+```yaml
+# .github/workflows/test-hooks.yml
+name: Test Claude Code Hooks
+
+on: [push, pull_request]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v2
+        with:
+          node-version: '18'
+
+      - name: Install dependencies
+        run: npm install
+
+      - name: Run hook tests
+        run: bash test/regression-test.sh
+
+      - name: Run performance tests
+        run: bash test/benchmark-hook.sh
+```
+
+## Common Testing Mistakes
+
+### Mistake 1: Not Testing Error Paths
+
+❌ **Wrong:**
+```bash
+# Only test success path
+npx tsc --noEmit
+echo "✅ Build passed"
+```
+
+✅ **Right:**
+```bash
+# Test both success and failure
+if npx tsc --noEmit 2>&1; then
+    echo "✅ Build passed"
+else
+    echo "❌ Build failed"
+    # Test that error handling works
+fi
+```
+
+### Mistake 2: Hardcoding Paths
+
+❌ **Wrong:**
+```bash
+# Hardcoded path
+cd /Users/myname/projects/myproject
+npm run build
+```
+
+✅ **Right:**
+```bash
+# Dynamic path
+project_root=$(find_project_root "$file_path")
+if [ -n "$project_root" ]; then
+    cd "$project_root"
+    npm run build
+fi
+```
+
+### Mistake 3: Not Cleaning Up
+
+❌ **Wrong:**
+```bash
+# Leaves test files behind
+echo "test" > /tmp/test-file.txt
+run_test
+# Never cleans up
+```
+
+✅ **Right:**
+```bash
+# Always cleanup
+trap 'rm -f /tmp/test-file.txt' EXIT
+echo "test" > /tmp/test-file.txt
+run_test
+```
+
+### Mistake 4: Silent Failures
+
+❌ **Wrong:**
+```bash
+# Errors disappear
+npx tsc --noEmit 2>/dev/null
+```
+
+✅ **Right:**
+```bash
+# Capture errors
+output=$(npx tsc --noEmit 2>&1)
+if [ $? -ne 0 ]; then
+    echo "❌ TypeScript errors:"
+    echo "$output"
+fi
+```
+
+## Debugging Failed Tests
+
+### Enable Verbose Output
+
+```bash
+# Add debug flags
+set -x          # Print commands
+set -e          # Exit on error
+set -u          # Error on undefined variables
+set -o pipefail # Catch pipe failures
+```
+
+### Capture Test Output
+
+```bash
+# Run test with full output
+bash -x test/regression-test.sh 2>&1 | tee test-output.log
+
+# Review output
+less test-output.log
+```
+
+### Isolate Failing Test
+
+```bash
+# Run single test
+source test/regression-test.sh
+setup
+test_build_checker_finds_errors
+teardown
+```
+
+## Remember
+
+- **Test before deploying** - Hooks have full system access
+- **Test all paths** - Success, failure, edge cases
+- **Test performance** - Hooks shouldn't slow workflow
+- **Automate testing** - Run tests on every change
+- **Clean up** - Don't leave test artifacts
+- **Document tests** - Future you will thank present you
+
+**Golden rule:** If you wouldn't run it on production, don't deploy it as a hook.
--- a/skills/commands/brainstorm.md
+++ b/skills/commands/brainstorm.md
@@ -0,0 +1 @@
+Use your hyperpowers:brainstorming skill.
--- a/skills/commands/execute-plan.md
+++ b/skills/commands/execute-plan.md
@@ -0,0 +1 @@
+Use your Executing-Plans skill.
--- a/skills/commands/write-plan.md
+++ b/skills/commands/write-plan.md
@@ -0,0 +1 @@
+Use your hyperpowers:writing-plans skill.
--- a/skills/common-patterns/bd-commands.md
+++ b/skills/common-patterns/bd-commands.md
@@ -0,0 +1,141 @@
+# bd Command Reference
+
+Common bd commands used across multiple skills. Reference this instead of duplicating.
+
+## Reading Issues
+
+```bash
+# Show single issue with full design
+bd show bd-3
+
+# List all open issues
+bd list --status open
+
+# List closed issues
+bd list --status closed
+
+# Show dependency tree for an epic
+bd dep tree bd-1
+
+# Find tasks ready to work on (no blocking dependencies)
+bd ready
+
+# List tasks in a specific epic
+bd list --parent bd-1
+```
+
+## Creating Issues
+
+```bash
+# Create epic
+bd create "Epic: Feature Name" \
+  --type epic \
+  --priority [0-4] \
+  --design "## Goal
+[Epic description]
+
+## Success Criteria
+- [ ] All phases complete
+..."
+
+# Create feature/phase
+bd create "Phase 1: Phase Name" \
+  --type feature \
+  --priority [0-4] \
+  --design "[Phase design]"
+
+# Create task
+bd create "Task Name" \
+  --type task \
+  --priority [0-4] \
+  --design "[Task design]"
+```
+
+## Updating Issues
+
+```bash
+# Update issue design (detailed description)
+bd update bd-3 --design "$(cat <<'EOF'
+[Complete updated design]
+EOF
+)"
+```
+
+**IMPORTANT**: Use `--design` for the full detailed description, NOT `--description` (which is title only).
+
+## Managing Status
+
+```bash
+# Start working on task
+bd update bd-3 --status in_progress
+
+# Complete task
+bd close bd-3
+
+# Reopen task
+bd update bd-3 --status open
+```
+
+**Common Mistakes:**
+```bash
+# ❌ WRONG - bd status shows database overview, doesn't change status
+bd status bd-3 --status in_progress
+
+# ✅ CORRECT - use bd update to change status
+bd update bd-3 --status in_progress
+
+# ❌ WRONG - using hyphens in status values
+bd update bd-3 --status in-progress
+
+# ✅ CORRECT - use underscores in status values
+bd update bd-3 --status in_progress
+
+# ❌ WRONG - 'done' is not a valid status
+bd update bd-3 --status done
+
+# ✅ CORRECT - use bd close to complete
+bd close bd-3
+```
+
+**Valid status values:** `open`, `in_progress`, `blocked`, `closed`
+
+## Managing Dependencies
+
+```bash
+# Add blocking dependency (LATER depends on EARLIER)
+# Syntax: bd dep add <dependent> <dependency>
+bd dep add bd-3 bd-2  # bd-3 depends on bd-2 (do bd-2 first)
+
+# Add parent-child relationship
+# Syntax: bd dep add <child> <parent> --type parent-child
+bd dep add bd-3 bd-1 --type parent-child  # bd-3 is child of bd-1
+
+# View dependency tree
+bd dep tree bd-1
+```
+
+## Commit Message Format
+
+Reference bd task IDs in commits (use hyperpowers:test-runner agent):
+
+```bash
+# Use test-runner agent to avoid pre-commit hook pollution
+Dispatch hyperpowers:test-runner agent: "Run: git add <files> && git commit -m 'feat(bd-3): implement feature
+
+Implements step 1 of bd-3: Task Name
+'"
+```
+
+## Common Queries
+
+```bash
+# Check if all tasks in epic are closed
+bd list --status open --parent bd-1
+# Output: [empty] = all closed
+
+# See what's blocking current work
+bd ready  # Shows only unblocked tasks
+
+# Find all in-progress work
+bd list --status in_progress
+```
--- a/skills/common-patterns/common-anti-patterns.md
+++ b/skills/common-patterns/common-anti-patterns.md
@@ -0,0 +1,123 @@
+# Common Anti-Patterns
+
+Anti-patterns that apply across multiple skills. Reference this to avoid duplication.
+
+## Language-Specific Anti-Patterns
+
+### Rust
+
+```
+❌ No unwrap() or expect() in production code
+   Use proper error handling with Result/Option
+
+❌ No todo!(), unimplemented!(), or panic!() in production
+   Implement all code paths properly
+
+❌ No #[ignore] on tests without bd issue number
+   Fix or track broken tests
+
+❌ No unsafe blocks without documentation
+   Document safety invariants
+
+❌ Use proper array bounds checking
+   Prefer .get() over direct indexing in production
+```
+
+### Swift
+
+```
+❌ No force unwrap (!) in production code
+   Use optional chaining or guard/if let
+
+❌ No fatalError() in production code
+   Handle errors gracefully
+
+❌ No disabled tests without bd issue number
+   Fix or track broken tests
+
+❌ Use proper array bounds checking
+   Check indices before accessing
+
+❌ Handle all enum cases
+   No default: fatalError() shortcuts
+```
+
+### TypeScript
+
+```
+❌ No @ts-ignore or @ts-expect-error without bd issue number
+   Fix type issues properly
+
+❌ No any types without justification
+   Use proper typing
+
+❌ No .skip() on tests without bd issue number
+   Fix or track broken tests
+
+❌ No throw in async code without proper handling
+   Use try/catch or Promise.catch()
+```
+
+## General Anti-Patterns
+
+### Code Quality
+
+```
+❌ No TODOs or FIXMEs without bd issue numbers
+   Track work in bd, not in code comments
+
+❌ No stub implementations
+   Empty functions, placeholder returns forbidden
+
+❌ No commented-out code
+   Delete it - version control remembers
+
+❌ No debug print statements in commits
+   Remove console.log, println!, print() before committing
+
+❌ No "we'll do this later"
+   Either do it now or create bd issue and reference it
+```
+
+### Testing
+
+```
+❌ Don't test mock behavior
+   Test real behavior or unmock it
+
+❌ Don't add test-only methods to production code
+   Put in test utilities instead
+
+❌ Don't mock without understanding dependencies
+   Understand what you're testing first
+
+❌ Don't skip verifications
+   Run the test, see the output, then claim it passes
+```
+
+### Process
+
+```
+❌ Don't commit without running tests
+   Verify tests pass before committing
+
+❌ Don't create PR without running full test suite
+   All tests must pass before PR creation
+
+❌ Don't skip pre-commit hooks
+   Never use --no-verify
+
+❌ Don't force push without explicit request
+   Respect shared branch history
+
+❌ Don't assume backwards compatibility is desired
+   Ask if breaking changes are acceptable
+```
+
+## Project-Specific Additions
+
+Each project may have additional anti-patterns. Check CLAUDE.md for:
+- Project-specific code patterns to avoid
+- Custom linting rules
+- Framework-specific anti-patterns
+- Team conventions
--- a/skills/common-patterns/common-rationalizations.md
+++ b/skills/common-patterns/common-rationalizations.md
@@ -0,0 +1,99 @@
+# Common Rationalizations - STOP
+
+These rationalizations appear across multiple contexts. When you catch yourself thinking any of these, STOP - you're about to violate a skill.
+
+## Process Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "This is simple, can skip the process" | Simple tasks done wrong become complex problems. |
+| "Just this once" | No exceptions. Process exists because exceptions fail. |
+| "I'm confident this will work" | Confidence ≠ evidence. Run the verification. |
+| "I'm tired" | Exhaustion ≠ excuse for shortcuts. |
+| "No time for proper approach" | Shortcuts cost more time in rework. |
+| "Partner won't notice" | They will. Trust is earned through consistency. |
+| "Different words so rule doesn't apply" | Spirit over letter. Intent matters. |
+
+## Verification Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "Should work now" | RUN the verification command. |
+| "Looks correct" | Run it and see the output. |
+| "Tests probably pass" | Probably ≠ verified. Run them. |
+| "Linter passed, must be fine" | Linter ≠ compiler ≠ tests. Run everything. |
+| "Partial check is enough" | Partial proves nothing about the whole. |
+| "Agent said success" | Agents lie/hallucinate. Verify independently. |
+
+## Documentation Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "File probably exists" | Use tools to verify. Don't assume. |
+| "Design mentioned it, must be there" | Codebase changes. Verify current state. |
+| "I can verify quickly myself" | Use investigator agents. Prevents hallucination. |
+| "User can figure it out during execution" | Your job is exact instructions. No ambiguity. |
+
+## Planning Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "Can skip exploring alternatives" | Comparison reveals issues. Always propose 2-3. |
+| "Partner knows what they want" | Questions reveal hidden constraints. Always ask. |
+| "Whole design at once for efficiency" | Incremental validation catches problems early. |
+| "Checklist is just suggestion" | Create TodoWrite todos. Track properly. |
+| "Subtask can reference parent for details" | NO. Subtasks must be complete. NO placeholders, NO "see parent". |
+| "I'll use placeholder and fill in later" | NO. Write actual content NOW. No meta-references like "[detailed above]". |
+| "Design field is too long, use placeholder" | Length doesn't matter. Write full content. Placeholder defeats the purpose. |
+| "Should I continue to the next task?" | YES. You have a TodoWrite plan. Execute it. Don't interrupt your own workflow. |
+| "Let me ask user's preference for remaining tasks" | NO. The user gave you the work. Do it. Only ask at natural completion points. |
+| "Should I stop here or keep going?" | Your TodoWrite list tells you. If tasks remain, continue. |
+
+## Execution Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "Task is tracked in TodoWrite, don't need step tracking" | Tasks have 4-8 implementation steps. Without substep tracking, steps 4-8 get skipped. |
+| "Made progress on the task, can move on" | Progress ≠ complete. All substeps must finish. 2/6 steps = 33%, not done. |
+| "Other tasks are waiting, should continue" | Current task incomplete = blocked. Finish all substeps first. |
+| "Can finish remaining steps later" | Later never comes. Complete all substeps now before closing task. |
+
+## Quality Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "Small gaps don't matter" | Spec is contract. All criteria must be met. |
+| "Will fix in next PR" | This PR should complete this work. Fix now. |
+| "Partner will review anyway" | You review first. Don't delegate your quality check. |
+| "Good enough for now" | "Now" becomes "forever". Do it right. |
+
+## TDD Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "Test is obvious, can skip RED phase" | If you don't watch it fail, you don't know it works. |
+| "Will adapt this code while writing test" | Delete it. Start fresh from the test. |
+| "Can keep it as reference" | No. Delete means delete. |
+| "Test is simple, don't need to run it" | Simple tests fail for subtle reasons. Run it. |
+
+## Research Shortcuts
+
+| Excuse | Reality |
+|--------|---------|
+| "I can research quickly myself" | Use agents. You'll hallucinate or waste context. |
+| "Agent didn't find it first try, must not exist" | Be persistent. Refine query and try again. |
+| "I know this codebase" | You don't know current state. Always verify. |
+| "Obvious solution, skip research" | Codebase may have established pattern. Check first. |
+
+**All of these mean: STOP. Follow the requirements exactly.**
+
+## Why This Matters
+
+Rationalizations are how good processes fail:
+1. Developer thinks "just this once"
+2. Shortcut causes subtle bug
+3. Bug found in production/PR
+4. More time spent fixing than process would have cost
+5. Trust damaged
+
+**No shortcuts. Follow the process. Every time.**
--- a/skills/debugging-with-tools/SKILL.md
+++ b/skills/debugging-with-tools/SKILL.md
@@ -0,0 +1,493 @@
+---
+name: debugging-with-tools
+description: Use when encountering bugs or test failures - systematic debugging using debuggers, internet research, and agents to find root cause before fixing
+---
+
+<skill_overview>
+Random fixes waste time and create new bugs. Always use tools to understand root cause BEFORE attempting fixes. Symptom fixes are failure.
+</skill_overview>
+
+<rigidity_level>
+MEDIUM FREEDOM - Must complete investigation phases (tools → hypothesis → test) before fixing.
+
+Can adapt tool choice to language/context. Never skip investigation or guess at fixes.
+</rigidity_level>
+
+<quick_reference>
+
+| Phase | Tools to Use | Output |
+|-------|--------------|--------|
+| **1. Investigate** | Error messages, internet-researcher agent, debugger, codebase-investigator | Root cause understanding |
+| **2. Hypothesize** | Form theory based on evidence (not guesses) | Testable hypothesis |
+| **3. Test** | Validate hypothesis with minimal change | Confirms or rejects theory |
+| **4. Fix** | Implement proper fix for root cause | Problem solved permanently |
+
+**FORBIDDEN:** Skip investigation → guess at fix → hope it works
+**REQUIRED:** Tools → evidence → hypothesis → test → fix
+
+**Key agents:**
+- `internet-researcher` - Search error messages, known bugs, solutions
+- `codebase-investigator` - Understand code structure, find related code
+- `test-runner` - Run tests without output pollution
+
+</quick_reference>
+
+<when_to_use>
+**Use for ANY technical issue:**
+- Test failures
+- Bugs in production or development
+- Unexpected behavior
+- Build failures
+- Integration issues
+- Performance problems
+
+**ESPECIALLY when:**
+- "Just one quick fix" seems obvious
+- Under time pressure (emergencies make guessing tempting)
+- Error message is unclear
+- Previous fix didn't work
+</when_to_use>
+
+<the_process>
+
+## Phase 1: Tool-Assisted Investigation
+
+**BEFORE attempting ANY fix, gather evidence with tools:**
+
+### 1. Read Complete Error Messages
+
+- Entire error message (not just first line)
+- Complete stack trace (all frames)
+- Line numbers, file paths, error codes
+- Stack traces show exact execution path
+
+### 2. Search Internet FIRST (Use internet-researcher Agent)
+
+**Dispatch internet-researcher with:**
+```
+"Search for error: [exact error message]
+- Check Stack Overflow solutions
+- Look for GitHub issues in [library] version [X]
+- Find official documentation explaining this error
+- Check if this is a known bug"
+```
+
+**What agent should find:**
+- Exact matches to your error
+- Similar symptoms and solutions
+- Known bugs in your dependency versions
+- Workarounds that worked for others
+
+### 3. Use Debugger to Inspect State
+
+**Claude cannot run debuggers directly. Instead:**
+
+**Option A - Recommend debugger to user:**
+```
+"Let's use lldb/gdb/DevTools to inspect state at error location.
+Please run: [specific commands]
+When breakpoint hits: [what to inspect]
+Share output with me."
+```
+
+**Option B - Add instrumentation Claude can add:**
+```rust
+// Add logging
+println!("DEBUG: var = {:?}, state = {:?}", var, state);
+
+// Add assertions
+assert!(condition, "Expected X but got {:?}", actual);
+```
+
+### 4. Investigate Codebase (Use codebase-investigator Agent)
+
+**Dispatch codebase-investigator with:**
+```
+"Error occurs in function X at line Y.
+Find:
+- How is X called? What are the callers?
+- What does variable Z contain at this point?
+- Are there similar functions that work correctly?
+- What changed recently in this area?"
+```
+
+## Phase 2: Form Hypothesis
+
+**Based on evidence (not guesses):**
+
+1. **State what you know** (from investigation)
+2. **Propose theory** explaining the evidence
+3. **Make prediction** that tests the theory
+
+**Example:**
+```
+Known: Error "null pointer" at auth.rs:45 when email is empty
+Theory: Empty email bypasses validation, passes null to login()
+Prediction: Adding validation before login() will prevent error
+Test: Add validation, verify error doesn't occur with empty email
+```
+
+**NEVER:**
+- Guess without evidence
+- Propose fix without hypothesis
+- Skip to "try this and see"
+
+## Phase 3: Test Hypothesis
+
+**Minimal change to validate theory:**
+
+1. Make smallest change that tests hypothesis
+2. Run test/reproduction case
+3. Observe result
+
+**If confirmed:** Proceed to Phase 4
+**If rejected:** Return to Phase 1 with new information
+
+## Phase 4: Implement Fix
+
+**After understanding root cause:**
+
+1. Write test reproducing bug (RED phase - use test-driven-development skill)
+2. Implement proper fix addressing root cause
+3. Verify test passes (GREEN phase)
+4. Run full test suite (regression check)
+5. Commit fix
+
+**The fix should:**
+- Address root cause (not symptom)
+- Be minimal and focused
+- Include test preventing regression
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer encounters test failure, immediately tries "obvious" fix without investigation</scenario>
+
+<code>
+Test error:
+```
+FAIL: test_login_expired_token
+AssertionError: Expected Err(TokenExpired), got Ok(User)
+```
+
+Developer thinks: "Obviously the token expiration check is wrong"
+
+Makes change without investigation:
+```rust
+// "Fix" - just check if token is expired
+if token.expires_at < now() {
+    return Err(AuthError::TokenExpired);
+}
+```
+
+Commits without testing other cases.
+</code>
+
+<why_it_fails>
+**No investigation:**
+- Didn't read error completely
+- Didn't check what `expires_at` contains
+- Didn't debug to see token state
+- Didn't search for similar issues
+
+**What actually happened:** Token `expires_at` was being parsed incorrectly, always showing future date. The "fix" adds dead code that never runs.
+
+**Result:** Bug not fixed, new dead code added, time wasted.
+</why_it_fails>
+
+<correction>
+**Phase 1 - Investigate with tools:**
+
+```bash
+# 1. Read complete error
+FAIL: test_login_expired_token at line 45
+Expected: Err(TokenExpired)
+Got: Ok(User { id: 123 })
+Token: { expires_at: "2099-01-01", ... }
+```
+
+**Dispatch internet-researcher:**
+```
+"Search for: token expiration always showing future date
+- Check date parsing bugs
+- Look for timezone issues
+- Find JWT expiration handling"
+```
+
+**Add instrumentation:**
+```rust
+println!("DEBUG: expires_at = {:?}, now = {:?}, expired = {:?}",
+         token.expires_at, now(), token.expires_at < now());
+```
+
+**Run test again:**
+```
+DEBUG: expires_at = 2099-01-01T00:00:00Z, now = 2024-01-15T10:30:00Z, expired = false
+```
+
+**Phase 2 - Hypothesis:**
+"Token `expires_at` is being set to 2099, not actual expiration. Problem is in token creation, not validation."
+
+**Phase 3 - Test:**
+Check token creation code:
+```rust
+// Found the bug!
+fn create_token() -> Token {
+    Token {
+        expires_at: "2099-01-01".parse()?, // HARDCODED!
+        ...
+    }
+}
+```
+
+**Phase 4 - Fix root cause:**
+```rust
+fn create_token(duration: Duration) -> Token {
+    Token {
+        expires_at: now() + duration,  // Correct
+        ...
+    }
+}
+```
+
+**Result:** Root cause fixed, test passes, no dead code.
+</correction>
+</example>
+
+<example>
+<scenario>Developer skips internet search, reinvents solution to known problem</scenario>
+
+<code>
+Error:
+```
+error: linking with `cc` failed: exit status: 1
+ld: symbol(s) not found for architecture arm64
+```
+
+Developer thinks: "Must be a linking issue, I'll add flags"
+
+Spends 2 hours trying different linker flags:
+```toml
+[target.aarch64-apple-darwin]
+rustflags = ["-C", "link-arg=-undefined dynamic_lookup"]
+# Doesn't work, tries more flags...
+```
+</code>
+
+<why_it_fails>
+**Skipped internet search:**
+- This is a common error with known solutions
+- Stack Overflow has exact fix
+- Official docs explain the issue
+- Wasted 2 hours reinventing solution
+
+**Why it happens:** Impatience, thinking "I can figure this out faster"
+</why_it_fails>
+
+<correction>
+**Dispatch internet-researcher FIRST:**
+
+```
+"Search for: 'symbol not found for architecture arm64' Rust linking
+- Check Stack Overflow solutions
+- Look for Xcode/macOS specific fixes
+- Find Cargo configuration for Apple Silicon"
+```
+
+**Agent returns (30 seconds):**
+```
+Found on Stack Overflow (2.4k upvotes):
+This occurs when Xcode Command Line Tools aren't installed or outdated.
+
+Solution:
+xcode-select --install
+
+Or update existing:
+softwareupdate --all --install --force
+```
+
+**Apply solution:**
+```bash
+xcode-select --install
+# Wait for installation
+cargo build
+# Success!
+```
+
+**Result:** Fixed in 5 minutes, not 2 hours.
+</correction>
+</example>
+
+<example>
+<scenario>Developer fixes symptom without understanding root cause, bug returns different way</scenario>
+
+<code>
+Bug: Users can delete other users' posts
+
+Developer notices: DELETE /posts/:id doesn't check ownership
+
+"Fix":
+```python
+@app.delete("/posts/{post_id}")
+def delete_post(post_id: int, user: User):
+    post = get_post(post_id)
+    if post.author_id != user.id:
+        raise Forbidden()
+    delete(post)
+```
+
+Bug "fixed"... but bug report comes back: "Users can edit other users' posts"
+</code>
+
+<why_it_fails>
+**Symptom fix:**
+- Fixed DELETE endpoint
+- Didn't investigate root cause
+- Didn't check other endpoints
+
+**Root cause:** No authorization middleware checking ownership across ALL endpoints.
+
+**Result:** Whack-a-mole fixing symptoms, not the underlying issue.
+</why_it_fails>
+
+<correction>
+**Phase 1 - Investigate with codebase-investigator:**
+
+```
+"User can delete other users' posts. Investigate:
+- What authorization exists across all post endpoints?
+- Is there a pattern for ownership checks?
+- Which other endpoints access posts?
+- Is there middleware handling this?"
+```
+
+**Agent reports:**
+```
+Found:
+- DELETE /posts/:id - NO ownership check
+- PUT /posts/:id - NO ownership check
+- GET /posts/:id - NO ownership check (allows viewing private posts!)
+- No authorization middleware found
+- Similar pattern in comments, likes endpoints
+```
+
+**Phase 2 - Hypothesis:**
+"Missing authorization layer. Need middleware checking resource ownership across ALL endpoints."
+
+**Phase 4 - Fix root cause:**
+```python
+# Add authorization middleware
+class OwnershipMiddleware:
+    def check_ownership(self, resource, user):
+        if resource.author_id != user.id:
+            raise Forbidden()
+
+# Apply to all endpoints
+@app.delete("/posts/{post_id}")
+@require_ownership(Post)
+def delete_post(...):
+    ...
+
+@app.put("/posts/{post_id}")
+@require_ownership(Post)
+def update_post(...):
+    ...
+```
+
+**Result:** Root cause fixed, ALL endpoints secured, not just one symptom.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **Tools before fixes** → Never guess without investigation
+   - Use internet-researcher for errors
+   - Use debugger or instrumentation for state
+   - Use codebase-investigator for context
+
+2. **Evidence-based hypotheses** → Not guesses or hunches
+   - State what tools revealed
+   - Propose theory explaining evidence
+   - Make testable prediction
+
+3. **Test hypothesis before fixing** → Minimal change to validate
+   - Smallest change that tests theory
+   - Observe result
+   - If wrong, return to investigation
+
+4. **Fix root cause, not symptom** → One fix, many symptoms prevented
+   - Understand why problem occurred
+   - Fix the underlying issue
+   - Don't play whack-a-mole
+
+## Common Excuses
+
+All of these mean: Stop, use tools to investigate:
+- "The fix is obvious"
+- "I know what this is"
+- "Just a quick try"
+- "No time for debugging"
+- "Error message is clear enough"
+- "Internet search will take too long"
+
+</critical_rules>
+
+<verification_checklist>
+
+Before proposing any fix:
+- [ ] Read complete error message (not just first line)
+- [ ] Dispatched internet-researcher for unclear errors
+- [ ] Used debugger or added instrumentation to inspect state
+- [ ] Dispatched codebase-investigator to understand context
+- [ ] Formed hypothesis based on evidence (not guesses)
+- [ ] Tested hypothesis with minimal change
+- [ ] Verified hypothesis confirmed before fixing
+
+Before committing fix:
+- [ ] Written test reproducing bug (RED phase)
+- [ ] Verified test fails before fix
+- [ ] Implemented fix addressing root cause
+- [ ] Verified test passes after fix (GREEN phase)
+- [ ] Ran full test suite (regression check)
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- internet-researcher (search errors, known bugs, solutions)
+- codebase-investigator (understand code structure, find related code)
+- test-driven-development (write test for bug, implement fix)
+- test-runner (run tests without output pollution)
+
+**This skill is called by:**
+- fixing-bugs (complete bug fix workflow)
+- root-cause-tracing (deep debugging for complex issues)
+- Any skill when encountering unexpected behavior
+
+**Agents used:**
+- hyperpowers:internet-researcher (search for error solutions)
+- hyperpowers:codebase-investigator (understand codebase context)
+- hyperpowers:test-runner (run tests, return summary only)
+
+</integration>
+
+<resources>
+
+**Detailed guides:**
+- [Debugger reference](resources/debugger-reference.md) - LLDB, GDB, DevTools commands
+- [Debugging session example](resources/debugging-session-example.md) - Complete walkthrough
+
+**When stuck:**
+- Error unclear → Dispatch internet-researcher with exact error text
+- Don't understand code flow → Dispatch codebase-investigator
+- Need to inspect runtime state → Recommend debugger to user or add instrumentation
+- Tempted to guess → Stop, use tools to gather evidence first
+
+</resources>
--- a/skills/debugging-with-tools/resources/debugger-reference.md
+++ b/skills/debugging-with-tools/resources/debugger-reference.md
@@ -0,0 +1,113 @@
+## Debugger Quick Reference
+
+### Automated Debugging (Claude CAN run these)
+
+#### lldb Batch Mode
+
+```bash
+# One-shot command to inspect variable at breakpoint
+lldb -o "breakpoint set --file main.rs --line 42" \
+     -o "run" \
+     -o "frame variable my_var" \
+     -o "quit" \
+     -- target/debug/myapp 2>&1
+
+# With script file for complex debugging
+cat > debug.lldb <<'EOF'
+breakpoint set --file main.rs --line 42
+run
+frame variable
+bt
+up
+frame variable
+quit
+EOF
+
+lldb -s debug.lldb target/debug/myapp 2>&1
+```
+
+#### strace (Linux - system call tracing)
+
+```bash
+# See which files program opens
+strace -e trace=open,openat cargo run 2>&1 | grep -v "ENOENT"
+
+# Find network activity
+strace -e trace=network cargo run 2>&1
+
+# All syscalls with time
+strace -tt cargo test some_test 2>&1
+```
+
+#### dtrace (macOS - dynamic tracing)
+
+```bash
+# Trace function calls
+sudo dtrace -n 'pid$target:myapp::entry { printf("%s", probefunc); }' -p <PID>
+```
+
+### Interactive Debugging (USER runs these, Claude guides)
+
+**These require interactive terminal - Claude provides commands, user runs them**
+
+### lldb (Rust, Swift, C++)
+
+```bash
+# Start debugging
+lldb target/debug/myapp
+
+# Set breakpoints
+(lldb) breakpoint set --file main.rs --line 42
+(lldb) breakpoint set --name my_function
+
+# Run
+(lldb) run
+(lldb) run arg1 arg2
+
+# When paused:
+(lldb) frame variable              # Show all locals
+(lldb) print my_var                # Print specific variable
+(lldb) bt                          # Backtrace (stack)
+(lldb) up / down                   # Navigate stack
+(lldb) continue                    # Resume
+(lldb) step / next                 # Step into / over
+(lldb) finish                      # Run until return
+```
+
+### Browser DevTools (JavaScript)
+
+```javascript
+// In code:
+debugger; // Execution pauses here
+
+// In DevTools:
+// - Sources tab → Add breakpoint by clicking line number
+// - When paused:
+//   - Scope panel: See all variables
+//   - Watch: Add expressions to watch
+//   - Call stack: Navigate callers
+//   - Step over (F10), Step into (F11)
+```
+
+### gdb (C, C++, Go)
+
+```bash
+# Start debugging
+gdb ./myapp
+
+# Set breakpoints
+(gdb) break main.c:42
+(gdb) break myfunction
+
+# Run
+(gdb) run
+
+# When paused:
+(gdb) print myvar
+(gdb) info locals
+(gdb) backtrace
+(gdb) up / down
+(gdb) continue
+(gdb) step / next
+```
+
--- a/skills/debugging-with-tools/resources/debugging-session-example.md
+++ b/skills/debugging-with-tools/resources/debugging-session-example.md
@@ -0,0 +1,73 @@
+## Example: Complete Debugging Session
+
+**Problem:** Test fails with "Symbol not found: _OBJC_CLASS_$_WKWebView"
+
+**Phase 1: Investigation**
+
+1. **Read error**: Symbol not found, linking issue
+2. **Internet research**:
+   ```
+   Dispatch hyperpowers:internet-researcher:
+   "Search for 'dyld Symbol not found _OBJC_CLASS_$_WKWebView'
+   Focus on: Xcode linking, framework configuration, iOS deployment"
+
+   Results: Need to link WebKit framework in Xcode project
+   ```
+
+3. **Debugger**: Not needed, linking happens before runtime
+
+4. **Codebase investigation**:
+   ```
+   Dispatch hyperpowers:codebase-investigator:
+   "Find other code using WKWebView - how is WebKit linked?"
+
+   Results: Main app target has WebKit in frameworks, test target doesn't
+   ```
+
+**Phase 2: Analysis**
+
+Root cause: Test target doesn't link WebKit framework
+Evidence: Main target works, test target fails, Stack Overflow confirms
+
+**Phase 3: Testing**
+
+Hypothesis: Adding WebKit to test target will fix it
+
+Minimal test:
+1. Add WebKit.framework to test target
+2. Clean build
+3. Run tests
+
+```
+Dispatch hyperpowers:test-runner: "Run: swift test"
+Result: ✓ All tests pass
+```
+
+**Phase 4: Implementation**
+
+1. Test already exists (the failing test)
+2. Fix: Framework linked
+3. Verification: Tests pass
+4. Update bd:
+   ```bash
+   bd close bd-123
+   ```
+
+**Time:** 15 minutes systematic vs. 2+ hours guessing
+
+## Remember
+
+- **Tools make debugging faster**, not slower
+- **hyperpowers:internet-researcher** can find solutions in seconds
+- **Automated debugging works** - lldb batch mode, strace, instrumentation
+- **hyperpowers:codebase-investigator** finds patterns you'd miss
+- **hyperpowers:test-runner agent** keeps context clean
+- **Evidence before fixes**, always
+
+**Prefer automated tools:**
+1. lldb batch mode - non-interactive variable inspection
+2. strace/dtrace - system call tracing
+3. Instrumentation - logging Claude can add
+4. Interactive debugger - only when automated tools insufficient
+
+95% faster to investigate systematically than to guess-and-check.
--- a/skills/dispatching-parallel-agents/SKILL.md
+++ b/skills/dispatching-parallel-agents/SKILL.md
@@ -0,0 +1,662 @@
+---
+name: dispatching-parallel-agents
+description: Use when facing 3+ independent failures that can be investigated without shared state or dependencies - dispatches multiple Claude agents to investigate and fix independent problems concurrently
+---
+
+<skill_overview>
+When facing 3+ independent failures, dispatch one agent per problem domain to investigate concurrently; verify independence first, dispatch all in single message, wait for all agents, check conflicts, verify integration.
+</skill_overview>
+
+<rigidity_level>
+MEDIUM FREEDOM - Follow the 6-step process (identify, create tasks, dispatch, monitor, review, verify) strictly. Independence verification mandatory. Parallel dispatch in single message required. Adapt agent prompt content to problem domain.
+</rigidity_level>
+
+<quick_reference>
+| Step | Action | Critical Rule |
+|------|--------|---------------|
+| 1. Identify Domains | Test independence (fix A doesn't affect B) | 3+ independent domains required |
+| 2. Create Agent Tasks | Write focused prompts (scope, goal, constraints, output) | One prompt per domain |
+| 3. Dispatch Agents | Launch all agents in SINGLE message | Multiple Task() calls in parallel |
+| 4. Monitor Progress | Track completions, don't integrate until ALL done | Wait for all agents |
+| 5. Review Results | Read summaries, check conflicts | Manual conflict resolution |
+| 6. Verify Integration | Run full test suite | Use verification-before-completion |
+
+**Why 3+?** With only 2 failures, coordination overhead often exceeds sequential time.
+
+**Critical:** Dispatch all agents in single message with multiple Task() calls, or they run sequentially.
+</quick_reference>
+
+<when_to_use>
+Use when:
+- 3+ test files failing with different root causes
+- Multiple subsystems broken independently
+- Each problem can be understood without context from others
+- No shared state between investigations
+- You've verified failures are truly independent
+- Each domain has clear boundaries (different files, modules, features)
+
+Don't use when:
+- Failures are related (fix one might fix others)
+- Need to understand full system state first
+- Agents would interfere (editing same files)
+- Haven't verified independence yet (exploratory phase)
+- Failures share root cause (one bug, multiple symptoms)
+- Need to preserve investigation order (cascading failures)
+- Only 2 failures (overhead exceeds benefit)
+</when_to_use>
+
+<the_process>
+## Step 1: Identify Independent Domains
+
+**Announce:** "I'm using hyperpowers:dispatching-parallel-agents to investigate these independent failures concurrently."
+
+**Create TodoWrite tracker:**
+```
+- Identify independent domains (3+ domains identified)
+- Create agent tasks (one prompt per domain drafted)
+- Dispatch agents in parallel (all agents launched in single message)
+- Monitor agent progress (track completions)
+- Review results (summaries read, conflicts checked)
+- Verify integration (full test suite green)
+```
+
+**Test for independence:**
+
+1. **Ask:** "If I fix failure A, does it affect failure B?"
+   - If NO → Independent
+   - If YES → Related, investigate together
+
+2. **Check:** "Do failures touch same code/files?"
+   - If NO → Likely independent
+   - If YES → Check if different functions/areas
+
+3. **Verify:** "Do failures share error patterns?"
+   - If NO → Independent
+   - If YES → Might be same root cause
+
+**Example independence check:**
+```
+Failure 1: Authentication tests failing (auth.test.ts)
+Failure 2: Database query tests failing (db.test.ts)
+Failure 3: API endpoint tests failing (api.test.ts)
+
+Check: Does fixing auth affect db queries? NO
+Check: Does fixing db affect API? YES - API uses db
+
+Result: 2 independent domains:
+  Domain 1: Authentication (auth.test.ts)
+  Domain 2: Database + API (db.test.ts + api.test.ts together)
+```
+
+**Group failures by what's broken:**
+- File A tests: Tool approval flow
+- File B tests: Batch completion behavior
+- File C tests: Abort functionality
+
+---
+
+## Step 2: Create Focused Agent Tasks
+
+Each agent prompt must have:
+
+1. **Specific scope:** One test file or subsystem
+2. **Clear goal:** Make these tests pass
+3. **Constraints:** Don't change other code
+4. **Expected output:** Summary of what you found and fixed
+
+**Good agent prompt example:**
+```markdown
+Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
+
+1. "should abort tool with partial output capture" - expects 'interrupted at' in message
+2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
+3. "should properly track pendingToolCount" - expects 3 results but gets 0
+
+These are timing/race condition issues. Your task:
+
+1. Read the test file and understand what each test verifies
+2. Identify root cause - timing issues or actual bugs?
+3. Fix by:
+   - Replacing arbitrary timeouts with event-based waiting
+   - Fixing bugs in abort implementation if found
+   - Adjusting test expectations if testing changed behavior
+
+Never just increase timeouts - find the real issue.
+
+Return: Summary of what you found and what you fixed.
+```
+
+**What makes this good:**
+- Specific test failures listed
+- Context provided (timing/race conditions)
+- Clear methodology (read, identify, fix)
+- Constraints (don't just increase timeouts)
+- Output format (summary)
+
+**Common mistakes:**
+
+❌ **Too broad:** "Fix all the tests" - agent gets lost
+✅ **Specific:** "Fix agent-tool-abort.test.ts" - focused scope
+
+❌ **No context:** "Fix the race condition" - agent doesn't know where
+✅ **Context:** Paste the error messages and test names
+
+❌ **No constraints:** Agent might refactor everything
+✅ **Constraints:** "Do NOT change production code" or "Fix tests only"
+
+❌ **Vague output:** "Fix it" - you don't know what changed
+✅ **Specific:** "Return summary of root cause and changes"
+
+---
+
+## Step 3: Dispatch All Agents in Parallel
+
+**CRITICAL:** You must dispatch all agents in a SINGLE message with multiple Task() calls.
+
+```typescript
+// ✅ CORRECT - Single message with multiple parallel tasks
+Task("Fix agent-tool-abort.test.ts failures", prompt1)
+Task("Fix batch-completion-behavior.test.ts failures", prompt2)
+Task("Fix tool-approval-race-conditions.test.ts failures", prompt3)
+// All three run concurrently
+
+// ❌ WRONG - Sequential messages
+Task("Fix agent-tool-abort.test.ts failures", prompt1)
+// Wait for response
+Task("Fix batch-completion-behavior.test.ts failures", prompt2)
+// This is sequential, not parallel!
+```
+
+**After dispatch:**
+- Mark "Dispatch agents in parallel" as completed in TodoWrite
+- Mark "Monitor agent progress" as in_progress
+- Wait for all agents to complete before integration
+
+---
+
+## Step 4: Monitor Progress
+
+As agents work:
+- Note which agents have completed
+- Note which are still running
+- Don't start integration until ALL agents done
+
+**If an agent gets stuck (>5 minutes):**
+
+1. Check AgentOutput to see what it's doing
+2. If stuck on wrong path: Cancel and retry with clearer prompt
+3. If needs context from other domain: Wait for other agent, then restart with context
+4. If hit real blocker: Investigate blocker yourself, then retry
+
+---
+
+## Step 5: Review Results and Check Conflicts
+
+**When all agents return:**
+
+1. **Read each summary carefully**
+   - What was the root cause?
+   - What did the agent change?
+   - Were there any uncertainties?
+
+2. **Check for conflicts**
+   - Did multiple agents edit same files?
+   - Did agents make contradictory assumptions?
+   - Are there integration points between domains?
+
+3. **Integration strategy:**
+   - If no conflicts: Apply all changes
+   - If conflicts: Resolve manually before applying
+   - If assumptions conflict: Verify with user
+
+4. **Document what happened**
+   - Which agents fixed what
+   - Any conflicts found
+   - Integration decisions made
+
+---
+
+## Step 6: Verify Integration
+
+**Run full test suite:**
+- Not just the fixed tests
+- Verify no regressions in other areas
+- Use hyperpowers:verification-before-completion skill
+
+**Before completing:**
+```bash
+# Run all tests
+npm test  # or cargo test, pytest, etc.
+
+# Verify output
+# If all pass → Mark "Verify integration" complete
+# If failures → Identify which agent's change caused regression
+```
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer dispatches agents sequentially instead of in parallel</scenario>
+
+<code>
+# Developer sees 3 independent failures
+# Creates 3 agent prompts
+
+# Dispatches first agent
+Task("Fix agent-tool-abort.test.ts failures", prompt1)
+# Waits for response from agent 1
+
+# Then dispatches second agent
+Task("Fix batch-completion-behavior.test.ts failures", prompt2)
+# Waits for response from agent 2
+
+# Then dispatches third agent
+Task("Fix tool-approval-race-conditions.test.ts failures", prompt3)
+
+# Total time: Sum of all three agents (sequential)
+</code>
+
+<why_it_fails>
+- Agents run sequentially, not in parallel
+- No time savings from parallelization
+- Each agent waits for previous to complete
+- Defeats entire purpose of parallel dispatch
+- Same result as sequential investigation
+- Wasted overhead of creating separate agents
+</why_it_fails>
+
+<correction>
+**Dispatch all agents in SINGLE message:**
+
+```typescript
+// Single message with multiple Task() calls
+Task("Fix agent-tool-abort.test.ts failures", `
+Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
+[prompt 1 content]
+`)
+
+Task("Fix batch-completion-behavior.test.ts failures", `
+Fix the 2 failing tests in src/agents/batch-completion-behavior.test.ts:
+[prompt 2 content]
+`)
+
+Task("Fix tool-approval-race-conditions.test.ts failures", `
+Fix the 1 failing test in src/agents/tool-approval-race-conditions.test.ts:
+[prompt 3 content]
+`)
+
+// All three run concurrently - THIS IS THE KEY
+```
+
+**What happens:**
+- All three agents start simultaneously
+- Each investigates independently
+- All complete in parallel
+- Total time: Max(agent1, agent2, agent3) instead of Sum
+
+**What you gain:**
+- True parallelization - 3 problems solved concurrently
+- Time saved: 3 investigations in time of 1
+- Each agent focused on narrow scope
+- No waiting for sequential completion
+- Proper use of parallel dispatch pattern
+</correction>
+</example>
+
+<example>
+<scenario>Developer assumes failures are independent without verification</scenario>
+
+<code>
+# Developer sees 3 test failures:
+# - API endpoint tests failing
+# - Database query tests failing
+# - Cache invalidation tests failing
+
+# Thinks: "Different subsystems, must be independent"
+
+# Dispatches 3 agents immediately without checking independence
+
+# Agent 1 finds: API failing because database schema changed
+# Agent 2 finds: Database queries need migration
+# Agent 3 finds: Cache keys based on old schema
+
+# All three failures caused by same root cause: schema change
+# Agents make conflicting fixes based on different assumptions
+# Integration fails because fixes contradict each other
+</code>
+
+<why_it_fails>
+- Skipped independence verification (Step 1)
+- Assumed independence based on surface appearance
+- All failures actually shared root cause (schema change)
+- Agents worked in isolation without seeing connection
+- Each agent made different assumptions about correct schema
+- Conflicting fixes can't be integrated
+- Wasted time on parallel work that should have been unified
+- Have to throw away agent work and start over
+</why_it_fails>
+
+<correction>
+**Run independence check FIRST:**
+
+```
+Check: Does fixing API affect database queries?
+- API uses database
+- If database schema changes, API breaks
+- YES - these are related
+
+Check: Does fixing database affect cache?
+- Cache stores database results
+- If database schema changes, cache keys break
+- YES - these are related
+
+Check: Do failures share error patterns?
+- All mention "column not found: user_email"
+- All started after schema migration
+- YES - shared root cause
+
+Result: NOT INDEPENDENT
+These are one problem (schema change) manifesting in 3 places
+```
+
+**Correct approach:**
+
+```
+Single agent investigates: "Schema migration broke 3 subsystems"
+
+Agent prompt:
+"We have 3 test failures all related to schema change:
+1. API endpoints: column not found
+2. Database queries: column not found
+3. Cache invalidation: old keys
+
+Investigate the schema migration that caused this.
+Fix by updating all 3 subsystems consistently.
+Return: What changed in schema, how you fixed each subsystem."
+
+# One agent sees full picture
+# Makes consistent fix across all 3 areas
+# No conflicts, proper integration
+```
+
+**What you gain:**
+- Caught shared root cause before wasting time
+- One agent sees full context
+- Consistent fix across all affected areas
+- No conflicting assumptions
+- No integration conflicts
+- Faster than 3 agents working at cross-purposes
+- Proper problem diagnosis before parallel dispatch
+</correction>
+</example>
+
+<example>
+<scenario>Developer integrates agent results without checking conflicts</scenario>
+
+<code>
+# 3 agents complete successfully
+# Developer quickly reads summaries:
+
+Agent 1: "Fixed timeout issue by increasing wait time to 5000ms"
+Agent 2: "Fixed race condition by adding mutex lock"
+Agent 3: "Fixed timing issue by reducing wait time to 1000ms"
+
+# Developer thinks: "All agents succeeded, ship it"
+
+# Applies all changes without checking conflicts
+
+# Result:
+# - Agent 1 and Agent 3 edited same file
+# - Agent 1 increased timeout, Agent 3 decreased it
+# - Final code has inconsistent timeouts
+# - Agent 2's mutex interacts badly with Agent 3's reduced timeout
+# - Tests still fail after integration
+</code>
+
+<why_it_fails>
+- Skipped conflict checking (Step 5)
+- Didn't carefully read what each agent changed
+- Agents made contradictory decisions
+- Agent 1 and Agent 3 had different assumptions about timing
+- Agent 2's locking interacts with timing changes
+- Blindly applying all fixes creates inconsistent state
+- Tests fail after "successful" integration
+- Have to manually untangle conflicting changes
+</why_it_fails>
+
+<correction>
+**Review results carefully before integration:**
+
+```markdown
+## Agent Summaries Review
+
+Agent 1: Fixed timeout issue by increasing wait time to 5000ms
+- File: src/agents/tool-executor.ts
+- Change: DEFAULT_TIMEOUT = 5000
+
+Agent 2: Fixed race condition by adding mutex lock
+- File: src/agents/tool-executor.ts
+- Change: Added mutex around tool execution
+
+Agent 3: Fixed timing issue by reducing wait time to 1000ms
+- File: src/agents/tool-executor.ts
+- Change: DEFAULT_TIMEOUT = 1000
+
+## Conflict Analysis
+
+**CONFLICT DETECTED:**
+- Agents 1 and 3 edited same file (tool-executor.ts)
+- Agents 1 and 3 changed same constant (DEFAULT_TIMEOUT)
+- Agent 1: increase to 5000ms
+- Agent 3: decrease to 1000ms
+- Contradictory assumptions about correct timing
+
+**Why conflict occurred:**
+- Domains weren't actually independent (same timeout constant)
+- Both agents tested locally, didn't see interaction
+- Different problem spaces led to different timing needs
+
+## Resolution
+
+**Option 1:** Different timeouts for different operations
+```typescript
+const TOOL_EXECUTION_TIMEOUT = 5000  // Agent 1's need
+const TOOL_APPROVAL_TIMEOUT = 1000   // Agent 3's need
+```
+
+**Option 2:** Investigate why timing varies
+- Maybe Agent 1's tests are actually slow (fix slowness)
+- Maybe Agent 3's tests are correct (use 1000ms everywhere)
+
+**Choose Option 2 after investigation:**
+- Agent 1's tests were slow due to unrelated issue
+- Fix the slowness, use 1000ms timeout everywhere
+- Agent 2's mutex is compatible with 1000ms
+
+**Integration steps:**
+1. Apply Agent 2's mutex (no conflict)
+2. Apply Agent 3's 1000ms timeout
+3. Fix Agent 1's slow tests (root cause)
+4. Don't apply Agent 1's timeout increase (symptom fix)
+```
+
+**Run full test suite:**
+```bash
+npm test
+# All tests pass ✅
+```
+
+**What you gain:**
+- Caught contradiction before breaking integration
+- Understood why agents made different decisions
+- Resolved conflict thoughtfully, not arbitrarily
+- Fixed root cause (slow tests) not symptom (long timeout)
+- Verified integration works correctly
+- Avoided shipping inconsistent code
+- Professional conflict resolution process
+</correction>
+</example>
+</examples>
+
+<failure_modes>
+## Agent Gets Stuck
+
+**Symptoms:** No progress after 5+ minutes
+
+**Causes:**
+- Prompt too vague, agent exploring aimlessly
+- Domain not actually independent, needs context from other agents
+- Agent hit a blocker (missing file, unclear error)
+
+**Recovery:**
+1. Use AgentOutput tool to check what it's doing
+2. If stuck on wrong path: Cancel and retry with clearer prompt
+3. If needs context from other domain: Wait for other agent, then restart with context
+4. If hit real blocker: Investigate blocker yourself, then retry
+
+---
+
+## Agents Return Conflicting Fixes
+
+**Symptoms:** Agents edited same code differently, or made contradictory assumptions
+
+**Causes:**
+- Domains weren't actually independent
+- Shared code between domains
+- Agents made different assumptions about correct behavior
+
+**Recovery:**
+1. Don't apply either fix automatically
+2. Read both fixes carefully
+3. Identify the conflict point
+4. Resolve manually based on which assumption is correct
+5. Consider if domains should be merged
+
+---
+
+## Integration Breaks Other Tests
+
+**Symptoms:** Fixed tests pass, but other tests now fail
+
+**Causes:**
+- Agent changed shared code
+- Agent's fix was too broad
+- Agent misunderstood requirements
+
+**Recovery:**
+1. Identify which agent's change caused the regression
+2. Read the agent's summary - did they mention this change?
+3. Evaluate if change is correct but tests need updating
+4. Or if change broke something, need to refine the fix
+5. Use hyperpowers:verification-before-completion skill for final check
+
+---
+
+## False Independence
+
+**Symptoms:** Fixing one domain revealed it affected another
+
+**Recovery:**
+1. Merge the domains
+2. Have one agent investigate both together
+3. Learn: Better independence test needed upfront
+</failure_modes>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Verify independence first** → Test with questions before dispatching
+2. **3+ domains required** → 2 failures: overhead exceeds benefit, do sequentially
+3. **Single message dispatch** → All agents in one message with multiple Task() calls
+4. **Wait for ALL agents** → Don't integrate until all complete
+5. **Check conflicts manually** → Read summaries, verify no contradictions
+6. **Verify integration** → Run full suite yourself, don't trust agents
+7. **TodoWrite tracking** → Track agent progress explicitly
+
+## Common Excuses
+
+All of these mean: **STOP. Follow the process.**
+
+- "Just 2 failures, can still parallelize" (Overhead exceeds benefit, do sequentially)
+- "Probably independent, will dispatch and see" (Verify independence FIRST)
+- "Can dispatch sequentially to save syntax" (WRONG - must dispatch in single message)
+- "Agent failed, but others succeeded - ship it" (All agents must succeed or re-investigate)
+- "Conflicts are minor, can ignore" (Resolve all conflicts explicitly)
+- "Don't need TodoWrite for just tracking agents" (Use TodoWrite, track properly)
+- "Can skip verification, agents ran tests" (Agents can make mistakes, YOU verify)
+</critical_rules>
+
+<verification_checklist>
+Before completing parallel agent work:
+
+- [ ] Verified independence with 3 questions (fix A affects B? same code? same error pattern?)
+- [ ] 3+ independent domains identified (not 2 or fewer)
+- [ ] Created focused agent prompts (scope, goal, constraints, output)
+- [ ] Dispatched all agents in single message (multiple Task() calls)
+- [ ] Waited for ALL agents to complete (didn't integrate early)
+- [ ] Read all agent summaries carefully
+- [ ] Checked for conflicts (same files, contradictory assumptions)
+- [ ] Resolved any conflicts manually before integration
+- [ ] Ran full test suite (not just fixed tests)
+- [ ] Used verification-before-completion skill
+- [ ] Documented which agents fixed what
+
+**Can't check all boxes?** Return to the process and complete missing steps.
+</verification_checklist>
+
+<integration>
+**This skill covers:** Parallel investigation of independent failures
+
+**Related skills:**
+- hyperpowers:debugging-with-tools (how to investigate individual failures)
+- hyperpowers:fixing-bugs (complete bug workflow)
+- hyperpowers:verification-before-completion (verify integration)
+- hyperpowers:test-runner (run tests without context pollution)
+
+**This skill uses:**
+- Task tool (dispatch parallel agents)
+- AgentOutput tool (monitor stuck agents)
+- TodoWrite (track agent progress)
+
+**Workflow integration:**
+```
+Multiple independent failures
+    ↓
+Verify independence (Step 1)
+    ↓
+Create agent tasks (Step 2)
+    ↓
+Dispatch in parallel (Step 3)
+    ↓
+Monitor progress (Step 4)
+    ↓
+Review + check conflicts (Step 5)
+    ↓
+Verify integration (Step 6)
+    ↓
+hyperpowers:verification-before-completion
+```
+
+**Real example from session (2025-10-03):**
+- 6 failures across 3 files
+- 3 agents dispatched in parallel
+- All investigations completed concurrently
+- All fixes integrated successfully
+- Zero conflicts between agent changes
+- Time saved: 3 problems solved in parallel vs sequentially
+</integration>
+
+<resources>
+**Key principles:**
+- Parallelization only wins with 3+ independent problems
+- Independence verification prevents wasted parallel work
+- Single message dispatch is critical for true parallelism
+- Conflict checking prevents integration disasters
+- Full verification catches agent mistakes
+
+**When stuck:**
+- Agent not making progress → Check AgentOutput, retry with clearer prompt
+- Conflicts after dispatch → Domains weren't independent, merge and retry
+- Integration fails tests → Identify which agent caused regression
+- Unclear if independent → Test with 3 questions (affects? same code? same error?)
+</resources>
--- a/skills/executing-plans/SKILL.md
+++ b/skills/executing-plans/SKILL.md
@@ -0,0 +1,571 @@
+---
+name: executing-plans
+description: Use to execute bd tasks iteratively - executes one task, reviews learnings, creates/refines next task, then STOPS for user review before continuing
+---
+
+<skill_overview>
+Execute bd tasks one at a time with mandatory checkpoints: Load epic → Execute task → Review learnings → Create next task → Run SRE refinement → STOP. User clears context, reviews implementation, then runs command again to continue. Epic requirements are immutable, tasks adapt to reality.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow exact process: load epic, execute ONE task, review, create next task with SRE refinement, STOP.
+
+Epic requirements are immutable. Tasks adapt to discoveries. Do not skip checkpoints, SRE refinement, or verification. STOP after each task for user review.
+</rigidity_level>
+
+<quick_reference>
+
+| Step | Command | Purpose |
+|------|---------|---------|
+| **Load Epic** | `bd show bd-1` | Read immutable requirements once at start |
+| **Find Task** | `bd ready` | Get next ready task to execute |
+| **Start Task** | `bd update bd-2 --status in_progress` | Mark task active |
+| **Track Substeps** | TodoWrite for each implementation step | Prevent incomplete execution |
+| **Close Task** | `bd close bd-2` | Mark task complete after verification |
+| **Review** | Re-read epic, check learnings | Adapt next task to reality |
+| **Create Next** | `bd create "Task N"` | Based on learnings, not assumptions |
+| **Refine** | Use `sre-task-refinement` skill | Corner-case analysis with Opus 4.1 |
+| **STOP** | Present summary to user | User reviews, clears context, runs command again |
+| **Final Check** | Use `review-implementation` skill | Verify all success criteria before closing epic |
+
+**Critical:** Epic = contract (immutable). Tasks = discovery (adapt to reality). STOP after each task for user review.
+
+</quick_reference>
+
+<when_to_use>
+**Use after hyperpowers:writing-plans creates epic and first task.**
+
+Symptoms you need this:
+- bd epic exists with tasks ready to execute
+- Need to implement features iteratively
+- Requirements clear, but implementation path will adapt
+- Want continuous learning between tasks
+</when_to_use>
+
+<the_process>
+
+## 0. Resumption Check (Every Invocation)
+
+This skill supports explicit resumption. When invoked:
+
+```bash
+bd list --type epic --status open  # Find active epic
+bd ready                           # Check for ready tasks
+bd list --status in_progress       # Check for in-progress tasks
+```
+
+**Fresh start:** No in-progress tasks, proceed to Step 1.
+
+**Resuming:** Found ready or in-progress tasks:
+- In-progress task exists → Resume at Step 2 (continue executing)
+- Ready task exists → Resume at Step 2 (start executing)
+- All tasks closed but epic open → Resume at Step 4 (check criteria)
+
+**Why resumption matters:**
+- User cleared context between tasks (intended workflow)
+- Context limit reached mid-task
+- Previous session ended unexpectedly
+
+**Do not ask "where did we leave off?"** - bd state tells you exactly where to resume.
+
+## 1. Load Epic Context (Once at Start)
+
+Before executing ANY task, load the epic into context:
+
+```bash
+bd list --type epic --status open  # Find epic
+bd show bd-1                       # Load epic details
+```
+
+**Extract and keep in mind:**
+- Requirements (IMMUTABLE)
+- Success criteria (validation checklist)
+- Anti-patterns (FORBIDDEN shortcuts)
+- Approach (high-level strategy)
+
+**Why:** Requirements prevent watering down when blocked.
+
+## 2. Execute Current Ready Task
+
+```bash
+bd ready                           # Find next task
+bd update bd-2 --status in_progress # Start it
+bd show bd-2                       # Read details
+```
+
+**CRITICAL - Create TodoWrite for ALL substeps:**
+
+Tasks contain 4-8 implementation steps. Create TodoWrite todos for each to prevent incomplete execution:
+
+```
+- bd-2 Step 1: Write test (pending)
+- bd-2 Step 2: Run test RED (pending)
+- bd-2 Step 3: Implement function (pending)
+- bd-2 Step 4: Run test GREEN (pending)
+- bd-2 Step 5: Refactor (pending)
+- bd-2 Step 6: Commit (pending)
+```
+
+**Execute steps:**
+- Use `test-driven-development` when implementing features
+- Mark each substep completed immediately after finishing
+- Use `test-runner` agent for verifications
+
+**Pre-close verification:**
+- Check TodoWrite: All substeps completed?
+- If incomplete: Continue with remaining substeps
+- If complete: Close task and commit
+
+```bash
+bd close bd-2  # After ALL substeps done
+```
+
+## 3. Review Against Epic and Create Next Task
+
+**CRITICAL:** After each task, adapt plan based on reality.
+
+**Review questions:**
+1. What did we learn?
+2. Discovered any blockers, existing functionality, limitations?
+3. Does this move us toward epic success criteria?
+4. What's next logical step?
+5. Any epic anti-patterns to avoid?
+
+**Re-read epic:**
+```bash
+bd show bd-1  # Keep requirements fresh
+```
+
+**Three cases:**
+
+**A) Next task still valid** → Proceed to Step 2
+
+**B) Next task now redundant** (plan invalidation allowed):
+```bash
+bd delete bd-4  # Remove wasteful task
+# Or update: bd update bd-4 --title "New work" --design "..."
+```
+
+**C) Need new task** based on learnings:
+```bash
+bd create "Task N: [Next Step Based on Reality]" \
+  --type feature \
+  --design "## Goal
+[Deliverable based on what we learned]
+
+## Context
+Completed bd-2: [discoveries]
+
+## Implementation
+[Steps reflecting current state, not assumptions]
+
+## Success Criteria
+- [ ] Specific outcomes
+- [ ] Tests passing"
+
+bd dep add bd-N bd-1 --type parent-child
+bd dep add bd-N bd-2 --type blocks
+```
+
+**REQUIRED - Run SRE refinement on new task:**
+```
+Use Skill tool: hyperpowers:sre-task-refinement
+```
+
+SRE refinement will:
+- Apply 7-category corner-case analysis (Opus 4.1)
+- Identify edge cases and failure modes
+- Strengthen success criteria
+- Ensure task is ready for implementation
+
+**Do NOT skip SRE refinement.** New tasks need the same rigor as initial planning.
+
+## 4. Check Epic Success Criteria and STOP
+
+```bash
+bd show bd-1  # Check success criteria
+```
+
+- ALL criteria met? → Step 5 (final validation)
+- Some missing? → **STOP for user review**
+
+## 4a. STOP Checkpoint (Mandatory)
+
+**Present summary to user:**
+
+```markdown
+## Task bd-N Complete - Checkpoint
+
+### What Was Done
+- [Summary of implementation]
+- [Key learnings/discoveries]
+
+### Next Task Ready
+- bd-M: [Title]
+- [Brief description of what's next]
+
+### Epic Progress
+- [X/Y success criteria met]
+- [Remaining criteria]
+
+### To Continue
+Run `/hyperpowers:execute-plan` to execute the next task.
+```
+
+**Why STOP is mandatory:**
+- User can clear context (prevents context exhaustion)
+- User can review implementation before next task
+- User can adjust next task if needed
+- Prevents runaway execution without oversight
+
+**Do NOT rationalize skipping the stop:**
+- "Good context loaded" → Context reloads are cheap, wrong decisions aren't
+- "Momentum" → Checkpoints ensure quality over speed
+- "User didn't ask to stop" → Stopping is the default, continuing requires explicit command
+
+## 5. Final Validation and Closure
+
+When all success criteria appear met:
+
+1. **Run full verification** (tests, hooks, manual checks)
+
+2. **REQUIRED - Use review-implementation skill:**
+```
+Use Skill tool: hyperpowers:review-implementation
+```
+
+Review-implementation will:
+- Check each requirement met
+- Verify each success criterion satisfied
+- Confirm no anti-patterns used
+- If approved: Calls `finishing-a-development-branch`
+- If gaps: Create tasks, return to Step 2
+
+3. **Only close epic after review approves**
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer closes task without completing all substeps, claims "mostly done"</scenario>
+
+<code>
+bd-2 has 6 implementation steps.
+
+TodoWrite shows:
+- ✅ bd-2 Step 1: Write test
+- ✅ bd-2 Step 2: Run test RED
+- ✅ bd-2 Step 3: Implement function
+- ⏸️ bd-2 Step 4: Run test GREEN (pending)
+- ⏸️ bd-2 Step 5: Refactor (pending)
+- ⏸️ bd-2 Step 6: Commit (pending)
+
+Developer thinks: "Function works, I'll close bd-2 and move on"
+Runs: bd close bd-2
+</code>
+
+<why_it_fails>
+Steps 4-6 skipped:
+- Tests not verified GREEN (might have broken other tests)
+- Code not refactored (leaves technical debt)
+- Changes not committed (work could be lost)
+
+"Mostly done" = incomplete task = will cause issues later.
+</why_it_fails>
+
+<correction>
+**Pre-close verification checkpoint:**
+
+Before closing ANY task:
+1. Check TodoWrite: All substeps completed?
+2. If incomplete: Continue with remaining substeps
+3. Only when ALL ✅: bd close bd-2
+
+**Result:** Task actually complete, tests passing, code committed.
+</correction>
+</example>
+
+<example>
+<scenario>Developer discovers planned task is redundant, executes it anyway "because it's in the plan"</scenario>
+
+<code>
+bd-4 says: "Implement token refresh middleware"
+
+While executing bd-2, developer discovers:
+- Token refresh middleware already exists in auth/middleware/refresh.ts
+- Works correctly, has tests
+- bd-4 would duplicate existing code
+
+Developer thinks: "bd-4 is in the plan, I should do it anyway"
+Proceeds to implement duplicate middleware
+</code>
+
+<why_it_fails>
+**Wasteful execution:**
+- Duplicates existing functionality
+- Creates maintenance burden (two implementations to keep in sync)
+- Violates DRY principle
+- Wastes time on redundant work
+
+**Why it happens:** Treating tasks as immutable instead of epic.
+</why_it_fails>
+
+<correction>
+**Plan invalidation is allowed:**
+
+1. Verify the discovery:
+```bash
+# Check existing code
+cat auth/middleware/refresh.ts
+# Confirm it works
+npm test -- refresh.spec.ts
+```
+
+2. Delete redundant task:
+```bash
+bd delete bd-4
+```
+
+3. Document why:
+```
+bd update bd-2 --design "...
+
+Discovery: Token refresh middleware already exists (auth/middleware/refresh.ts).
+Verified working with tests. bd-4 deleted as redundant."
+```
+
+4. Create new task if needed (maybe "Integrate existing refresh middleware" instead)
+
+**Result:** Plan adapts to reality. No wasted work.
+</correction>
+</example>
+
+<example>
+<scenario>Developer hits blocker, waters down epic requirement to "make it easier"</scenario>
+
+<code>
+Epic bd-1 anti-patterns say:
+"FORBIDDEN: Using mocks for database integration tests. Must use real test database."
+
+Developer encounters:
+- Real database setup is complex
+- Mocking would make tests pass quickly
+
+Developer thinks: "This is too hard, I'll use mocks just for now and refactor later"
+
+Adds TODO: // TODO: Replace mocks with real DB later
+</code>
+
+<why_it_fails>
+**Violates epic anti-pattern:**
+- Epic explicitly forbids mocks for integration tests
+- "Later" never happens (TODO remains forever)
+- Tests don't verify actual integration
+- Defeats purpose of integration testing
+
+**Why it happens:** Rationalizing around blockers instead of solving them.
+</why_it_fails>
+
+<correction>
+**When blocked, re-read epic:**
+
+1. Re-read epic requirements and anti-patterns:
+```bash
+bd show bd-1
+```
+
+2. Check if solution violates anti-pattern:
+- Using mocks? YES, explicitly forbidden
+
+3. Don't rationalize. Instead:
+
+**Option A - Research:**
+```bash
+bd create "Research: Real DB test setup for [project]" \
+  --design "Find how this project sets up test databases.
+Check existing test files for patterns.
+Document setup process that meets anti-pattern requirements."
+```
+
+**Option B - Ask user:**
+"Blocker: Test DB setup complex. Epic forbids mocks for integration.
+Is there existing test DB infrastructure I should use?"
+
+**Result:** Epic requirements maintained. Blocker solved properly.
+</correction>
+</example>
+
+<example>
+<scenario>Developer skips STOP checkpoint to "maintain momentum"</scenario>
+
+<code>
+Just completed bd-2 (authentication middleware).
+Created bd-3 (rate limiting endpoint).
+Ran SRE refinement on bd-3.
+
+Developer thinks: "Good context loaded, I'll just do bd-3 quickly then stop.
+User approved the epic, they trust me to execute it.
+Stopping now is inefficient."
+
+Continues directly to execute bd-3 without STOP checkpoint.
+</code>
+
+<why_it_fails>
+**Multiple failures:**
+- User can't review bd-2 implementation before bd-3 starts
+- User can't clear context (may hit context limit mid-task)
+- User can't adjust bd-3 based on bd-2 learnings
+- No checkpoint = no oversight
+
+**The rationalization trap:**
+- "Good context" sounds efficient but prevents review
+- "User trust" misinterprets approval (one command ≠ blanket permission)
+- "Quick task" becomes long task when issues arise
+
+**What actually happens:**
+- bd-3 hits unexpected issue
+- Context exhausted trying to debug
+- User returns to find 2 half-finished tasks instead of 1 complete task
+</why_it_fails>
+
+<correction>
+**Follow the STOP checkpoint:**
+
+1. After completing bd-2 and refining bd-3:
+```markdown
+## Task bd-2 Complete - Checkpoint
+
+### What Was Done
+- Implemented JWT middleware with validation
+- Added token refresh handling
+
+### Next Task Ready
+- bd-3: Implement rate limiting
+- Adds rate limiting to auth endpoints
+
+### Epic Progress
+- 2/4 success criteria met
+- Remaining: password reset, rate limiting
+
+### To Continue
+Run `/hyperpowers:execute-plan` to execute the next task.
+```
+
+2. **STOP and wait for user**
+
+**Result:** User can review, clear context, adjust next task. Each task completes with full oversight.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **STOP after each task** → Present summary, wait for user to run command again
+   - User needs checkpoint to review implementation
+   - User may need to clear context
+   - Continuous execution = no oversight
+
+2. **SRE refinement for new tasks** → Never skip corner-case analysis
+   - New tasks created during execution need same rigor as initial planning
+   - Use Opus 4.1 for thorough analysis
+   - Tasks without refinement will miss edge cases
+
+3. **Epic requirements are immutable** → Never water down when blocked
+   - If blocked: Research solution or ask user
+   - Never violate anti-patterns to "make it easier"
+
+4. **All substeps must be completed** → Never close task with pending substeps
+   - Check TodoWrite before closing
+   - "Mostly done" = incomplete = will cause issues
+
+5. **Plan invalidation is allowed** → Delete redundant tasks
+   - If discovered existing functionality: Delete duplicate task
+   - If discovered blocker: Update or delete invalid task
+   - Document what you found and why
+
+6. **Review before closing epic** → Use review-implementation skill
+   - Tasks done ≠ success criteria met
+   - All criteria must be verified before closing
+
+## Common Excuses
+
+All of these mean: Re-read epic, STOP as required, ask for help:
+- "Good context loaded, don't want to lose it" → STOP anyway, context reloads
+- "Just one more quick task" → STOP anyway, user needs checkpoint
+- "User didn't ask me to stop" → Stopping is default, continuing requires explicit command
+- "SRE refinement is overkill for this task" → Every task needs refinement, no exceptions
+- "This requirement is too hard" → Research or ask, don't water down
+- "I'll come back to this later" → Complete now or document why blocked
+- "Let me fake this to make tests pass" → Never, defeats purpose
+- "Existing task is wasteful, but it's planned" → Delete it, plan adapts to reality
+- "All tasks done, epic must be complete" → Verify with review-implementation
+
+</critical_rules>
+
+<verification_checklist>
+
+Before closing each task:
+- [ ] ALL TodoWrite substeps completed (no pending)
+- [ ] Tests passing (use test-runner agent)
+- [ ] Changes committed
+- [ ] Task actually done (not "mostly")
+
+After closing each task:
+- [ ] Reviewed learnings against epic
+- [ ] Created/updated next task based on reality
+- [ ] Ran SRE refinement on any new tasks
+- [ ] Presented STOP checkpoint summary to user
+- [ ] STOPPED execution (do not continue to next task)
+
+Before closing epic:
+- [ ] ALL success criteria met (check epic)
+- [ ] review-implementation skill used and approved
+- [ ] No anti-patterns violated
+- [ ] All tasks closed
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- writing-plans (creates epic and first task before this runs)
+- sre-task-refinement (REQUIRED for new tasks created during execution)
+- test-driven-development (when implementing features)
+- test-runner (for running tests without output pollution)
+- review-implementation (final validation before closing epic)
+- finishing-a-development-branch (after review approves)
+
+**This skill is called by:**
+- User (via /hyperpowers:execute-plan command)
+- After writing-plans creates epic
+- Explicitly to resume after checkpoint (user runs command again)
+
+**Agents used:**
+- hyperpowers:test-runner (run tests, return summary only)
+
+**Workflow pattern:**
+```
+/hyperpowers:execute-plan → Execute task → STOP
+[User clears context, reviews]
+/hyperpowers:execute-plan → Execute next task → STOP
+[Repeat until epic complete]
+```
+
+</integration>
+
+<resources>
+
+**bd command reference:**
+- See [bd commands](../common-patterns/bd-commands.md) for complete command list
+
+**When stuck:**
+- Hit blocker → Re-read epic, check anti-patterns, research or ask
+- Don't understand instruction → Stop and ask (never guess)
+- Verification fails repeatedly → Check epic anti-patterns, ask for help
+- Tempted to skip steps → Check TodoWrite, complete all substeps
+
+</resources>
--- a/skills/finishing-a-development-branch/SKILL.md
+++ b/skills/finishing-a-development-branch/SKILL.md
@@ -0,0 +1,482 @@
+---
+name: finishing-a-development-branch
+description: Use when implementation complete and tests pass - closes bd epic, presents integration options (merge/PR/keep/discard), executes choice
+---
+
+<skill_overview>
+Close bd epic, verify tests pass, present 4 integration options, execute choice, cleanup worktree appropriately.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow the 6-step process exactly. Present exactly 4 options. Never skip test verification. Must confirm before discarding.
+</rigidity_level>
+
+<quick_reference>
+| Step | Action | If Blocked |
+|------|--------|------------|
+| 1 | Close bd epic | Tasks still open → STOP |
+| 2 | Verify tests pass (test-runner agent) | Tests fail → STOP |
+| 3 | Determine base branch | Ask if needed |
+| 4 | Present exactly 4 options | Wait for choice |
+| 5 | Execute choice | Follow option workflow |
+| 6 | Cleanup worktree (options 1,2,4 only) | Option 3 keeps worktree |
+
+**Options:** 1=Merge locally, 2=PR, 3=Keep as-is, 4=Discard (confirm)
+</quick_reference>
+
+<when_to_use>
+- Implementation complete and reviewed
+- All bd tasks for epic are done
+- Ready to integrate work back to main branch
+- Called by hyperpowers:review-implementation (final step)
+
+**Don't use for:**
+- Work still in progress
+- Tests failing
+- Epic has open tasks
+- Mid-implementation (use hyperpowers:executing-plans)
+</when_to_use>
+
+<the_process>
+## Step 1: Close bd Epic
+
+**Announce:** "I'm using hyperpowers:finishing-a-development-branch to complete this work."
+
+**Verify all tasks closed:**
+
+```bash
+bd dep tree bd-1  # Show task tree
+bd list --status open --parent bd-1  # Check for open tasks
+```
+
+**If any tasks still open:**
+```
+Cannot close epic bd-1: N tasks still open:
+- bd-3: Task Name (status: in_progress)
+- bd-5: Task Name (status: open)
+
+Complete all tasks before finishing.
+```
+
+**STOP. Do not proceed.**
+
+**If all tasks closed:**
+
+```bash
+bd close bd-1
+```
+
+---
+
+## Step 2: Verify Tests
+
+**IMPORTANT:** Use hyperpowers:test-runner agent to avoid context pollution.
+
+Dispatch hyperpowers:test-runner agent:
+```
+Run: cargo test
+(or: npm test / pytest / go test ./...)
+```
+
+Agent returns summary + failures only.
+
+**If tests fail:**
+```
+Tests failing (N failures). Must fix before completing:
+
+[Show failures]
+
+Cannot proceed until tests pass.
+```
+
+**STOP. Do not proceed.**
+
+**If tests pass:** Continue to Step 3.
+
+---
+
+## Step 3: Determine Base Branch
+
+```bash
+git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null
+```
+
+Or ask: "This branch split from main - is that correct?"
+
+---
+
+## Step 4: Present Options
+
+Present exactly these 4 options:
+
+```
+Implementation complete. What would you like to do?
+
+1. Merge back to <base-branch> locally
+2. Push and create a Pull Request
+3. Keep the branch as-is (I'll handle it later)
+4. Discard this work
+
+Which option?
+```
+
+**Don't add explanation.** Keep concise.
+
+---
+
+## Step 5: Execute Choice
+
+### Option 1: Merge Locally
+
+```bash
+git checkout <base-branch>
+git pull
+git merge <feature-branch>
+
+# Verify tests on merged result
+Dispatch hyperpowers:test-runner: "Run: <test command>"
+
+# If tests pass
+git branch -d <feature-branch>
+```
+
+Then: Step 6 (cleanup worktree)
+
+---
+
+### Option 2: Push and Create PR
+
+**Get epic info:**
+
+```bash
+bd show bd-1
+bd dep tree bd-1
+```
+
+**Create PR:**
+
+```bash
+git push -u origin <feature-branch>
+
+gh pr create --title "feat: <epic-name>" --body "$(cat <<'EOF'
+## Epic
+
+Closes bd-<N>: <Epic Title>
+
+## Summary
+<2-3 bullets from epic implementation>
+
+## Tasks Completed
+- bd-2: <Task Name>
+- bd-3: <Task Name>
+
+## Test Plan
+- [ ] All tests passing
+- [ ] <verification steps from epic>
+EOF
+)"
+```
+
+Then: Step 6 (cleanup worktree)
+
+---
+
+### Option 3: Keep As-Is
+
+Report: "Keeping branch <name>. Worktree preserved at <path>."
+
+**Don't cleanup worktree.**
+
+---
+
+### Option 4: Discard
+
+**Confirm first:**
+
+```
+This will permanently delete:
+- Branch <name>
+- All commits: <commit-list>
+- Worktree at <path>
+
+Type 'discard' to confirm.
+```
+
+Wait for exact "discard" confirmation.
+
+**If confirmed:**
+
+```bash
+git checkout <base-branch>
+git branch -D <feature-branch>
+```
+
+Then: Step 6 (cleanup worktree)
+
+---
+
+## Step 6: Cleanup Worktree
+
+**For Options 1, 2, 4 only:**
+
+```bash
+# Check if in worktree
+git worktree list | grep $(git branch --show-current)
+
+# If yes
+git worktree remove <worktree-path>
+```
+
+**For Option 3:** Keep worktree (don't cleanup).
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer skips test verification before presenting options</scenario>
+
+<code>
+# Step 1: Epic closed ✓
+bd close bd-1
+
+# Step 2: SKIPPED test verification
+# Jump directly to presenting options
+
+"Implementation complete. What would you like to do?
+1. Merge back to main locally
+2. Push and create PR
+..."
+
+User selects Option 1
+
+git checkout main
+git merge feature-branch
+# Tests fail! Broken code now on main
+</code>
+
+<why_it_fails>
+- Skipped mandatory test verification
+- Merged broken code to main branch
+- Other developers pull broken main
+- CI/CD fails, blocks deployment
+- Must revert, fix, merge again (wasted time)
+</why_it_fails>
+
+<correction>
+**Follow Step 2 strictly:**
+
+```bash
+# After closing epic
+bd close bd-1 ✓
+
+# MANDATORY: Verify tests BEFORE presenting options
+Dispatch hyperpowers:test-runner agent: "Run: cargo test"
+
+# Agent reports
+"Test suite passed (127 tests, 0 failures, 2.3s)"
+
+# NOW present options
+"Implementation complete. What would you like to do?
+1. Merge back to main locally
+..."
+```
+
+**What you gain:**
+- Confidence tests pass before integration
+- No broken code merged to main
+- CI/CD stays green
+- Other developers unblocked
+- Professional workflow
+</correction>
+</example>
+
+<example>
+<scenario>Developer auto-cleans worktree for PR option</scenario>
+
+<code>
+# User selects Option 2: Create PR
+git push -u origin feature-auth
+gh pr create --title "feat: Add OAuth" --body "..."
+
+# Developer immediately cleans up worktree
+git worktree remove ../feature-auth-worktree
+
+# PR gets feedback: "Please add rate limiting"
+# User: "Can you address the PR feedback?"
+# Worktree is gone! Have to recreate it
+git worktree add ../feature-auth-worktree feature-auth
+# Lost local state, uncommitted experiments, etc.
+</code>
+
+<why_it_fails>
+- Cleaned worktree when PR still active
+- User likely needs worktree for PR feedback
+- Have to recreate worktree for changes
+- Lost any local uncommitted work
+- Inefficient workflow
+</why_it_fails>
+
+<correction>
+**Option 2 workflow (correct):**
+
+```bash
+git push -u origin feature-auth
+gh pr create --title "feat: Add OAuth" --body "..."
+
+# Report PR created
+"Pull request created: https://github.com/user/repo/pull/42
+
+Keeping worktree at ../feature-auth-worktree for PR updates."
+
+# NO worktree cleanup
+# User can address PR feedback in same worktree
+```
+
+**Cleanup happens later when:**
+- PR is merged
+- User explicitly requests cleanup
+- User uses finishing-a-development-branch again after PR merges
+
+**What you gain:**
+- Worktree available for PR feedback
+- No need to recreate worktree
+- Preserve local state and experiments
+- Efficient PR iteration workflow
+</correction>
+</example>
+
+<example>
+<scenario>Developer discards work without confirmation</scenario>
+
+<code>
+# User says: "Actually, discard this work"
+
+# Developer immediately executes
+git checkout main
+git branch -D feature-experimental
+git worktree remove ../feature-experimental-worktree
+
+# 50 commits deleted
+# User: "Wait, I meant discard the LAST commit, not the whole branch!"
+# Too late - work is gone
+</code>
+
+<why_it_fails>
+- No confirmation before destructive action
+- Misunderstood user intent
+- Permanent data loss
+- No way to recover work
+- Catastrophic for user
+</why_it_fails>
+
+<correction>
+**Option 4 workflow (correct):**
+
+```
+User: "Discard this work"
+
+"This will permanently delete:
+- Branch feature-experimental
+- All commits:
+  * a1b2c3d Add OAuth integration
+  * d4e5f6g Add rate limiting
+  * g7h8i9j Update tests
+  ... (47 more commits)
+- Worktree at ../feature-experimental-worktree
+
+Type 'discard' to confirm."
+
+# WAIT for exact confirmation
+User types: "discard"
+
+# NOW execute
+git checkout main
+git branch -D feature-experimental
+git worktree remove ../feature-experimental-worktree
+
+"Branch feature-experimental deleted."
+```
+
+**What you gain:**
+- User sees exactly what will be deleted
+- Explicit confirmation required
+- Prevents accidental data loss
+- Time to reconsider or clarify
+- Safe destructive operations
+</correction>
+</example>
+</examples>
+
+<option_matrix>
+| Option | Merge | Push | Keep Worktree | Cleanup Branch | Cleanup Worktree |
+|--------|-------|------|---------------|----------------|------------------|
+| 1. Merge locally | ✓ | - | - | ✓ | ✓ |
+| 2. Create PR | - | ✓ | ✓ | - | - |
+| 3. Keep as-is | - | - | ✓ | - | - |
+| 4. Discard | - | - | - | ✓ (force) | ✓ |
+</option_matrix>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Never skip test verification** → Tests must pass before presenting options
+2. **Present exactly 4 options** → No open-ended questions
+3. **Require confirmation for Option 4** → Type "discard" exactly
+4. **Keep worktree for Options 2 & 3** → PR and keep-as-is need worktree
+5. **Verify tests after merge (Option 1)** → Merged result might break
+
+## Common Excuses
+
+All of these mean: **STOP. Follow the process.**
+
+- "Tests passed earlier, don't need to verify" (Might have changed, verify now)
+- "User knows what they want" (Present options, let them choose)
+- "Obvious they want to discard" (Require explicit confirmation)
+- "PR done, cleanup worktree" (PR likely needs updates, keep worktree)
+- "Too many options" (Exactly 4, no more, no less)
+</critical_rules>
+
+<verification_checklist>
+Before completing:
+
+- [ ] bd epic closed (all child tasks closed)
+- [ ] Tests verified passing (via test-runner agent)
+- [ ] Presented exactly 4 options (no open-ended questions)
+- [ ] Waited for user choice (didn't assume)
+- [ ] If Option 4: Got typed "discard" confirmation
+- [ ] Worktree cleaned for Options 1, 4 only (not 2, 3)
+- [ ] If Option 1: Verified tests on merged result
+
+**Can't check all boxes?** Return to process and complete missing steps.
+</verification_checklist>
+
+<integration>
+**This skill is called by:**
+- hyperpowers:review-implementation (final step after approval)
+
+**Call chain:**
+```
+hyperpowers:executing-plans → hyperpowers:review-implementation → hyperpowers:finishing-a-development-branch
+                         ↓
+                   (if gaps found: STOP)
+```
+
+**This skill calls:**
+- hyperpowers:test-runner agent (for test verification)
+- bd commands (epic management)
+- gh commands (PR creation)
+
+**CRITICAL:** Never read `.beads/issues.jsonl` directly. Always use bd CLI commands.
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Git worktree management](resources/worktree-guide.md)
+- [PR description templates](resources/pr-templates.md)
+- [bd epic reference in PRs](resources/bd-pr-integration.md)
+
+**When stuck:**
+- Tasks won't close → Check bd status, verify all child tasks done
+- Tests fail → Fix before presenting options (can't proceed)
+- User unsure → Explain options, but don't make choice for them
+- Worktree won't remove → Might have uncommitted changes, ask user
+</resources>
--- a/skills/fixing-bugs/SKILL.md
+++ b/skills/fixing-bugs/SKILL.md
@@ -0,0 +1,528 @@
+---
+name: fixing-bugs
+description: Use when encountering a bug - complete workflow from discovery through debugging, bd issue, test-driven fix, verification, and closure
+---
+
+<skill_overview>
+Bug fixing is a complete workflow: reproduce, track in bd, debug systematically, write test, fix, verify, close. Every bug gets a bd issue and regression test.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow exact workflow: create bd issue → debug with tools → write failing test → fix → verify → close.
+
+Never skip tracking or regression test. Use debugging-with-tools for investigation, test-driven-development for fix.
+</rigidity_level>
+
+<quick_reference>
+
+| Step | Action | Command/Skill |
+|------|--------|---------------|
+| **1. Track** | Create bd bug issue | `bd create "Bug: [description]" --type bug` |
+| **2. Debug** | Systematic investigation | Use `debugging-with-tools` skill |
+| **3. Test (RED)** | Write failing test reproducing bug | Use `test-driven-development` skill |
+| **4. Fix (GREEN)** | Implement fix | Minimal code to pass test |
+| **5. Verify** | Run full test suite | Use `verification-before-completion` skill |
+| **6. Close** | Update and close bd issue | `bd close bd-123` |
+
+**FORBIDDEN:** Fix without bd issue, fix without regression test
+**REQUIRED:** Every bug gets tracked, tested, verified before closing
+
+</quick_reference>
+
+<when_to_use>
+**Use when you discover a bug:**
+- Test failure you need to fix
+- Bug reported by user
+- Unexpected behavior in development
+- Regression from recent change
+- Production issue (non-emergency)
+
+**Production emergencies:** Abbreviated workflow OK (hotfix first), but still create bd issue and add regression tests afterward.
+</when_to_use>
+
+<the_process>
+
+## 1. Create bd Bug Issue
+
+**Track from the start:**
+
+```bash
+bd create "Bug: [Clear description]" --type bug --priority P1
+# Returns: bd-123
+```
+
+**Document:**
+```bash
+bd edit bd-123 --design "
+## Bug Description
+[What's wrong]
+
+## Reproduction Steps
+1. Step one
+2. Step two
+
+## Expected Behavior
+[What should happen]
+
+## Actual Behavior
+[What actually happens]
+
+## Environment
+[Version, OS, etc.]"
+```
+
+## 2. Debug Systematically
+
+**REQUIRED: Use debugging-with-tools skill**
+
+```
+Use Skill tool: hyperpowers:debugging-with-tools
+```
+
+**debugging-with-tools will:**
+- Use internet-researcher to search for error
+- Recommend debugger or instrumentation
+- Use codebase-investigator to understand context
+- Guide to root cause (not symptom)
+
+**Update bd issue with findings:**
+```bash
+bd edit bd-123 --design "[previous content]
+
+## Investigation
+[Root cause found via debugging]
+[Tools used: debugger, internet search, etc.]"
+```
+
+## 3. Write Failing Test (RED Phase)
+
+**REQUIRED: Use test-driven-development skill**
+
+Write test that reproduces the bug:
+
+```python
+def test_rejects_empty_email():
+    """Regression test for bd-123: Empty email accepted"""
+    with pytest.raises(ValidationError):
+        create_user(email="")  # Should fail, currently passes
+```
+
+**Run test, verify it FAILS:**
+```bash
+pytest tests/test_user.py::test_rejects_empty_email
+# Expected: PASS (bug exists)
+# Should fail AFTER fix
+```
+
+**Why critical:** If test passes before fix, it doesn't test the bug.
+
+## 4. Implement Fix (GREEN Phase)
+
+**Fix the root cause (not symptom):**
+
+```python
+def create_user(email: str):
+    if not email or not email.strip():  # Fix
+        raise ValidationError("Email required")
+    # ... rest
+```
+
+**Run test, verify it now FAILS (test was written backwards by mistake earlier - fix this):**
+
+Actually write the test to FAIL first:
+```python
+def test_rejects_empty_email():
+    with pytest.raises(ValidationError):
+        create_user(email="")
+```
+
+Run:
+```bash
+pytest tests/test_user.py::test_rejects_empty_email
+# Should FAIL before fix (no validation)
+# Should PASS after fix (validation added)
+```
+
+## 5. Verify Complete Fix
+
+**REQUIRED: Use verification-before-completion skill**
+
+```bash
+# Run full test suite (via test-runner agent)
+"Run: pytest"
+
+# Agent returns: All tests pass (including regression test)
+```
+
+**Verify:**
+- Regression test passes
+- All other tests still pass
+- No new warnings or errors
+- Pre-commit hooks pass
+
+## 6. Close bd Issue
+
+**Update with fix details:**
+```bash
+bd edit bd-123 --design "[previous content]
+
+## Fix Implemented
+[Description of fix]
+[File changed: src/auth/user.py:23]
+
+## Regression Test
+[Test added: tests/test_user.py::test_rejects_empty_email]
+
+## Verification
+[All tests pass: 145/145]"
+
+bd close bd-123
+```
+
+**Commit with bd reference:**
+```bash
+git commit -m "fix(bd-123): Reject empty email in user creation
+
+Adds validation to prevent empty strings.
+Regression test: test_rejects_empty_email
+
+Closes bd-123"
+```
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer fixes bug without creating bd issue or regression test</scenario>
+
+<code>
+Developer notices: Empty email accepted in user creation
+
+"Fixes" immediately:
+```python
+def create_user(email: str):
+    if not email:  # Quick fix
+        raise ValidationError("Email required")
+```
+
+Commits: "fix: validate email"
+
+[No bd issue, no regression test]
+</code>
+
+<why_it_fails>
+**No tracking:**
+- Work not tracked in bd (can't see what was fixed)
+- No link between commit and bug
+- Can't verify fix meets requirements
+
+**No regression test:**
+- Bug could come back in future
+- Can't prove fix works
+- No protection against breaking this again
+
+**Incomplete fix:**
+- Doesn't handle `email=" "` (whitespace)
+- Didn't debug to understand full issue
+
+**Result:** Bug returns when someone changes validation logic.
+</why_it_fails>
+
+<correction>
+**Complete workflow:**
+
+```bash
+# 1. Track
+bd create "Bug: Empty email accepted" --type bug
+# Returns: bd-123
+
+# 2. Debug (use debugging-with-tools)
+# Investigation reveals: Email validation missing entirely
+# Also: Whitespace emails like " " also accepted
+
+# 3. Write failing test (RED)
+def test_rejects_empty_email():
+    with pytest.raises(ValidationError):
+        create_user(email="")
+
+def test_rejects_whitespace_email():
+    with pytest.raises(ValidationError):
+        create_user(email="   ")
+
+# Run: Both PASS (bug exists) - WAIT, test should FAIL before fix!
+```
+
+Actually:
+```python
+# Test currently PASSES (bug exists - no validation)
+# We expect test to FAIL after we add validation
+
+# 4. Fix
+def create_user(email: str):
+    if not email or not email.strip():
+        raise ValidationError("Email required")
+
+# 5. Verify
+pytest  # All tests pass now, including regression tests
+
+# 6. Close
+bd close bd-123
+git commit -m "fix(bd-123): Reject empty/whitespace email"
+```
+
+**Result:** Bug fixed, tracked, tested, won't regress.
+</correction>
+</example>
+
+<example>
+<scenario>Developer writes test after fix, test passes immediately, doesn't catch regression</scenario>
+
+<code>
+Developer fixes validation bug, then writes test:
+
+```python
+# Fix first
+def validate_email(email):
+    return "@" in email and len(email) > 0
+
+# Then test
+def test_validate_email():
+    assert validate_email("user@example.com") == True
+```
+
+Test runs: PASS
+
+Commits both together.
+
+Later, someone changes validation:
+```python
+def validate_email(email):
+    return True  # Breaks validation!
+```
+
+Test still PASSES (only checks happy path).
+</code>
+
+<why_it_fails>
+**Test written after fix:**
+- Never saw test fail
+- Only tests happy path remembered
+- Doesn't test the bug that was fixed
+- Missed edge case: `validate_email("@@")` returns True (bug!)
+
+**Why it happens:** Skipping TDD RED phase.
+</why_it_fails>
+
+<correction>
+**TDD approach (RED-GREEN):**
+
+```python
+# 1. Write test FIRST that reproduces the bug
+def test_validate_email():
+    # Happy path
+    assert validate_email("user@example.com") == True
+    # Bug case (empty email was accepted)
+    assert validate_email("") == False
+    # Edge case discovered during debugging
+    assert validate_email("@@") == False
+
+# 2. Run test - should FAIL (bug exists)
+pytest test_validate_email
+# FAIL: validate_email("") returned True, expected False
+
+# 3. Implement fix
+def validate_email(email):
+    if not email or len(email) == 0:
+        return False
+    return "@" in email and email.count("@") == 1
+
+# 4. Run test - should PASS
+pytest test_validate_email
+# PASS: All assertions pass
+```
+
+**Later regression:**
+```python
+def validate_email(email):
+    return True  # Someone breaks it
+```
+
+**Test catches it:**
+```
+FAIL: assert validate_email("") == False
+Expected False, got True
+```
+
+**Result:** Regression test actually prevents bug from returning.
+</correction>
+</example>
+
+<example>
+<scenario>Developer fixes symptom without using debugging-with-tools to find root cause</scenario>
+
+<code>
+Bug report: "Application crashes when processing user data"
+
+Error:
+```
+NullPointerException at UserService.java:45
+```
+
+Developer sees line 45:
+```java
+String email = user.getEmail().toLowerCase();  // Line 45
+```
+
+"Obvious fix":
+```java
+String email = user.getEmail() != null ? user.getEmail().toLowerCase() : "";
+```
+
+Bug "fixed"... but crashes continue with different data.
+</code>
+
+<why_it_fails>
+**Symptom fix:**
+- Fixed null check at crash point
+- Didn't investigate WHY email is null
+- Didn't use debugging-with-tools to find root cause
+
+**Actual root cause:** User object created without email in registration flow. Email is null for all users created via broken endpoint.
+
+**Result:** Null-check applied everywhere, root cause (broken registration) unfixed.
+</why_it_fails>
+
+<correction>
+**Use debugging-with-tools skill:**
+
+```
+# Dispatch internet-researcher
+"Search for: NullPointerException UserService getEmail
+- Common causes of null email in user objects
+- User registration validation patterns"
+
+# Dispatch codebase-investigator
+"Investigate:
+- How is User object created?
+- Where is email set?
+- Are there paths where email can be null?
+- Which endpoints create users?"
+
+# Agent reports:
+"Found: POST /register endpoint creates User without validating email field.
+Email is optional in UserDTO but required in User domain object."
+```
+
+**Root cause found:** Registration doesn't validate email.
+
+**Proper fix:**
+```java
+// In registration endpoint
+@PostMapping("/register")
+public User register(@RequestBody UserDTO dto) {
+    if (dto.getEmail() == null || dto.getEmail().isEmpty()) {
+        throw new ValidationException("Email required");
+    }
+    return userService.create(dto);
+}
+```
+
+**Regression test:**
+```java
+@Test
+void registrationRequiresEmail() {
+    assertThrows(ValidationException.class, () ->
+        register(new UserDTO(null, "password")));
+}
+```
+
+**Result:** Root cause fixed, no more null emails created.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **Every bug gets a bd issue** → Track from discovery to closure
+   - Create bd issue before fixing
+   - Document reproduction steps
+   - Update with investigation findings
+   - Close only after verified
+
+2. **Use debugging-with-tools skill** → Systematic investigation required
+   - Never guess at fixes
+   - Use internet-researcher for errors
+   - Use debugger/instrumentation for state
+   - Find root cause, not symptom
+
+3. **Write failing test first (RED)** → Regression prevention
+   - Test must fail before fix
+   - Test must reproduce the bug
+   - Test must pass after fix
+   - If test passes immediately, it doesn't test the bug
+
+4. **Verify complete fix** → Use verification-before-completion
+   - Regression test passes
+   - Full test suite passes
+   - No new warnings
+   - Pre-commit hooks pass
+
+## Common Excuses
+
+All of these mean: Stop, follow complete workflow:
+- "Quick fix, no need for bd issue"
+- "Obvious bug, no need to debug"
+- "I'll add test later"
+- "Test passes, must be fixed"
+- "Just one line change"
+
+</critical_rules>
+
+<verification_checklist>
+
+Before claiming bug fixed:
+- [ ] bd issue created with reproduction steps
+- [ ] Used debugging-with-tools to find root cause
+- [ ] Wrote test that reproduces bug (RED phase)
+- [ ] Verified test FAILS before fix
+- [ ] Implemented fix addressing root cause
+- [ ] Verified test PASSES after fix
+- [ ] Ran full test suite (all pass)
+- [ ] Updated bd issue with fix details
+- [ ] Closed bd issue
+- [ ] Committed with bd reference
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- debugging-with-tools (systematic investigation)
+- test-driven-development (RED-GREEN-REFACTOR cycle)
+- verification-before-completion (verify complete fix)
+
+**This skill is called by:**
+- When bugs discovered during development
+- When test failures need fixing
+- When user reports bugs
+
+**Agents used:**
+- hyperpowers:internet-researcher (via debugging-with-tools)
+- hyperpowers:codebase-investigator (via debugging-with-tools)
+- hyperpowers:test-runner (run tests without output pollution)
+
+</integration>
+
+<resources>
+
+**When stuck:**
+- Don't understand bug → Use debugging-with-tools skill
+- Tempted to skip tracking → Create bd issue first, always
+- Test passes immediately → Not testing the bug, rewrite test
+- Fix doesn't work → Return to debugging-with-tools, find actual root cause
+
+</resources>
--- a/skills/managing-bd-tasks/SKILL.md
+++ b/skills/managing-bd-tasks/SKILL.md
@@ -0,0 +1,707 @@
+---
+name: managing-bd-tasks
+description: Use for advanced bd operations - splitting tasks mid-flight, merging duplicates, changing dependencies, archiving epics, querying metrics, cross-epic dependencies
+---
+
+<skill_overview>
+Advanced bd operations for managing complex task structures; bd is single source of truth, keep it accurate.
+</skill_overview>
+
+<rigidity_level>
+HIGH FREEDOM - These are operational patterns, not rigid workflows. Adapt operations to your specific situation while following the core principles (keep bd accurate, merge don't delete, document changes).
+</rigidity_level>
+
+<quick_reference>
+| Operation | When | Key Command |
+|-----------|------|-------------|
+| Split task | Task too large mid-flight | Create subtasks, add deps, close parent |
+| Merge duplicates | Found duplicate tasks | Combine designs, move deps, close with reference |
+| Change dependencies | Dependencies wrong/changed | `bd dep remove` then `bd dep add` |
+| Archive epic | Epic complete, hide from views | `bd close bd-X --reason "Archived"` |
+| Query metrics | Need status/velocity data | `bd list` + filters + `wc -l` |
+| Cross-epic deps | Task depends on other epic | `bd dep add` works across epics |
+| Bulk updates | Multiple tasks need same change | Loop with careful review first |
+| Recover mistakes | Accidentally closed/wrong dep | `bd update --status` or `bd dep remove` |
+
+**Core principle:** Track all work in bd, update as you go, never batch updates.
+</quick_reference>
+
+<when_to_use>
+Use this skill for **advanced** bd operations:
+- Split task that's too large (discovered mid-implementation)
+- Merge duplicate tasks
+- Reorganize dependencies after work started
+- Archive completed epics (hide from views, keep history)
+- Query bd for metrics (velocity, progress, bottlenecks)
+- Manage cross-epic dependencies
+- Bulk status updates
+- Recover from bd mistakes
+
+**For basic operations:** See skills/common-patterns/bd-commands.md (create, show, close, update)
+</when_to_use>
+
+<operations>
+## Operation 1: Splitting Tasks Mid-Flight
+
+**When:** Task in-progress but turns out too large.
+
+**Example:** Started "Implement authentication" - realize it's 8+ hours of work across multiple areas.
+
+**Process:**
+
+### Step 1: Create subtasks for remaining work
+
+```bash
+# Original task bd-5 is in-progress
+# Already completed: Login form
+# Remaining work gets split:
+
+bd create "Auth API endpoints" --type task --priority P1 --design "
+POST /api/login and POST /api/logout endpoints.
+## Success Criteria
+- [ ] POST /api/login validates credentials, returns JWT
+- [ ] POST /api/logout invalidates token
+- [ ] Tests pass
+"
+# Returns bd-12
+
+bd create "Session management" --type task --priority P1 --design "
+JWT token tracking and validation.
+## Success Criteria
+- [ ] JWT generated on login
+- [ ] Tokens validated on protected routes
+- [ ] Token expiration handled
+- [ ] Tests pass
+"
+# Returns bd-13
+
+bd create "Password hashing" --type task --priority P1 --design "
+Secure password hashing with bcrypt.
+## Success Criteria
+- [ ] Passwords hashed before storage
+- [ ] Hash verification on login
+- [ ] Tests pass
+"
+# Returns bd-14
+```
+
+### Step 2: Set up dependencies
+
+```bash
+# Password hashing must be done first
+# API endpoints depend on password hashing
+bd dep add bd-12 bd-14  # bd-12 depends on bd-14
+
+# Session management depends on API endpoints
+bd dep add bd-13 bd-12  # bd-13 depends on bd-12
+
+# View tree
+bd dep tree bd-5
+```
+
+### Step 3: Update original task and close
+
+```bash
+bd edit bd-5 --design "
+Implement user authentication.
+
+## Status
+✓ Login form completed (frontend)
+✗ Remaining work split into subtasks:
+  - bd-14: Password hashing (do first)
+  - bd-12: Auth API endpoints (depends on bd-14)
+  - bd-13: Session management (depends on bd-12)
+
+## Success Criteria
+- [x] Login form renders
+- [ ] See subtasks for remaining criteria
+"
+
+bd close bd-5 --reason "Split into bd-12, bd-13, bd-14"
+```
+
+### Step 4: Work on subtasks in order
+
+```bash
+bd ready  # Shows bd-14 (no dependencies)
+bd update bd-14 --status in_progress
+# Complete bd-14...
+bd close bd-14
+
+# Now bd-12 is unblocked
+bd ready  # Shows bd-12
+```
+
+---
+
+## Operation 2: Merging Duplicate Tasks
+
+**When:** Discovered two tasks are same thing.
+
+**Example:**
+```
+bd-7: "Add email validation"
+bd-9: "Validate user email addresses"
+^ Duplicates
+```
+
+### Step 1: Choose which to keep
+
+Based on:
+- Which has more complete design?
+- Which has more work done?
+- Which has more dependencies?
+
+**Example:** Keep bd-7 (more complete)
+
+### Step 2: Merge designs
+
+```bash
+bd show bd-7
+bd show bd-9
+
+# Combine into bd-7
+bd edit bd-7 --design "
+Add email validation to user creation and update.
+
+## Background
+Originally tracked as bd-7 and bd-9 (now merged).
+
+## Success Criteria
+- [ ] Email validated on creation
+- [ ] Email validated on update
+- [ ] Rejects invalid formats
+- [ ] Rejects empty strings
+- [ ] Tests cover all cases
+
+## Notes from bd-9
+Need validation on update, not just creation.
+"
+```
+
+### Step 3: Move dependencies
+
+```bash
+# Check bd-9 dependencies
+bd show bd-9
+
+# If bd-10 depended on bd-9, update to bd-7
+bd dep remove bd-10 bd-9
+bd dep add bd-10 bd-7
+```
+
+### Step 4: Close duplicate with reference
+
+```bash
+bd edit bd-9 --design "DUPLICATE: Merged into bd-7
+
+This task was duplicate of bd-7. All work tracked there."
+
+bd close bd-9
+```
+
+---
+
+## Operation 3: Changing Dependencies
+
+**When:** Dependencies were wrong or requirements changed.
+
+**Example:** bd-10 depends on bd-8 and bd-9, but bd-9 got merged and bd-10 now also needs bd-11.
+
+```bash
+# Remove obsolete dependency
+bd dep remove bd-10 bd-9
+
+# Add new dependency
+bd dep add bd-10 bd-11
+
+# Verify
+bd dep tree bd-1  # If bd-10 in epic bd-1
+bd show bd-10 | grep "Blocking"
+```
+
+**Common scenarios:**
+- Discovered hidden dependency during implementation
+- Requirements changed mid-flight
+- Tasks reordered for better flow
+
+---
+
+## Operation 4: Archiving Completed Epics
+
+**When:** Epic complete, want to hide from default views but keep history.
+
+```bash
+# Verify all tasks closed
+bd list --parent bd-1 --status open
+# Output: [empty] = all closed
+
+# Archive epic
+bd close bd-1 --reason "Archived - completed Oct 2025"
+
+# Won't show in open listings
+bd list --status open  # bd-1 won't appear
+
+# Still accessible
+bd show bd-1  # Still shows full epic
+```
+
+**Use archived for:** Completed epics, shipped features, historical reference
+**Use open/in-progress for:** Active work
+**Use closed with note for:** Cancelled work (explain why)
+
+---
+
+## Operation 5: Querying for Metrics
+
+### Velocity
+
+```bash
+# Tasks closed this week
+bd list --status closed | grep "closed_at" | grep "2025-10-" | wc -l
+
+# Tasks closed by epic
+bd list --parent bd-1 --status closed | wc -l
+```
+
+### Blocked vs Ready
+
+```bash
+# Ready to work on
+bd ready
+bd ready | grep "^bd-" | wc -l
+
+# All open tasks
+bd list --status open | wc -l
+
+# Blocked = open - ready
+```
+
+### Epic Progress
+
+```bash
+# Show tree
+bd dep tree bd-1
+
+# Total tasks in epic
+bd list --parent bd-1 | grep "^bd-" | wc -l
+
+# Completed tasks
+bd list --parent bd-1 --status closed | grep "^bd-" | wc -l
+
+# Percentage = (completed / total) * 100
+```
+
+**For detailed metrics guidance:** See [resources/metrics-guide.md](resources/metrics-guide.md)
+
+---
+
+## Operation 6: Cross-Epic Dependencies
+
+**When:** Task in one epic depends on task in different epic.
+
+**Example:**
+```
+Epic bd-1: User Management
+  - bd-10: User CRUD API
+
+Epic bd-2: Order Management
+  - bd-20: Order creation (needs user API)
+```
+
+```bash
+# Add cross-epic dependency
+bd dep add bd-20 bd-10
+# bd-20 (in bd-2) depends on bd-10 (in bd-1)
+
+# Check dependencies
+bd show bd-20 | grep "Blocking"
+
+# Check ready tasks
+bd ready
+# Won't show bd-20 until bd-10 closed
+```
+
+**Best practices:**
+- Document cross-epic dependencies clearly
+- Consider if epics should be merged
+- Coordinate if different people own epics
+
+---
+
+## Operation 7: Bulk Status Updates
+
+**When:** Need to update multiple tasks.
+
+**Example:** Mark all test tasks closed after suite complete.
+
+```bash
+# Get tasks
+bd list --parent bd-1 --status open | grep "test:" > test-tasks.txt
+
+# Review list
+cat test-tasks.txt
+
+# Update each
+while read task_id; do
+  bd close "$task_id"
+done < test-tasks.txt
+
+# Verify
+bd list --parent bd-1 --status open | grep "test:"
+```
+
+**Use bulk for:**
+- Marking completed work closed
+- Reopening related tasks
+- Updating priorities
+
+**Never bulk:**
+- Thoughtless changes
+- Hiding problems (closing unfinished tasks)
+
+---
+
+## Operation 8: Recovering from Mistakes
+
+### Accidentally closed task
+
+```bash
+bd update bd-15 --status open
+# Or if was in progress
+bd update bd-15 --status in_progress
+```
+
+### Wrong dependency
+
+```bash
+bd dep remove bd-10 bd-8  # Remove wrong
+bd dep add bd-10 bd-9     # Add correct
+```
+
+### Undo design changes
+
+```bash
+# bd has no undo, restore from git
+git log -p -- .beads/issues.jsonl | grep -A 50 "bd-10"
+# Find previous version, copy
+
+bd edit bd-10 --design "[paste previous]"
+```
+
+### Epic structure wrong
+
+1. Create new tasks with correct structure
+2. Move work to new tasks
+3. Close old tasks with reference
+4. Don't delete (keep audit trail)
+</operations>
+
+<examples>
+<example>
+<scenario>Developer closes duplicate without merging information</scenario>
+
+<code>
+# Found duplicates
+bd-7: "Add email validation"
+bd-9: "Validate user email addresses"
+
+# Developer just closes bd-9
+bd close bd-9
+
+# Loses information from bd-9's design
+# bd-9 mentioned validation on update (bd-7 didn't)
+# Now that requirement is lost
+# Work on bd-7 completes, but misses update validation
+# Bug ships to production
+</code>
+
+<why_it_fails>
+- Closed duplicate without reading its design
+- Lost requirement mentioned only in duplicate
+- Information not preserved
+- Incomplete implementation ships
+- bd not accurate source of truth
+</why_it_fails>
+
+<correction>
+**Correct process:**
+
+```bash
+# Read BOTH tasks
+bd show bd-7  # Only mentions validation on creation
+bd show bd-9  # Mentions validation on update too
+
+# Merge information
+bd edit bd-7 --design "
+Email validation for user creation and update.
+
+## Background
+Merged from bd-9.
+
+## Success Criteria
+- [ ] Validate on creation (from bd-7)
+- [ ] Validate on update (from bd-9)  ← Preserved!
+- [ ] Tests for both cases
+"
+
+# Then close duplicate with reference
+bd edit bd-9 --design "DUPLICATE: Merged into bd-7"
+bd close bd-9
+```
+
+**What you gain:**
+- All requirements preserved
+- bd remains accurate
+- No information lost
+- Complete implementation
+- Audit trail clear
+</correction>
+</example>
+
+<example>
+<scenario>Developer doesn't split large task, struggles through</scenario>
+
+<code>
+bd-15: "Implement payment processing" (started)
+
+# 3 hours in, developer realizes:
+# - Need Stripe API integration (4 hours)
+# - Need payment validation (2 hours)
+# - Need retry logic (3 hours)
+# - Need receipt generation (2 hours)
+# Total: 11 more hours!
+
+# Developer thinks: "Too late to split, I'll power through"
+# Works 14 hours straight
+# Gets exhausted, makes mistakes
+# Ships buggy code
+# Has to fix in production
+</code>
+
+<why_it_fails>
+- Didn't split when discovered size
+- "Sunk cost" rationalization (already started)
+- No clear stopping points
+- Exhaustion leads to bugs
+- Can't track progress granularly
+- If interrupted, hard to resume
+</why_it_fails>
+
+<correction>
+**Correct approach (split mid-flight):**
+
+```bash
+# 3 hours in, stop and split
+
+bd edit bd-15 --design "
+Implement payment processing.
+
+## Status
+✓ Completed: Payment form UI (3 hours)
+✗ Split remaining work into subtasks:
+  - bd-20: Stripe API integration
+  - bd-21: Payment validation
+  - bd-22: Retry logic
+  - bd-23: Receipt generation
+"
+
+bd close bd-15 --reason "Split into bd-20, bd-21, bd-22, bd-23"
+
+# Create subtasks with dependencies
+bd create "Stripe API integration" ...  # bd-20
+bd create "Payment validation" ...      # bd-21
+bd create "Retry logic" ...             # bd-22
+bd create "Receipt generation" ...      # bd-23
+
+bd dep add bd-21 bd-20  # Validation needs API
+bd dep add bd-22 bd-20  # Retry needs API
+bd dep add bd-23 bd-22  # Receipts after retry works
+
+# Work on one at a time
+bd update bd-20 --status in_progress
+# Complete bd-20 (4 hours)
+bd close bd-20
+
+# Take break
+# Next day: bd-21
+```
+
+**What you gain:**
+- Clear stopping points (can pause between tasks)
+- Track progress granularly
+- No exhaustion (spread over days)
+- Better quality (not rushed)
+- If interrupted, easy to resume
+- Each subtask gets proper focus
+</correction>
+</example>
+
+<example>
+<scenario>Developer adds dependency but doesn't update dependent task</scenario>
+
+<code>
+# Initial state
+bd-10: "Add user dashboard" (in progress)
+bd-15: "Add analytics to dashboard" (blocked on bd-10)
+
+# During bd-10 implementation, discover need for new API
+bd create "Analytics API endpoints" ...  # Creates bd-20
+
+# Add dependency
+bd dep add bd-15 bd-20  # bd-15 now depends on bd-20 too
+
+# But bd-10 completes, closes
+bd close bd-10
+
+# bd-15 shows as ready (bd-10 closed)
+bd ready  # Shows bd-15
+
+# Developer starts bd-15
+bd update bd-15 --status in_progress
+
+# Immediately blocked - needs bd-20!
+# bd-20 not done yet
+# Have to stop work on bd-15
+# Time wasted
+</code>
+
+<why_it_fails>
+- Added dependency but didn't document in bd-15
+- bd-15's design doesn't mention bd-20 requirement
+- Appears ready when not actually ready
+- Wastes time starting work that's blocked
+- Dependencies not obvious from task design
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+```bash
+# Create new API task
+bd create "Analytics API endpoints" ...  # bd-20
+
+# Add dependency
+bd dep add bd-15 bd-20
+
+# UPDATE bd-15 to document new requirement
+bd edit bd-15 --design "
+Add analytics to dashboard.
+
+## Dependencies
+- bd-10: User dashboard (completed)
+- bd-20: Analytics API endpoints (NEW - discovered during bd-10)
+
+## Success Criteria
+- [ ] Integrate with analytics API (bd-20)
+- [ ] Display charts on dashboard
+- [ ] Tests pass
+"
+
+# Close bd-10
+bd close bd-10
+
+# Check ready
+bd ready  # Does NOT show bd-15 (blocked on bd-20)
+
+# Work on bd-20 first
+bd update bd-20 --status in_progress
+# Complete bd-20
+bd close bd-20
+
+# NOW bd-15 is truly ready
+bd ready  # Shows bd-15
+```
+
+**What you gain:**
+- Dependencies documented in task design
+- Clear why task is blocked
+- No false "ready" signals
+- Work proceeds in correct order
+- No wasted time starting blocked work
+</correction>
+</example>
+</examples>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Keep bd accurate** → Single source of truth for all work
+2. **Merge duplicates, don't just close** → Preserve information from both
+3. **Split large tasks when discovered** → Not after struggling through
+4. **Document dependency changes** → Update task designs when deps change
+5. **Update as you go** → Never batch updates "for later"
+
+## Common Excuses
+
+All of these mean: **STOP. Follow the operation properly.**
+
+- "Task too complex to split" (Every task can be broken down)
+- "Just close duplicate" (Merge first, preserve information)
+- "Won't track this in bd" (All work tracked, no exceptions)
+- "bd is out of date, update later" (Later never comes, update now)
+- "This dependency doesn't matter" (Dependencies prevent blocking, they matter)
+- "Too much overhead to split" (More overhead to fail huge task)
+</critical_rules>
+
+<bd_best_practices>
+**For detailed guidance on:**
+- Task naming conventions
+- Priority guidelines (P0-P4)
+- Task granularity
+- Success criteria
+- Dependency management
+
+**See:** [resources/task-naming-guide.md](resources/task-naming-guide.md)
+</bd_best_practices>
+
+<red_flags>
+Watch for these patterns:
+
+- **Multiple in-progress tasks** → Focus on one
+- **Tasks stuck in-progress for days** → Blocked? Split it?
+- **Many open tasks, no dependencies** → Prioritize!
+- **Epics with 20+ tasks** → Too large, split epic
+- **Closed tasks, incomplete criteria** → Not done, reopen
+</red_flags>
+
+<verification_checklist>
+After advanced bd operations:
+
+- [ ] bd still accurate (reflects reality)
+- [ ] Dependencies correct (nothing blocked incorrectly)
+- [ ] Duplicate information merged (not lost)
+- [ ] Changes documented in task designs
+- [ ] Ready tasks are actually unblocked
+- [ ] Metrics queries return sensible numbers
+- [ ] No orphaned tasks (all part of epics)
+
+**Can't check all boxes?** Review operation and fix issues.
+</verification_checklist>
+
+<integration>
+**This skill covers:** Advanced bd operations
+
+**For basic operations:**
+- skills/common-patterns/bd-commands.md
+
+**Related skills:**
+- hyperpowers:writing-plans (creating epics and tasks)
+- hyperpowers:executing-plans (working through tasks)
+- hyperpowers:verification-before-completion (closing tasks properly)
+
+**CRITICAL:** Use bd CLI commands, never read `.beads/issues.jsonl` directly.
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Metrics guide (cycle time, WIP limits)](resources/metrics-guide.md)
+- [Task naming conventions](resources/task-naming-guide.md)
+- [Dependency patterns](resources/dependency-patterns.md)
+
+**When stuck:**
+- Task seems unsplittable → Ask user how to break it down
+- Duplicates complex → Merge designs carefully, don't rush
+- Dependencies tangled → Draw diagram, untangle systematically
+- bd out of sync → Stop everything, update bd first
+</resources>
--- a/skills/managing-bd-tasks/resources/metrics-guide.md
+++ b/skills/managing-bd-tasks/resources/metrics-guide.md
@@ -0,0 +1,166 @@
+# bd Metrics Guide
+
+This guide covers the key metrics for tracking work in bd.
+
+## Cycle Time vs. Lead Time
+
+**Two distinct time measurements:**
+
+### Cycle Time
+
+- **Definition**: Time from "work started" to "work completed"
+- **Start**: When task moves to "in-progress" status
+- **End**: When task moves to "closed" status
+- **Measures**: How efficiently work flows through active development
+- **Use**: Identify process inefficiencies, improve development speed
+
+```bash
+# Calculate cycle time for completed task
+bd show bd-5 | grep "status.*in-progress" # Get start time
+bd show bd-5 | grep "status.*closed"      # Get end time
+# Difference = cycle time
+```
+
+### Lead Time
+
+- **Definition**: Time from "request created" to "delivered to customer"
+- **Start**: When task is created (enters backlog)
+- **End**: When task is deployed/delivered
+- **Measures**: Overall responsiveness to requests
+- **Use**: Set realistic expectations, measure total process duration
+
+```bash
+# Calculate lead time for completed task
+bd show bd-5 | grep "created_at"    # Get creation time
+bd show bd-5 | grep "deployed_at"   # Get deployment time (if tracked)
+# Difference = lead time
+```
+
+### Key Differences
+
+| Metric | Starts | Ends | Includes Waiting? | Measures |
+|--------|--------|------|-------------------|----------|
+| **Cycle Time** | In-progress | Closed | No | Development efficiency |
+| **Lead Time** | Created | Deployed | Yes | Total responsiveness |
+
+### Example
+
+```
+Task created: Monday 9am (enters backlog)
+↓ [waits 2 days]
+Task started: Wednesday 9am (moved to in-progress)
+↓ [active work]
+Task completed: Wednesday 5pm (moved to closed)
+↓ [waits for deployment]
+Task deployed: Thursday 2pm (delivered)
+
+Cycle Time: 8 hours (Wednesday 9am → 5pm)
+Lead Time: 3 days, 5 hours (Monday 9am → Thursday 2pm)
+```
+
+### Why Both Matter
+
+- **Short cycle time, long lead time**: Work is efficient once started, but tasks wait too long in backlog
+  - Fix: Reduce WIP, start fewer tasks, finish faster
+
+- **Long cycle time, short lead time**: Work starts immediately but takes forever to complete
+  - Fix: Split tasks smaller, remove blockers, improve focus
+
+- **Both long**: Overall process is slow
+  - Fix: Address both backlog management AND development efficiency
+
+### Tracking Over Time
+
+```bash
+# Average cycle time (manual calculation)
+# For each closed task: (closed_at - started_at)
+# Sum and divide by task count
+
+# Trend analysis
+# Week 1: Avg cycle time = 3 days
+# Week 2: Avg cycle time = 2 days  ✅ Improving
+# Week 3: Avg cycle time = 4 days  ❌ Getting worse
+```
+
+### Improvement Targets
+
+- **Cycle time**: Reduce by splitting tasks, removing blockers, improving focus
+- **Lead time**: Reduce by prioritizing backlog, reducing WIP, faster deployment
+
+## Work in Progress (WIP)
+
+```bash
+# All in-progress tasks
+bd list --status in-progress
+
+# Count
+bd list --status in-progress | grep "^bd-" | wc -l
+```
+
+### WIP Limits
+
+Work in Progress limits prevent overcommitment and identify bottlenecks.
+
+**Setting WIP limits:**
+- **Personal WIP limit**: 1-2 tasks in-progress at a time
+- **Team WIP limit**: Depends on team size and workflow stages
+- **Rule of thumb**: WIP limit = (Team size ÷ 2) + 1
+
+**Example for individual developer:**
+```
+✅ Good: 1 task in-progress, 0-1 in code review
+❌ Bad: 5 tasks in-progress simultaneously
+```
+
+**Example for team of 6:**
+```
+Workflow stages and limits:
+- Backlog: Unlimited
+- Ready: 8 items max
+- In Progress: 4 items max  (team size ÷ 2 + 1)
+- Code Review: 3 items max
+- Testing: 2 items max
+- Done: Unlimited
+```
+
+### Why WIP Limits Matter
+
+1. **Focus:** Fewer tasks means deeper focus, faster completion
+2. **Flow:** Prevents bottlenecks from accumulating
+3. **Quality:** Less context switching, fewer mistakes
+4. **Visibility:** High WIP indicates blocked work or overcommitment
+
+### Monitoring WIP
+
+```bash
+# Check personal WIP
+bd list --status in-progress | grep "assignee:me" | wc -l
+
+# If > 2: Focus on finishing before starting new work
+```
+
+### Red Flags
+
+- WIP consistently at or above limit (need more capacity or smaller tasks)
+- WIP growing week-over-week (work piling up, not finishing)
+- WIP high but velocity low (tasks blocked or too large)
+
+### Response to High WIP
+
+1. Finish existing tasks before starting new ones
+2. Identify and remove blockers
+3. Split large tasks
+4. Add capacity (if chronically high)
+
+## Bottleneck Identification
+
+```bash
+# Find tasks that are blocking others
+# (Tasks that many other tasks depend on)
+for task in $(bd list --status open | grep "^bd-" | cut -d: -f1); do
+  echo -n "$task: "
+  bd list --status open | xargs -I {} sh -c "bd show {} | grep -q \"depends on $task\" && echo {}" | wc -l
+done | sort -t: -k2 -n -r
+
+# Shows tasks with most dependencies (top bottlenecks)
+```
--- a/skills/managing-bd-tasks/resources/task-naming-guide.md
+++ b/skills/managing-bd-tasks/resources/task-naming-guide.md
@@ -0,0 +1,276 @@
+# bd Task Naming and Quality Guidelines
+
+This guide covers best practices for naming tasks, setting priorities, sizing work, and defining success criteria.
+
+## Task Naming Conventions
+
+### Principles
+
+- **Actionable**: Start with action verbs (add, fix, update, remove, refactor, implement)
+- **Specific**: Include enough context to understand without opening
+- **Consistent**: Follow project-wide templates
+
+### Templates by Task Type
+
+#### User Stories
+
+**Template:**
+```
+As a [persona], I want [something] so that [reason]
+```
+
+**Examples:**
+```
+As a customer, I want one-click checkout so that I can purchase quickly
+As an admin, I want bulk user import so that I can onboard teams efficiently
+As a developer, I want API rate limiting so that I can prevent abuse
+```
+
+**When to use:** Features from user perspective
+
+#### Bug Reports
+
+**Template 1 (Capability-focused):**
+```
+[User type] can't [action they should be able to do]
+```
+
+**Examples:**
+```
+New users can't view home screen after signup
+Admin users can't export user data to CSV
+Guest users can't add items to cart
+```
+
+**Template 2 (Event-focused):**
+```
+When [action/event], [system feature] doesn't work
+```
+
+**Examples:**
+```
+When clicking Submit, payment form doesn't validate
+When uploading large files, progress bar freezes
+When session expires, user isn't redirected to login
+```
+
+**When to use:** Describing broken functionality
+
+#### Tasks (Implementation Work)
+
+**Template:**
+```
+[Verb] [object] [context]
+```
+
+**Examples:**
+```
+feat(auth): Implement JWT token generation
+fix(api): Handle empty email validation in user endpoint
+test: Add integration tests for payment flow
+refactor: Extract validation logic from UserService
+docs: Update API documentation for v2 endpoints
+```
+
+**When to use:** Technical implementation tasks
+
+#### Features (High-Level Capabilities)
+
+**Template:**
+```
+[Verb] [capability] for [user/system]
+```
+
+**Examples:**
+```
+Add dark mode toggle for Settings page
+Implement rate limiting for API endpoints
+Enable two-factor authentication for admin users
+Build export functionality for report data
+```
+
+**When to use:** Feature-level work (may become epic with multiple tasks)
+
+### Context Guidelines
+
+- **Which component**: "in login flow", "for user API", "in Settings page"
+- **Which user type**: "for admins", "for guests", "for authenticated users"
+- **Avoid jargon** in user stories (user perspective, not technical)
+- **Be specific** in technical tasks (exact API, file, function)
+
+### Good vs Bad Names
+
+**Good names:**
+- `feat(auth): Implement JWT token generation`
+- `fix(api): Handle empty email validation in user endpoint`
+- `As a customer, I want CSV export so that I can analyze my data`
+- `test: Add integration tests for payment flow`
+- `refactor: Extract validation logic from UserService`
+
+**Bad names:**
+- `fix stuff` (vague - what stuff?)
+- `implement feature` (vague - which feature?)
+- `work on backend` (vague - what work?)
+- `Report` (noun, not action - should be "Generate Q4 Sales Report")
+- `API endpoint` (incomplete - "Add GET /users endpoint" better)
+
+## Priority Guidelines
+
+Use bd's priority system consistently:
+
+- **P0:** Critical production bug (drop everything)
+- **P1:** Blocking other work (do next)
+- **P2:** Important feature work (normal priority)
+- **P3:** Nice to have (do when time permits)
+- **P4:** Someday/maybe (backlog)
+
+## Granularity Guidelines
+
+**Good task size:**
+- 2-4 hours of focused work
+- Can complete in one sitting
+- Clear deliverable
+
+**Too large:**
+- Takes multiple days
+- Multiple independent pieces
+- Should be split
+
+**Too small:**
+- Takes 15 minutes
+- Too granular to track
+- Combine with related tasks
+
+## Success Criteria: Acceptance Criteria vs. Definition of Done
+
+**Two distinct types of completion criteria:**
+
+### Acceptance Criteria (Per-Task, Functional)
+
+**Definition:** Specific, measurable requirements unique to each task that define functional completeness from user/business perspective.
+
+**Scope:** Unique to each backlog item (bug, task, story)
+
+**Purpose:** "Does this feature work correctly?"
+
+**Owner:** Product owner/stakeholder defines, team validates
+
+**Format:** Checklist or scenarios
+
+```markdown
+## Acceptance Criteria
+- [ ] User can upload CSV files up to 10MB
+- [ ] System validates CSV format before processing
+- [ ] User sees progress bar during upload
+- [ ] User receives success message with row count
+- [ ] Invalid files show specific error messages
+```
+
+**Scenario format (Given/When/Then):**
+```markdown
+## Acceptance Criteria
+
+Scenario 1: Valid file upload
+Given a user is on the upload page
+When they select a valid CSV file
+Then the file uploads successfully
+And they see confirmation with row count
+
+Scenario 2: Invalid file format
+Given a user selects a non-CSV file
+When they try to upload
+Then they see error: "Only CSV files supported"
+```
+
+### Definition of Done (Universal, Quality)
+
+**Definition:** Universal checklist that applies to ALL work items to ensure consistent quality and release-readiness.
+
+**Scope:** Applies to every single task (bugs, features, stories)
+
+**Purpose:** "Is this work complete to our quality standards?"
+
+**Owner:** Team defines and maintains (reviewed in retrospectives)
+
+**Example DoD:**
+```markdown
+## Definition of Done (applies to all tasks)
+- [ ] Code written and peer-reviewed
+- [ ] Unit tests written and passing (>80% coverage)
+- [ ] Integration tests passing
+- [ ] No linter warnings
+- [ ] Documentation updated (if public API)
+- [ ] Manual testing completed (if UI)
+- [ ] Deployed to staging environment
+- [ ] Product owner accepted
+- [ ] Commit references bd task ID
+```
+
+### Key Differences
+
+| Aspect | Acceptance Criteria | Definition of Done |
+|--------|--------------------|--------------------|
+| **Scope** | Per-task (unique) | All tasks (universal) |
+| **Focus** | Functional requirements | Quality standards |
+| **Question** | "Does it work?" | "Is it done?" |
+| **Owner** | Product owner | Team |
+| **Changes** | Per task | Rarely (retrospectives) |
+| **Examples** | "User can export data" | "Tests pass, code reviewed" |
+
+### How to Use Both
+
+**When creating a task:**
+
+1. **Define Acceptance Criteria** (task-specific functional requirements)
+2. **Reference Definition of Done** (don't duplicate it in task)
+
+```markdown
+bd create "Implement CSV file upload" --design "
+## Acceptance Criteria
+- [ ] User can upload CSV files up to 10MB
+- [ ] System validates CSV format
+- [ ] Progress bar shows during upload
+- [ ] Success message displays row count
+
+## Notes
+Must also meet team's Definition of Done (see project wiki)
+"
+```
+
+**Before closing a task:**
+
+1. ✅ Verify all Acceptance Criteria met (functional)
+2. ✅ Verify Definition of Done met (quality)
+3. Only then close task
+
+**Bad practice:**
+```markdown
+## Success Criteria
+- [ ] CSV upload works
+- [ ] Tests pass          ← This is DoD, not acceptance criteria
+- [ ] Code reviewed       ← This is DoD, not acceptance criteria
+- [ ] No linter warnings  ← This is DoD, not acceptance criteria
+```
+
+**Good practice:**
+```markdown
+## Acceptance Criteria (functional, task-specific)
+- [ ] CSV upload handles files up to 10MB
+- [ ] Validation rejects non-CSV formats
+- [ ] Progress bar updates during upload
+
+## Definition of Done (quality, universal - referenced, not duplicated)
+See team DoD checklist (applies to all tasks)
+```
+
+## Dependency Management
+
+**Good dependency usage:**
+- Technical dependency (feature B needs feature A's code)
+- Clear ordering (must do A before B)
+- Unblocks work (completing A unblocks B)
+
+**Bad dependency usage:**
+- "Feels like should be done first" (vague)
+- No technical relationship (just preference)
+- Circular dependencies (A depends on B depends on A)
--- a/skills/refactoring-safely/SKILL.md
+++ b/skills/refactoring-safely/SKILL.md
@@ -0,0 +1,541 @@
+---
+name: refactoring-safely
+description: Use when refactoring code - test-preserving transformations in small steps, running tests between each change
+---
+
+<skill_overview>
+Refactoring changes code structure without changing behavior; tests must stay green throughout or you're rewriting, not refactoring.
+</skill_overview>
+
+<rigidity_level>
+MEDIUM FREEDOM - Follow the change→test→commit cycle strictly, but adapt the specific refactoring patterns to your language and codebase.
+</rigidity_level>
+
+<quick_reference>
+| Step | Action | Verify |
+|------|--------|--------|
+| 1 | Run full test suite | ALL pass |
+| 2 | Create bd refactoring task | Track work |
+| 3 | Make ONE small change | Compiles |
+| 4 | Run tests immediately | ALL still pass |
+| 5 | Commit with descriptive message | History clear |
+| 6 | Repeat 3-5 until complete | Each step safe |
+| 7 | Final verification & close bd | Done |
+
+**Core cycle:** Change → Test → Commit (repeat until complete)
+</quick_reference>
+
+<when_to_use>
+- Improving code structure without changing functionality
+- Extracting duplicated code into shared utilities
+- Renaming for clarity
+- Reorganizing file/module structure
+- Simplifying complex code while preserving behavior
+
+**Don't use for:**
+- Changing functionality (use feature development)
+- Fixing bugs (use hyperpowers:fixing-bugs)
+- Adding features while restructuring (do separately)
+- Code without tests (write tests first using hyperpowers:test-driven-development)
+</when_to_use>
+
+<the_process>
+## 1. Verify Tests Pass
+
+**BEFORE any refactoring:**
+
+```bash
+# Use test-runner agent to keep context clean
+Dispatch hyperpowers:test-runner agent: "Run: cargo test"
+```
+
+**Verify:** ALL tests pass. If any fail, fix them FIRST, then refactor.
+
+**Why:** Failing tests mean you can't detect if refactoring breaks things.
+
+---
+
+## 2. Create bd Task for Refactoring
+
+Track the refactoring work:
+
+```bash
+bd create "Refactor: Extract user validation logic" \
+  --type task \
+  --priority P2
+
+bd edit bd-456 --design "
+## Goal
+Extract user validation logic from UserService into separate Validator class.
+
+## Why
+- Validation duplicated across 3 services
+- Makes testing individual validations difficult
+- Violates single responsibility
+
+## Approach
+1. Create UserValidator class
+2. Extract email validation
+3. Extract name validation
+4. Extract age validation
+5. Update UserService to use validator
+6. Remove duplication from other services
+
+## Success Criteria
+- All existing tests still pass
+- No behavior changes
+- Validator has 100% test coverage
+"
+
+bd update bd-456 --status in_progress
+```
+
+---
+
+## 3. Make ONE Small Change
+
+The smallest transformation that compiles.
+
+**Examples of "small":**
+- Extract one method
+- Rename one variable
+- Move one function to different file
+- Inline one constant
+- Extract one interface
+
+**NOT small:**
+- Extracting multiple methods at once
+- Renaming + moving + restructuring
+- "While I'm here" improvements
+
+**Example:**
+
+```rust
+// Before
+fn create_user(name: &str, email: &str) -> Result<User> {
+    if email.is_empty() {
+        return Err(Error::InvalidEmail);
+    }
+    if !email.contains('@') {
+        return Err(Error::InvalidEmail);
+    }
+
+    let user = User { name, email };
+    Ok(user)
+}
+
+// After - ONE small change (extract email validation)
+fn create_user(name: &str, email: &str) -> Result<User> {
+    validate_email(email)?;
+
+    let user = User { name, email };
+    Ok(user)
+}
+
+fn validate_email(email: &str) -> Result<()> {
+    if email.is_empty() {
+        return Err(Error::InvalidEmail);
+    }
+    if !email.contains('@') {
+        return Err(Error::InvalidEmail);
+    }
+    Ok(())
+}
+```
+
+---
+
+## 4. Run Tests Immediately
+
+After EVERY small change:
+
+```bash
+Dispatch hyperpowers:test-runner agent: "Run: cargo test"
+```
+
+**Verify:** ALL tests still pass.
+
+**If tests fail:**
+1. STOP
+2. Undo the change: `git restore src/file.rs`
+3. Understand why it broke
+4. Make smaller change
+5. Try again
+
+**Never proceed with failing tests.**
+
+---
+
+## 5. Commit the Small Change
+
+Commit each safe transformation:
+
+```bash
+Dispatch hyperpowers:test-runner agent: "Run: git add src/user_service.rs && git commit -m 'refactor(bd-456): extract email validation to function
+
+No behavior change. All tests pass.
+
+Part of bd-456'"
+```
+
+**Why commit so often:**
+- Easy to undo if next step breaks
+- Clear history of transformations
+- Can review each step independently
+- Proves tests passed at each point
+
+---
+
+## 6. Repeat Until Complete
+
+Repeat steps 3-5 for each small transformation:
+
+```
+1. Extract validate_email() ✓ (committed)
+2. Extract validate_name() ✓ (committed)
+3. Extract validate_age() ✓ (committed)
+4. Create UserValidator struct ✓ (committed)
+5. Move validations into UserValidator ✓ (committed)
+6. Update UserService to use validator ✓ (committed)
+7. Remove validation from OrderService ✓ (committed)
+8. Remove validation from AccountService ✓ (committed)
+```
+
+**Pattern:** change → test → commit (repeat)
+
+---
+
+## 7. Final Verification
+
+After all transformations complete:
+
+```bash
+# Full test suite
+Dispatch hyperpowers:test-runner agent: "Run: cargo test"
+
+# Linter
+Dispatch hyperpowers:test-runner agent: "Run: cargo clippy"
+```
+
+**Review the changes:**
+
+```bash
+# See all refactoring commits
+git log --oneline | grep "bd-456"
+
+# Review full diff
+git diff main...HEAD
+```
+
+**Checklist:**
+- [ ] All tests pass
+- [ ] No new warnings
+- [ ] No behavior changes
+- [ ] Code is cleaner/simpler
+- [ ] Each commit is small and safe
+
+**Close bd task:**
+
+```bash
+bd edit bd-456 --design "
+... (append to existing design)
+
+## Completed
+- Created UserValidator class with email, name, age validation
+- Removed duplicated validation from 3 services
+- All tests pass (verified)
+- No behavior changes
+- 8 small transformations, each tested
+"
+
+bd close bd-456
+```
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer changes behavior while "refactoring"</scenario>
+
+<code>
+// Original code
+fn validate_email(email: &str) -> Result<()> {
+    if email.is_empty() {
+        return Err(Error::InvalidEmail);
+    }
+    if !email.contains('@') {
+        return Err(Error::InvalidEmail);
+    }
+    Ok(())
+}
+
+// "Refactored" version
+fn validate_email(email: &str) -> Result<()> {
+    if email.is_empty() {
+        return Err(Error::InvalidEmail);
+    }
+    if !email.contains('@') {
+        return Err(Error::InvalidEmail);
+    }
+    // NEW: Added extra validation
+    if !email.contains('.') {  // BEHAVIOR CHANGE
+        return Err(Error::InvalidEmail);
+    }
+    Ok(())
+}
+</code>
+
+<why_it_fails>
+- This changes behavior (now rejects emails like "user@localhost")
+- Tests might fail, or worse, pass and ship breaking change
+- Not refactoring - this is modifying functionality
+- Users who relied on old behavior experience regression
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+1. Extract validation (pure refactoring, no behavior change)
+2. Commit with tests passing
+3. THEN add new validation as separate feature with new tests
+4. Two clear commits: refactoring vs. feature addition
+
+**What you gain:**
+- Clear history of what changed when
+- Easy to revert feature without losing refactoring
+- Tests document exact behavior changes
+- No surprises in production
+</correction>
+</example>
+
+<example>
+<scenario>Developer does big-bang refactoring</scenario>
+
+<code>
+# Changes made all at once:
+- Renamed 15 functions across 5 files
+- Extracted 3 new classes
+- Moved code between 10 files
+- Reorganized module structure
+- Updated all import statements
+
+# Then runs tests
+$ cargo test
+... 23 test failures ...
+
+# Now what? Which change broke what?
+</code>
+
+<why_it_fails>
+- Can't identify which specific change broke tests
+- Reverting means losing ALL work
+- Fixing requires re-debugging entire refactoring
+- Wastes hours trying to untangle failures
+- Might give up and revert everything
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+1. Rename ONE function → test → commit
+2. Extract ONE class → test → commit
+3. Move ONE file → test → commit
+4. Continue one change at a time
+
+**If test fails:**
+- Know exactly which change broke it
+- Revert ONE commit, not all work
+- Fix or make smaller change
+- Continue from known-good state
+
+**What you gain:**
+- Tests break → immediately know why
+- Each commit is reviewable independently
+- Can stop halfway with useful progress
+- Confidence from continuous green tests
+- Clear history for future developers
+</correction>
+</example>
+
+<example>
+<scenario>Developer refactors code without tests</scenario>
+
+<code>
+// Legacy code with no tests
+fn process_payment(amount: f64, user_id: i64) -> Result<PaymentId> {
+    // 200 lines of complex payment logic
+    // Multiple edge cases
+    // No tests exist
+}
+
+// Developer refactors without tests:
+// - Extracts 5 methods
+// - Renames variables
+// - Simplifies conditionals
+// - "Looks good to me!"
+
+// Deploys to production
+// 💥 Payments fail for amounts over $1000
+// Edge case handling was accidentally changed
+</code>
+
+<why_it_fails>
+- No tests to verify behavior preserved
+- Complex logic has hidden edge cases
+- Subtle behavior changes go unnoticed
+- Breaks in production, not development
+- Costs customer trust and emergency debugging
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+1. **Write tests FIRST** (using hyperpowers:test-driven-development)
+   - Test happy path
+   - Test all edge cases (amounts over $1000, etc.)
+   - Test error conditions
+   - Run tests → all pass (documenting current behavior)
+
+2. **Then refactor with tests as safety net**
+   - Extract method → run tests → commit
+   - Rename → run tests → commit
+   - Simplify → run tests → commit
+
+3. **Tests catch any behavior changes immediately**
+
+**What you gain:**
+- Confidence behavior is preserved
+- Edge cases documented in tests
+- Catches subtle changes before production
+- Future refactoring is also safe
+- Tests serve as documentation
+</correction>
+</example>
+</examples>
+
+<refactor_vs_rewrite>
+## When to Refactor
+
+- Tests exist and pass
+- Changes are incremental
+- Business logic stays same
+- Can transform in small, safe steps
+- Each step independently valuable
+
+## When to Rewrite
+
+- No tests exist (write tests first, then refactor)
+- Fundamental architecture change needed
+- Easier to rebuild than modify
+- Requirements changed significantly
+- After 3+ failed refactoring attempts
+
+**Rule:** If you need to change test assertions (not just add tests), you're rewriting, not refactoring.
+
+## Strangler Fig Pattern (Hybrid)
+
+**When to use:**
+- Need to replace legacy system but can't tolerate downtime
+- Want incremental migration with continuous monitoring
+- System too large to refactor in one go
+
+**How it works:**
+
+1. **Transform:** Create modernized components alongside legacy
+2. **Coexist:** Both systems run in parallel (façade routes requests)
+3. **Eliminate:** Retire old functionality piece by piece
+
+**Example:**
+
+```
+Legacy: Monolithic user service (50K LOC)
+Goal: Microservices architecture
+
+Step 1 (Transform):
+- Create new UserService microservice
+- Implement user creation endpoint
+- Tests pass in isolation
+
+Step 2 (Coexist):
+- Add routing layer (façade)
+- Route POST /users to new service
+- Route GET /users to legacy service (for now)
+- Monitor both, compare results
+
+Step 3 (Eliminate):
+- Once confident, migrate GET /users to new service
+- Remove user creation from legacy
+- Repeat for remaining endpoints
+```
+
+**Benefits:**
+- Incremental replacement reduces risk
+- Legacy continues operating during transition
+- Can pause/rollback at any point
+- Each migration step is independently valuable
+
+**Use refactoring within components, Strangler Fig for replacing systems.**
+</refactor_vs_rewrite>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Tests must stay green** throughout refactoring → If they fail, you changed behavior (stop and undo)
+2. **Commit after each small change** → Large commits hide which change broke what
+3. **One transformation at a time** → Multiple changes = impossible to debug failures
+4. **Run tests after EVERY change** → Delayed testing doesn't tell you which change broke it
+5. **If tests fail 3+ times, question approach** → Might need to rewrite instead, or add tests first
+
+## Common Excuses
+
+All of these mean: **Stop and return to the change→test→commit cycle**
+
+- "Small refactoring, don't need tests between steps"
+- "I'll test at the end"
+- "Tests are slow, I'll run once at the end"
+- "Just fixing bugs while refactoring" (bug fixes = behavior changes = not refactoring)
+- "Easier to do all at once"
+- "I know it works without tests"
+- "While I'm here, I'll also..." (scope creep during refactoring)
+- "Tests will fail temporarily but I'll fix them" (tests must stay green)
+</critical_rules>
+
+<verification_checklist>
+Before marking refactoring complete:
+
+- [ ] All tests pass (verified with hyperpowers:test-runner agent)
+- [ ] No new linter warnings
+- [ ] No behavior changes introduced
+- [ ] Code is cleaner/simpler than before
+- [ ] Each commit in history is small and safe
+- [ ] bd task documents what was done and why
+- [ ] Can explain what each transformation did
+
+**Can't check all boxes?** Return to process and fix before closing bd task.
+</verification_checklist>
+
+<integration>
+**This skill requires:**
+- hyperpowers:test-driven-development (for writing tests before refactoring if none exist)
+- hyperpowers:verification-before-completion (for final verification)
+- hyperpowers:test-runner agent (for running tests without context pollution)
+
+**This skill is called by:**
+- General development workflows when improving code structure
+- After features are complete and working
+- When preparing code for new features
+
+**Agents used:**
+- test-runner (runs tests/commits without polluting main context)
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Common refactoring patterns](resources/refactoring-patterns.md) - Extract Method, Extract Class, Inline, etc.
+- [Complete refactoring session example](resources/example-session.md) - Minute-by-minute walkthrough
+
+**When stuck:**
+- Tests fail after change → Undo (git restore), make smaller change
+- 3+ failures → Question if refactoring is right approach, consider rewrite
+- No tests exist → Use hyperpowers:test-driven-development to write tests first
+- Unsure how small → If it touches more than one function/file, it's too big
+</resources>
--- a/skills/refactoring-safely/resources/example-session.md
+++ b/skills/refactoring-safely/resources/example-session.md
@@ -0,0 +1,78 @@
+## Example: Complete Refactoring Session
+
+**Goal:** Extract validation logic from UserService
+
+**Time: 60 minutes**
+
+### Minutes 0-5: Verify Tests Pass
+```bash
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Result: ✓ 234 tests pass
+```
+
+### Minutes 5-10: Create bd Task
+```bash
+bd create "Refactor: Extract user validation" --type task
+bd edit bd-456 --design "Extract validation to UserValidator class..."
+bd update bd-456 --status in_progress
+```
+
+### Minutes 10-15: Step 1 - Extract email validation function
+```rust
+// Extract validate_email()
+```
+```bash
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Result: ✓ 234 tests pass
+git commit -m "refactor(bd-456): extract email validation"
+```
+
+### Minutes 15-20: Step 2 - Extract name validation function
+```rust
+// Extract validate_name()
+```
+```bash
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Result: ✓ 234 tests pass
+git commit -m "refactor(bd-456): extract name validation"
+```
+
+### Minutes 20-25: Step 3 - Create UserValidator struct
+```rust
+struct UserValidator { /* empty */ }
+impl UserValidator { /* empty */ }
+```
+```bash
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Result: ✓ 234 tests pass
+git commit -m "refactor(bd-456): create UserValidator struct"
+```
+
+### Minutes 25-35: Steps 4-6 - Move validations to UserValidator
+Each step: move one method, test, commit
+
+### Minutes 35-45: Step 7 - Update UserService to use validator
+```rust
+// Use UserValidator instead of inline validation
+```
+```bash
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Result: ✓ 234 tests pass
+git commit -m "refactor(bd-456): use UserValidator in UserService"
+```
+
+### Minutes 45-55: Step 8 - Remove duplication from other services
+Each service: one change, test, commit
+
+### Minutes 55-60: Final verification and close
+```bash
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Result: ✓ 234 tests pass
+
+Dispatch hyperpowers:test-runner: "Run: cargo clippy"
+Result: ✓ No warnings
+
+bd close bd-456
+```
+
+**Result:** Refactoring complete, 8 safe commits, all tests green throughout.
--- a/skills/refactoring-safely/resources/refactoring-patterns.md
+++ b/skills/refactoring-safely/resources/refactoring-patterns.md
@@ -0,0 +1,121 @@
+## Common Refactoring Patterns
+
+### Extract Method
+
+**When:** Duplicated code or long function
+
+```rust
+// Before: Long function
+fn process(data: Vec<i32>) -> i32 {
+    let mut sum = 0;
+    for x in data {
+        sum += x * x;
+    }
+    sum
+}
+
+// After: Extracted method
+fn process(data: Vec<i32>) -> i32 {
+    data.iter().map(|x| square(x)).sum()
+}
+
+fn square(x: &i32) -> i32 {
+    x * x
+}
+```
+
+**Steps:**
+1. Extract square() function
+2. Run tests
+3. Commit
+4. Replace loop with iterator
+5. Run tests
+6. Commit
+
+### Rename Variable/Function
+
+**When:** Name is unclear or misleading
+
+```rust
+// Before
+fn calc(d: Vec<i32>) -> f64 {
+    let s: i32 = d.iter().sum();
+    s as f64 / d.len() as f64
+}
+
+// After - Step by step
+// Step 1: Rename function
+fn calculate_average(d: Vec<i32>) -> f64 { ... }  // Test, commit
+
+// Step 2: Rename parameter
+fn calculate_average(data: Vec<i32>) -> f64 { ... }  // Test, commit
+
+// Step 3: Rename variable
+fn calculate_average(data: Vec<i32>) -> f64 {
+    let sum: i32 = data.iter().sum();  // Test, commit
+    sum as f64 / data.len() as f64
+}
+```
+
+### Extract Class/Struct
+
+**When:** Class has multiple responsibilities
+
+```rust
+// Before: God object
+struct UserService {
+    db: Database,
+    email_validator: Regex,
+    name_validator: Regex,
+}
+
+// After: Single responsibility
+struct UserService {
+    db: Database,
+    validator: UserValidator,  // EXTRACTED
+}
+
+struct UserValidator {
+    email_pattern: Regex,
+    name_pattern: Regex,
+}
+```
+
+**Steps:**
+1. Create empty UserValidator struct
+2. Test, commit
+3. Move email_validator field
+4. Test, commit
+5. Move name_validator field
+6. Test, commit
+7. Update UserService to use UserValidator
+8. Test, commit
+
+### Inline Unnecessary Abstraction
+
+**When:** Abstraction adds no value
+
+```rust
+// Before: Pointless wrapper
+fn get_user_email(user: &User) -> &str {
+    &user.email
+}
+
+fn process() {
+    let email = get_user_email(&user);  // Just use user.email!
+}
+
+// After: Inline
+fn process() {
+    let email = &user.email;
+}
+```
+
+**Steps:**
+1. Replace one call site with direct access
+2. Test, commit
+3. Replace next call site
+4. Test, commit
+5. Remove wrapper function
+6. Test, commit
+
--- a/skills/review-implementation/SKILL.md
+++ b/skills/review-implementation/SKILL.md
@@ -0,0 +1,646 @@
+---
+name: review-implementation
+description: Use after hyperpowers:executing-plans completes all tasks - verifies implementation against bd spec, all success criteria met, anti-patterns avoided
+---
+
+<skill_overview>
+Review completed implementation against bd epic to catch gaps before claiming completion; spec is contract, implementation must fulfill contract completely.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow the 4-step review process exactly. Review with Google Fellow-level scrutiny. Never skip automated checks, quality gates, or code reading. No approval without evidence for every criterion.
+</rigidity_level>
+
+<quick_reference>
+| Step | Action | Deliverable |
+|------|--------|-------------|
+| 1 | Load bd epic + all tasks | TodoWrite with tasks to review |
+| 2 | Review each task (automated checks, quality gates, read code, verify criteria) | Findings per task |
+| 3 | Report findings (approved / gaps found) | Review decision |
+| 4 | Gate: If approved → finishing-a-development-branch, If gaps → STOP | Next action |
+
+**Review Perspective:** Google Fellow-level SRE with 20+ years experience reviewing junior engineer code.
+</quick_reference>
+
+<when_to_use>
+- hyperpowers:executing-plans completed all tasks
+- Before claiming work is complete
+- Before hyperpowers:finishing-a-development-branch
+- Want to verify implementation matches spec
+
+**Don't use for:**
+- Mid-implementation (use hyperpowers:executing-plans)
+- Before all tasks done
+- Code reviews of external PRs (this is self-review)
+</when_to_use>
+
+<the_process>
+## Step 1: Load Epic Specification
+
+**Announce:** "I'm using hyperpowers:review-implementation to verify implementation matches spec. Reviewing with Google Fellow-level scrutiny."
+
+**Get epic and tasks:**
+
+```bash
+bd show bd-1          # Epic specification
+bd dep tree bd-1      # Task tree
+bd list --parent bd-1 # All tasks
+```
+
+**Create TodoWrite tracker:**
+
+```
+TodoWrite todos:
+- Review bd-2: Task Name
+- Review bd-3: Task Name
+- Review bd-4: Task Name
+- Compile findings and make decision
+```
+
+---
+
+## Step 2: Review Each Task
+
+For each task:
+
+### A. Read Task Specification
+
+```bash
+bd show bd-3
+```
+
+Extract:
+- Goal (what problem solved?)
+- Success criteria (how verify done?)
+- Implementation checklist (files/functions/tests)
+- Key considerations (edge cases)
+- Anti-patterns (prohibited patterns)
+
+---
+
+### B. Run Automated Code Completeness Checks
+
+```bash
+# TODOs/FIXMEs without issue numbers
+rg -i "todo|fixme" src/ tests/ || echo "✅ None"
+
+# Stub implementations
+rg "unimplemented!|todo!|unreachable!|panic!\(\"not implemented" src/ || echo "✅ None"
+
+# Unsafe patterns in production
+rg "\.unwrap\(\)|\.expect\(" src/ | grep -v "/tests/" || echo "✅ None"
+
+# Ignored/skipped tests
+rg "#\[ignore\]|#\[skip\]|\.skip\(\)" tests/ src/ || echo "✅ None"
+```
+
+---
+
+### C. Run Quality Gates (via test-runner agent)
+
+**IMPORTANT:** Use hyperpowers:test-runner agent to avoid context pollution.
+
+```
+Dispatch hyperpowers:test-runner: "Run: cargo test"
+Dispatch hyperpowers:test-runner: "Run: cargo fmt --check"
+Dispatch hyperpowers:test-runner: "Run: cargo clippy -- -D warnings"
+Dispatch hyperpowers:test-runner: "Run: .git/hooks/pre-commit"
+```
+
+---
+
+### D. Read Implementation Files
+
+**CRITICAL:** READ actual files, not just git diff.
+
+```bash
+# See changes
+git diff main...HEAD -- src/auth/jwt.ts
+
+# THEN READ FULL FILE
+Read tool: src/auth/jwt.ts
+```
+
+**While reading, check:**
+- ✅ Code implements checklist items (not stubs)
+- ✅ Error handling uses proper patterns (Result, try/catch)
+- ✅ Edge cases from "Key Considerations" handled
+- ✅ Code is clear and maintainable
+- ✅ No anti-patterns present
+
+---
+
+### E. Code Quality Review (Google Fellow Perspective)
+
+**Assume code written by junior engineer. Apply production-grade scrutiny.**
+
+**Error Handling:**
+- Proper use of Result/Option or try/catch?
+- Error messages helpful for production debugging?
+- No unwrap/expect in production?
+- Errors propagate with context?
+- Failure modes graceful?
+
+**Safety:**
+- No unsafe blocks without justification?
+- Proper bounds checking?
+- No potential panics?
+- No data races?
+- No SQL injection, XSS vulnerabilities?
+
+**Clarity:**
+- Would junior understand in 6 months?
+- Single responsibility per function?
+- Descriptive variable names?
+- Complex logic explained?
+- No clever tricks - obvious and boring?
+
+**Testing:**
+- Edge cases covered (empty, max, Unicode)?
+- Tests meaningful, not just coverage?
+- Test names describe what verified?
+- Tests test behavior, not implementation?
+- Failure scenarios tested?
+
+**Production Readiness:**
+- Comfortable deploying to production?
+- Could cause outage or data loss?
+- Performance acceptable under load?
+- Logging sufficient for debugging?
+
+---
+
+### F. Verify Success Criteria with Evidence
+
+For EACH criterion in bd task:
+- Run verification command
+- Check actual output
+- Don't assume - verify with evidence
+- Use hyperpowers:test-runner for tests/lints
+
+**Example:**
+
+```
+Criterion: "All tests passing"
+Command: cargo test
+Evidence: "127 tests passed, 0 failures"
+Result: ✅ Met
+
+Criterion: "No unwrap in production"
+Command: rg "\.unwrap\(\)" src/
+Evidence: "No matches"
+Result: ✅ Met
+```
+
+---
+
+### G. Check Anti-Patterns
+
+Search for each prohibited pattern from bd task:
+
+```bash
+# Example anti-patterns from task
+rg "\.unwrap\(\)" src/  # If task prohibits unwrap
+rg "TODO" src/          # If task prohibits untracked TODOs
+rg "\.skip\(\)" tests/  # If task prohibits skipped tests
+```
+
+---
+
+### H. Verify Key Considerations
+
+Read code to confirm edge cases handled:
+- Empty input validation
+- Unicode handling
+- Concurrent access
+- Failure modes
+- Performance concerns
+
+**Example:** Task says "Must handle empty payload" → Find validation code for empty payload.
+
+---
+
+### I. Record Findings
+
+```markdown
+### Task: bd-3 - Implement JWT authentication
+
+#### Automated Checks
+- TODOs: ✅ None
+- Stubs: ✅ None
+- Unsafe patterns: ❌ Found `.unwrap()` at src/auth/jwt.ts:45
+- Ignored tests: ✅ None
+
+#### Quality Gates
+- Tests: ✅ Pass (127 tests)
+- Formatting: ✅ Pass
+- Linting: ❌ 3 warnings
+- Pre-commit: ❌ Fails due to linting
+
+#### Files Reviewed
+- src/auth/jwt.ts: ⚠️ Contains `.unwrap()` at line 45
+- tests/auth/jwt_test.rs: ✅ Complete
+
+#### Code Quality
+- Error Handling: ⚠️ Uses unwrap instead of proper error propagation
+- Safety: ✅ Good
+- Clarity: ✅ Good
+- Testing: ✅ Good
+
+#### Success Criteria
+1. "All tests pass": ✅ Met - Evidence: 127 tests passed
+2. "Pre-commit passes": ❌ Not met - Evidence: clippy warnings
+3. "No unwrap in production": ❌ Not met - Evidence: Found at jwt.ts:45
+
+#### Anti-Patterns
+- "NO unwrap in production": ❌ Violated at src/auth/jwt.ts:45
+
+#### Issues
+**Critical:**
+1. unwrap() at jwt.ts:45 - violates anti-pattern, must use proper error handling
+
+**Important:**
+2. 3 clippy warnings block pre-commit hook
+```
+
+---
+
+### J. Mark Task Reviewed (TodoWrite)
+
+---
+
+## Step 3: Report Findings
+
+After reviewing ALL tasks:
+
+**If NO gaps:**
+
+```markdown
+## Implementation Review: APPROVED ✅
+
+Reviewed bd-1 (OAuth Authentication) against implementation.
+
+### Tasks Reviewed
+- bd-2: Configure OAuth provider ✅
+- bd-3: Implement token exchange ✅
+- bd-4: Add refresh logic ✅
+
+### Verification Summary
+- All success criteria verified
+- No anti-patterns detected
+- All key considerations addressed
+- All files implemented per spec
+
+### Evidence
+- Tests: 127 passed, 0 failures (2.3s)
+- Linting: No warnings
+- Pre-commit: Pass
+- Code review: Production-ready
+
+Ready to proceed to hyperpowers:finishing-a-development-branch.
+```
+
+**If gaps found:**
+
+```markdown
+## Implementation Review: GAPS FOUND ❌
+
+Reviewed bd-1 (OAuth Authentication) against implementation.
+
+### Tasks with Gaps
+
+#### bd-3: Implement token exchange
+**Gaps:**
+- ❌ Success criterion not met: "Pre-commit hooks pass"
+  - Evidence: cargo clippy shows 3 warnings
+- ❌ Anti-pattern violation: Found `.unwrap()` at src/auth/jwt.ts:45
+- ⚠️ Key consideration not addressed: "Empty payload validation"
+  - No check for empty payload in generateToken()
+
+#### bd-4: Add refresh logic
+**Gaps:**
+- ❌ Success criterion not met: "All tests passing"
+  - Evidence: test_verify_expired_token failing
+
+### Cannot Proceed
+Implementation does not match spec. Fix gaps before completing.
+```
+
+---
+
+## Step 4: Gate Decision
+
+**If APPROVED:**
+```
+Announce: "I'm using hyperpowers:finishing-a-development-branch to complete this work."
+
+Use Skill tool: hyperpowers:finishing-a-development-branch
+```
+
+**If GAPS FOUND:**
+```
+STOP. Do not proceed to finishing-a-development-branch.
+Fix gaps or discuss with partner.
+Re-run review after fixes.
+```
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer only checks git diff, doesn't read actual files</scenario>
+
+<code>
+# Review process
+git diff main...HEAD  # Shows changes
+
+# Developer sees:
+ function generateToken(payload) {
+   return jwt.sign(payload, secret);
+ }
+
+# Approves based on diff
+"Looks good, token generation implemented ✅"
+
+# Misses: Full context shows no validation
+function generateToken(payload) {
+  // No validation of payload!
+  // No check for empty payload (key consideration)
+  // No error handling if jwt.sign fails
+  return jwt.sign(payload, secret);
+}
+</code>
+
+<why_it_fails>
+- Git diff shows additions, not full context
+- Missed that empty payload not validated (key consideration)
+- Missed that error handling missing (quality issue)
+- False approval - gaps exist but not caught
+- Will fail in production when empty payload passed
+</why_it_fails>
+
+<correction>
+**Correct review process:**
+
+```bash
+# See changes
+git diff main...HEAD -- src/auth/jwt.ts
+
+# THEN READ FULL FILE
+Read tool: src/auth/jwt.ts
+```
+
+**Reading full file reveals:**
+```javascript
+function generateToken(payload) {
+  // Missing: empty payload check (key consideration from bd task)
+  // Missing: error handling for jwt.sign failure
+  return jwt.sign(payload, secret);
+}
+```
+
+**Record in findings:**
+```
+⚠️ Key consideration not addressed: "Empty payload validation"
+- No check for empty payload in generateToken()
+- Code at src/auth/jwt.ts:15-17
+
+⚠️ Error handling: jwt.sign can throw, not handled
+```
+
+**What you gain:**
+- Caught gaps that git diff missed
+- Full context reveals missing validation
+- Quality issues identified before production
+- Spec compliance verified, not assumed
+</correction>
+</example>
+
+<example>
+<scenario>Developer assumes tests passing means done</scenario>
+
+<code>
+# Run tests
+cargo test
+# Output: 127 tests passed
+
+# Developer concludes
+"Tests pass, implementation complete ✅"
+
+# Proceeds to finishing-a-development-branch
+
+# Misses:
+- bd task has 5 success criteria
+- Only checked 1 (tests pass)
+- Anti-pattern: unwrap() present (prohibited)
+- Key consideration: Unicode handling not tested
+- Linter has warnings (blocks pre-commit)
+</code>
+
+<why_it_fails>
+- Tests passing ≠ spec compliance
+- Didn't verify all success criteria
+- Didn't check anti-patterns
+- Didn't verify key considerations
+- Pre-commit will fail (blocks merge)
+- Ships code violating anti-patterns
+</why_it_fails>
+
+<correction>
+**Correct review checks ALL criteria:**
+
+```markdown
+bd task has 5 success criteria:
+1. "All tests pass" ✅ - Evidence: 127 passed
+2. "Pre-commit passes" ❌ - Evidence: clippy warns (3 warnings)
+3. "No unwrap in production" ❌ - Evidence: Found at jwt.ts:45
+4. "Unicode handling tested" ⚠️ - Need to verify test exists
+5. "Rate limiting implemented" ⚠️ - Need to check code
+
+Result: 1/5 criteria verified met. GAPS EXIST.
+```
+
+**Run additional checks:**
+```bash
+# Check criterion 2
+cargo clippy
+# 3 warnings found ❌
+
+# Check criterion 3
+rg "\.unwrap\(\)" src/
+# src/auth/jwt.ts:45 ❌
+
+# Check criterion 4
+rg "unicode" tests/
+# No matches ⚠️ Need to verify
+```
+
+**Decision: GAPS FOUND, cannot proceed**
+
+**What you gain:**
+- Verified ALL criteria, not just tests
+- Caught anti-pattern violations
+- Caught pre-commit blockers
+- Prevented shipping non-compliant code
+- Spec contract honored completely
+</correction>
+</example>
+
+<example>
+<scenario>Developer rationalizes skipping rigor for "simple" task</scenario>
+
+<code>
+bd task: "Add logging to error paths"
+
+# Developer thinks: "Simple task, just added console.log"
+# Skips:
+- Automated checks (assumes no issues)
+- Code quality review (seems obvious)
+- Full success criteria verification
+
+# Approves quickly:
+"Logging added ✅"
+
+# Misses:
+- console.log used instead of proper logger (anti-pattern)
+- Only added to 2 of 5 error paths (incomplete)
+- No test verifying logs actually output (criterion)
+- Logs contain sensitive data (security issue)
+</code>
+
+<why_it_fails>
+- "Simple" tasks have hidden complexity
+- Skipped rigor catches exactly these issues
+- Incomplete implementation (2/5 paths)
+- Security vulnerability shipped
+- Anti-pattern not caught
+- Failed success criterion (test logs)
+</why_it_fails>
+
+<correction>
+**Follow full review process:**
+
+```bash
+# Automated checks
+rg "console\.log" src/
+# Found at error-handler.ts:12, 15 ⚠️
+
+# Read bd task
+bd show bd-5
+
+# Success criteria:
+# 1. "All error paths logged"
+# 2. "No sensitive data in logs"
+# 3. "Test verifies log output"
+
+# Check criterion 1
+grep -n "throw new Error" src/
+# 5 locations found
+# Only 2 have logging ❌ Incomplete
+
+# Check criterion 2
+Read tool: src/error-handler.ts
+# Logs contain password field ❌ Security issue
+
+# Check criterion 3
+rg "test.*log" tests/
+# No matches ❌ Test missing
+```
+
+**Decision: GAPS FOUND**
+- Incomplete (3/5 error paths missing logs)
+- Security issue (logs password)
+- Anti-pattern (console.log instead of logger)
+- Missing test
+
+**What you gain:**
+- "Simple" task revealed multiple gaps
+- Security vulnerability caught pre-production
+- Rigor prevents incomplete work shipping
+- All criteria must be met, no exceptions
+</correction>
+</example>
+</examples>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Review every task** → No skipping "simple" tasks
+2. **Run all automated checks** → TODOs, stubs, unwrap, ignored tests
+3. **Read actual files with Read tool** → Not just git diff
+4. **Verify every success criterion** → With evidence, not assumptions
+5. **Check all anti-patterns** → Search for prohibited patterns
+6. **Apply Google Fellow scrutiny** → Production-grade code review
+7. **If gaps found → STOP** → Don't proceed to finishing-a-development-branch
+
+## Common Excuses
+
+All of these mean: **STOP. Follow full review process.**
+
+- "Tests pass, must be complete" (Tests ≠ spec, check all criteria)
+- "I implemented it, it's done" (Implementation ≠ compliance, verify)
+- "No time for thorough review" (Gaps later cost more than review now)
+- "Looks good to me" (Opinion ≠ evidence, run verifications)
+- "Small gaps don't matter" (Spec is contract, all criteria matter)
+- "Will fix in next PR" (This PR completes this epic, fix now)
+- "Can check diff instead of files" (Diff shows changes, not context)
+- "Automated checks cover it" (Checks + code review both required)
+- "Success criteria passing means done" (Also check anti-patterns, quality, edge cases)
+
+</critical_rules>
+
+<verification_checklist>
+Before approving implementation:
+
+**Per task:**
+- [ ] Read bd task specification completely
+- [ ] Ran all automated checks (TODOs, stubs, unwrap, ignored tests)
+- [ ] Ran all quality gates via test-runner agent (tests, format, lint, pre-commit)
+- [ ] Read actual implementation files with Read tool (not just diff)
+- [ ] Reviewed code quality with Google Fellow perspective
+- [ ] Verified every success criterion with evidence
+- [ ] Checked every anti-pattern (searched for prohibited patterns)
+- [ ] Verified every key consideration addressed in code
+
+**Overall:**
+- [ ] Reviewed ALL tasks (no exceptions)
+- [ ] TodoWrite tracker shows all tasks reviewed
+- [ ] Compiled findings (approved or gaps)
+- [ ] If approved: all criteria met for all tasks
+- [ ] If gaps: documented exactly what missing
+
+**Can't check all boxes?** Return to Step 2 and complete review.
+</verification_checklist>
+
+<integration>
+**This skill is called by:**
+- hyperpowers:executing-plans (Step 5, after all tasks executed)
+
+**This skill calls:**
+- hyperpowers:finishing-a-development-branch (if approved)
+- hyperpowers:test-runner agent (for quality gates)
+
+**This skill uses:**
+- hyperpowers:verification-before-completion principles (evidence before claims)
+
+**Call chain:**
+```
+hyperpowers:executing-plans → hyperpowers:review-implementation → hyperpowers:finishing-a-development-branch
+                         ↓
+                   (if gaps: STOP)
+```
+
+**CRITICAL:** Use bd commands (bd show, bd list, bd dep tree), never read `.beads/issues.jsonl` directly.
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Code quality standards by language](resources/quality-standards.md)
+- [Common anti-patterns to check](resources/anti-patterns-reference.md)
+- [Production readiness checklist](resources/production-checklist.md)
+
+**When stuck:**
+- Unsure if gap critical → If violates criterion, it's a gap
+- Criteria ambiguous → Ask user for clarification before approving
+- Anti-pattern unclear → Search for it, document if found
+- Quality concern → Document as gap, don't rationalize away
+</resources>
--- a/skills/root-cause-tracing/SKILL.md
+++ b/skills/root-cause-tracing/SKILL.md
@@ -0,0 +1,566 @@
+---
+name: root-cause-tracing
+description: Use when errors occur deep in execution - traces bugs backward through call stack to find original trigger, not just symptom
+---
+
+<skill_overview>
+Bugs manifest deep in the call stack; trace backward until you find the original trigger, then fix at source, not where error appears.
+</skill_overview>
+
+<rigidity_level>
+MEDIUM FREEDOM - Follow the backward tracing process strictly, but adapt instrumentation and debugging techniques to your language and tools.
+</rigidity_level>
+
+<quick_reference>
+| Step | Action | Question |
+|------|--------|----------|
+| 1 | Read error completely | What failed and where? |
+| 2 | Find immediate cause | What code directly threw this? |
+| 3 | Trace backward one level | What called this code? |
+| 4 | Keep tracing up stack | What called that? |
+| 5 | Find where bad data originated | Where was invalid value created? |
+| 6 | Fix at source | Address root cause |
+| 7 | Add defense at each layer | Validate assumptions as backup |
+
+**Core rule:** Never fix just where error appears. Fix where problem originates.
+</quick_reference>
+
+<when_to_use>
+- Error happens deep in execution (not at entry point)
+- Stack trace shows long call chain
+- Unclear where invalid data originated
+- Need to find which test/code triggers problem
+- Error message points to utility/library code
+
+**Example symptoms:**
+- "Database rejects empty string" ← Where did empty string come from?
+- "File not found: ''" ← Why is path empty?
+- "Invalid argument to function" ← Who passed invalid argument?
+- "Null pointer dereference" ← What should have been initialized?
+</when_to_use>
+
+<the_process>
+## 1. Observe the Symptom
+
+Read the complete error:
+
+```
+Error: Invalid email format: ""
+  at validateEmail (validator.ts:42)
+  at UserService.create (user-service.ts:18)
+  at ApiHandler.createUser (api-handler.ts:67)
+  at HttpServer.handleRequest (server.ts:123)
+  at TestCase.test_create_user (user.test.ts:10)
+```
+
+**Symptom:** Email validation fails on empty string
+**Location:** Deep in validator utility
+
+**DON'T fix here yet.** This might be symptom, not source.
+
+---
+
+## 2. Find Immediate Cause
+
+What code directly causes this?
+
+```typescript
+// validator.ts:42
+function validateEmail(email: string): boolean {
+  if (!email) throw new Error(`Invalid email format: "${email}"`);
+  return EMAIL_REGEX.test(email);
+}
+```
+
+**Question:** Why is email empty? Keep tracing.
+
+---
+
+## 3. Trace Backward: What Called This?
+
+Use stack trace:
+
+```typescript
+// user-service.ts:18
+create(request: UserRequest): User {
+  validateEmail(request.email); // Called with request.email = ""
+  // ...
+}
+```
+
+**Question:** Why is `request.email` empty? Keep tracing.
+
+---
+
+## 4. Keep Tracing Up the Stack
+
+```typescript
+// api-handler.ts:67
+async createUser(req: Request): Promise<Response> {
+  const userRequest = {
+    name: req.body.name,
+    email: req.body.email || "", // ← FOUND IT!
+  };
+  return this.userService.create(userRequest);
+}
+```
+
+**Root cause found:** API handler provides default empty string when email missing.
+
+---
+
+## 5. Identify the Pattern
+
+**Why empty string as default?**
+- Misguided "safety": Thought empty string better than undefined
+- Should reject invalid request at API boundary
+- Downstream code assumes data already validated
+
+---
+
+## 6. Fix at Source
+
+```typescript
+// api-handler.ts (SOURCE FIX)
+async createUser(req: Request): Promise<Response> {
+  if (!req.body.email) {
+    return Response.badRequest("Email is required");
+  }
+  const userRequest = {
+    name: req.body.name,
+    email: req.body.email, // No default, already validated
+  };
+  return this.userService.create(userRequest);
+}
+```
+
+---
+
+## 7. Add Defense in Depth
+
+After fixing source, add validation at each layer as backup:
+
+```typescript
+// Layer 1: API - Reject invalid input (PRIMARY FIX)
+if (!req.body.email) return Response.badRequest("Email required");
+
+// Layer 2: Service - Validate assumptions
+assert(request.email, "email must be present");
+
+// Layer 3: Utility - Defensive check
+if (!email) throw new Error("invariant violated: email empty");
+```
+
+**Primary fix at source. Defense is backup, not replacement.**
+</the_process>
+
+<debugging_approaches>
+## Option 1: Guide User Through Debugger
+
+**IMPORTANT:** Claude cannot run interactive debuggers. Guide user through debugger commands.
+
+```
+"Let's use lldb to trace backward through the call stack.
+
+Please run these commands:
+  lldb target/debug/myapp
+  (lldb) breakpoint set --file validator.rs --line 42
+  (lldb) run
+
+When breakpoint hits:
+  (lldb) frame variable email     # Check value here
+  (lldb) bt                       # See full call stack
+  (lldb) up                       # Move to caller
+  (lldb) frame variable request   # Check values in caller
+  (lldb) up                       # Move up again
+  (lldb) frame variable           # Where empty string created?
+
+Please share:
+  1. Value of 'email' at validator.rs:42
+  2. Value of 'request.email' in user_service.rs
+  3. Value of 'req.body.email' in api_handler.rs
+  4. Where does empty string first appear?"
+```
+
+---
+
+## Option 2: Add Instrumentation (Claude CAN Do This)
+
+When debugger not available or issue intermittent:
+
+```rust
+// Add at error location
+fn validate_email(email: &str) -> Result<()> {
+    eprintln!("DEBUG validate_email called:");
+    eprintln!("  email: {:?}", email);
+    eprintln!("  backtrace: {}", std::backtrace::Backtrace::capture());
+
+    if email.is_empty() {
+        return Err(Error::InvalidEmail);
+    }
+    // ...
+}
+```
+
+**Critical:** Use `eprintln!()` or `console.error()` in tests (not logger - may be suppressed).
+
+**Run and analyze:**
+
+```bash
+cargo test 2>&1 | grep "DEBUG validate_email" -A 10
+```
+
+Look for:
+- Test file names in backtraces
+- Line numbers triggering the call
+- Patterns (same test? same parameter?)
+</debugging_approaches>
+
+<finding_polluting_tests>
+## Finding Which Test Pollutes
+
+When something appears during tests but you don't know which:
+
+**Binary search approach:**
+
+```bash
+# Run half the tests
+npm test tests/first-half/*.test.ts
+# Pollution appears? Yes → in first half, No → second half
+
+# Subdivide
+npm test tests/first-quarter/*.test.ts
+
+# Continue until specific file
+npm test tests/auth/login.test.ts  ← Found it!
+```
+
+**Or test isolation:**
+
+```bash
+# Run tests one at a time
+for test in tests/**/*.test.ts; do
+  echo "Testing: $test"
+  npm test "$test"
+  if [ -d .git ]; then
+    echo "FOUND POLLUTER: $test"
+    break
+  fi
+done
+```
+</finding_polluting_tests>
+
+<examples>
+<example>
+<scenario>Developer fixes symptom, not source</scenario>
+
+<code>
+# Error appears in git utility:
+fn git_init(directory: &str) {
+    Command::new("git")
+        .arg("init")
+        .current_dir(directory)
+        .run()
+}
+
+# Error: "Invalid argument: empty directory"
+
+# Developer adds validation at symptom:
+fn git_init(directory: &str) {
+    if directory.is_empty() {
+        panic!("Directory cannot be empty"); // Band-aid
+    }
+    Command::new("git").arg("init").current_dir(directory).run()
+}
+</code>
+
+<why_it_fails>
+- Fixes symptom, not source (where empty string created)
+- Same bug will appear elsewhere directory is used
+- Doesn't explain WHY directory was empty
+- Future code might make same mistake
+- Band-aid hides the real problem
+</why_it_fails>
+
+<correction>
+**Trace backward:**
+
+1. git_init called with directory=""
+2. WorkspaceManager.init(projectDir="")
+3. Session.create(projectDir="")
+4. Test: Project.create(context.tempDir)
+5. **SOURCE:** context.tempDir="" (accessed before beforeEach!)
+
+**Fix at source:**
+
+```typescript
+function setupTest() {
+  let _tempDir: string | undefined;
+
+  return {
+    beforeEach() {
+      _tempDir = makeTempDir();
+    },
+    get tempDir(): string {
+      if (!_tempDir) {
+        throw new Error("tempDir accessed before beforeEach!");
+      }
+      return _tempDir;
+    }
+  };
+}
+```
+
+**What you gain:**
+- Fixes actual bug (test timing issue)
+- Prevents same mistake elsewhere
+- Clear error at source, not deep in stack
+- No empty strings propagating through system
+</correction>
+</example>
+
+<example>
+<scenario>Developer stops tracing too early</scenario>
+
+<code>
+# Error in API handler
+async createUser(req: Request): Promise<Response> {
+  const userRequest = {
+    name: req.body.name,
+    email: req.body.email || "", // Suspicious!
+  };
+  return this.userService.create(userRequest);
+}
+
+# Developer sees empty string default and "fixes" it:
+email: req.body.email || "noreply@example.com"
+
+# Ships to production
+# Bug: Users created without email input get noreply@example.com
+# Database has fake emails, can't distinguish missing from real
+</code>
+
+<why_it_fails>
+- Stopped at first suspicious code
+- Didn't question WHY empty string was default
+- "Fixed" by replacing with different wrong default
+- Root cause: shouldn't accept missing email at all
+- Validation should happen at API boundary
+</why_it_fails>
+
+<correction>
+**Keep tracing to understand intent:**
+
+1. Why was empty string default?
+2. Should email be optional or required?
+3. What does API spec say?
+4. What does database schema say?
+
+**Findings:**
+- Email column is NOT NULL in database
+- API docs say email is required
+- Empty string was workaround, not design
+
+**Fix at source (validate at boundary):**
+
+```typescript
+async createUser(req: Request): Promise<Response> {
+  // Validate at API boundary
+  if (!req.body.email) {
+    return Response.badRequest("Email is required");
+  }
+
+  const userRequest = {
+    name: req.body.name,
+    email: req.body.email, // No default needed
+  };
+  return this.userService.create(userRequest);
+}
+```
+
+**What you gain:**
+- Validates at correct layer (API boundary)
+- Clear error message to client
+- No invalid data propagates downstream
+- Database constraints enforced
+- Matches API specification
+</correction>
+</example>
+
+<example>
+<scenario>Complex multi-layer trace to find original trigger</scenario>
+
+<code>
+# Problem: .git directory appearing in source code directory during tests
+
+# Symptom location:
+Error: Cannot initialize git repo (repo already exists)
+Location: src/workspace/git.rs:45
+
+# Developer adds check:
+if Path::new(".git").exists() {
+    return Err("Git already initialized");
+}
+
+# Doesn't help - still appears in wrong place!
+</code>
+
+<why_it_fails>
+- Detects symptom, doesn't prevent it
+- .git still created in wrong directory
+- Doesn't explain HOW it gets there
+- Pollution still happens, just detected
+</why_it_fails>
+
+<correction>
+**Trace through multiple layers:**
+
+```
+1. git init runs with cwd=""
+   ↓ Why is cwd empty?
+
+2. WorkspaceManager.init(projectDir="")
+   ↓ Why is projectDir empty?
+
+3. Session.create(projectDir="")
+   ↓ Why was empty string passed?
+
+4. Test: Project.create(context.tempDir)
+   ↓ Why is context.tempDir empty?
+
+5. ROOT CAUSE:
+   const context = setupTest(); // tempDir="" initially
+   Project.create(context.tempDir); // Accessed at top level!
+
+   beforeEach(() => {
+     context.tempDir = makeTempDir(); // Assigned here
+   });
+
+   TEST ACCESSED TEMPDIR BEFORE BEFOREEACH RAN!
+```
+
+**Fix at source (make early access impossible):**
+
+```typescript
+function setupTest() {
+  let _tempDir: string | undefined;
+
+  return {
+    beforeEach() {
+      _tempDir = makeTempDir();
+    },
+    get tempDir(): string {
+      if (!_tempDir) {
+        throw new Error("tempDir accessed before beforeEach!");
+      }
+      return _tempDir;
+    }
+  };
+}
+```
+
+**Then add defense at each layer:**
+
+```rust
+// Layer 1: Test framework (PRIMARY FIX)
+// Getter throws if accessed early
+
+// Layer 2: Project validation
+fn create(directory: &str) -> Result<Self> {
+    if directory.is_empty() {
+        return Err("Directory cannot be empty");
+    }
+    // ...
+}
+
+// Layer 3: Workspace validation
+fn init(path: &Path) -> Result<()> {
+    if !path.exists() {
+        return Err("Path must exist");
+    }
+    // ...
+}
+
+// Layer 4: Environment guard
+fn git_init(dir: &Path) -> Result<()> {
+    if env::var("NODE_ENV") != Ok("test".to_string()) {
+        if !dir.starts_with("/tmp") {
+            panic!("Refusing to git init outside test dir");
+        }
+    }
+    // ...
+}
+```
+
+**What you gain:**
+- Primary fix prevents early access (source)
+- Each layer validates assumptions (defense)
+- Clear error at source, not deep in stack
+- Environment guard prevents production pollution
+- Multi-layer defense catches future mistakes
+</correction>
+</example>
+</examples>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Never fix just where error appears** → Trace backward to find source
+2. **Don't stop at first suspicious code** → Keep tracing to original trigger
+3. **Fix at source first** → Defense is backup, not primary fix
+4. **Use debugger OR instrumentation** → Don't guess at call chain
+5. **Add defense at each layer** → After fixing source, validate assumptions throughout
+
+## Common Excuses
+
+All of these mean: **STOP. Trace backward to find source.**
+
+- "Error is obvious here, I'll add validation" (That's a symptom fix)
+- "Stack trace shows the problem" (Shows symptom location, not source)
+- "This code should handle empty values" (Why is value empty? Find source.)
+- "Too deep to trace, I'll add defensive check" (Defense without source fix = band-aid)
+- "Multiple places could cause this" (Trace to find which one actually does)
+</critical_rules>
+
+<verification_checklist>
+Before claiming root cause fixed:
+
+- [ ] Traced backward through entire call chain
+- [ ] Found where invalid data was created (not just passed)
+- [ ] Identified WHY invalid data was created (pattern/assumption)
+- [ ] Fixed at source (where bad data originates)
+- [ ] Added defense at each layer (validate assumptions)
+- [ ] Verified fix with test (reproduces original bug, passes with fix)
+- [ ] Confirmed no other code paths have same pattern
+
+**Can't check all boxes?** Keep tracing backward.
+</verification_checklist>
+
+<integration>
+**This skill is called by:**
+- hyperpowers:debugging-with-tools (Phase 2: Trace Backward Through Call Stack)
+- When errors occur deep in execution
+- When unclear where invalid data originated
+
+**This skill requires:**
+- Stack traces or debugger access
+- Ability to add instrumentation (logging)
+- Understanding of call chain
+
+**This skill calls:**
+- hyperpowers:test-driven-development (write regression test after finding source)
+- hyperpowers:verification-before-completion (verify fix works)
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Debugger commands by language](resources/debugger-reference.md)
+- [Instrumentation patterns](resources/instrumentation-patterns.md)
+- [Defense-in-depth examples](resources/defense-patterns.md)
+
+**When stuck:**
+- Can't find source → Add instrumentation at each layer, run test
+- Stack trace unclear → Use debugger to inspect variables at each frame
+- Multiple suspects → Add instrumentation to all, find which actually executes
+- Intermittent issue → Add instrumentation and wait for reproduction
+</resources>
--- a/skills/skills-auto-activation/SKILL.md
+++ b/skills/skills-auto-activation/SKILL.md
@@ -0,0 +1,399 @@
+---
+name: skills-auto-activation
+description: Use when skills aren't activating reliably - covers official solutions (better descriptions) and custom hook system for deterministic skill activation
+---
+
+<skill_overview>
+Skills often don't activate despite keywords; make activation reliable through better descriptions, explicit triggers, or custom hooks.
+</skill_overview>
+
+<rigidity_level>
+HIGH FREEDOM - Choose solution level based on project needs (Level 1 for simple, Level 3 for complex). Hook implementation is flexible pattern, not rigid process.
+</rigidity_level>
+
+<quick_reference>
+| Level | Solution | Effort | Reliability | When to Use |
+|-------|----------|--------|-------------|-------------|
+| 1 | Better descriptions + explicit requests | Low | Moderate | Small projects, starting out |
+| 2 | CLAUDE.md references | Low | Moderate | Document patterns |
+| 3 | Custom hook system | High | Very High | Large projects, established patterns |
+
+**Hyperpowers includes:** Auto-activation hook at `hooks/user-prompt-submit/10-skill-activator.js`
+</quick_reference>
+
+<when_to_use>
+Use this skill when:
+- Skills you created aren't being used automatically
+- Need consistent skill activation across sessions
+- Large codebases with established patterns
+- Manual "/use skill-name" gets tedious
+
+**Prerequisites:**
+- Skills properly configured (name, description, SKILL.md)
+- Code execution enabled (Settings > Capabilities)
+- Skills toggled on (Settings > Capabilities)
+</when_to_use>
+
+<the_problem>
+## What Users Experience
+
+**Symptoms:**
+- Keywords from skill descriptions present → skill not used
+- Working on files that should trigger skills → nothing
+- Skills exist but sit unused
+
+**Community reports:**
+- GitHub Issue #9954: "Skills not available even if explicitly enabled"
+- "Claude knows it should use skills, but it's not reliable"
+- Skills activation is "not reliable yet"
+
+**Root cause:** Skills rely on Claude recognizing relevance (not deterministic)
+</the_problem>
+
+<solution_levels>
+## Level 1: Official Solutions (Start Here)
+
+### 1. Improve Skill Descriptions
+
+❌ **Bad:**
+```yaml
+name: backend-dev
+description: Helps with backend development
+```
+
+✅ **Good:**
+```yaml
+name: backend-dev-guidelines
+description: Use when creating API routes, controllers, services, or repositories in backend - enforces TypeScript patterns, error handling with Sentry, and Prisma repository pattern
+```
+
+**Key elements:**
+- Specific keywords: "API routes", "controllers", "services"
+- When to use: "Use when creating..."
+- What it enforces: Patterns, error handling
+
+### 2. Be Explicit in Requests
+
+Instead of: "How do I create an endpoint?"
+
+Try: "Use my backend-dev-guidelines skill to create an endpoint"
+
+**Result:** Works, but tedious
+
+### 3. Check Settings
+
+- Settings > Capabilities > Enable code execution
+- Settings > Capabilities > Toggle Skills on
+- Team/Enterprise: Check org-level settings
+
+---
+
+## Level 2: Skill References (Moderate)
+
+Reference skills in CLAUDE.md:
+
+```markdown
+## When Working on Backend
+
+Before making changes:
+1. Check `/skills/backend-dev-guidelines` for patterns
+2. Follow repository pattern for database access
+
+The backend-dev-guidelines skill contains complete examples.
+```
+
+**Pros:** No custom code
+**Cons:** Claude still might not check
+
+---
+
+## Level 3: Custom Hook System (Advanced)
+
+**How it works:**
+1. UserPromptSubmit hook analyzes prompt before Claude sees it
+2. Matches keywords, intent patterns, file paths
+3. Injects skill activation reminder into context
+4. Claude sees "🎯 USE these skills" before processing
+
+**Result:** "Night and day difference" - skills consistently used
+
+### Architecture
+
+```
+User submits prompt
+    ↓
+UserPromptSubmit hook intercepts
+    ↓
+Analyze prompt (keywords, intent, files)
+    ↓
+Check skill-rules.json for matches
+    ↓
+Inject activation reminder
+    ↓
+Claude sees: "🎯 USE these skills: ..."
+    ↓
+Claude loads and uses relevant skills
+```
+
+### Configuration: skill-rules.json
+
+```json
+{
+  "backend-dev-guidelines": {
+    "type": "domain",
+    "enforcement": "suggest",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["backend", "controller", "service", "API", "endpoint"],
+      "intentPatterns": [
+        "(create|add|build).*?(route|endpoint|controller|service)",
+        "(how to|pattern).*?(backend|API)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["backend/src/**/*.ts", "server/**/*.ts"],
+      "contentPatterns": ["express\\.Router", "export.*Controller"]
+    }
+  },
+  "test-driven-development": {
+    "type": "process",
+    "enforcement": "suggest",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["test", "TDD", "testing"],
+      "intentPatterns": [
+        "(write|add|create).*?(test|spec)",
+        "test.*(first|before|TDD)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["**/*.test.ts", "**/*.spec.ts"],
+      "contentPatterns": ["describe\\(", "it\\(", "test\\("]
+    }
+  }
+}
+```
+
+### Trigger Types
+
+1. **Keyword Triggers** - Simple string matching (case insensitive)
+2. **Intent Pattern Triggers** - Regex for actions + objects
+3. **File Path Triggers** - Glob patterns for file paths
+4. **Content Pattern Triggers** - Regex in file content
+
+### Hook Implementation (High-Level)
+
+```javascript
+#!/usr/bin/env node
+// ~/.claude/hooks/user-prompt-submit/skill-activator.js
+
+const fs = require('fs');
+const path = require('path');
+
+// Load skill rules
+const rules = JSON.parse(fs.readFileSync(
+  path.join(process.env.HOME, '.claude/skill-rules.json'), 'utf8'
+));
+
+// Read prompt from stdin
+let promptData = '';
+process.stdin.on('data', chunk => promptData += chunk);
+
+process.stdin.on('end', () => {
+    const prompt = JSON.parse(promptData);
+
+    // Analyze prompt for skill matches
+    const activatedSkills = analyzePrompt(prompt.text);
+
+    if (activatedSkills.length > 0) {
+        // Inject skill activation reminder
+        const context = `
+🎯 SKILL ACTIVATION CHECK
+
+Relevant skills for this prompt:
+${activatedSkills.map(s => `- **${s.skill}** (${s.priority} priority)`).join('\n')}
+
+Check if these skills should be used before responding.
+`;
+
+        console.log(JSON.stringify({
+            decision: 'approve',
+            additionalContext: context
+        }));
+    } else {
+        console.log(JSON.stringify({ decision: 'approve' }));
+    }
+});
+
+function analyzePrompt(text) {
+    // Match against all skill rules
+    // Return list of activated skills with priorities
+}
+```
+
+**For complete working implementation:** See [resources/hook-implementation.md](resources/hook-implementation.md)
+
+### Progressive Enhancement
+
+**Phase 1 (Week 1):** Basic keyword matching
+```json
+{"keywords": ["backend", "API", "controller"]}
+```
+
+**Phase 2 (Week 2):** Add intent patterns
+```json
+{"intentPatterns": ["(create|add).*?(route|endpoint)"]}
+```
+
+**Phase 3 (Week 3):** Add file triggers
+```json
+{"fileTriggers": {"pathPatterns": ["backend/**/*.ts"]}}
+```
+
+**Phase 4 (Ongoing):** Refine based on observation
+</solution_levels>
+
+<results>
+### Before Hook System
+
+- Skills sit unused despite perfect keywords
+- Manual "/use skill-name" every time
+- Inconsistent patterns across codebase
+- Time spent fixing "creative interpretations"
+
+### After Hook System
+
+- Skills activate automatically and reliably
+- Consistent patterns enforced
+- Claude self-checks before showing code
+- "Night and day difference"
+
+**Real user:** "Skills went from 'expensive decorations' to actually useful"
+</results>
+
+<limitations>
+## Hook System Limitations
+
+1. **Requires hook system** - Not built into Claude Code
+2. **Maintenance overhead** - skill-rules.json needs updates
+3. **May over-activate** - Too many skills overwhelm context
+4. **Not perfect** - Still relies on Claude using activated skills
+
+## Considerations
+
+**Token usage:**
+- Activation reminder adds ~50-100 tokens per prompt
+- Multiple skills add more tokens
+- Use priorities to limit activation
+
+**Performance:**
+- Hook adds ~100-300ms to prompt processing
+- Acceptable for quality improvement
+- Optimize regex patterns if slow
+
+**Maintenance:**
+- Update rules when adding new skills
+- Review activation logs monthly
+- Refine patterns based on misses
+</limitations>
+
+<alternatives>
+## Approach 1: MCP Integration
+
+Use Model Context Protocol to provide skills as context.
+
+**Pros:** Built into Claude system
+**Cons:** Still not deterministic, same activation issues
+
+## Approach 2: Custom System Prompt
+
+Modify Claude's system prompt to always check certain skills.
+
+**Pros:** Works without hooks
+**Cons:** Limited to Pro plan, can't customize per-project
+
+## Approach 3: Manual Discipline
+
+Always explicitly request skill usage.
+
+**Pros:** No setup required
+**Cons:** Tedious, easy to forget, doesn't scale
+
+## Approach 4: Skill Consolidation
+
+Combine all guidelines into CLAUDE.md.
+
+**Pros:** Always loaded
+**Cons:** Violates progressive disclosure, wastes tokens
+
+**Recommendation:** Level 3 (hooks) for large projects, Level 1 for smaller projects
+</alternatives>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Try Level 1 first** → Better descriptions and explicit requests before building hooks
+2. **Observe before building** → Watch which prompts should activate skills
+3. **Start with keywords** → Add complexity incrementally (keywords → intent → files)
+4. **Keep hook fast (<1 second)** → Don't block prompt processing
+5. **Maintain skill-rules.json** → Update when skills change
+
+## Common Excuses
+
+All of these mean: **Try Level 1 first, then decide.**
+
+- "Skills should just work automatically" (They should, but don't reliably - workaround needed)
+- "Hook system too complex" (Setup takes 2 hours, saves hundreds of hours)
+- "I'll manually specify skills" (You'll forget, it gets tedious)
+- "Improving descriptions will fix it" (Helps, but not deterministic)
+- "This is overkill" (Maybe - start Level 1, upgrade if needed)
+</critical_rules>
+
+<verification_checklist>
+Before building hook system:
+
+- [ ] Tried improving skill descriptions (Level 1)
+- [ ] Tried explicit skill requests (Level 1)
+- [ ] Checked all settings are enabled
+- [ ] Observed which prompts should activate skills
+- [ ] Identified patterns in failures
+- [ ] Project large enough to justify hook overhead
+- [ ] Have time for 2-hour setup + ongoing maintenance
+
+**If Level 1 works:** Don't build hook system
+
+**If Level 1 insufficient:** Build hook system (Level 3)
+</verification_checklist>
+
+<integration>
+**This skill covers:** Skill activation strategies
+
+**Related skills:**
+- hyperpowers:building-hooks (how to build hook system)
+- hyperpowers:using-hyper (when to use skills generally)
+- hyperpowers:writing-skills (creating skills that activate well)
+
+**This skill enables:**
+- Consistent enforcement of patterns
+- Automatic guideline checking
+- Reliable skill usage across sessions
+
+**Hyperpowers includes:** Auto-activation hook at `hooks/user-prompt-submit/10-skill-activator.js`
+</integration>
+
+<resources>
+**Detailed implementation:**
+- [Complete working hook code](resources/hook-implementation.md)
+- [skill-rules.json examples](resources/skill-rules-examples.md)
+- [Troubleshooting guide](resources/troubleshooting.md)
+
+**Official documentation:**
+- [Anthropic Skills Best Practices](https://docs.claude.com/en/docs/agents-and-tools/agent-skills/best-practices)
+- [Claude Code Hooks Guide](https://docs.claude.com/en/docs/claude-code/hooks-guide)
+
+**When stuck:**
+- Skills still not activating → Check Settings > Capabilities
+- Hook not working → Check ~/.claude/logs/hooks.log
+- Over-activation → Reduce keywords, increase priority thresholds
+- Under-activation → Add more keywords, broaden intent patterns
+</resources>
--- a/skills/skills-auto-activation/resources/hook-implementation.md
+++ b/skills/skills-auto-activation/resources/hook-implementation.md
@@ -0,0 +1,655 @@
+# Complete Hook Implementation for Skills Auto-Activation
+
+This guide provides complete, production-ready code for implementing skills auto-activation using Claude Code hooks.
+
+## Complete File Structure
+
+```
+~/.claude/
+├── hooks/
+│   └── user-prompt-submit/
+│       └── skill-activator.js       # Main hook script
+├── skill-rules.json                  # Skill activation rules
+└── hooks.json                        # Hook configuration
+```
+
+## Step 1: Create skill-rules.json
+
+**Location:** `~/.claude/skill-rules.json`
+
+```json
+{
+  "backend-dev-guidelines": {
+    "type": "domain",
+    "enforcement": "suggest",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": [
+        "backend",
+        "controller",
+        "service",
+        "repository",
+        "API",
+        "endpoint",
+        "route",
+        "middleware",
+        "database",
+        "prisma",
+        "sequelize"
+      ],
+      "intentPatterns": [
+        "(create|add|build|implement).*?(route|endpoint|controller|service|repository)",
+        "(how to|best practice|pattern|guide).*?(backend|API|database|server)",
+        "(setup|configure|initialize).*?(database|ORM|API)",
+        "implement.*(authentication|authorization|auth|security)",
+        "(error|exception).*(handling|catching|logging)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "backend/**/*.ts",
+        "backend/**/*.js",
+        "server/**/*.ts",
+        "api/**/*.ts",
+        "src/controllers/**",
+        "src/services/**",
+        "src/repositories/**"
+      ],
+      "contentPatterns": [
+        "express\\.Router",
+        "export.*Controller",
+        "export.*Service",
+        "export.*Repository",
+        "prisma\\.",
+        "@Controller",
+        "@Injectable"
+      ]
+    }
+  },
+  "frontend-dev-guidelines": {
+    "type": "domain",
+    "enforcement": "suggest",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": [
+        "frontend",
+        "component",
+        "react",
+        "UI",
+        "layout",
+        "page",
+        "view",
+        "hooks",
+        "state",
+        "props",
+        "routing",
+        "navigation"
+      ],
+      "intentPatterns": [
+        "(create|build|add|implement).*?(component|page|layout|view|screen)",
+        "(how to|pattern|best practice).*?(react|hooks|state|context|props)",
+        "(style|CSS|design).*?(component|layout|UI)",
+        "implement.*?(routing|navigation|route)",
+        "(state|data).*(management|flow|handling)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "src/components/**/*.tsx",
+        "src/components/**/*.jsx",
+        "src/pages/**/*.tsx",
+        "src/views/**/*.tsx",
+        "frontend/**/*.tsx"
+      ],
+      "contentPatterns": [
+        "import.*from ['\"]react",
+        "export.*function.*Component",
+        "export.*default.*function",
+        "useState",
+        "useEffect",
+        "React\\.FC"
+      ]
+    }
+  },
+  "test-driven-development": {
+    "type": "process",
+    "enforcement": "suggest",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": [
+        "test",
+        "testing",
+        "TDD",
+        "spec",
+        "unit test",
+        "integration test",
+        "e2e",
+        "jest",
+        "vitest",
+        "mocha"
+      ],
+      "intentPatterns": [
+        "(write|add|create|implement).*?(test|spec|unit test)",
+        "test.*(first|before|TDD|driven)",
+        "(bug|fix|issue).*?(reproduce|test)",
+        "(coverage|untested).*?(code|function)",
+        "(mock|stub|spy).*?(function|API|service)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "**/*.test.ts",
+        "**/*.test.js",
+        "**/*.spec.ts",
+        "**/*.spec.js",
+        "**/__tests__/**",
+        "**/test/**"
+      ],
+      "contentPatterns": [
+        "describe\\(",
+        "it\\(",
+        "test\\(",
+        "expect\\(",
+        "jest\\.fn",
+        "beforeEach\\(",
+        "afterEach\\("
+      ]
+    }
+  },
+  "debugging-with-tools": {
+    "type": "process",
+    "enforcement": "suggest",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": [
+        "debug",
+        "debugging",
+        "error",
+        "bug",
+        "crash",
+        "fails",
+        "broken",
+        "not working",
+        "issue",
+        "problem"
+      ],
+      "intentPatterns": [
+        "(debug|fix|solve|investigate|troubleshoot).*?(error|bug|issue|problem)",
+        "(why|what).*?(failing|broken|not working|crashing)",
+        "(find|locate|identify).*?(bug|issue|problem|root cause)",
+        "reproduce.*(bug|issue|error)"
+      ]
+    }
+  },
+  "refactoring-safely": {
+    "type": "process",
+    "enforcement": "suggest",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": [
+        "refactor",
+        "refactoring",
+        "cleanup",
+        "improve",
+        "restructure",
+        "reorganize",
+        "simplify"
+      ],
+      "intentPatterns": [
+        "(refactor|clean up|improve|restructure).*?(code|function|class|component)",
+        "(extract|split|separate).*?(function|method|component|logic)",
+        "(rename|move|relocate).*?(file|function|class)",
+        "remove.*(duplication|duplicate|repeated code)"
+      ]
+    }
+  }
+}
+```
+
+## Step 2: Create Hook Script
+
+**Location:** `~/.claude/hooks/user-prompt-submit/skill-activator.js`
+
+```javascript
+#!/usr/bin/env node
+
+const fs = require('fs');
+const path = require('path');
+
+// Configuration
+const CONFIG = {
+    rulesPath: process.env.SKILL_RULES || path.join(process.env.HOME, '.claude/skill-rules.json'),
+    maxSkills: 3,  // Limit to avoid context overload
+    debugMode: process.env.DEBUG === 'true'
+};
+
+// Load skill rules
+function loadRules() {
+    try {
+        const content = fs.readFileSync(CONFIG.rulesPath, 'utf8');
+        return JSON.parse(content);
+    } catch (error) {
+        if (CONFIG.debugMode) {
+            console.error('Failed to load skill rules:', error.message);
+        }
+        return {};
+    }
+}
+
+// Read prompt from stdin
+function readPrompt() {
+    return new Promise((resolve) => {
+        let data = '';
+        process.stdin.on('data', chunk => data += chunk);
+        process.stdin.on('end', () => {
+            try {
+                resolve(JSON.parse(data));
+            } catch (error) {
+                if (CONFIG.debugMode) {
+                    console.error('Failed to parse prompt:', error.message);
+                }
+                resolve({ text: '' });
+            }
+        });
+    });
+}
+
+// Analyze prompt for skill matches
+function analyzePrompt(promptText, rules) {
+    const lowerText = promptText.toLowerCase();
+    const activated = [];
+
+    for (const [skillName, config] of Object.entries(rules)) {
+        let matched = false;
+        let matchReason = '';
+
+        // Check keyword triggers
+        if (config.promptTriggers?.keywords) {
+            for (const keyword of config.promptTriggers.keywords) {
+                if (lowerText.includes(keyword.toLowerCase())) {
+                    matched = true;
+                    matchReason = `keyword: "${keyword}"`;
+                    break;
+                }
+            }
+        }
+
+        // Check intent pattern triggers
+        if (!matched && config.promptTriggers?.intentPatterns) {
+            for (const pattern of config.promptTriggers.intentPatterns) {
+                try {
+                    if (new RegExp(pattern, 'i').test(promptText)) {
+                        matched = true;
+                        matchReason = `intent pattern: "${pattern}"`;
+                        break;
+                    }
+                } catch (error) {
+                    if (CONFIG.debugMode) {
+                        console.error(`Invalid pattern "${pattern}":`, error.message);
+                    }
+                }
+            }
+        }
+
+        if (matched) {
+            activated.push({
+                skill: skillName,
+                priority: config.priority || 'medium',
+                reason: matchReason,
+                type: config.type || 'general'
+            });
+        }
+    }
+
+    // Sort by priority (high > medium > low)
+    const priorityOrder = { high: 0, medium: 1, low: 2 };
+    activated.sort((a, b) => {
+        const priorityDiff = priorityOrder[a.priority] - priorityOrder[b.priority];
+        if (priorityDiff !== 0) return priorityDiff;
+        // Secondary sort: process types before domain types
+        const typeOrder = { process: 0, domain: 1, general: 2 };
+        return (typeOrder[a.type] || 2) - (typeOrder[b.type] || 2);
+    });
+
+    // Limit to max skills
+    return activated.slice(0, CONFIG.maxSkills);
+}
+
+// Generate activation context
+function generateContext(skills) {
+    if (skills.length === 0) {
+        return null;
+    }
+
+    const lines = [
+        '',
+        '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━',
+        '🎯 SKILL ACTIVATION CHECK',
+        '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━',
+        '',
+        'Relevant skills for this prompt:',
+        ''
+    ];
+
+    for (const skill of skills) {
+        const emoji = skill.priority === 'high' ? '⭐' : skill.priority === 'medium' ? '📌' : '💡';
+        lines.push(`${emoji} **${skill.skill}** (${skill.priority} priority)`);
+
+        if (CONFIG.debugMode) {
+            lines.push(`   Matched: ${skill.reason}`);
+        }
+    }
+
+    lines.push('');
+    lines.push('Before responding, check if any of these skills should be used.');
+    lines.push('━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━');
+    lines.push('');
+
+    return lines.join('\n');
+}
+
+// Main execution
+async function main() {
+    try {
+        // Load rules
+        const rules = loadRules();
+
+        if (Object.keys(rules).length === 0) {
+            if (CONFIG.debugMode) {
+                console.error('No rules loaded');
+            }
+            console.log(JSON.stringify({ decision: 'approve' }));
+            return;
+        }
+
+        // Read prompt
+        const prompt = await readPrompt();
+
+        if (!prompt.text || prompt.text.trim() === '') {
+            console.log(JSON.stringify({ decision: 'approve' }));
+            return;
+        }
+
+        // Analyze prompt
+        const activatedSkills = analyzePrompt(prompt.text, rules);
+
+        // Generate response
+        if (activatedSkills.length > 0) {
+            const context = generateContext(activatedSkills);
+
+            if (CONFIG.debugMode) {
+                console.error('Activated skills:', activatedSkills.map(s => s.skill).join(', '));
+            }
+
+            console.log(JSON.stringify({
+                decision: 'approve',
+                additionalContext: context
+            }));
+        } else {
+            if (CONFIG.debugMode) {
+                console.error('No skills activated');
+            }
+            console.log(JSON.stringify({ decision: 'approve' }));
+        }
+    } catch (error) {
+        if (CONFIG.debugMode) {
+            console.error('Hook error:', error.message, error.stack);
+        }
+        // Always approve on error
+        console.log(JSON.stringify({ decision: 'approve' }));
+    }
+}
+
+main();
+```
+
+## Step 3: Make Hook Executable
+
+```bash
+chmod +x ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+## Step 4: Configure Hook
+
+**Location:** `~/.claude/hooks.json`
+
+```json
+{
+  "hooks": [
+    {
+      "event": "UserPromptSubmit",
+      "command": "~/.claude/hooks/user-prompt-submit/skill-activator.js",
+      "description": "Analyze prompt and inject skill activation reminders",
+      "blocking": false,
+      "timeout": 1000
+    }
+  ]
+}
+```
+
+## Step 5: Test the Hook
+
+### Test 1: Keyword Matching
+
+```bash
+# Create test prompt
+echo '{"text": "How do I create a new API endpoint?"}' | \
+    node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+**Expected output:**
+```json
+{
+  "additionalContext": "\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n🎯 SKILL ACTIVATION CHECK\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n\nRelevant skills for this prompt:\n\n⭐ **backend-dev-guidelines** (high priority)\n\nBefore responding, check if any of these skills should be used.\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
+}
+```
+
+### Test 2: Intent Pattern Matching
+
+```bash
+echo '{"text": "I want to build a new React component"}' | \
+    node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+**Expected:** Should activate frontend-dev-guidelines
+
+### Test 3: Multiple Skills
+
+```bash
+echo '{"text": "Write a test for the API endpoint"}' | \
+    node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+**Expected:** Should activate hyperpowers:test-driven-development and backend-dev-guidelines
+
+### Test 4: Debug Mode
+
+```bash
+DEBUG=true echo '{"text": "How do I create a component?"}' | \
+    node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
+```
+
+**Expected:** Debug output showing which skills matched and why
+
+## Advanced: File-Based Triggers
+
+To add file-based triggers, extend the hook to check which files are being edited:
+
+```javascript
+// Add to skill-activator.js
+
+// Get recently edited files from Claude Code context
+function getRecentFiles(prompt) {
+    // Claude Code provides context about files being edited
+    // This would come from the prompt context or a separate tracking mechanism
+    return prompt.files || [];
+}
+
+// Check file triggers
+function checkFileTriggers(files, config) {
+    if (!files || files.length === 0) return false;
+    if (!config.fileTriggers) return false;
+
+    // Check path patterns
+    if (config.fileTriggers.pathPatterns) {
+        for (const file of files) {
+            for (const pattern of config.fileTriggers.pathPatterns) {
+                // Convert glob pattern to regex
+                const regex = globToRegex(pattern);
+                if (regex.test(file)) {
+                    return true;
+                }
+            }
+        }
+    }
+
+    // Check content patterns (would require reading files)
+    // Omitted for performance - better to check in PostToolUse hook
+
+    return false;
+}
+
+// Convert glob pattern to regex
+function globToRegex(glob) {
+    const regex = glob
+        .replace(/\*\*/g, '___DOUBLE_STAR___')
+        .replace(/\*/g, '[^/]*')
+        .replace(/___DOUBLE_STAR___/g, '.*')
+        .replace(/\?/g, '.');
+    return new RegExp(`^${regex}$`);
+}
+```
+
+## Troubleshooting
+
+### Hook Not Running
+
+**Check:**
+```bash
+# Verify hook is configured
+cat ~/.claude/hooks.json
+
+# Test hook manually
+echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+
+# Check Claude Code logs
+tail -f ~/.claude/logs/hooks.log
+```
+
+### No Skills Activating
+
+**Enable debug mode:**
+```bash
+DEBUG=true node ~/.claude/hooks/user-prompt-submit/skill-activator.js < test-prompt.json
+```
+
+**Common causes:**
+- skill-rules.json not found or invalid
+- Keywords don't match (check casing, spelling)
+- Patterns have regex errors
+- Hook timing out (increase timeout)
+
+### Too Many Skills Activating
+
+**Adjust maxSkills:**
+```javascript
+const CONFIG = {
+    maxSkills: 2,  // Reduce from 3
+    // ...
+};
+```
+
+**Or tighten triggers:**
+```json
+{
+  "backend-dev-guidelines": {
+    "priority": "high",  // Only high priority skills
+    "promptTriggers": {
+      "keywords": ["controller", "service"],  // More specific keywords
+      // ...
+    }
+  }
+}
+```
+
+### Performance Issues
+
+**If hook is slow (>500ms):**
+
+1. Reduce regex complexity
+2. Limit number of patterns
+3. Cache compiled regex patterns
+4. Profile with:
+
+```bash
+time echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+## Maintenance
+
+### Monthly Review
+
+```bash
+# Check activation frequency
+grep "Activated skills" ~/.claude/hooks/debug.log | sort | uniq -c
+
+# Find prompts that didn't activate any skills
+grep "No skills activated" ~/.claude/hooks/debug.log
+```
+
+### Updating Rules
+
+When adding new skills:
+
+1. Add to skill-rules.json
+2. Test activation with sample prompts
+3. Observe for false positives/negatives
+4. Refine patterns based on usage
+
+### Version Control
+
+```bash
+# Track rules in git
+cd ~/.claude
+git init
+git add skill-rules.json hooks/
+git commit -m "Initial skill activation rules"
+```
+
+## Integration with Other Hooks
+
+The skill activator can work alongside other hooks:
+
+```json
+{
+  "hooks": [
+    {
+      "event": "UserPromptSubmit",
+      "command": "~/.claude/hooks/user-prompt-submit/00-log-prompt.sh",
+      "description": "Log prompts for analysis",
+      "blocking": false
+    },
+    {
+      "event": "UserPromptSubmit",
+      "command": "~/.claude/hooks/user-prompt-submit/10-skill-activator.js",
+      "description": "Activate relevant skills",
+      "blocking": false
+    }
+  ]
+}
+```
+
+**Naming convention:** Use numeric prefixes (00-, 10-, 20-) to control execution order.
+
+## Performance Benchmarks
+
+**Target performance:**
+- Keyword matching: <50ms
+- Intent pattern matching: <200ms
+- Total hook execution: <500ms
+
+**Actual performance (typical):**
+- 2-3 skills: ~100-300ms
+- 5+ skills: ~300-500ms
+
+If performance degrades, profile and optimize patterns.
--- a/skills/skills-auto-activation/resources/skill-rules-examples.md
+++ b/skills/skills-auto-activation/resources/skill-rules-examples.md
@@ -0,0 +1,443 @@
+# Skill Rules Examples
+
+Example configurations for common skill types and scenarios.
+
+## Domain-Specific Skills
+
+### Backend Development
+
+```json
+{
+  "backend-dev-guidelines": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": [
+        "backend", "server", "API", "endpoint", "route",
+        "controller", "service", "repository",
+        "middleware", "authentication", "authorization"
+      ],
+      "intentPatterns": [
+        "(create|build|implement|add).*?(API|endpoint|route|controller)",
+        "how.*(backend|server|API)",
+        "(setup|configure).*(server|backend|API)",
+        "implement.*(auth|security)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "backend/**/*.ts",
+        "server/**/*.ts",
+        "src/api/**"
+      ]
+    }
+  }
+}
+```
+
+### Frontend Development
+
+```json
+{
+  "frontend-dev-guidelines": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": [
+        "frontend", "UI", "component", "react", "vue", "angular",
+        "page", "layout", "view", "hooks", "state"
+      ],
+      "intentPatterns": [
+        "(create|build).*?(component|page|layout)",
+        "how.*(react|hooks|state)",
+        "(style|design).*?(component|UI)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "src/components/**/*.tsx",
+        "src/pages/**/*.tsx"
+      ]
+    }
+  }
+}
+```
+
+## Process Skills
+
+### Test-Driven Development
+
+```json
+{
+  "test-driven-development": {
+    "type": "process",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["test", "TDD", "testing", "spec", "jest", "vitest"],
+      "intentPatterns": [
+        "(write|create|add).*?test",
+        "test.*first",
+        "reproduce.*(bug|error)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "**/*.test.ts",
+        "**/*.spec.ts",
+        "**/__tests__/**"
+      ]
+    }
+  }
+}
+```
+
+### Code Review
+
+```json
+{
+  "code-review": {
+    "type": "process",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": ["review", "check", "verify", "audit", "quality"],
+      "intentPatterns": [
+        "review.*(code|changes|implementation)",
+        "(check|verify).*(quality|standards|best practices)"
+      ]
+    }
+  }
+}
+```
+
+## Technology-Specific Skills
+
+### Database/Prisma
+
+```json
+{
+  "database-prisma": {
+    "type": "technology",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": [
+        "database", "prisma", "schema", "migration",
+        "query", "orm", "model"
+      ],
+      "intentPatterns": [
+        "(create|update|modify).*?(schema|model|migration)",
+        "(query|fetch|get).*?database"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "**/prisma/**",
+        "**/*.prisma"
+      ]
+    }
+  }
+}
+```
+
+### Docker/DevOps
+
+```json
+{
+  "devops-docker": {
+    "type": "technology",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": [
+        "docker", "dockerfile", "container",
+        "deployment", "CI/CD", "kubernetes"
+      ],
+      "intentPatterns": [
+        "(create|build|configure).*?(docker|container)",
+        "(deploy|release|publish)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "**/Dockerfile",
+        "**/.github/workflows/**",
+        "**/docker-compose.yml"
+      ]
+    }
+  }
+}
+```
+
+## Project-Specific Skills
+
+### Feature-Specific (E-commerce Cart)
+
+```json
+{
+  "cart-feature": {
+    "type": "feature",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": [
+        "cart", "shopping cart", "basket",
+        "add to cart", "checkout"
+      ],
+      "intentPatterns": [
+        "(implement|create|modify).*?cart",
+        "cart.*(functionality|feature|logic)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": [
+        "src/features/cart/**",
+        "backend/cart-service/**"
+      ]
+    }
+  }
+}
+```
+
+## Priority-Based Configuration
+
+### High Priority (Always Check)
+
+```json
+{
+  "critical-security": {
+    "type": "security",
+    "priority": "high",
+    "enforcement": "suggest",
+    "promptTriggers": {
+      "keywords": [
+        "security", "vulnerability", "authentication", "authorization",
+        "SQL injection", "XSS", "CSRF", "password", "token"
+      ],
+      "intentPatterns": [
+        "secur(e|ity)",
+        "vulnerab(le|ility)",
+        "(auth|password|token).*(implement|handle|store)"
+      ]
+    }
+  }
+}
+```
+
+### Medium Priority (Contextual)
+
+```json
+{
+  "performance-optimization": {
+    "type": "optimization",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": [
+        "performance", "optimize", "slow", "cache",
+        "memory", "speed", "latency"
+      ],
+      "intentPatterns": [
+        "(improve|optimize).*(performance|speed)",
+        "(reduce|minimize).*(latency|memory|time)"
+      ]
+    }
+  }
+}
+```
+
+### Low Priority (Optional)
+
+```json
+{
+  "documentation-guide": {
+    "type": "documentation",
+    "priority": "low",
+    "promptTriggers": {
+      "keywords": [
+        "documentation", "docs", "comments", "readme",
+        "docstring", "jsdoc"
+      ],
+      "intentPatterns": [
+        "(write|update|create).*?(documentation|docs)",
+        "document.*(API|function|component)"
+      ]
+    }
+  }
+}
+```
+
+## Multi-Repo Configuration
+
+For projects with multiple repositories:
+
+```json
+{
+  "frontend-mobile": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["mobile", "ios", "android", "react native"],
+      "intentPatterns": ["(create|build).*?(screen|component)"]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["/mobile/**", "/apps/mobile/**"]
+    }
+  },
+  "frontend-web": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["web", "website", "react", "nextjs"],
+      "intentPatterns": ["(create|build).*?(page|component)"]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["/web/**", "/apps/web/**"]
+    }
+  }
+}
+```
+
+## Advanced Pattern Matching
+
+### Negative Patterns (Exclude)
+
+```json
+{
+  "backend-dev-guidelines": {
+    "promptTriggers": {
+      "keywords": ["backend"],
+      "intentPatterns": [
+        "backend",
+        "(?!.*test).*backend"  // Match "backend" but not if "test" appears
+      ]
+    }
+  }
+}
+```
+
+### Compound Patterns
+
+```json
+{
+  "database-migration": {
+    "promptTriggers": {
+      "intentPatterns": [
+        "(create|generate|run).*(migration|schema change)",
+        "(add|remove|modify).*(column|table|index)"
+      ]
+    }
+  }
+}
+```
+
+### Context-Aware Patterns
+
+```json
+{
+  "error-handling": {
+    "promptTriggers": {
+      "keywords": ["error", "exception", "try", "catch"],
+      "intentPatterns": [
+        "(handle|catch|throw).*(error|exception)",
+        "error.*handling"
+      ]
+    }
+  }
+}
+```
+
+## Enforcement Levels
+
+```json
+{
+  "critical-skill": {
+    "enforcement": "block",    // Block if not used (future feature)
+    "priority": "high"
+  },
+  "recommended-skill": {
+    "enforcement": "suggest",  // Suggest usage
+    "priority": "medium"
+  },
+  "optional-skill": {
+    "enforcement": "optional", // Mention availability
+    "priority": "low"
+  }
+}
+```
+
+## Full Example Configuration
+
+Complete configuration for a full-stack TypeScript project:
+
+```json
+{
+  "backend-dev-guidelines": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["backend", "API", "endpoint", "controller", "service"],
+      "intentPatterns": [
+        "(create|add|implement).*?(API|endpoint|route|controller|service)",
+        "how.*(backend|server|API)"
+      ]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["backend/**/*.ts", "server/**/*.ts"]
+    }
+  },
+  "frontend-dev-guidelines": {
+    "type": "domain",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["frontend", "component", "react", "UI"],
+      "intentPatterns": ["(create|build).*?(component|page)"]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["src/components/**/*.tsx", "src/pages/**/*.tsx"]
+    }
+  },
+  "test-driven-development": {
+    "type": "process",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["test", "TDD", "testing"],
+      "intentPatterns": ["(write|create).*?test", "test.*first"]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["**/*.test.ts", "**/*.spec.ts"]
+    }
+  },
+  "database-prisma": {
+    "type": "technology",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["database", "prisma", "schema", "migration"],
+      "intentPatterns": ["(create|modify).*?(schema|migration)"]
+    },
+    "fileTriggers": {
+      "pathPatterns": ["**/prisma/**"]
+    }
+  },
+  "debugging-with-tools": {
+    "type": "process",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": ["debug", "bug", "error", "broken", "not working"],
+      "intentPatterns": ["(debug|fix|solve).*?(error|bug|issue)"]
+    }
+  },
+  "refactoring-safely": {
+    "type": "process",
+    "priority": "medium",
+    "promptTriggers": {
+      "keywords": ["refactor", "cleanup", "improve", "restructure"],
+      "intentPatterns": ["(refactor|clean up|improve).*?code"]
+    }
+  }
+}
+```
+
+## Tips for Creating Rules
+
+1. **Start broad, refine narrow** - Begin with general keywords, narrow based on false positives
+2. **Use priority wisely** - High priority for critical skills only
+3. **Test patterns** - Validate regex patterns before deploying
+4. **Monitor activation** - Track which skills activate and adjust
+5. **Keep it maintainable** - Comment complex patterns
+6. **Version control** - Track changes to rules over time
--- a/skills/skills-auto-activation/resources/troubleshooting.md
+++ b/skills/skills-auto-activation/resources/troubleshooting.md
@@ -0,0 +1,557 @@
+# Troubleshooting Skills Auto-Activation
+
+Common issues and solutions for skills auto-activation system.
+
+## Problem: Hook Not Running At All
+
+### Symptoms
+- No skill activation messages appear
+- Prompts process normally without injected context
+
+### Diagnosis
+
+**Step 1: Check hook configuration**
+```bash
+cat ~/.claude/hooks.json
+```
+
+Should contain:
+```json
+{
+  "hooks": [
+    {
+      "event": "UserPromptSubmit",
+      "command": "~/.claude/hooks/user-prompt-submit/skill-activator.js"
+    }
+  ]
+}
+```
+
+**Step 2: Test hook manually**
+```bash
+echo '{"text": "test backend endpoint"}' | \
+  node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+Should output JSON with `decision` and possibly `additionalContext`.
+
+**Step 3: Check file permissions**
+```bash
+ls -l ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+Should be executable (`-rwxr-xr-x`). If not:
+```bash
+chmod +x ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+**Step 4: Check Claude Code logs**
+```bash
+tail -f ~/.claude/logs/hooks.log
+```
+
+Look for errors related to skill-activator.
+
+### Solutions
+
+**Solution 1: Reinstall hook**
+```bash
+mkdir -p ~/.claude/hooks/user-prompt-submit
+cp skill-activator.js ~/.claude/hooks/user-prompt-submit/
+chmod +x ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+**Solution 2: Verify Node.js**
+```bash
+which node
+node --version
+```
+
+Ensure Node.js is installed and in PATH.
+
+**Solution 3: Check hook timeout**
+```json
+{
+  "hooks": [
+    {
+      "event": "UserPromptSubmit",
+      "command": "~/.claude/hooks/user-prompt-submit/skill-activator.js",
+      "timeout": 2000  // Increase if needed
+    }
+  ]
+}
+```
+
+## Problem: No Skills Activating
+
+### Symptoms
+- Hook runs successfully
+- No skills appear in activation messages
+- Debug shows "No skills activated"
+
+### Diagnosis
+
+**Enable debug mode:**
+```bash
+DEBUG=true echo '{"text": "your test prompt"}' | \
+  node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
+```
+
+**Check for:**
+- "No rules loaded" → skill-rules.json not found
+- "No skills activated" → Keywords/patterns don't match
+
+### Solutions
+
+**Solution 1: Verify skill-rules.json location**
+```bash
+cat ~/.claude/skill-rules.json
+```
+
+If not found:
+```bash
+cp skill-rules.json ~/.claude/skill-rules.json
+```
+
+**Solution 2: Test with known keyword**
+```bash
+echo '{"text": "create backend controller"}' | \
+  SKILL_RULES=~/.claude/skill-rules.json \
+  DEBUG=true \
+  node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
+```
+
+Should match "backend-dev-guidelines" if configured.
+
+**Solution 3: Check JSON syntax**
+```bash
+cat ~/.claude/skill-rules.json | jq '.'
+```
+
+If errors, fix JSON syntax.
+
+**Solution 4: Simplify rules for testing**
+```json
+{
+  "test-skill": {
+    "type": "test",
+    "priority": "high",
+    "promptTriggers": {
+      "keywords": ["test"]
+    }
+  }
+}
+```
+
+Test with:
+```bash
+echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+## Problem: Wrong Skills Activating
+
+### Symptoms
+- Skills activate on irrelevant prompts
+- Too many false positives
+
+### Diagnosis
+
+**Enable debug to see why skills matched:**
+```bash
+DEBUG=true echo '{"text": "your prompt"}' | \
+  node ~/.claude/hooks/user-prompt-submit/skill-activator.js 2>&1
+```
+
+Look for "Matched: keyword" or "Matched: intent pattern" to see why.
+
+### Solutions
+
+**Solution 1: Tighten keywords**
+
+Before:
+```json
+{
+  "keywords": ["api", "test", "code"]
+}
+```
+
+After (more specific):
+```json
+{
+  "keywords": ["API endpoint", "integration test", "refactor code"]
+}
+```
+
+**Solution 2: Use negative patterns**
+```json
+{
+  "intentPatterns": [
+    "(?!.*test).*backend"  // Match "backend" but not if "test" in prompt
+  ]
+}
+```
+
+**Solution 3: Increase priority thresholds**
+```json
+{
+  "test-skill": {
+    "priority": "low"  // Will be deprioritized if others match
+  }
+}
+```
+
+**Solution 4: Reduce maxSkills**
+
+In skill-activator.js:
+```javascript
+const CONFIG = {
+    maxSkills: 2,  // Reduce from 3
+};
+```
+
+## Problem: Hook Is Slow
+
+### Symptoms
+- Noticeable delay before Claude responds
+- Hook takes >1 second
+
+### Diagnosis
+
+**Measure hook performance:**
+```bash
+time echo '{"text": "test"}' | node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+```
+
+Should be <500ms. If slower, diagnose:
+
+**Check number of rules:**
+```bash
+cat ~/.claude/skill-rules.json | jq 'keys | length'
+```
+
+More than 10 rules may slow down.
+
+**Check pattern complexity:**
+```bash
+cat ~/.claude/skill-rules.json | jq '.[].promptTriggers.intentPatterns'
+```
+
+Complex regex patterns slow matching.
+
+### Solutions
+
+**Solution 1: Optimize regex patterns**
+
+Before (slow):
+```json
+{
+  "intentPatterns": [
+    ".*create.*backend.*endpoint.*"
+  ]
+}
+```
+
+After (faster):
+```json
+{
+  "intentPatterns": [
+    "(create|build).*(backend|API).*(endpoint|route)"
+  ]
+}
+```
+
+**Solution 2: Cache compiled patterns**
+
+Modify hook to compile patterns once:
+```javascript
+const compiledPatterns = new Map();
+
+function getCompiledPattern(pattern) {
+    if (!compiledPatterns.has(pattern)) {
+        compiledPatterns.set(pattern, new RegExp(pattern, 'i'));
+    }
+    return compiledPatterns.get(pattern);
+}
+```
+
+**Solution 3: Reduce number of rules**
+
+Remove low-priority or rarely-used skills.
+
+**Solution 4: Parallelize pattern matching**
+
+For advanced users, use worker threads to match patterns in parallel.
+
+## Problem: Skills Still Don't Activate in Claude
+
+### Symptoms
+- Hook injects activation message
+- Claude still doesn't use the skills
+
+### Diagnosis
+
+This means the hook is working, but Claude is ignoring the suggestion.
+
+**Check:**
+1. Are skills actually installed?
+2. Does Claude have access to read skills?
+3. Are skill descriptions clear?
+
+### Solutions
+
+**Solution 1: Make activation message stronger**
+
+In skill-activator.js, change:
+```javascript
+'Before responding, check if any of these skills should be used.'
+```
+
+To:
+```javascript
+'⚠️ IMPORTANT: You MUST check these skills before responding. Use the Skill tool to load them.'
+```
+
+**Solution 2: Block until skills loaded**
+
+Change hook to blocking mode (use cautiously):
+```json
+{
+  "hooks": [
+    {
+      "event": "UserPromptSubmit",
+      "command": "~/.claude/hooks/user-prompt-submit/skill-activator.js",
+      "blocking": true  // ⚠️ Experimental
+    }
+  ]
+}
+```
+
+**Solution 3: Improve skill descriptions**
+
+Ensure skill descriptions are specific:
+```yaml
+name: backend-dev-guidelines
+description: Use when creating API routes, controllers, services, or repositories - enforces TypeScript patterns, Prisma repository pattern, and Sentry error handling
+```
+
+**Solution 4: Reference skills in CLAUDE.md**
+
+Add to project's CLAUDE.md:
+```markdown
+## Available Skills
+
+- backend-dev-guidelines: Use for all backend code
+- frontend-dev-guidelines: Use for all frontend code
+- hyperpowers:test-driven-development: Use when writing tests
+```
+
+## Problem: Hook Crashes Claude Code
+
+### Symptoms
+- Claude Code freezes or crashes after hook execution
+- Error in hooks.log
+
+### Diagnosis
+
+**Check error logs:**
+```bash
+tail -50 ~/.claude/logs/hooks.log
+```
+
+Look for errors related to skill-activator.
+
+**Common causes:**
+- Infinite loop in hook
+- Memory leak
+- Unhandled promise rejection
+- Blocking operation
+
+### Solutions
+
+**Solution 1: Add error handling**
+```javascript
+async function main() {
+    try {
+        // ... hook logic
+    } catch (error) {
+        console.error('Hook error:', error.message);
+        // Always return approve on error
+        console.log(JSON.stringify({ decision: 'approve' }));
+    }
+}
+```
+
+**Solution 2: Add timeout protection**
+```javascript
+const timeout = setTimeout(() => {
+    console.log(JSON.stringify({ decision: 'approve' }));
+    process.exit(0);
+}, 900);  // Exit before hook timeout
+
+// Clear timeout if completed normally
+clearTimeout(timeout);
+```
+
+**Solution 3: Test hook in isolation**
+```bash
+# Run hook with various inputs
+for prompt in "test" "backend" "frontend" "debug"; do
+  echo "Testing: $prompt"
+  echo "{\"text\": \"$prompt\"}" | \
+    timeout 2s node ~/.claude/hooks/user-prompt-submit/skill-activator.js
+done
+```
+
+**Solution 4: Simplify hook**
+
+Remove complex logic and test minimal version:
+```javascript
+// Minimal hook for testing
+console.log(JSON.stringify({
+    decision: 'approve',
+    additionalContext: '🎯 Test message'
+}));
+```
+
+## Problem: Context Overload
+
+### Symptoms
+- Too many skill activation messages
+- Context window fills quickly
+- Claude seems overwhelmed
+
+### Solutions
+
+**Solution 1: Limit activated skills**
+```javascript
+const CONFIG = {
+    maxSkills: 1,  // Only top match
+};
+```
+
+**Solution 2: Use priorities strictly**
+```json
+{
+  "critical-skill": {
+    "priority": "high"  // Only high priority
+  },
+  "optional-skill": {
+    "priority": "low"   // Remove low priority
+  }
+}
+```
+
+**Solution 3: Shorten activation message**
+```javascript
+function generateContext(skills) {
+    return `🎯 Use: ${skills.map(s => s.skill).join(', ')}`;
+}
+```
+
+## Problem: Inconsistent Activation
+
+### Symptoms
+- Sometimes activates, sometimes doesn't
+- Same prompt gives different results
+
+### Diagnosis
+
+**This is expected due to:**
+- Prompt variations (punctuation, wording)
+- Context differences (files being edited)
+- Keyword order
+
+### Solutions
+
+**Solution 1: Add keyword variations**
+```json
+{
+  "keywords": [
+    "backend",
+    "back end",
+    "back-end",
+    "server side",
+    "server-side"
+  ]
+}
+```
+
+**Solution 2: Use more patterns**
+```json
+{
+  "intentPatterns": [
+    "create.*backend",
+    "backend.*create",
+    "build.*API",
+    "API.*build"
+  ]
+}
+```
+
+**Solution 3: Log all prompts for analysis**
+```javascript
+// Add to hook
+fs.appendFileSync(
+    path.join(process.env.HOME, '.claude/prompt-log.txt'),
+    `${new Date().toISOString()} | ${prompt.text}\n`
+);
+```
+
+Analyze monthly:
+```bash
+grep "backend" ~/.claude/prompt-log.txt | wc -l
+```
+
+## General Debugging Tips
+
+**Enable full debug logging:**
+```bash
+# Add to hook
+const logFile = path.join(process.env.HOME, '.claude/hook-debug.log');
+function debug(msg) {
+    fs.appendFileSync(logFile, `${new Date().toISOString()} ${msg}\n`);
+}
+
+debug(`Analyzing prompt: ${prompt.text}`);
+debug(`Activated skills: ${activatedSkills.map(s => s.skill).join(', ')}`);
+```
+
+**Test with controlled inputs:**
+```bash
+# Create test suite
+cat > test-prompts.json <<EOF
+[
+  {"text": "create backend endpoint", "expected": ["backend-dev-guidelines"]},
+  {"text": "build react component", "expected": ["frontend-dev-guidelines"]},
+  {"text": "write test for API", "expected": ["test-driven-development"]}
+]
+EOF
+
+# Run tests
+node test-hook.js
+```
+
+**Monitor in production:**
+```bash
+# Daily summary
+grep "Activated skills" ~/.claude/hook-debug.log | \
+  grep "$(date +%Y-%m-%d)" | \
+  sort | uniq -c
+```
+
+## Getting Help
+
+If problems persist:
+
+1. Check GitHub issues for similar problems
+2. Share debug output (sanitize sensitive info)
+3. Test with minimal configuration
+4. Verify with official examples
+
+**Checklist before asking for help:**
+- [ ] Hook runs manually without errors
+- [ ] skill-rules.json is valid JSON
+- [ ] Node.js version is current (v18+)
+- [ ] Debug mode shows expected behavior
+- [ ] Tested with simplified configuration
+- [ ] Checked Claude Code logs
--- a/skills/sre-task-refinement/SKILL.md
+++ b/skills/sre-task-refinement/SKILL.md
@@ -0,0 +1,826 @@
+---
+name: sre-task-refinement
+description: Use when you have to refine subtasks into actionable plans ensuring that all corner cases are handled and we understand all the requirements.
+---
+
+<skill_overview>
+Review bd task plans with Google Fellow SRE perspective to ensure junior engineer can execute without questions; catch edge cases, verify granularity, strengthen criteria, prevent production issues before implementation.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow the 7-category checklist exactly. Apply all categories to every task. No skipping red flag checks. Always verify no placeholder text after updates. Reject plans with critical gaps.
+</rigidity_level>
+
+<quick_reference>
+| Category | Key Questions | Auto-Reject If |
+|----------|---------------|----------------|
+| 1. Granularity | Tasks 4-8 hours? Phases <16 hours? | Any task >16h without breakdown |
+| 2. Implementability | Junior can execute without questions? | Vague language, missing details |
+| 3. Success Criteria | 3+ measurable criteria per task? | Can't verify ("works well") |
+| 4. Dependencies | Correct parent-child, blocking relationships? | Circular dependencies |
+| 5. Safety Standards | Anti-patterns specified? Error handling? | No anti-patterns section |
+| 6. Edge Cases | Empty input? Unicode? Concurrency? Failures? | No edge case consideration |
+| 7. Red Flags | Placeholder text? Vague instructions? | "[detailed above]", "TODO" |
+
+**Perspective**: Google Fellow SRE with 20+ years experience reviewing junior engineer designs.
+
+**Time**: Don't rush - catching one gap pre-implementation saves hours of rework.
+</quick_reference>
+
+<when_to_use>
+Use when:
+- Reviewing bd epic/feature plans before implementation
+- Need to ensure junior engineer can execute without questions
+- Want to catch edge cases and failure modes upfront
+- Need to verify task granularity (4-8 hour subtasks)
+- After hyperpowers:writing-plans creates initial plan
+- Before hyperpowers:executing-plans starts implementation
+
+Don't use when:
+- Task already being implemented (too late)
+- Just need to understand existing code (use codebase-investigator)
+- Debugging issues (use debugging-with-tools)
+- Want to create plan from scratch (use brainstorming → writing-plans)
+</when_to_use>
+
+<the_process>
+## Announcement
+
+**Announce:** "I'm using hyperpowers:sre-task-refinement to review this plan with Google Fellow-level scrutiny."
+
+---
+
+## Review Checklist (Apply to Every Task)
+
+### 1. Task Granularity
+
+**Check:**
+- [ ] No task >8 hours (subtasks) or >16 hours (phases)?
+- [ ] Large phases broken into 4-8 hour subtasks?
+- [ ] Each subtask independently completable?
+- [ ] Each subtask has clear deliverable?
+
+**If task >16 hours:**
+- Create subtasks with `bd create`
+- Link with `bd dep add child parent --type parent-child`
+- Update parent to coordinator role
+
+---
+
+### 2. Implementability (Junior Engineer Test)
+
+**Check:**
+- [ ] Can junior engineer implement without asking questions?
+- [ ] Function signatures/behaviors described, not just "implement X"?
+- [ ] Test scenarios described (what they verify, not just names)?
+- [ ] "Done" clearly defined with verifiable criteria?
+- [ ] All file paths specified or marked "TBD: new file"?
+
+**Red flags:**
+- "Implement properly" (how?)
+- "Add support" (for what exactly?)
+- "Make it work" (what does working mean?)
+- File paths missing or ambiguous
+
+---
+
+### 3. Success Criteria Quality
+
+**Check:**
+- [ ] Each task has 3+ specific, measurable success criteria?
+- [ ] All criteria testable/verifiable (not subjective)?
+- [ ] Includes automated verification (tests pass, clippy clean)?
+- [ ] No vague criteria like "works well" or "is implemented"?
+
+**Good criteria examples:**
+- ✅ "5+ unit tests pass (valid VIN, invalid checksum, various formats)"
+- ✅ "Clippy clean with no warnings"
+- ✅ "Performance: <100ms for 1000 records"
+
+**Bad criteria examples:**
+- ❌ "Code is good quality"
+- ❌ "Works correctly"
+- ❌ "Is implemented"
+
+---
+
+### 4. Dependency Structure
+
+**Check:**
+- [ ] Parent-child relationships correct (epic → phases → subtasks)?
+- [ ] Blocking dependencies correct (earlier work blocks later)?
+- [ ] No circular dependencies?
+- [ ] Dependency graph makes logical sense?
+
+**Verify with:**
+```bash
+bd dep tree bd-1  # Show full dependency tree
+```
+
+---
+
+### 5. Safety & Quality Standards
+
+**Check:**
+- [ ] Anti-patterns include unwrap/expect prohibition?
+- [ ] Anti-patterns include TODO prohibition (or must have issue #)?
+- [ ] Anti-patterns include stub implementation prohibition?
+- [ ] Error handling requirements specified (use Result, avoid panic)?
+- [ ] Test requirements specific (test names, scenarios listed)?
+
+**Minimum anti-patterns:**
+- ❌ No unwrap/expect in production code
+- ❌ No TODOs without issue numbers
+- ❌ No stub implementations (unimplemented!, todo!)
+- ❌ No regex without catastrophic backtracking check
+
+---
+
+### 6. Edge Cases & Failure Modes (Fellow SRE Perspective)
+
+**Ask for each task:**
+- [ ] What happens with malformed input?
+- [ ] What happens with empty/nil/zero values?
+- [ ] What happens under high load/concurrency?
+- [ ] What happens when dependencies fail?
+- [ ] What happens with Unicode, special characters, large inputs?
+- [ ] Are these edge cases addressed in the plan?
+
+**Add to Key Considerations section:**
+- Edge case descriptions
+- Mitigation strategies
+- References to similar code handling these cases
+
+---
+
+### 7. Red Flags (AUTO-REJECT)
+
+**Check for these - if found, REJECT plan:**
+- ❌ Any task >16 hours without subtask breakdown
+- ❌ Vague language: "implement properly", "add support", "make it work"
+- ❌ Success criteria that can't be verified: "code is good", "works well"
+- ❌ Missing test specifications
+- ❌ "We'll handle this later" or "TODO" in the plan itself
+- ❌ No anti-patterns section
+- ❌ Implementation checklist with fewer than 3 items per task
+- ❌ No effort estimates
+- ❌ Missing error handling considerations
+- ❌ **CRITICAL: Placeholder text in design field** - "[detailed above]", "[as specified]", "[complete steps here]"
+
+---
+
+## Review Process
+
+For each task in the plan:
+
+**Step 1: Read the task**
+```bash
+bd show bd-3
+```
+
+**Step 2: Apply all 7 checklist categories**
+- Task Granularity
+- Implementability
+- Success Criteria Quality
+- Dependency Structure
+- Safety & Quality Standards
+- Edge Cases & Failure Modes
+- Red Flags
+
+**Step 3: Document findings**
+Take notes:
+- What's done well
+- What's missing
+- What's vague or ambiguous
+- Hidden failure modes not addressed
+- Better approaches or simplifications
+
+**Step 4: Update the task**
+
+Use `bd update` to add missing information:
+
+```bash
+bd update bd-3 --design "$(cat <<'EOF'
+## Goal
+[Original goal, preserved]
+
+## Effort Estimate
+[Updated estimate if needed]
+
+## Success Criteria
+- [ ] Existing criteria
+- [ ] NEW: Added missing measurable criteria
+
+## Implementation Checklist
+[Complete checklist with file paths]
+
+## Key Considerations (ADDED BY SRE REVIEW)
+
+**Edge Case: Empty Input**
+- What happens when input is empty string?
+- MUST validate input length before processing
+
+**Edge Case: Unicode Handling**
+- What if string contains RTL or surrogate pairs?
+- Use proper Unicode-aware string methods
+
+**Performance Concern: Regex Backtracking**
+- Pattern `.*[a-z]+.*` has catastrophic backtracking risk
+- MUST test with pathological inputs (e.g., 10000 'a's)
+- Use possessive quantifiers or bounded repetition
+
+**Reference Implementation**
+- Study src/similar/module.rs for pattern to follow
+
+## Anti-patterns
+[Original anti-patterns]
+- ❌ NEW: Specific anti-pattern for this task's risks
+EOF
+)"
+```
+
+**IMPORTANT:** Use `--design` for full detailed description, NOT `--description` (title only).
+
+**Step 5: Verify no placeholder text (MANDATORY)**
+
+After updating, read back with `bd show bd-N` and verify:
+- ✅ All sections contain actual content, not meta-references
+- ✅ No placeholder text like "[detailed above]", "[as specified]", "[will be added]"
+- ✅ Implementation steps fully written with actual code examples
+- ✅ Success criteria explicit, not referencing "criteria above"
+- ❌ If ANY placeholder text found: REJECT and rewrite with actual content
+
+---
+
+## Breaking Down Large Tasks
+
+If task >16 hours, create subtasks:
+
+```bash
+# Create first subtask
+bd create "Subtask 1: [Specific Component]" \
+  --type task \
+  --priority 1 \
+  --design "[Complete subtask design with all 7 categories addressed]"
+# Returns bd-10
+
+# Create second subtask
+bd create "Subtask 2: [Another Component]" \
+  --type task \
+  --priority 1 \
+  --design "[Complete subtask design]"
+# Returns bd-11
+
+# Link subtasks to parent with parent-child relationship
+bd dep add bd-10 bd-3 --type parent-child  # bd-10 is child of bd-3
+bd dep add bd-11 bd-3 --type parent-child  # bd-11 is child of bd-3
+
+# Add sequential dependencies if needed (LATER depends on EARLIER)
+bd dep add bd-11 bd-10  # bd-11 depends on bd-10 (do bd-10 first)
+
+# Update parent to coordinator
+bd update bd-3 --design "$(cat <<'EOF'
+## Goal
+Coordinate implementation of [feature]. Broken into N subtasks.
+
+## Success Criteria
+- [ ] All N child subtasks closed
+- [ ] Integration tests pass
+- [ ] [High-level verification criteria]
+EOF
+)"
+```
+
+---
+
+## Output Format
+
+After reviewing all tasks:
+
+```markdown
+## Plan Review Results
+
+### Epic: [Name] ([epic-id])
+
+### Overall Assessment
+[APPROVE ✅ / NEEDS REVISION ⚠️ / REJECT ❌]
+
+### Dependency Structure Review
+[Output of `bd dep tree [epic-id]`]
+
+**Structure Quality**: [✅ Correct / ❌ Issues found]
+- [Comments on parent-child relationships]
+- [Comments on blocking dependencies]
+- [Comments on granularity]
+
+### Task-by-Task Review
+
+#### [Task Name] (bd-N)
+**Type**: [epic/feature/task]
+**Status**: [✅ Ready / ⚠️ Needs Minor Improvements / ❌ Needs Major Revision]
+**Estimated Effort**: [X hours] ([✅ Good / ❌ Too large - needs breakdown])
+
+**Strengths**:
+- [What's done well]
+
+**Critical Issues** (must fix):
+- [Blocking problems]
+
+**Improvements Needed**:
+- [What to add/clarify]
+
+**Edge Cases Missing**:
+- [Failure modes not addressed]
+
+**Changes Made**:
+- [Specific improvements added via `bd update`]
+
+---
+
+[Repeat for each task/phase/subtask]
+
+### Summary of Changes
+
+**Issues Updated**:
+- bd-3 - Added edge case handling for Unicode, regex backtracking risks
+- bd-5 - Broke into 3 subtasks (was 40 hours, now 3x8 hours)
+- bd-7 - Strengthened success criteria (added test names, verification commands)
+
+### Critical Gaps Across Plan
+1. [Pattern of missing items across multiple tasks]
+2. [Systemic issues in the plan]
+
+### Recommendations
+
+[If APPROVE]:
+✅ Plan is solid and ready for implementation.
+- All tasks are junior-engineer implementable
+- Dependency structure is correct
+- Edge cases and failure modes addressed
+
+[If NEEDS REVISION]:
+⚠️ Plan needs improvements before implementation:
+- [List major items that need addressing]
+- After changes, re-run hyperpowers:sre-task-refinement
+
+[If REJECT]:
+❌ Plan has fundamental issues and needs redesign:
+- [Critical problems]
+```
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer reviews task but skips edge case analysis (Category 6)</scenario>
+
+<code>
+# Review of bd-3: Implement VIN scanner
+
+## Checklist review:
+1. Granularity: ✅ 6-8 hours
+2. Implementability: ✅ Junior can implement
+3. Success Criteria: ✅ Has 5 test scenarios
+4. Dependencies: ✅ Correct
+5. Safety Standards: ✅ Anti-patterns present
+6. Edge Cases: [SKIPPED - "looks straightforward"]
+7. Red Flags: ✅ None found
+
+Conclusion: "Task looks good, approve ✅"
+
+# Task ships without edge case review
+# Production issues occur:
+- VIN scanner matches random 17-char strings (no checksum validation)
+- Lowercase VINs not handled (should normalize)
+- Catastrophic regex backtracking on long inputs (DoS vulnerability)
+</code>
+
+<why_it_fails>
+- Skipped Category 6 (Edge Cases) assuming task was "straightforward"
+- Didn't ask: What happens with invalid checksums? Lowercase? Long inputs?
+- Missed critical production issues:
+  - False positives (no checksum validation)
+  - Data handling bugs (case sensitivity)
+  - Security vulnerability (regex DoS)
+- Junior engineer didn't know to handle these (not in task)
+- Production incidents occur after deployment
+- Hours of emergency fixes, customer impact
+- SRE review failed to prevent known failure modes
+</why_it_fails>
+
+<correction>
+**Apply Category 6 rigorously:**
+
+```markdown
+## Edge Case Analysis for bd-3: VIN Scanner
+
+Ask for EVERY task:
+- Malformed input? VIN has checksum - must validate, not just pattern match
+- Empty/nil? What if empty string passed?
+- Concurrency? Read-only scanner, no concurrency issues
+- Dependency failures? No external dependencies
+- Unicode/special chars? VIN is alphanumeric only, but what about lowercase?
+- Large inputs? Regex `.*` patterns can cause catastrophic backtracking
+
+Findings:
+❌ VIN checksum validation not mentioned (will match random strings)
+❌ Case normalization not mentioned (lowercase VINs exist)
+❌ Regex backtracking risk not mentioned (DoS vulnerability)
+```
+
+**Update task:**
+```bash
+bd update bd-3 --design "$(cat <<'EOF'
+[... original content ...]
+
+## Key Considerations (ADDED BY SRE REVIEW)
+
+**VIN Checksum Complexity**:
+- ISO 3779 requires transliteration table (letters → numbers)
+- Weighted sum algorithm with modulo 11
+- Reference: https://en.wikipedia.org/wiki/Vehicle_identification_number#Check_digit
+- MUST validate checksum, not just pattern - prevents false positives
+
+**Case Normalization**:
+- VINs can appear in lowercase
+- MUST normalize to uppercase before validation
+- Test with mixed case: "1hgbh41jxmn109186"
+
+**Regex Backtracking Risk**:
+- CRITICAL: Pattern `.*[A-HJ-NPR-Z0-9]{17}.*` has backtracking risk
+- Test with pathological input: 10000 'X's followed by 16-char string
+- Use possessive quantifiers or bounded repetition
+- Reference: https://www.regular-expressions.info/catastrophic.html
+
+**Edge Cases to Test**:
+- Valid VIN with valid checksum (should match)
+- Valid pattern but invalid checksum (should NOT match)
+- Lowercase VIN (should normalize and validate)
+- Ambiguous chars I/O not valid in VIN (should reject)
+- Very long input (should not DoS)
+EOF
+)"
+```
+
+**What you gain:**
+- Prevented false positives (checksum validation)
+- Prevented data handling bugs (case normalization)
+- Prevented security vulnerability (regex DoS)
+- Junior engineer has complete requirements
+- Production issues caught pre-implementation
+- Proper SRE review preventing known failure modes
+- Customer trust maintained
+</correction>
+</example>
+
+<example>
+<scenario>Developer approves task with placeholder text (Red Flag #10)</scenario>
+
+<code>
+# Review of bd-5: Implement License Plate Scanner
+
+bd show bd-5:
+
+## Implementation Checklist
+- [ ] Create scanner module
+- [ ] [Complete implementation steps detailed above]
+- [ ] Add tests
+
+## Success Criteria
+- [ ] [As specified in the implementation checklist]
+- [ ] Tests pass
+
+## Key Considerations
+- [Will be added during implementation]
+
+# Developer's review:
+"Looks comprehensive, has implementation checklist and success criteria ✅"
+
+# During implementation:
+Junior engineer: "What are the 'implementation steps detailed above'?"
+Junior engineer: "What specific success criteria should I verify?"
+Junior engineer: "What key considerations exist?"
+
+# No answers in the task - junior engineer blocked
+# Have to research and add missing information
+# Implementation delayed by 2 days
+</code>
+
+<why_it_fails>
+- Missed Red Flag #10: Placeholder text present
+- "[Complete implementation steps detailed above]" is meta-reference, not content
+- "[As specified in the implementation checklist]" is circular reference
+- "[Will be added during implementation]" is deferral, not specification
+- Junior engineer can't execute - missing critical information
+- Task looks complete but actually incomplete
+- Implementation blocked until details added
+- SRE review failed to catch placeholder text
+</why_it_fails>
+
+<correction>
+**Check for placeholder text after reading:**
+
+```markdown
+## Red Flag Check (Category 7)
+
+Read through bd-5 line by line:
+
+Line 15: "[Complete implementation steps detailed above]"
+❌ PLACEHOLDER - "detailed above" is meta-reference, not actual content
+
+Line 22: "[As specified in the implementation checklist]"
+❌ PLACEHOLDER - Circular reference to another section, not explicit criteria
+
+Line 30: "[Will be added during implementation]"
+❌ PLACEHOLDER - Deferral to future, not actual considerations
+
+DECISION: REJECT ❌
+Reason: Contains placeholder text - task not ready for implementation
+```
+
+**Update task with actual content:**
+```bash
+bd update bd-5 --design "$(cat <<'EOF'
+## Implementation Checklist
+- [ ] Create src/scan/plugins/scanners/license_plate.rs
+- [ ] Implement LicensePlateScanner struct with ScanPlugin trait
+- [ ] Add regex patterns for US states:
+  - CA: `[0-9][A-Z]{3}[0-9]{3}` (e.g., 1ABC123)
+  - NY: `[A-Z]{3}[0-9]{4}` (e.g., ABC1234)
+  - TX: `[A-Z]{3}[0-9]{4}|[0-9]{3}[A-Z]{3}` (e.g., ABC1234 or 123ABC)
+  - Generic: `[A-Z0-9]{5,8}` (fallback)
+- [ ] Implement has_healthcare_context() check
+- [ ] Create test module with 8+ test cases
+- [ ] Register in src/scan/plugins/scanners/mod.rs
+
+## Success Criteria
+- [ ] Valid CA plate "1ABC123" detected in healthcare context
+- [ ] Valid NY plate "ABC1234" detected in healthcare context
+- [ ] Invalid plate "123" NOT detected (too short)
+- [ ] Valid plate NOT detected outside healthcare context
+- [ ] 8+ unit tests pass covering all patterns and edge cases
+- [ ] Clippy clean, no warnings
+- [ ] cargo test passes
+
+## Key Considerations
+
+**False Positive Risk**:
+- License plates are short and generic (5-8 chars)
+- MUST require healthcare context via has_healthcare_context()
+- Without context, will match random alphanumeric sequences
+- Test: Random string "ABC1234" should NOT match outside healthcare context
+
+**State Format Variations**:
+- 50 US states have different formats
+- Implement common formats (CA, NY, TX) + generic fallback
+- Document which formats supported in module docstring
+- Consider international plates in future iteration
+
+**Performance**:
+- Regex patterns are simple, no backtracking risk
+- Should process <1ms per chunk
+
+**Reference Implementation**:
+- Study src/scan/plugins/scanners/vehicle_identifier.rs
+- Follow same pattern: regex + context check + tests
+EOF
+)"
+```
+
+**Verify no placeholder text:**
+```bash
+bd show bd-5
+# Read entire output
+# Confirm: All sections have actual content
+# Confirm: No "[detailed above]", "[as specified]", "[will be added]"
+# ✅ Task ready for implementation
+```
+
+**What you gain:**
+- Junior engineer has complete specification
+- No blocked implementation waiting for details
+- All edge cases documented upfront
+- Success criteria explicit and verifiable
+- Key considerations prevent common mistakes
+- No placeholder text - task truly ready
+- Professional SRE review standard maintained
+</correction>
+</example>
+
+<example>
+<scenario>Developer accepts vague success criteria (Category 3)</scenario>
+
+<code>
+# Review of bd-7: Implement Data Encryption
+
+bd show bd-7:
+
+## Success Criteria
+- [ ] Encryption is implemented correctly
+- [ ] Code is good quality
+- [ ] Tests work properly
+
+# Developer's review:
+"Has 3 success criteria ✅ Meets minimum requirement"
+
+# During implementation:
+Junior engineer: "How do I know if encryption is 'correct'?"
+Junior engineer: "What makes code 'good quality'?"
+Junior engineer: "What does 'tests work properly' mean?"
+
+# Junior engineer makes best guesses:
+- Uses ECB mode (insecure, should use GCM)
+- No key rotation (bad practice)
+- Tests only happy path (misses edge cases)
+
+# Code review finds critical security issues
+# Complete rewrite required - 3 days wasted
+</code>
+
+<why_it_fails>
+- Category 3 check: "All criteria testable/verifiable?" ❌ FAILED
+- "Implemented correctly" - not measurable (correct by what standard?)
+- "Good quality" - subjective, not verifiable
+- "Work properly" - vague, what is proper?
+- Junior engineer can't verify criteria objectively
+- Makes incorrect assumptions filling gaps
+- Security vulnerabilities introduced
+- Wastes time on implementation that fails review
+- SRE review failed to strengthen criteria
+</why_it_fails>
+
+<correction>
+**Apply Category 3 rigorously:**
+
+```markdown
+## Success Criteria Analysis for bd-7
+
+Current criteria:
+- [ ] Encryption is implemented correctly
+  ❌ NOT TESTABLE - "correctly" is subjective, no standard specified
+
+- [ ] Code is good quality
+  ❌ NOT TESTABLE - "good quality" is opinion, not measurable
+
+- [ ] Tests work properly
+  ❌ NOT TESTABLE - "properly" is vague, no definition
+
+Minimum requirement: 3+ specific, measurable, testable criteria
+Current: 0 testable criteria
+DECISION: REJECT ❌
+```
+
+**Update with measurable criteria:**
+```bash
+bd update bd-7 --design "$(cat <<'EOF'
+[... original content ...]
+
+## Success Criteria
+
+**Encryption Implementation**:
+- [ ] Uses AES-256-GCM mode (verified in code review)
+- [ ] Key derivation via PBKDF2 with 100,000 iterations (NIST recommendation)
+- [ ] Unique IV generated per encryption (crypto_random)
+- [ ] Authentication tag verified on decryption
+
+**Code Quality** (automated checks):
+- [ ] Clippy clean with no warnings: `cargo clippy -- -D warnings`
+- [ ] Rustfmt compliant: `cargo fmt --check`
+- [ ] No unwrap/expect in production: `rg "\.unwrap\(\)|\.expect\(" src/` returns 0
+- [ ] No TODOs without issue numbers: `rg "TODO" src/` returns 0
+
+**Test Coverage**:
+- [ ] 12+ unit tests pass covering:
+  - test_encrypt_decrypt_roundtrip (happy path)
+  - test_wrong_key_fails_auth (security)
+  - test_modified_ciphertext_fails_auth (security)
+  - test_empty_plaintext (edge case)
+  - test_large_plaintext_10mb (performance)
+  - test_unicode_plaintext (data handling)
+  - test_concurrent_encryption (thread safety)
+  - test_iv_uniqueness (security)
+  - [4 more specific scenarios]
+- [ ] All tests pass: `cargo test encryption`
+- [ ] Test coverage >90%: `cargo tarpaulin --packages encryption`
+
+**Documentation**:
+- [ ] Module docstring explains encryption scheme (AES-256-GCM)
+- [ ] Function docstrings include examples
+- [ ] Security considerations documented (key management, IV handling)
+
+**Security Review**:
+- [ ] No hardcoded keys or IVs (verified via grep)
+- [ ] Key zeroized after use (verified in code)
+- [ ] Constant-time comparison for auth tag (timing attack prevention)
+EOF
+)"
+```
+
+**What you gain:**
+- Every criterion objectively verifiable
+- Junior engineer knows exactly what "done" means
+- Automated checks (clippy, fmt, grep) provide instant feedback
+- Specific test scenarios prevent missed edge cases
+- Security requirements explicit (GCM, PBKDF2, unique IV)
+- No ambiguity - can verify each criterion with command or code review
+- Professional SRE review standard: measurable, testable, specific
+</correction>
+</example>
+</examples>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Apply all 7 categories to every task** → No skipping any category for any task
+2. **Reject plans with placeholder text** → "[detailed above]", "[as specified]" = instant reject
+3. **Verify no placeholder after updates** → Read back with `bd show` and confirm actual content
+4. **Break tasks >16 hours** → Create subtasks, don't accept large tasks
+5. **Strengthen vague criteria** → "Works correctly" → measurable verification commands
+6. **Add edge cases to every task** → Empty? Unicode? Concurrency? Failures?
+7. **Never skip Category 6** → Edge case analysis prevents production issues
+
+## Common Excuses
+
+All of these mean: **STOP. Apply the full process.**
+
+- "Task looks straightforward" (Edge cases hide in "straightforward" tasks)
+- "Has 3 criteria, meets minimum" (Criteria must be measurable, not just 3+ items)
+- "Placeholder text is just formatting" (Placeholders mean incomplete specification)
+- "Can handle edge cases during implementation" (Must specify upfront, not defer)
+- "Junior will figure it out" (Junior should NOT need to figure out - we specify)
+- "Too detailed, feels like micromanaging" (Detail prevents questions and rework)
+- "Taking too long to review" (One gap caught saves hours of rework)
+</critical_rules>
+
+<verification_checklist>
+Before completing SRE review:
+
+**Per task reviewed:**
+- [ ] Applied all 7 categories (Granularity, Implementability, Criteria, Dependencies, Safety, Edge Cases, Red Flags)
+- [ ] Checked for placeholder text in design field
+- [ ] Updated task with missing information via `bd update --design`
+- [ ] Verified updated task with `bd show` (no placeholders remain)
+- [ ] Broke down any task >16 hours into subtasks
+- [ ] Strengthened vague success criteria to measurable
+- [ ] Added edge case analysis to Key Considerations
+- [ ] Strengthened anti-patterns based on failure modes
+
+**Overall plan:**
+- [ ] Reviewed ALL tasks/phases/subtasks (no exceptions)
+- [ ] Verified dependency structure with `bd dep tree`
+- [ ] Documented findings for each task
+- [ ] Created summary of changes made
+- [ ] Provided clear recommendation (APPROVE/NEEDS REVISION/REJECT)
+
+**Can't check all boxes?** Return to review process and complete missing steps.
+</verification_checklist>
+
+<integration>
+**This skill is used after:**
+- hyperpowers:writing-plans (creates initial plan)
+- hyperpowers:brainstorming (establishes requirements)
+
+**This skill is used before:**
+- hyperpowers:executing-plans (implements tasks)
+
+**This skill is also called by:**
+- hyperpowers:executing-plans (REQUIRED for new tasks created during execution)
+
+**Call chains:**
+```
+Initial planning:
+hyperpowers:brainstorming → hyperpowers:writing-plans → hyperpowers:sre-task-refinement → hyperpowers:executing-plans
+                                                    ↓
+                                            (if gaps: revise and re-review)
+
+During execution (for new tasks):
+hyperpowers:executing-plans → creates new task → hyperpowers:sre-task-refinement → STOP checkpoint
+```
+
+**This skill uses:**
+- bd commands (show, update, create, dep add, dep tree)
+- Google Fellow SRE perspective (20+ years distributed systems)
+- 7-category checklist (mandatory for every task)
+
+**Time expectations:**
+- Small epic (3-5 tasks): 15-20 minutes
+- Medium epic (6-10 tasks): 25-40 minutes
+- Large epic (10+ tasks): 45-60 minutes
+
+**Don't rush:** Catching one critical gap pre-implementation saves hours of rework.
+</integration>
+
+<resources>
+**Review patterns:**
+- Task too large (>16h) → Break into 4-8h subtasks
+- Vague criteria ("works correctly") → Measurable commands/checks
+- Missing edge cases → Add to Key Considerations with mitigations
+- Placeholder text → Rewrite with actual content
+
+**When stuck:**
+- Unsure if task too large → Ask: Can junior complete in one day?
+- Unsure if criteria measurable → Ask: Can I verify with command/code review?
+- Unsure if edge case matters → Ask: Could this fail in production?
+- Unsure if placeholder → Ask: Does this reference other content instead of providing content?
+
+**Key principle:** Junior engineer should be able to execute task without asking questions. If they would need to ask, specification is incomplete.
+</resources>
--- a/skills/test-driven-development/SKILL.md
+++ b/skills/test-driven-development/SKILL.md
@@ -0,0 +1,336 @@
+---
+name: test-driven-development
+description: Use when implementing features or fixing bugs - enforces RED-GREEN-REFACTOR cycle requiring tests to fail before writing code
+---
+
+<skill_overview>
+Write the test first, watch it fail, write minimal code to pass. If you didn't watch the test fail, you don't know if it tests the right thing.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow these exact steps in order. Do not adapt.
+
+Violating the letter of the rules is violating the spirit of the rules.
+</rigidity_level>
+
+<quick_reference>
+
+| Phase | Action | Command Example | Expected Result |
+|-------|--------|-----------------|-----------------|
+| **RED** | Write failing test | `cargo test test_name` | FAIL (feature missing) |
+| **Verify RED** | Confirm correct failure | Check error message | "function not found" or assertion fails |
+| **GREEN** | Write minimal code | Implement feature | Test passes |
+| **Verify GREEN** | All tests pass | `cargo test` | All green, no warnings |
+| **REFACTOR** | Clean up code | Improve while green | Tests still pass |
+
+**Iron Law:** NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+
+</quick_reference>
+
+<when_to_use>
+**Always use for:**
+- New features
+- Bug fixes
+- Refactoring with behavior changes
+- Any production code
+
+**Ask your human partner for exceptions:**
+- Throwaway prototypes (will be deleted)
+- Generated code
+- Configuration files
+
+Thinking "skip TDD just this once"? Stop. That's rationalization.
+</when_to_use>
+
+<the_process>
+
+## 1. RED - Write Failing Test
+
+Write one minimal test showing what should happen.
+
+**Requirements:**
+- Test one behavior only ("and" in name? Split it)
+- Clear name describing behavior
+- Use real code (no mocks unless unavoidable)
+
+See [resources/language-examples.md](resources/language-examples.md) for Rust, Swift, TypeScript examples.
+
+## 2. Verify RED - Watch It Fail
+
+**MANDATORY. Never skip.**
+
+Run the test and confirm:
+- ✓ Test **fails** (not errors with syntax issues)
+- ✓ Failure message is expected ("function not found" or assertion fails)
+- ✓ Fails because feature missing (not typos)
+
+**If test passes:** You're testing existing behavior. Fix the test.
+**If test errors:** Fix syntax error, re-run until it fails correctly.
+
+## 3. GREEN - Write Minimal Code
+
+Write simplest code to pass the test. Nothing more.
+
+**Key principle:** Don't add features the test doesn't require. Don't refactor other code. Don't "improve" beyond the test.
+
+## 4. Verify GREEN - Watch It Pass
+
+**MANDATORY.**
+
+Run tests and confirm:
+- ✓ New test passes
+- ✓ All other tests still pass
+- ✓ No errors or warnings
+
+**If test fails:** Fix code, not test.
+**If other tests fail:** Fix now before proceeding.
+
+## 5. REFACTOR - Clean Up
+
+**Only after green:**
+- Remove duplication
+- Improve names
+- Extract helpers
+
+Keep tests green. Don't add behavior.
+
+## 6. Repeat
+
+Next failing test for next feature.
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer writes implementation first, then adds test that passes immediately</scenario>
+
+<code>
+// Code written FIRST
+def validate_email(email):
+    return "@" in email  # Bug: accepts "@@"
+
+// Test written AFTER
+def test_validate_email():
+    assert validate_email("user@example.com")  # Passes immediately!
+    // Missing edge case: assert not validate_email("@@")
+</code>
+
+<why_it_fails>
+When test passes immediately:
+- Never proved the test catches bugs
+- Only tested happy path you remembered
+- Forgot edge cases (like "@@")
+- Bug ships to production
+
+Tests written after verify remembered cases, not required behavior.
+</why_it_fails>
+
+<correction>
+**TDD approach:**
+
+1. **RED** - Write test first (including edge case):
+```python
+def test_validate_email():
+    assert validate_email("user@example.com")  # Will fail - function doesn't exist
+    assert not validate_email("@@")            # Edge case up front
+```
+
+2. **Verify RED** - Run test, watch it fail:
+```bash
+NameError: function 'validate_email' is not defined
+```
+
+3. **GREEN** - Implement to pass both cases:
+```python
+def validate_email(email):
+    return "@" in email and email.count("@") == 1
+```
+
+4. **Verify GREEN** - Both assertions pass, bug prevented.
+
+**Result:** Test failed first, proving it works. Edge case discovered during test writing, not in production.
+</correction>
+</example>
+
+<example>
+<scenario>Developer has already written 3 hours of code without tests. Wants to keep it as "reference" while writing tests.</scenario>
+
+<code>
+// 200 lines of untested code exists
+// Developer thinks: "I'll keep this and write tests that match it"
+// Or: "I'll use it as reference to speed up TDD"
+</code>
+
+<why_it_fails>
+**Keeping code as "reference":**
+- You'll copy it (that's testing after, with extra steps)
+- You'll adapt it (biased by implementation)
+- Tests will match code, not requirements
+- You'll justify shortcuts: "I already know this works"
+
+**Result:** All the problems of test-after, none of the benefits of TDD.
+</why_it_fails>
+
+<correction>
+**Delete it. Completely.**
+
+```bash
+git stash  # Or delete the file
+```
+
+**Then start TDD:**
+1. Write first failing test from requirements (not from code)
+2. Watch it fail
+3. Implement fresh (might be different from original, that's OK)
+4. Watch it pass
+
+**Why delete:**
+- Sunk cost is already gone
+- 3 hours implementing ≠ 3 hours with TDD (TDD might be 2 hours total)
+- Code without tests is technical debt
+- Fresh implementation from tests is usually better
+
+**What you gain:**
+- Tests that actually verify behavior
+- Confidence code works
+- Ability to refactor safely
+- No bugs from untested edge cases
+</correction>
+</example>
+
+<example>
+<scenario>Test is hard to write. Developer thinks "design must be unclear, but I'll implement first to explore."</scenario>
+
+<code>
+// Test attempt:
+func testUserServiceCreatesAccount() {
+    // Need to mock database, email service, payment gateway, logger...
+    // This is getting complicated, maybe I should just implement first
+}
+</code>
+
+<why_it_fails>
+**"Test is hard" is valuable signal:**
+- Hard to test = hard to use
+- Too many dependencies = coupling too tight
+- Complex setup = design needs simplification
+
+**Implementing first ignores this signal:**
+- Build the complex design
+- Lock in the coupling
+- Now forced to write complex tests (or skip them)
+</why_it_fails>
+
+<correction>
+**Listen to the test.**
+
+Hard to test? Simplify the interface:
+
+```swift
+// Instead of:
+class UserService {
+    init(db: Database, email: EmailService, payments: PaymentGateway, logger: Logger) { }
+    func createAccount(email: String, password: String, paymentToken: String) throws { }
+}
+
+// Make testable:
+class UserService {
+    func createAccount(request: CreateAccountRequest) -> Result<Account, Error> {
+        // Dependencies injected through request or passed separately
+    }
+}
+```
+
+**Test becomes simple:**
+```swift
+func testCreatesAccountFromRequest() {
+    let service = UserService()
+    let request = CreateAccountRequest(email: "user@example.com")
+    let result = service.createAccount(request: request)
+    XCTAssertEqual(result.email, "user@example.com")
+}
+```
+
+**TDD forces good design.** If test is hard, fix design before implementing.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **Write code before test?** → Delete it. Start over.
+   - Never keep as "reference"
+   - Never "adapt" while writing tests
+   - Delete means delete
+
+2. **Test passes immediately?** → Not TDD. Fix the test or delete the code.
+   - Passing immediately proves nothing
+   - You're testing existing behavior, not required behavior
+
+3. **Can't explain why test failed?** → Fix until failure makes sense.
+   - "function not found" = good (feature doesn't exist)
+   - Weird error = bad (fix test, re-run)
+
+4. **Want to skip "just this once"?** → That's rationalization. Stop.
+   - TDD is faster than debugging in production
+   - "Too simple to test" = test takes 30 seconds
+   - "Already manually tested" = not systematic, not repeatable
+
+## Common Excuses
+
+All of these mean: Stop, follow TDD:
+- "This is different because..."
+- "I'm being pragmatic, not dogmatic"
+- "It's about spirit not ritual"
+- "Tests after achieve the same goals"
+- "Deleting X hours of work is wasteful"
+
+</critical_rules>
+
+<verification_checklist>
+
+Before marking work complete:
+
+- [ ] Every new function/method has a test
+- [ ] Watched each test **fail** before implementing
+- [ ] Each test failed for expected reason (feature missing, not typo)
+- [ ] Wrote minimal code to pass each test
+- [ ] All tests pass with no warnings
+- [ ] Tests use real code (mocks only if unavoidable)
+- [ ] Edge cases and errors covered
+
+**Can't check all boxes?** You skipped TDD. Start over.
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- verification-before-completion (running tests to verify)
+
+**This skill is called by:**
+- fixing-bugs (write failing test reproducing bug)
+- executing-plans (when implementing bd tasks)
+- refactoring-safely (keep tests green while refactoring)
+
+**Agents used:**
+- hyperpowers:test-runner (run tests, return summary only)
+
+</integration>
+
+<resources>
+
+**Detailed language-specific examples:**
+- [Rust, Swift, TypeScript examples](resources/language-examples.md) - Complete RED-GREEN-REFACTOR cycles
+- [Language-specific test commands](resources/language-examples.md#verification-commands-by-language)
+
+**When stuck:**
+- Test too complicated? → Design too complicated, simplify interface
+- Must mock everything? → Code too coupled, use dependency injection
+- Test setup huge? → Extract helpers, or simplify design
+
+</resources>
--- a/skills/test-driven-development/resources/example-workflows.md
+++ b/skills/test-driven-development/resources/example-workflows.md
@@ -0,0 +1,325 @@
+# TDD Workflow Examples
+
+This guide shows complete TDD workflows for common scenarios: bug fixes and feature additions.
+
+## Example: Bug Fix
+
+### Bug
+
+Empty email is accepted when it should be rejected.
+
+### RED Phase: Write Failing Test
+
+**Swift:**
+```swift
+func testRejectsEmptyEmail() async throws {
+    let result = try await submitForm(FormData(email: ""))
+    XCTAssertEqual(result.error, "Email required")
+}
+```
+
+### Verify RED: Watch It Fail
+
+```bash
+$ swift test --filter FormTests.testRejectsEmptyEmail
+FAIL: XCTAssertEqual failed: ("nil") is not equal to ("Optional("Email required")")
+```
+
+**Confirms:**
+- Test fails (not errors)
+- Failure message shows email not being validated
+- Fails because feature missing (not typos)
+
+### GREEN Phase: Minimal Code
+
+```swift
+struct FormResult {
+    var error: String?
+}
+
+func submitForm(_ data: FormData) async throws -> FormResult {
+    if data.email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
+        return FormResult(error: "Email required")
+    }
+    // ... rest of form processing
+    return FormResult()
+}
+```
+
+### Verify GREEN: Watch It Pass
+
+```bash
+$ swift test --filter FormTests.testRejectsEmptyEmail
+Test Case '-[FormTests testRejectsEmptyEmail]' passed
+```
+
+**Confirms:**
+- Test passes
+- Other tests still pass
+- No errors or warnings
+
+### REFACTOR: Clean Up
+
+If multiple fields need validation:
+
+```swift
+extension FormData {
+    func validate() -> String? {
+        if email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
+            return "Email required"
+        }
+        // Add other validations...
+        return nil
+    }
+}
+
+func submitForm(_ data: FormData) async throws -> FormResult {
+    if let error = data.validate() {
+        return FormResult(error: error)
+    }
+    // ... rest of form processing
+    return FormResult()
+}
+```
+
+Run tests again to confirm still green.
+
+---
+
+## Example: Feature Addition
+
+### Feature
+
+Calculate average of non-empty list.
+
+### RED Phase: Write Failing Test
+
+**TypeScript:**
+```typescript
+describe('average', () => {
+  it('calculates average of non-empty list', () => {
+    expect(average([1, 2, 3])).toBe(2);
+  });
+});
+```
+
+### Verify RED: Watch It Fail
+
+```bash
+$ npm test -- --testNamePattern="average"
+FAIL: ReferenceError: average is not defined
+```
+
+**Confirms:**
+- Function doesn't exist yet
+- Test would verify the behavior when function exists
+
+### GREEN Phase: Minimal Code
+
+```typescript
+function average(numbers: number[]): number {
+  const sum = numbers.reduce((acc, n) => acc + n, 0);
+  return sum / numbers.length;
+}
+```
+
+### Verify GREEN: Watch It Pass
+
+```bash
+$ npm test -- --testNamePattern="average"
+PASS: calculates average of non-empty list
+```
+
+### Add Edge Case: Empty List
+
+**RED:**
+```typescript
+it('returns 0 for empty list', () => {
+  expect(average([])).toBe(0);
+});
+```
+
+**Verify RED:**
+```bash
+$ npm test -- --testNamePattern="average.*empty"
+FAIL: Expected: 0, Received: NaN
+```
+
+**GREEN:**
+```typescript
+function average(numbers: number[]): number {
+  if (numbers.length === 0) return 0;
+  const sum = numbers.reduce((acc, n) => acc + n, 0);
+  return sum / numbers.length;
+}
+```
+
+**Verify GREEN:**
+```bash
+$ npm test -- --testNamePattern="average"
+PASS: 2 tests passed
+```
+
+### REFACTOR: Clean Up
+
+No duplication or unclear naming, so no refactoring needed. Move to next feature.
+
+---
+
+## Example: Refactoring with Tests
+
+### Scenario
+
+Existing function works but is hard to read. Tests exist and pass.
+
+### Current Code
+
+```rust
+fn process(data: Vec<i32>) -> i32 {
+    let mut result = 0;
+    for item in data {
+        if item > 0 {
+            result += item * 2;
+        }
+    }
+    result
+}
+```
+
+### Existing Tests (Already Green)
+
+```rust
+#[test]
+fn processes_positive_numbers() {
+    assert_eq!(process(vec![1, 2, 3]), 12); // (1*2) + (2*2) + (3*2) = 12
+}
+
+#[test]
+fn ignores_negative_numbers() {
+    assert_eq!(process(vec![1, -2, 3]), 8); // (1*2) + (3*2) = 8
+}
+
+#[test]
+fn handles_empty_list() {
+    assert_eq!(process(vec![]), 0);
+}
+```
+
+### REFACTOR: Improve Clarity
+
+```rust
+fn process(data: Vec<i32>) -> i32 {
+    data.iter()
+        .filter(|&&n| n > 0)
+        .map(|&n| n * 2)
+        .sum()
+}
+```
+
+### Verify Still Green
+
+```bash
+$ cargo test
+running 3 tests
+test processes_positive_numbers ... ok
+test ignores_negative_numbers ... ok
+test handles_empty_list ... ok
+
+test result: ok. 3 passed; 0 failed
+```
+
+**Key:** Tests prove refactoring didn't break behavior.
+
+---
+
+## Common Patterns
+
+### Pattern: Adding Validation
+
+1. **RED:** Test that invalid input is rejected
+2. **Verify RED:** Confirm invalid input currently accepted
+3. **GREEN:** Add validation check
+4. **Verify GREEN:** Confirm validation works
+5. **REFACTOR:** Extract validation if reusable
+
+### Pattern: Adding Error Handling
+
+1. **RED:** Test that error condition is caught
+2. **Verify RED:** Confirm error currently unhandled
+3. **GREEN:** Add error handling
+4. **Verify GREEN:** Confirm error handled correctly
+5. **REFACTOR:** Consolidate error handling if duplicated
+
+### Pattern: Optimizing Performance
+
+1. **Ensure tests exist and pass** (if not, add tests first)
+2. **REFACTOR:** Optimize implementation
+3. **Verify GREEN:** Confirm tests still pass
+4. **Measure:** Confirm performance improved
+
+**Note:** Never optimize without tests. You can't prove optimization didn't break behavior.
+
+---
+
+## Workflow Checklist
+
+### For Each New Feature
+
+- [ ] Write one failing test
+- [ ] Run test, confirm it fails correctly
+- [ ] Write minimal code to pass
+- [ ] Run test, confirm it passes
+- [ ] Run all tests, confirm no regressions
+- [ ] Refactor if needed (staying green)
+- [ ] Commit
+
+### For Each Bug Fix
+
+- [ ] Write test reproducing the bug
+- [ ] Run test, confirm it fails (reproduces bug)
+- [ ] Fix the bug (minimal change)
+- [ ] Run test, confirm it passes (bug fixed)
+- [ ] Run all tests, confirm no regressions
+- [ ] Commit
+
+### For Each Refactoring
+
+- [ ] Confirm tests exist and pass
+- [ ] Make one small refactoring change
+- [ ] Run tests, confirm still green
+- [ ] Repeat until refactoring complete
+- [ ] Commit
+
+---
+
+## Anti-Patterns to Avoid
+
+### ❌ Writing Multiple Tests Before Implementing
+
+**Why bad:** You can't tell which test makes implementation fail. Write one, implement, repeat.
+
+### ❌ Changing Test to Make It Pass
+
+**Why bad:** Test should define correct behavior. If test is wrong, fix test first, then re-run RED phase.
+
+### ❌ Adding Features Not Covered by Tests
+
+**Why bad:** Untested code. If you need a feature, write test first.
+
+### ❌ Skipping RED Verification
+
+**Why bad:** Test might pass immediately, meaning it doesn't test anything new.
+
+### ❌ Skipping GREEN Verification
+
+**Why bad:** Test might fail for unexpected reason. Always verify expected pass.
+
+---
+
+## Remember
+
+- **One test at a time:** Write test, implement, repeat
+- **Watch it fail:** Proves test actually tests something
+- **Watch it pass:** Proves implementation works
+- **Stay green:** All tests pass before moving on
+- **Refactor freely:** Tests catch breaks
--- a/skills/test-driven-development/resources/language-examples.md
+++ b/skills/test-driven-development/resources/language-examples.md
@@ -0,0 +1,267 @@
+# TDD Language-Specific Examples
+
+This guide provides concrete TDD examples in multiple programming languages, showing the RED-GREEN-REFACTOR cycle.
+
+## RED Phase Examples
+
+Write one minimal test showing what should happen.
+
+### Rust
+
+```rust
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn retries_failed_operations_3_times() {
+        let mut attempts = 0;
+        let operation = || -> Result<&str, &str> {
+            attempts += 1;
+            if attempts < 3 {
+                Err("fail")
+            } else {
+                Ok("success")
+            }
+        };
+
+        let result = retry_operation(operation);
+
+        assert_eq!(result, Ok("success"));
+        assert_eq!(attempts, 3);
+    }
+}
+```
+
+**Running the test:**
+```bash
+cargo test tests::retries_failed_operations_3_times
+```
+
+### Swift
+
+```swift
+func testRetriesFailedOperations3Times() async throws {
+    var attempts = 0
+    let operation = { () -> Result<String, Error> in
+        attempts += 1
+        if attempts < 3 {
+            return .failure(RetryError.failed)
+        }
+        return .success("success")
+    }
+
+    let result = try await retryOperation(operation)
+
+    XCTAssertEqual(result, "success")
+    XCTAssertEqual(attempts, 3)
+}
+```
+
+**Running the test:**
+```bash
+swift test --filter RetryTests.testRetriesFailedOperations3Times
+```
+
+### TypeScript
+
+```typescript
+describe('retryOperation', () => {
+  it('retries failed operations 3 times', async () => {
+    let attempts = 0;
+    const operation = () => {
+      attempts++;
+      if (attempts < 3) {
+        throw new Error('fail');
+      }
+      return 'success';
+    };
+
+    const result = await retryOperation(operation);
+
+    expect(result).toBe('success');
+    expect(attempts).toBe(3);
+  });
+});
+```
+
+**Running the test (Jest):**
+```bash
+npm test -- --testNamePattern="retries failed operations"
+```
+
+**Running the test (Vitest):**
+```bash
+npm test -- -t "retries failed operations"
+```
+
+### Why These Are Good
+
+- Clear names describing the behavior
+- Test real behavior, not mocks
+- One thing per test
+- Shows desired API
+
+### Bad Example
+
+```typescript
+test('retry', () => {
+    let mockCalls = 0;
+    const mock = () => {
+        mockCalls++;
+        return 'success';
+    };
+    retryOperation(mock);
+    expect(mockCalls).toBe(1); // Tests mock, not behavior
+});
+```
+
+**Why this is bad:**
+- Vague name
+- Tests mock behavior, not real retry logic
+
+## GREEN Phase Examples
+
+Write simplest code to pass the test.
+
+### Rust
+
+```rust
+fn retry_operation<F, T, E>(mut operation: F) -> Result<T, E>
+where
+    F: FnMut() -> Result<T, E>,
+{
+    for i in 0..3 {
+        match operation() {
+            Ok(result) => return Ok(result),
+            Err(e) => {
+                if i == 2 {
+                    return Err(e);
+                }
+            }
+        }
+    }
+    unreachable!()
+}
+```
+
+### Swift
+
+```swift
+func retryOperation<T>(_ operation: () async throws -> T) async throws -> T {
+    var lastError: Error?
+    for attempt in 0..<3 {
+        do {
+            return try await operation()
+        } catch {
+            lastError = error
+            if attempt == 2 {
+                throw error
+            }
+        }
+    }
+    throw lastError!
+}
+```
+
+### TypeScript
+
+```typescript
+async function retryOperation<T>(
+  operation: () => Promise<T>
+): Promise<T> {
+  let lastError: Error | undefined;
+  for (let i = 0; i < 3; i++) {
+    try {
+      return await operation();
+    } catch (error) {
+      lastError = error as Error;
+      if (i === 2) {
+        throw error;
+      }
+    }
+  }
+  throw lastError;
+}
+```
+
+### Bad Example - Over-engineered (YAGNI)
+
+```typescript
+async function retryOperation<T>(
+  operation: () => Promise<T>,
+  options: {
+    maxRetries?: number;
+    backoff?: 'linear' | 'exponential';
+    onRetry?: (attempt: number) => void;
+    shouldRetry?: (error: Error) => boolean;
+  } = {}
+): Promise<T> {
+  // Don't add features the test doesn't require!
+}
+```
+
+**Why this is bad:** Test only requires 3 retries. Don't add:
+- Configurable retries
+- Backoff strategies
+- Callbacks
+- Error filtering
+
+...until a test requires them.
+
+## Test Requirements
+
+**Every test should:**
+- Test one behavior
+- Have a clear name
+- Use real code (no mocks unless unavoidable)
+
+## Verification Commands by Language
+
+### Rust
+```bash
+# Single test
+cargo test tests::test_name
+
+# All tests
+cargo test
+
+# With output
+cargo test -- --nocapture
+```
+
+### Swift
+```bash
+# Single test
+swift test --filter TestClass.testName
+
+# All tests
+swift test
+
+# With output
+swift test --verbose
+```
+
+### TypeScript (Jest)
+```bash
+# Single test
+npm test -- --testNamePattern="test name"
+
+# All tests
+npm test
+
+# With coverage
+npm test -- --coverage
+```
+
+### TypeScript (Vitest)
+```bash
+# Single test
+npm test -- -t "test name"
+
+# All tests
+npm test
+
+# With coverage
+npm test -- --coverage
+```
--- a/skills/testing-anti-patterns/SKILL.md
+++ b/skills/testing-anti-patterns/SKILL.md
@@ -0,0 +1,581 @@
+---
+name: testing-anti-patterns
+description: Use when writing or changing tests, adding mocks - prevents testing mock behavior, production pollution with test-only methods, and mocking without understanding dependencies
+---
+
+<skill_overview>
+Tests must verify real behavior, not mock behavior; mocks are tools to isolate, not things to test.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - The 3 Iron Laws are absolute (never test mocks, never add test-only methods, never mock without understanding). Apply gate functions strictly.
+</rigidity_level>
+
+<quick_reference>
+## The 3 Iron Laws
+
+1. **NEVER test mock behavior** → Test real component behavior
+2. **NEVER add test-only methods to production** → Use test utilities instead
+3. **NEVER mock without understanding** → Know dependencies before mocking
+
+## Gate Functions (Use Before Action)
+
+**Before asserting on any mock:**
+- Ask: "Am I testing real behavior or mock existence?"
+- If mock existence → STOP, delete assertion
+
+**Before adding method to production:**
+- Ask: "Is this only used by tests?"
+- If yes → STOP, put in test utilities
+
+**Before mocking:**
+- Ask: "What side effects does real method have?"
+- Ask: "Does test depend on those side effects?"
+- If depends → Mock lower level, not this method
+</quick_reference>
+
+<when_to_use>
+- Writing new tests
+- Adding mocks to tests
+- Tempted to add method only tests will use
+- Test failing and considering mocking something
+- Unsure whether to mock a dependency
+- Test setup becoming complex with mocks
+
+**Critical moment:** Before you add a mock or test-only method, use this skill's gate functions.
+</when_to_use>
+
+<the_iron_laws>
+## Law 1: Never Test Mock Behavior
+
+**Anti-pattern:**
+```rust
+// ❌ BAD: Testing that mock exists
+#[test]
+fn test_processes_request() {
+    let mock_service = MockApiService::new();
+    let handler = RequestHandler::new(Box::new(mock_service));
+
+    // Testing mock existence, not behavior
+    assert!(handler.service().is_mock());
+}
+```
+
+**Why wrong:** Verifies mock works, not that code works.
+
+**Fix:**
+```rust
+// ✅ GOOD: Test real behavior
+#[test]
+fn test_processes_request() {
+    let service = TestApiService::new();  // Real implementation or full fake
+    let handler = RequestHandler::new(Box::new(service));
+
+    let result = handler.process_request("data");
+    assert_eq!(result.status, StatusCode::OK);
+}
+```
+
+---
+
+## Law 2: Never Add Test-Only Methods to Production
+
+**Anti-pattern:**
+```rust
+// ❌ BAD: reset() only used in tests
+pub struct Connection {
+    pool: Arc<ConnectionPool>,
+}
+
+impl Connection {
+    pub fn reset(&mut self) {  // Looks like production API!
+        self.pool.clear_all();
+    }
+}
+
+// In tests
+#[test]
+fn test_something() {
+    let mut conn = Connection::new();
+    conn.reset();  // Test-only method
+}
+```
+
+**Why wrong:**
+- Production code polluted with test-only methods
+- Dangerous if accidentally called in production
+- Confuses object lifecycle with entity lifecycle
+
+**Fix:**
+```rust
+// ✅ GOOD: Test utilities handle cleanup
+// Connection has no reset()
+
+// In tests/test_utils.rs
+pub fn cleanup_connection(conn: &Connection) {
+    if let Some(pool) = conn.get_pool() {
+        pool.clear_test_data();
+    }
+}
+
+// In tests
+#[test]
+fn test_something() {
+    let conn = Connection::new();
+    cleanup_connection(&conn);
+}
+```
+
+---
+
+## Law 3: Never Mock Without Understanding
+
+**Anti-pattern:**
+```rust
+// ❌ BAD: Mock breaks test logic
+#[test]
+fn test_detects_duplicate_server() {
+    // Mock prevents config write that test depends on!
+    let mut config_manager = MockConfigManager::new();
+    config_manager.expect_add_server()
+        .returning(|_| Ok(()));  // No actual config write!
+
+    config_manager.add_server(&config).unwrap();
+    config_manager.add_server(&config).unwrap();  // Should fail - but won't!
+}
+```
+
+**Why wrong:** Mocked method had side effect test depended on (writing config).
+
+**Fix:**
+```rust
+// ✅ GOOD: Mock at correct level
+#[test]
+fn test_detects_duplicate_server() {
+    // Mock the slow part, preserve behavior test needs
+    let server_manager = MockServerManager::new();  // Just mock slow server startup
+    let config_manager = ConfigManager::new_with_manager(server_manager);
+
+    config_manager.add_server(&config).unwrap();  // Config written
+    let result = config_manager.add_server(&config);  // Duplicate detected ✓
+    assert!(result.is_err());
+}
+```
+</the_iron_laws>
+
+<gate_functions>
+## Gate Function 1: Before Asserting on Mock
+
+```
+BEFORE any assertion that checks mock elements:
+
+1. Ask: "Am I testing real component behavior or just mock existence?"
+
+2. If testing mock existence:
+   STOP - Delete the assertion or unmock the component
+
+3. Test real behavior instead
+```
+
+**Examples of mock existence testing (all wrong):**
+- `assert!(handler.service().is_mock())`
+- `XCTAssertTrue(manager.delegate is MockDelegate)`
+- `expect(component.database).toBe(mockDb)`
+
+---
+
+## Gate Function 2: Before Adding Method to Production
+
+```
+BEFORE adding any method to production class:
+
+1. Ask: "Is this only used by tests?"
+
+2. If yes:
+   STOP - Don't add it
+   Put it in test utilities instead
+
+3. Ask: "Does this class own this resource's lifecycle?"
+
+4. If no:
+   STOP - Wrong class for this method
+```
+
+**Red flags:**
+- Method named `reset()`, `clear()`, `cleanup()` in production class
+- Method only has `#[cfg(test)]` callers
+- Method added "for testing purposes"
+
+---
+
+## Gate Function 3: Before Mocking
+
+```
+BEFORE mocking any method:
+
+STOP - Don't mock yet
+
+1. Ask: "What side effects does the real method have?"
+2. Ask: "Does this test depend on any of those side effects?"
+3. Ask: "Do I fully understand what this test needs?"
+
+If depends on side effects:
+  → Mock at lower level (the actual slow/external operation)
+  → OR use test doubles that preserve necessary behavior
+  → NOT the high-level method the test depends on
+
+If unsure what test depends on:
+  → Run test with real implementation FIRST
+  → Observe what actually needs to happen
+  → THEN add minimal mocking at the right level
+```
+
+**Red flags:**
+- "I'll mock this to be safe"
+- "This might be slow, better mock it"
+- Mocking without understanding dependency chain
+</gate_functions>
+
+<examples>
+<example>
+<scenario>Developer tests mock behavior instead of real behavior</scenario>
+
+<code>
+#[test]
+fn test_user_service_initialized() {
+    let mock_db = MockDatabase::new();
+    let service = UserService::new(mock_db);
+
+    // Testing that mock exists
+    assert_eq!(service.database().connection_string(), "mock://test");
+    assert!(service.database().is_test_mode());
+}
+</code>
+
+<why_it_fails>
+- Assertions check mock properties, not service behavior
+- Test passes when mock is correct, fails when mock is wrong
+- Tells you nothing about whether UserService works
+- Would pass even if UserService.new() does nothing
+- False confidence - mock works, but does service work?
+</why_it_fails>
+
+<correction>
+**Apply Gate Function 1:**
+
+"Am I testing real behavior or mock existence?"
+→ Testing mock existence (connection_string(), is_test_mode() are mock properties)
+
+**Fix:**
+
+```rust
+#[test]
+fn test_user_service_creates_user() {
+    let db = TestDatabase::new();  // Real test implementation
+    let service = UserService::new(db);
+
+    // Test real behavior
+    let user = service.create_user("alice", "alice@example.com").unwrap();
+    assert_eq!(user.name, "alice");
+    assert_eq!(user.email, "alice@example.com");
+
+    // Verify user was saved
+    let retrieved = service.get_user(user.id).unwrap();
+    assert_eq!(retrieved.name, "alice");
+}
+```
+
+**What you gain:**
+- Tests actual UserService behavior
+- Validates create and retrieve work
+- Would fail if service broken (even with working mock)
+- Confidence service actually works
+</correction>
+</example>
+
+<example>
+<scenario>Developer adds test-only method to production class</scenario>
+
+<code>
+// Production code
+pub struct Database {
+    pool: ConnectionPool,
+}
+
+impl Database {
+    pub fn new() -> Self { /* ... */ }
+
+    // Added "for testing"
+    pub fn reset(&mut self) {
+        self.pool.clear();
+        self.pool.reinitialize();
+    }
+}
+
+// Tests
+#[test]
+fn test_user_creation() {
+    let mut db = Database::new();
+    // ... test logic ...
+    db.reset();  // Clean up
+}
+
+#[test]
+fn test_user_deletion() {
+    let mut db = Database::new();
+    // ... test logic ...
+    db.reset();  // Clean up
+}
+</code>
+
+<why_it_fails>
+- Production Database polluted with test-only reset()
+- reset() looks like legitimate API to other developers
+- Dangerous if accidentally called in production (clears all data!)
+- Violates single responsibility (Database manages connections, not test lifecycle)
+- Every test class now needs reset() added
+</why_it_fails>
+
+<correction>
+**Apply Gate Function 2:**
+
+"Is this only used by tests?" → YES
+"Does Database class own test lifecycle?" → NO
+
+**Fix:**
+
+```rust
+// Production code (NO reset method)
+pub struct Database {
+    pool: ConnectionPool,
+}
+
+impl Database {
+    pub fn new() -> Self { /* ... */ }
+    // No reset() - production code clean
+}
+
+// Test utilities (tests/test_utils.rs)
+pub fn create_test_database() -> Database {
+    Database::new()
+}
+
+pub fn cleanup_database(db: &mut Database) {
+    // Access internals properly for cleanup
+    if let Some(pool) = db.get_pool_mut() {
+        pool.clear_test_data();
+    }
+}
+
+// Tests
+#[test]
+fn test_user_creation() {
+    let mut db = create_test_database();
+    // ... test logic ...
+    cleanup_database(&mut db);
+}
+```
+
+**What you gain:**
+- Production code has no test pollution
+- No risk of accidental production calls
+- Clear separation: Database manages connections, test utils manage test lifecycle
+- Test utilities can evolve without changing production code
+</correction>
+</example>
+
+<example>
+<scenario>Developer mocks without understanding dependencies</scenario>
+
+<code>
+#[test]
+fn test_detects_duplicate_server() {
+    // "I'll mock ConfigManager to speed up the test"
+    let mut mock_config = MockConfigManager::new();
+    mock_config.expect_add_server()
+        .times(2)
+        .returning(|_| Ok(()));  // Always returns Ok!
+
+    // Test expects duplicate detection
+    mock_config.add_server(&server_config).unwrap();
+    let result = mock_config.add_server(&server_config);
+
+    // Assertion fails! Mock always returns Ok, no duplicate detection
+    assert!(result.is_err());  // FAILS
+}
+</code>
+
+<why_it_fails>
+- Mocked add_server() without understanding it writes config
+- Mock returns Ok() both times (no duplicate detection)
+- Test depends on ConfigManager's internal state tracking
+- Mock eliminates the behavior test needs to verify
+- "Speeding up" by mocking broke the test
+</why_it_fails>
+
+<correction>
+**Apply Gate Function 3:**
+
+"What side effects does add_server() have?" → Writes to config file, tracks added servers
+"Does test depend on those?" → YES! Test needs duplicate detection
+"Do I understand what test needs?" → Now yes
+
+**Fix:**
+
+```rust
+#[test]
+fn test_detects_duplicate_server() {
+    // Mock at the RIGHT level - just the slow I/O
+    let mock_file_system = MockFileSystem::new();  // Mock slow file writes
+    let config_manager = ConfigManager::new_with_fs(mock_file_system);
+
+    // ConfigManager's duplicate detection still works
+    config_manager.add_server(&server_config).unwrap();
+    let result = config_manager.add_server(&server_config);
+
+    // Passes! ConfigManager tracks duplicates, only file I/O is mocked
+    assert!(result.is_err());
+}
+```
+
+**What you gain:**
+- Test verifies real duplicate detection logic
+- Only mocked the actual slow part (file I/O)
+- ConfigManager's internal tracking works normally
+- Test actually validates the feature
+</correction>
+</example>
+</examples>
+
+<additional_anti_patterns>
+## Anti-Pattern 4: Incomplete Mocks
+
+**Problem:** Mock only fields you think you need, omit others.
+
+```rust
+// ❌ BAD: Partial mock
+struct MockResponse {
+    status: String,
+    data: UserData,
+    // Missing: metadata that downstream code uses
+}
+
+impl ApiResponse for MockResponse {
+    fn metadata(&self) -> &Metadata {
+        panic!("metadata not implemented!")  // Breaks at runtime!
+    }
+}
+```
+
+**Fix:** Mirror real API completely.
+
+```rust
+// ✅ GOOD: Complete mock
+struct MockResponse {
+    status: String,
+    data: UserData,
+    metadata: Metadata,  // All fields real API returns
+}
+```
+
+**Gate function:**
+```
+BEFORE creating mock responses:
+  1. Examine actual API response structure
+  2. Include ALL fields system might consume
+  3. Verify mock matches real schema completely
+```
+
+---
+
+## Anti-Pattern 5: Over-Complex Mocks
+
+**Warning signs:**
+- Mock setup longer than test logic
+- Mocking everything to make test pass
+- Test breaks when mock changes
+
+**Consider:** Integration tests with real components often simpler than complex mocks.
+</additional_anti_patterns>
+
+<tdd_prevention>
+## TDD Prevents These Anti-Patterns
+
+**Why TDD helps:**
+
+1. **Write test first** → Forces thinking about what you're actually testing
+2. **Watch it fail** → Confirms test tests real behavior, not mocks
+3. **Minimal implementation** → No test-only methods creep in
+4. **Real dependencies** → See what test needs before mocking
+
+**If you're testing mock behavior, you violated TDD** - you added mocks without watching test fail against real code first.
+
+**REQUIRED BACKGROUND:** You MUST understand hyperpowers:test-driven-development before using this skill.
+</tdd_prevention>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Never test mock behavior** → Test real component behavior always
+2. **Never add test-only methods to production** → Pollutes production code
+3. **Never mock without understanding** → Must know dependencies and side effects
+4. **Use gate functions before action** → Before asserting, adding methods, or mocking
+5. **Follow TDD** → Write test first, watch fail, prevents testing mocks
+
+## Common Excuses
+
+All of these mean: **STOP. Apply the gate function.**
+
+- "Just checking the mock is wired up" (Testing mock, not behavior)
+- "Need reset() for test cleanup" (Test-only method, use test utilities)
+- "I'll mock this to be safe" (Don't understand dependencies)
+- "Mock setup is complex but necessary" (Probably over-mocking)
+- "This will speed up tests" (Might break test logic)
+</critical_rules>
+
+<verification_checklist>
+Before claiming tests are correct:
+
+- [ ] No assertions on mock elements (no `is_mock()`, `is MockType`, etc.)
+- [ ] No test-only methods in production classes
+- [ ] All mocks preserve side effects test depends on
+- [ ] Mock at lowest level needed (mock slow I/O, not business logic)
+- [ ] Understand why each mock is necessary
+- [ ] Mock structure matches real API completely
+- [ ] Test logic shorter/equal to mock setup (not longer)
+- [ ] Followed TDD (test failed with real code before mocking)
+
+**Can't check all boxes?** Apply gate functions and refactor.
+</verification_checklist>
+
+<integration>
+**This skill requires:**
+- hyperpowers:test-driven-development (prevents these anti-patterns)
+- Understanding of mocking vs. faking vs. stubbing
+
+**This skill is called by:**
+- When writing tests
+- When adding mocks
+- When test setup becoming complex
+- hyperpowers:test-driven-development (use gate functions during RED phase)
+
+**Red flags triggering this skill:**
+- Assertion checks for `*-mock` test IDs
+- Methods only called in test files
+- Mock setup >50% of test
+- Test fails when you remove mock
+- Can't explain why mock needed
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Mocking vs Faking vs Stubbing](resources/test-doubles.md)
+- [Test utilities patterns](resources/test-utilities.md)
+- [When to use integration tests](resources/integration-vs-unit.md)
+
+**When stuck:**
+- Mock too complex → Consider integration test with real components
+- Unsure what to mock → Run with real implementation first, observe
+- Test failing mysteriously → Check if mock breaks test logic (use Gate Function 3)
+- Production polluted → Move all test helpers to test_utils
+</resources>
--- a/skills/using-hyper/SKILL.md
+++ b/skills/using-hyper/SKILL.md
@@ -0,0 +1,386 @@
+---
+name: using-hyper
+description: Use when starting any conversation - establishes mandatory workflows for finding and using skills
+---
+
+<EXTREMELY_IMPORTANT>
+If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST read the skill.
+
+**IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT.**
+
+This is not negotiable. This is not optional. You cannot rationalize your way out of this.
+</EXTREMELY_IMPORTANT>
+
+<skill_overview>
+Skills are proven workflows; if one exists for your task, using it is mandatory, not optional.
+</skill_overview>
+
+<rigidity_level>
+HIGH FREEDOM - The meta-process (check for skills, use Skill tool, announce usage) is rigid, but each individual skill defines its own rigidity level.
+</rigidity_level>
+
+<quick_reference>
+**Before responding to ANY user message:**
+
+1. List available skills mentally
+2. Ask: "Does ANY skill match this request?"
+3. If yes → Use Skill tool to load the skill file
+4. Announce which skill you're using
+5. Follow the skill exactly as written
+
+**Skill has checklist?** Create TodoWrite for every item.
+
+**Finding a relevant skill = mandatory to use it.**
+</quick_reference>
+
+<when_to_use>
+This skill applies at the start of EVERY conversation and BEFORE every task:
+
+- User asks you to implement a feature
+- User asks you to fix a bug
+- User asks you to refactor code
+- User asks you to debug an issue
+- User asks you to write tests
+- User asks you to review code
+- User describes a problem to solve
+- User provides requirements to implement
+
+**Applies to:** Literally any task that might have a corresponding skill.
+</when_to_use>
+
+<the_process>
+## 1. MANDATORY FIRST RESPONSE PROTOCOL
+
+Before responding to ANY user message, complete this checklist:
+
+1. ☐ List available skills in your mind
+2. ☐ Ask yourself: "Does ANY skill match this request?"
+3. ☐ If yes → Use the Skill tool to read and run the skill file
+4. ☐ Announce which skill you're using
+5. ☐ Follow the skill exactly
+
+**Responding WITHOUT completing this checklist = automatic failure.**
+
+---
+
+## 2. Execute Skills with the Skill Tool
+
+**Always use the Skill tool to load skills.** Never rely on memory.
+
+```
+Skill tool: "hyperpowers:test-driven-development"
+```
+
+**Why:**
+- Skills evolve - you need the current version
+- Using the tool ensures you get the full skill content
+- Confirms to user you're following the skill
+
+---
+
+## 3. Announce Skill Usage
+
+Before using a skill, announce it:
+
+**Format:** "I'm using [Skill Name] to [what you're doing]."
+
+**Examples:**
+- "I'm using hyperpowers:brainstorming to refine your idea into a design."
+- "I'm using hyperpowers:test-driven-development to implement this feature."
+- "I'm using hyperpowers:debugging-with-tools to investigate this error."
+
+**Why:** Transparency helps user understand your process and catch errors early. Confirms you actually read the skill.
+
+---
+
+## 4. Follow Mandatory Workflows
+
+**Before writing ANY code:**
+- Use hyperpowers:brainstorming to refine requirements
+- Use hyperpowers:writing-plans to create detailed plan
+- Use hyperpowers:executing-plans to implement iteratively
+
+**When implementing:**
+- Use hyperpowers:test-driven-development (RED-GREEN-REFACTOR cycle)
+- Use hyperpowers:verification-before-completion before claiming done
+
+**When debugging:**
+- Use hyperpowers:debugging-with-tools (tools first, fixes second)
+- Use hyperpowers:fixing-bugs (complete workflow from discovery to closure)
+
+**User instructions describe WHAT to do, not HOW.** "Add X" means use brainstorming, TDD, verification. Not permission to skip workflows.
+
+---
+
+## 5. Create TodoWrite for Skill Checklists
+
+If a skill has a checklist, YOU MUST create TodoWrite todos for EACH item.
+
+**Don't:**
+- Work through checklist mentally
+- Skip creating todos "to save time"
+- Batch multiple items into one todo
+- Mark complete without doing them
+
+**Why:** Checklists without TodoWrite tracking = steps get skipped. Every time. The overhead is tiny compared to missing steps.
+
+**Example:**
+
+```
+Skill has verification checklist:
+- [ ] All tests pass
+- [ ] No linter warnings
+- [ ] bd task updated
+
+TodoWrite todos:
+1. Run all tests and verify they pass
+2. Run linter and verify no warnings
+3. Update bd task with completion status
+```
+</the_process>
+
+<examples>
+<example>
+<scenario>User asks to implement a new feature</scenario>
+
+<code>
+User: "Add a user profile page with avatar upload"
+
+Claude (without using-hyper):
+"Sure! Let me start implementing the user profile page..."
+[Starts writing code immediately without brainstorming or planning]
+</code>
+
+<why_it_fails>
+- Skipped mandatory brainstorming (requirements unclear)
+- Skipped writing-plans (no detailed implementation plan)
+- Skipped test-driven-development (no tests first)
+- Will likely miss edge cases, make wrong assumptions
+- User has no visibility into approach before code is written
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+Claude: "I'm using hyperpowers:brainstorming to refine your requirements into a complete design."
+
+[Skill tool loads brainstorming]
+[Asks Socratic questions about avatar size limits, formats, storage, etc.]
+[Creates refined requirements]
+
+Claude: "Now I'm using hyperpowers:writing-plans to create a detailed implementation plan."
+
+[Skill tool loads writing-plans]
+[Creates bd epic with tasks]
+
+Claude: "Now I'm using hyperpowers:executing-plans to implement the tasks iteratively."
+
+[Skill tool loads executing-plans]
+[Implements with TDD, commits incrementally]
+
+**What you gain:**
+- Clear requirements before coding
+- Detailed plan user can review
+- Implementation follows proven workflow
+- Tests written first (TDD)
+- Progress tracked in bd
+</correction>
+</example>
+
+<example>
+<scenario>User reports a bug</scenario>
+
+<code>
+User: "Users can't log in after password reset"
+
+Claude (without using-hyper):
+"Let me check the password reset code... I found the issue, it's not updating the password hash. Let me fix that..."
+[Makes immediate fix without investigation]
+</code>
+
+<why_it_fails>
+- Skipped debugging-with-tools (no systematic investigation)
+- Skipped fixing-bugs workflow (no bd issue, no regression test)
+- "Obvious" fix might be wrong or incomplete
+- No verification that fix actually works
+- Will likely miss root cause and fix symptom
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+Claude: "I'm using hyperpowers:debugging-with-tools to investigate this systematically."
+
+[Skill tool loads debugging-with-tools]
+[Uses debugger, checks logs, searches internet for known issues]
+[Identifies root cause: token invalidation timing issue]
+
+Claude: "Now I'm using hyperpowers:fixing-bugs to implement a complete fix."
+
+[Skill tool loads fixing-bugs]
+[Creates bd bug issue]
+[Writes failing test reproducing bug (RED)]
+[Implements minimal fix (GREEN)]
+[Runs full test suite via test-runner agent]
+[Verifies fix and closes bd issue]
+
+**What you gain:**
+- Root cause identified (not symptom)
+- Regression test prevents recurrence
+- Fix verified with evidence
+- Bug tracked in bd
+- Complete workflow documented
+</correction>
+</example>
+
+<example>
+<scenario>User asks to refactor code</scenario>
+
+<code>
+User: "This validation logic is duplicated, can you extract it?"
+
+Claude (without using-hyper):
+"Sure, let me extract that into a shared function..."
+[Changes 5 files at once, then runs tests]
+[Tests fail]
+"Hmm, let me fix these test failures..."
+</code>
+
+<why_it_fails>
+- Skipped refactoring-safely workflow (change→test→commit cycle)
+- Big-bang refactoring (changed multiple files before testing)
+- Tests failing means behavior changed (not safe refactoring)
+- Can't identify which change broke what
+- Wastes time debugging entire refactoring
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+Claude: "I'm using hyperpowers:refactoring-safely to extract this validation logic incrementally."
+
+[Skill tool loads refactoring-safely]
+
+Step 1: Verify tests pass BEFORE refactoring
+Step 2: Create bd refactoring task
+Step 3: Extract validation from first file → test → commit
+Step 4: Extract validation from second file → test → commit
+Step 5: Create shared validator → test → commit
+Step 6: Final verification → close bd task
+
+**What you gain:**
+- Tests stay green throughout (safe refactoring)
+- Each commit is reviewable independently
+- Know exactly which change broke if test fails
+- Can stop halfway with useful progress
+- Clear history of transformations
+</correction>
+</example>
+</examples>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **Check for relevant skills BEFORE any task** → If skill exists, use it (not optional)
+2. **Use Skill tool to load skills** → Never rely on memory (skills evolve)
+3. **Announce skill usage** → Transparency helps catch errors early
+4. **Follow mandatory workflows** → brainstorming before coding, TDD for implementation, verification before claiming done
+5. **Create TodoWrite for checklists** → Mental tracking = skipped steps
+
+## Common Rationalizations
+
+All of these mean: **STOP. Check for and use the relevant skill.**
+
+- "This is just a simple question" (Questions are tasks. Check for skills.)
+- "I can check git/files quickly" (Files lack context. Check for skills.)
+- "Let me gather information first" (Skills tell you HOW to gather. Check for skills.)
+- "This doesn't need a formal skill" (If skill exists, use it. Not optional.)
+- "I remember this skill" (Skills evolve. Use Skill tool to load current version.)
+- "This doesn't count as a task" (Taking action = task. Check for skills.)
+- "The skill is overkill for this" (Skills exist because "simple" becomes complex.)
+- "I'll just do this one thing first" (Check for skills BEFORE doing anything.)
+- "Instruction was specific so I can skip brainstorming" (Specific instructions = WHAT, not HOW. Use workflows.)
+</critical_rules>
+
+<understanding_rigidity>
+## Rigid Skills (Follow Exactly)
+
+These have LOW FREEDOM - follow the exact process:
+
+- hyperpowers:test-driven-development (RED-GREEN-REFACTOR cycle)
+- hyperpowers:verification-before-completion (evidence before claims)
+- hyperpowers:executing-plans (continuous execution, substep tracking)
+
+## Flexible Skills (Adapt Principles)
+
+These have HIGH FREEDOM - adapt core principles to context:
+
+- hyperpowers:brainstorming (Socratic method, but questions vary)
+- hyperpowers:managing-bd-tasks (operations adapt to project)
+- hyperpowers:sre-task-refinement (corner case analysis, but depth varies)
+
+**The skill itself tells you its rigidity level.** Check `<rigidity_level>` section.
+</understanding_rigidity>
+
+<instructions_vs_workflows>
+## User Instructions Describe WHAT, Not HOW
+
+**User says:** "Add user authentication"
+**This means:** Use brainstorming → writing-plans → executing-plans → TDD → verification
+
+**User says:** "Fix this bug"
+**This means:** Use debugging-with-tools → fixing-bugs → TDD → verification
+
+**User says:** "Refactor this code"
+**This means:** Use refactoring-safely (change→test→commit cycle)
+
+**User instructions are the GOAL, not permission to skip workflows.**
+
+**Red flags that you're rationalizing:**
+- "Instruction was specific, don't need brainstorming"
+- "Seems simple, don't need TDD"
+- "Workflow is overkill for this"
+
+**Why workflows matter MORE when instructions are specific:**
+- Clear requirements = perfect time for structured implementation
+- "Simple" tasks often have hidden complexity
+- Skipping process on "easy" tasks is how they become hard problems
+</instructions_vs_workflows>
+
+<verification_checklist>
+Before completing ANY task:
+
+- [ ] Did I check for relevant skills before starting?
+- [ ] Did I use Skill tool to load skills (not rely on memory)?
+- [ ] Did I announce which skill I'm using?
+- [ ] Did I follow the skill's process exactly?
+- [ ] Did I create TodoWrite for any skill checklists?
+- [ ] Did I follow mandatory workflows (brainstorming, TDD, verification)?
+
+**Can't check all boxes?** You skipped critical steps. Review and fix.
+</verification_checklist>
+
+<integration>
+**This skill calls:**
+- ALL other skills (meta-skill that triggers appropriate skill usage)
+
+**This skill is called by:**
+- Session start (always loaded)
+- User requests (check before every task)
+
+**Critical workflows this establishes:**
+- hyperpowers:brainstorming (before writing code)
+- hyperpowers:test-driven-development (during implementation)
+- hyperpowers:verification-before-completion (before claiming done)
+</integration>
+
+<resources>
+**Available skills:**
+- See skill descriptions in Skill tool's "Available Commands" section
+- Each skill's description shows when to use it
+
+**When unsure if skill applies:**
+- If there's even 1% chance it applies → use it
+- Better to load and decide "not needed" than to skip and fail
+- Skills are optimized, loading them is cheap
+</resources>
--- a/skills/verification-before-completion/SKILL.md
+++ b/skills/verification-before-completion/SKILL.md
@@ -0,0 +1,363 @@
+---
+name: verification-before-completion
+description: Use before claiming work complete, fixed, or passing - requires running verification commands and confirming output; evidence before assertions always
+---
+
+<skill_overview>
+Claiming work is complete without verification is dishonesty, not efficiency. Evidence before claims, always.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - NO exceptions. Run verification command, read output, THEN make claim.
+
+No shortcuts. No "should work". No partial verification. Run it, prove it.
+</rigidity_level>
+
+<quick_reference>
+
+| Claim | Verification Required | Not Sufficient |
+|-------|----------------------|----------------|
+| **Tests pass** | Run full test command, see 0 failures | Previous run, "should pass" |
+| **Build succeeds** | Run build, see exit 0 | Linter passing |
+| **Bug fixed** | Test original symptom, passes | Code changed |
+| **Task complete** | Check all success criteria, run verifications | "Implemented bd-3" |
+| **Epic complete** | `bd list --status open --parent bd-1` shows 0 | "All tasks done" |
+
+**Iron Law:** NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
+
+**Use test-runner agent for:** Tests, pre-commit hooks, commits (keeps verbose output out of context)
+
+</quick_reference>
+
+<when_to_use>
+**ALWAYS before:**
+- Any success/completion claim
+- Any expression of satisfaction
+- Committing, PR creation, task completion
+- Moving to next task
+- ANY communication suggesting completion/correctness
+
+**Red flags you need this:**
+- Using "should", "probably", "seems to"
+- Expressing satisfaction before verification ("Great!", "Perfect!")
+- About to commit/push without verification
+- Trusting agent success reports
+- Relying on partial verification
+</when_to_use>
+
+<the_process>
+
+## The Gate Function
+
+Before claiming ANY status:
+
+### 1. Identify
+What command proves this claim?
+
+### 2. Run
+Execute the full command (fresh, complete).
+
+**For tests/hooks/commits:** Use `hyperpowers:test-runner` agent
+- Agent captures verbose output in its context
+- Returns only summary + failures
+- Prevents context pollution
+
+**For other commands:** Run directly and capture output
+
+### 3. Read
+Full output, check exit code, count failures.
+
+### 4. Verify
+Does output confirm the claim?
+- If NO: State actual status with evidence
+- If YES: State claim with evidence
+
+### 5. Only Then
+Make the claim.
+
+**Skip any step = lying, not verifying**
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer claims tests pass without running them</scenario>
+
+<code>
+Developer modifies authentication logic.
+
+Developer thinks: "This fix is straightforward, tests should pass now"
+
+Developer writes: "Great! All tests passing. The bug is fixed."
+
+[No test command run, no output shown]
+</code>
+
+<why_it_fails>
+**No evidence:**
+- "Should pass" ≠ evidence
+- Confidence ≠ verification
+- Might have broken other tests
+- Might not have fixed the bug
+
+**Why dangerous:**
+- Broken code ships
+- Trust broken with partner
+- Wastes time on false completion
+</why_it_fails>
+
+<correction>
+**Run verification:**
+
+```bash
+# Dispatch test-runner agent
+"Run: cargo test"
+```
+
+**Agent returns:**
+```
+Summary: 33 passed, 1 failed
+Failures:
+- test_login_with_expired_token: assertion failed
+  Expected: Err(AuthError::TokenExpired)
+  Got: Ok(User { id: 123 })
+```
+
+**State actual status:**
+```
+Tests: 33 passed, 1 failed.
+Failure: test_login_with_expired_token still fails.
+The fix didn't handle expired tokens.
+Investigating...
+```
+
+**Result:** Accurate status, bug caught before committing.
+</correction>
+</example>
+
+<example>
+<scenario>Developer closes bd task without verifying success criteria</scenario>
+
+<code>
+bd-3 success criteria:
+- [ ] All functions fully implemented (no stubs, no TODOs)
+- [ ] Tests written and passing: test_generate_token, test_verify_token
+- [ ] Pre-commit hooks pass
+
+Developer implements token functions.
+
+Developer thinks: "I implemented everything, task complete"
+
+Runs: bd close bd-3
+
+[No verification commands run]
+</code>
+
+<why_it_fails>
+**Success criteria not verified:**
+- Might have TODO comments left
+- Specific tests not run
+- Pre-commit hooks not checked
+
+**Result:** Incomplete task closed, issues discovered later.
+</why_it_fails>
+
+<correction>
+**Verify each criterion:**
+
+```bash
+# 1. Check for TODOs
+rg "TODO|FIXME|todo!|unimplemented!" src/
+# Output: [nothing found] ✅
+
+# 2. Run specific tests
+cargo test test_generate_token test_verify_token
+# Output: test result: ok. 2 passed ✅
+
+# 3. Run pre-commit hooks (via test-runner agent)
+"Run: cargo fmt --check && cargo clippy && cargo test"
+```
+
+**Agent returns:**
+```
+All checks passed:
+- cargo fmt: no formatting issues
+- cargo clippy: no warnings
+- cargo test: 34/34 passed
+```
+
+**THEN close task:**
+```bash
+bd close bd-3
+```
+
+**Result:** Task actually complete, all criteria met with evidence.
+</correction>
+</example>
+
+<example>
+<scenario>Developer claims build works based on linter passing</scenario>
+
+<code>
+Developer fixes TypeScript errors flagged by linter.
+
+Runs: npm run lint
+Output: ✅ No issues found
+
+Developer writes: "Build passes! Ready to commit."
+
+[No build command run]
+</code>
+
+<why_it_fails>
+**Linter ≠ compiler:**
+- Linter checks style and simple errors
+- Doesn't verify imports, types, compilation
+- Build might still fail
+
+**Why it happens:** Partial verification, extrapolation.
+</why_it_fails>
+
+<correction>
+**Run actual build:**
+
+```bash
+npm run build
+```
+
+**Output:**
+```
+ERROR in src/auth/service.ts:23:15
+TS2304: Cannot find name 'validateToken'
+Did you mean 'verifyToken'?
+```
+
+**State actual status:**
+```
+Linter passes, but build fails.
+Error: validateToken not defined (should be verifyToken).
+Fixing...
+```
+
+**After fix, run build again:**
+```bash
+npm run build
+# Output: ✓ built in 1.2s
+```
+
+**Now can claim:**
+```
+Build passes. [Evidence: npm run build exit 0]
+Ready to commit.
+```
+
+**Result:** Actual build status verified, error caught.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **No claims without fresh verification** → Run command, see output, THEN claim
+   - "Should work" = forbidden
+   - "Looks correct" = forbidden
+   - Previous run ≠ fresh verification
+
+2. **Use test-runner agent for verbose commands** → Tests, hooks, commits
+   - Prevents context pollution
+   - Returns summary + failures only
+   - Never run `git commit` or `cargo test` directly if output is verbose
+
+3. **Verify ALL success criteria** → Not just "tests pass"
+   - Read each criterion from bd task
+   - Run verification for each
+   - Check all pass before closing
+
+4. **Evidence in every claim** → Show the output
+   - Not: "Tests pass"
+   - Yes: "Tests pass [Ran: cargo test, Output: 34/34 passed]"
+
+## Common Excuses
+
+All of these mean: Stop, run verification:
+- "Should work now"
+- "I'm confident this fixes it"
+- "Just this once"
+- "Linter passed" (when claiming build works)
+- "Agent said success" (without independent verification)
+- "I'm tired" (exhaustion ≠ excuse)
+- "Partial check is enough"
+
+## Pre-Commit Hook Assumption
+
+**If your project uses pre-commit hooks enforcing tests:**
+- All test failures are from your current changes
+- Never check if errors were "pre-existing"
+- Don't run `git checkout <sha> && pytest` to verify
+- Pre-commit hooks guarantee previous commit passed
+- Just fix the error directly
+
+</critical_rules>
+
+<verification_checklist>
+
+Before claiming tests pass:
+- [ ] Ran full test command (not partial)
+- [ ] Saw output showing 0 failures
+- [ ] Used test-runner agent if output verbose
+
+Before claiming build succeeds:
+- [ ] Ran build command (not just linter)
+- [ ] Saw exit code 0
+- [ ] Checked for compilation errors
+
+Before closing bd task:
+- [ ] Re-read success criteria from bd task
+- [ ] Ran verification for each criterion
+- [ ] Saw evidence all pass
+- [ ] THEN closed task
+
+Before closing bd epic:
+- [ ] Ran `bd list --status open --parent bd-1`
+- [ ] Saw 0 open tasks
+- [ ] Ran `bd dep tree bd-1`
+- [ ] Confirmed all tasks closed
+- [ ] THEN closed epic
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- test-runner (for verbose verification commands)
+
+**This skill is called by:**
+- test-driven-development (verify tests pass/fail)
+- executing-plans (verify task success criteria)
+- refactoring-safely (verify tests still pass)
+- ALL skills before completion claims
+
+**Agents used:**
+- hyperpowers:test-runner (run tests, hooks, commits without output pollution)
+
+</integration>
+
+<resources>
+
+**When stuck:**
+- Tempted to say "should work" → Run the verification
+- Agent reports success → Check VCS diff, verify independently
+- Partial verification → Run complete command
+- Tired and want to finish → Run verification anyway, no exceptions
+
+**Verification patterns:**
+- Tests: Use test-runner agent, check 0 failures
+- Build: Run build command, check exit 0
+- bd task: Verify each success criterion
+- bd epic: Check all tasks closed with bd list/dep tree
+
+</resources>
--- a/skills/writing-plans/SKILL.md
+++ b/skills/writing-plans/SKILL.md
@@ -0,0 +1,478 @@
+---
+name: writing-plans
+description: Use to expand bd tasks with detailed implementation steps - adds exact file paths, complete code, verification commands assuming zero context
+---
+
+<skill_overview>
+Enhance bd tasks with comprehensive implementation details for engineers with zero codebase context. Expand checklists into explicit steps: which files, complete code examples, exact commands, verification steps.
+</skill_overview>
+
+<rigidity_level>
+MEDIUM FREEDOM - Follow task-by-task validation pattern, use codebase-investigator for verification.
+
+Adapt implementation details to actual codebase state. Never use placeholders or meta-references.
+</rigidity_level>
+
+<quick_reference>
+
+| Step | Action | Critical Rule |
+|------|--------|---------------|
+| **Identify Scope** | Single task, range, or full epic | No artificial limits |
+| **Verify Codebase** | Use `codebase-investigator` agent | NEVER verify yourself, report discrepancies |
+| **Draft Steps** | Write bite-sized (2-5 min) actions | Follow TDD cycle for new features |
+| **Present to User** | Show COMPLETE expansion FIRST | Then ask for approval |
+| **Update bd** | `bd update bd-N --design "..."` | Only after user approves |
+| **Continue** | Move to next task automatically | NO asking permission between tasks |
+
+**FORBIDDEN:** Placeholders like `[Full implementation steps as detailed above]`
+**REQUIRED:** Actual content - complete code, exact paths, real commands
+
+</quick_reference>
+
+<when_to_use>
+**Use after hyperpowers:sre-task-refinement or anytime tasks need more detail.**
+
+Symptoms:
+- bd tasks have implementation checklists but need expansion
+- Engineer needs step-by-step guide with zero context
+- Want explicit file paths, complete code examples
+- Need exact verification commands
+
+</when_to_use>
+
+<the_process>
+
+## 1. Identify Tasks to Expand
+
+**User specifies scope:**
+- Single: "Expand bd-2"
+- Range: "Expand bd-2 through bd-5"
+- Epic: "Expand all tasks in bd-1"
+
+**If epic:**
+```bash
+bd dep tree bd-1  # View complete dependency tree
+# Note all child task IDs
+```
+
+**Create TodoWrite tracker:**
+```
+- [ ] bd-2: [Task Title]
+- [ ] bd-3: [Task Title]
+...
+```
+
+## 2. For EACH Task (Loop Until All Done)
+
+### 2a. Mark In Progress and Read Current State
+
+```bash
+# Mark in TodoWrite: in_progress
+bd show bd-3  # Read current task design
+```
+
+### 2b. Verify Codebase State
+
+**CRITICAL: Use codebase-investigator agent, NEVER verify yourself.**
+
+**Provide agent with bd assumptions:**
+```
+Assumptions from bd-3:
+- Auth service should be in src/services/auth.ts with login() and logout()
+- User model in src/models/user.ts with email and password fields
+- Test file at tests/services/auth.test.ts
+- Uses bcrypt dependency for password hashing
+
+Verify these assumptions and report:
+1. What exists vs what bd-3 expects
+2. Structural differences (different paths, functions, exports)
+3. Missing or additional components
+4. Current dependency versions
+```
+
+**Based on investigator report:**
+- ✓ Confirmed assumptions → Use in implementation
+- ✗ Incorrect assumptions → Adjust plan to match reality
+- + Found additional → Document and incorporate
+
+**NEVER write conditional steps:**
+❌ "Update `index.js` if exists"
+❌ "Modify `config.py` (if present)"
+
+**ALWAYS write definitive steps:**
+✅ "Create `src/auth.ts`" (investigator confirmed doesn't exist)
+✅ "Modify `src/index.ts:45-67`" (investigator confirmed exists)
+
+### 2c. Draft Expanded Implementation Steps
+
+**Bite-sized granularity (2-5 minutes per step):**
+
+For new features (follow test-driven-development):
+1. Write the failing test (one step)
+2. Run it to verify it fails (one step)
+3. Implement minimal code to pass (one step)
+4. Run tests to verify they pass (one step)
+5. Commit (one step)
+
+**Include in each step:**
+- Exact file path
+- Complete code example (not pseudo-code)
+- Exact command to run
+- Expected output
+
+### 2d. Present COMPLETE Expansion to User
+
+**CRITICAL: Show the full expansion BEFORE asking for approval.**
+
+**Format:**
+```markdown
+**bd-[N]: [Task Title]**
+
+**From bd issue:**
+- Goal: [From bd show]
+- Effort estimate: [From bd issue]
+- Success criteria: [From bd issue]
+
+**Codebase verification findings:**
+- ✓ Confirmed: [what matched]
+- ✗ Incorrect: [what issue said] - ACTUALLY: [reality]
+- + Found: [unexpected discoveries]
+
+**Implementation steps based on actual codebase state:**
+
+### Step Group 1: [Component Name]
+
+**Files:**
+- Create: `exact/path/to/file.py`
+- Modify: `exact/path/to/existing.py:123-145`
+- Test: `tests/exact/path/to/test.py`
+
+**Step 1: Write the failing test**
+```python
+# tests/auth/test_login.py
+def test_login_with_valid_credentials():
+    user = create_test_user(email="test@example.com", password="secure123")
+    result = login(email="test@example.com", password="secure123")
+    assert result.success is True
+    assert result.user_id == user.id
+```
+
+**Step 2: Run test to verify it fails**
+```bash
+pytest tests/auth/test_login.py::test_login_with_valid_credentials
+# Expected: ModuleNotFoundError: No module named 'auth.login'
+```
+
+[... continue for all steps ...]
+```
+
+**THEN ask for approval using AskUserQuestion:**
+- Question: "Is this expansion approved for bd-[N]?"
+- Options:
+  - "Approved - continue to next task"
+  - "Needs revision"
+  - "Other"
+
+### 2e. If Approved: Update bd and Continue
+
+```bash
+bd update bd-3 --design "[paste complete expansion]"
+# Mark completed in TodoWrite
+# IMMEDIATELY continue to next task (NO asking permission)
+```
+
+### 2f. If Needs Revision: Iterate
+
+- Keep as in_progress in TodoWrite
+- Revise based on feedback
+- Present again (step 2d)
+
+## 3. After ALL Tasks Done
+
+```
+All bd issues now contain detailed implementation steps.
+Epic ready for execution.
+```
+
+**Offer execution choice:**
+"Ready to execute? I can use hyperpowers:executing-plans to implement iteratively."
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer writes placeholder text instead of actual implementation steps</scenario>
+
+<code>
+bd update bd-3 --design "## Goal
+Implement user authentication
+
+## Implementation
+[Full implementation steps as detailed above - includes all 6 step groups with complete code examples]
+
+## Tests
+[Complete code examples will be added here]"
+</code>
+
+<why_it_fails>
+**Placeholders defeat the purpose:**
+- Engineer executing bd-3 has zero context
+- There is no "above" to reference (bd issue is the source of truth)
+- Violates "exact instructions for zero-context engineer" principle
+- Makes task impossible to execute
+
+**Common placeholder patterns (ALL FORBIDDEN):**
+- `[Full implementation steps as detailed above]`
+- `[See above for detailed steps]`
+- `[As specified in success criteria]`
+- `[Complete code examples will be added here]`
+</why_it_fails>
+
+<correction>
+**Write actual content:**
+
+```bash
+bd update bd-3 --design "## Goal
+Implement user authentication
+
+## Implementation
+
+### Step 1: Write failing login test
+```python
+# tests/auth/test_login.py
+import pytest
+from auth.service import login
+
+def test_login_with_valid_credentials():
+    result = login(email='test@example.com', password='pass123')
+    assert result.success is True
+```
+
+### Step 2: Run test (should fail)
+```bash
+pytest tests/auth/test_login.py::test_login_with_valid_credentials
+# Expected: ModuleNotFoundError: No module named 'auth.service'
+```
+
+### Step 3: Create login function
+```python
+# src/auth/service.py
+from dataclasses import dataclass
+
+@dataclass
+class LoginResult:
+    success: bool
+    user_id: int | None = None
+
+def login(email: str, password: str) -> LoginResult:
+    # Minimal implementation
+    return LoginResult(success=True, user_id=1)
+```
+
+[... continue for all steps with complete code ...]
+
+## Tests
+All test code included in implementation steps above following TDD cycle."
+```
+
+**Result:** Engineer can execute without any context.
+</correction>
+</example>
+
+<example>
+<scenario>Developer verifies codebase state themselves instead of using codebase-investigator agent</scenario>
+
+<code>
+Developer reads files manually:
+- Reads src/services/auth.ts directly
+- Checks package.json manually
+- Assumes file structure based on quick look
+
+Writes expansion based on quick check:
+"Modify src/services/auth.ts (if exists)"
+</code>
+
+<why_it_fails>
+**Manual verification problems:**
+- Misses nuances (existing functions, imports, structure)
+- Creates conditional steps ("if exists")
+- Doesn't catch version mismatches
+- Doesn't report discrepancies from bd assumptions
+
+**Result:** Implementation plan may not match actual codebase state.
+</why_it_fails>
+
+<correction>
+**Use codebase-investigator agent:**
+
+```
+Dispatch agent with bd-3 assumptions:
+"bd-3 expects auth service in src/services/auth.ts with login() and logout() functions.
+Verify:
+1. Does src/services/auth.ts exist?
+2. What functions does it export?
+3. How do login() and logout() work currently?
+4. Any other relevant auth code?
+5. What's the bcrypt version?"
+```
+
+**Agent reports:**
+```
+✓ src/services/auth.ts exists
+✗ ONLY has login() function - NO logout() yet
+ Found: login() uses argon2 NOT bcrypt
+ Found: Session management in src/services/session.ts
+✓ argon2 version: 0.31.2
+```
+
+**Write definitive steps based on findings:**
+```
+Step 1: Add logout() function to EXISTING src/services/auth.ts:45-67
+(no "if exists" - investigator confirmed location)
+
+Step 2: Use argon2 (already installed 0.31.2) not bcrypt
+(no assumption - investigator confirmed actual dependency)
+```
+
+**Result:** Plan matches actual codebase state.
+</correction>
+</example>
+
+<example>
+<scenario>Developer asks permission between each task validation instead of continuing automatically</scenario>
+
+<code>
+After user approves bd-3 expansion:
+
+Developer: "bd-3 expansion approved and updated in bd.
+Should I continue to bd-4 now? What's your preference?"
+
+[Waits for user response]
+</code>
+
+<why_it_fails>
+**Breaks workflow momentum:**
+- Unnecessary interruption
+- User has to respond multiple times
+- Slows down batch processing
+- TodoWrite list IS the plan
+
+**Why it happens:** Over-asking for permission instead of executing the plan.
+</why_it_fails>
+
+<correction>
+**After user approves bd-3:**
+
+```bash
+bd update bd-3 --design "[expansion]"  # Update bd
+# Mark completed in TodoWrite
+```
+
+**IMMEDIATELY continue to bd-4:**
+```bash
+bd show bd-4  # Read next task
+# Dispatch codebase-investigator with bd-4 assumptions
+# Draft expansion
+# Present bd-4 expansion to user
+```
+
+**NO asking:** "Should I continue?" or "What's your preference?"
+
+**ONLY ask user:**
+1. When presenting each task expansion for validation
+2. At the VERY END after ALL tasks done to offer execution choice
+
+**Between validations: JUST CONTINUE.**
+
+**Result:** Efficient batch processing of all tasks.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **No placeholders or meta-references** → Write actual content
+   - ❌ FORBIDDEN: `[Full implementation steps as detailed above]`
+   - ✅ REQUIRED: Complete code, exact paths, real commands
+
+2. **Use codebase-investigator agent** → Never verify yourself
+   - Agent gets bd assumptions
+   - Agent reports discrepancies
+   - You adjust plan to match reality
+
+3. **Present COMPLETE expansion before asking** → User must SEE before approving
+   - Show full expansion in message text
+   - Then use AskUserQuestion for approval
+   - Never ask without showing first
+
+4. **Continue automatically between validations** → Don't ask permission
+   - TodoWrite list IS your plan
+   - Execute it completely
+   - Only ask: (a) task validation, (b) final execution choice
+
+5. **Write definitive steps** → Never conditional
+   - ❌ "Update `index.js` if exists"
+   - ✅ "Create `src/auth.ts`" (investigator confirmed)
+
+## Common Excuses
+
+All of these mean: Stop, write actual content:
+- "I'll add the details later"
+- "The implementation is obvious from the goal"
+- "See above for the steps"
+- "User can figure out the code"
+
+</critical_rules>
+
+<verification_checklist>
+
+Before marking each task complete in TodoWrite:
+- [ ] Used codebase-investigator agent (not manual verification)
+- [ ] Presented COMPLETE expansion to user (showed full text)
+- [ ] User approved expansion (via AskUserQuestion)
+- [ ] Updated bd with actual content (no placeholders)
+- [ ] No meta-references in design field
+
+Before finishing all tasks:
+- [ ] All tasks in TodoWrite marked completed
+- [ ] All bd issues updated with expansions
+- [ ] No conditional steps ("if exists")
+- [ ] Complete code examples in all steps
+- [ ] Exact file paths and commands throughout
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- sre-task-refinement (optional, can run before this)
+- codebase-investigator (REQUIRED for each task verification)
+- executing-plans (offered after all tasks expanded)
+
+**This skill is called by:**
+- User (via /hyperpowers:write-plan command)
+- After brainstorming creates epic
+
+**Agents used:**
+- hyperpowers:codebase-investigator (verify assumptions, report discrepancies)
+
+</integration>
+
+<resources>
+
+**Detailed guidance:**
+- [bd command reference](../common-patterns/bd-commands.md)
+- [Task structure examples](resources/task-examples.md) (if exists)
+
+**When stuck:**
+- Unsure about file structure → Use codebase-investigator
+- Don't know version → Use codebase-investigator
+- Tempted to write "if exists" → Use codebase-investigator first
+- About to write placeholder → Stop, write actual content
+- Want to ask permission → Check: Is this task validation or final choice? If neither, don't ask
+
+</resources>
--- a/skills/writing-skills/SKILL.md
+++ b/skills/writing-skills/SKILL.md
@@ -0,0 +1,599 @@
+---
+name: writing-skills
+description: Use when creating new skills, editing existing skills, or verifying skills work - applies TDD to documentation by testing with subagents before writing
+---
+
+<skill_overview>
+Writing skills IS test-driven development applied to process documentation; write test (pressure scenario), watch fail (baseline), write skill, watch pass, refactor (close loopholes).
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow the RED-GREEN-REFACTOR cycle exactly when creating skills. No skill without failing test first. Same Iron Law as TDD.
+</rigidity_level>
+
+<quick_reference>
+| Phase | Action | Verify |
+|-------|--------|--------|
+| **RED** | Create pressure scenarios | Document baseline failures |
+| **RED** | Run WITHOUT skill | Agent violates rule |
+| **GREEN** | Write minimal skill | Addresses baseline failures |
+| **GREEN** | Run WITH skill | Agent now complies |
+| **REFACTOR** | Find new rationalizations | Agent still complies |
+| **REFACTOR** | Add explicit counters | Bulletproof against excuses |
+| **DEPLOY** | Commit and optionally PR | Skill ready for use |
+
+**Iron Law:** NO SKILL WITHOUT FAILING TEST FIRST (applies to new skills AND edits)
+</quick_reference>
+
+<when_to_use>
+**Create skill when:**
+- Technique wasn't intuitively obvious to you
+- You'd reference this again across projects
+- Pattern applies broadly (not project-specific)
+- Others would benefit from this knowledge
+
+**Never create for:**
+- One-off solutions
+- Standard practices well-documented elsewhere
+- Project-specific conventions (put in CLAUDE.md instead)
+
+**Edit existing skill when:**
+- Found new rationalization agents use
+- Discovered loophole in current guidance
+- Need to add clarifying examples
+
+**ALWAYS test before writing or editing. No exceptions.**
+</when_to_use>
+
+<tdd_mapping>
+Skills use the exact same TDD cycle as code:
+
+| TDD Concept | Skill Creation |
+|-------------|----------------|
+| **Test case** | Pressure scenario with subagent |
+| **Production code** | Skill document (SKILL.md) |
+| **Test fails (RED)** | Agent violates rule without skill |
+| **Test passes (GREEN)** | Agent complies with skill present |
+| **Refactor** | Close loopholes while maintaining compliance |
+| **Write test first** | Run baseline scenario BEFORE writing skill |
+| **Watch it fail** | Document exact rationalizations agent uses |
+| **Minimal code** | Write skill addressing those specific violations |
+| **Watch it pass** | Verify agent now complies |
+| **Refactor cycle** | Find new rationalizations → plug → re-verify |
+
+**REQUIRED BACKGROUND:** You MUST understand hyperpowers:test-driven-development before using this skill.
+</tdd_mapping>
+
+<the_process>
+## 1. RED Phase - Create Failing Test
+
+**Create pressure scenarios for subagent:**
+
+```
+Task tool with general-purpose agent:
+
+"You are implementing a payment processing feature. User requirements:
+- Process credit card payments
+- Handle retries on failure
+- Log all transactions
+
+[PRESSURE 1: Time] You have 10 minutes before deployment.
+[PRESSURE 2: Sunk Cost] You've already written 200 lines of code.
+[PRESSURE 3: Authority] Senior engineer said 'just make it work, tests can wait.'
+
+Implement this feature."
+```
+
+**Run WITHOUT skill present.**
+
+**Document baseline behavior:**
+- Exact rationalizations agent uses ("tests can wait," "simple feature," etc.)
+- What agent skips (tests, verification, bd task, etc.)
+- Patterns in failure modes
+
+**Example baseline result:**
+```
+Agent response:
+"I'll implement the payment processing quickly since time is tight..."
+[Skips TDD]
+[Skips verification-before-completion]
+[Claims done without evidence]
+```
+
+**This is your failing test.** Agent doesn't follow the workflow without guidance.
+
+---
+
+## 2. GREEN Phase - Write Minimal Skill
+
+Write skill that addresses the SPECIFIC failures from baseline:
+
+**Structure:**
+
+```markdown
+---
+name: skill-name-with-hyphens
+description: Use when [specific triggers] - [what skill does]
+---
+
+<skill_overview>
+One sentence core principle
+</skill_overview>
+
+<rigidity_level>
+LOW | MEDIUM | HIGH FREEDOM - [What this means]
+</rigidity_level>
+
+[Rest of standard XML structure]
+```
+
+**Frontmatter rules:**
+- Only `name` and `description` fields (max 1024 chars total)
+- Name: letters, numbers, hyphens only (no parentheses/special chars)
+- Description: Start with "Use when...", third person, includes triggers
+
+**Description format:**
+```yaml
+# ❌ BAD: Too abstract, first person
+description: I can help with async tests when they're flaky
+
+# ✅ GOOD: Starts with "Use when", describes problem
+description: Use when tests have race conditions or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
+```
+
+**Write skill addressing baseline failures:**
+- Add explicit counters for rationalizations ("tests can wait" → "NO EXCEPTIONS: tests first")
+- Create quick reference table for scanning
+- Add concrete examples showing failure modes
+- Use XML structure for all sections
+
+**Run WITH skill present.**
+
+**Verify agent now complies:**
+- Same pressure scenario
+- Agent now follows workflow
+- No rationalizations from baseline appear
+
+**This is your passing test.**
+
+---
+
+## 3. REFACTOR Phase - Close Loopholes
+
+**Find NEW rationalizations:**
+
+Run skill with DIFFERENT pressures:
+- Combine 3+ pressures (time + sunk cost + exhaustion)
+- Try meta-rationalizations ("this skill doesn't apply because...")
+- Test with edge cases
+
+**Document new failures:**
+- What rationalizations appear NOW?
+- What loopholes did agent find?
+- What explicit counters are needed?
+
+**Add counters to skill:**
+
+```markdown
+<critical_rules>
+## Common Excuses
+
+All of these mean: [Action to take]
+- "Test can wait" (NO, test first always)
+- "Simple feature" (Simple breaks too, test first)
+- "Time pressure" (Broken code wastes more time)
+[Add ALL rationalizations found during testing]
+</critical_rules>
+```
+
+**Re-test until bulletproof:**
+- Run scenarios again
+- Verify new counters work
+- Agent complies even under combined pressures
+
+---
+
+## 4. Quality Checks
+
+Before deployment, verify:
+
+- [ ] Has `<quick_reference>` section (scannable table)
+- [ ] Has `<rigidity_level>` explicit
+- [ ] Has 2-3 `<example>` tags showing failure modes
+- [ ] Description <500 chars, starts with "Use when..."
+- [ ] Keywords throughout for search (error messages, symptoms, tools)
+- [ ] One excellent code example (not multi-language)
+- [ ] Supporting files only for tools or heavy reference (>100 lines)
+
+**Token efficiency:**
+- Frequently-loaded skills: <200 words ideally
+- Other skills: <500 words
+- Move heavy content to resources/ files
+
+---
+
+## 5. Deploy
+
+**Commit to git:**
+
+```bash
+git add skills/skill-name/
+git commit -m "feat: add [skill-name] skill
+
+Tested with subagents under [pressures used].
+Addresses [baseline failures found].
+
+Closes rationalizations:
+- [Rationalization 1]
+- [Rationalization 2]"
+```
+
+**Personal skills:** Write to `~/.claude/skills/` for cross-project use
+
+**Plugin skills:** PR to plugin repository if broadly useful
+
+**STOP:** Before moving to next skill, complete this entire process. No batching untested skills.
+</the_process>
+
+<examples>
+<example>
+<scenario>Developer writes skill without testing first</scenario>
+
+<code>
+# Developer writes skill:
+"---
+name: always-use-tdd
+description: Always write tests first
+---
+
+Write tests first. No exceptions."
+
+# Then tries to deploy it
+</code>
+
+<why_it_fails>
+- No baseline behavior documented (don't know what agent does WITHOUT skill)
+- No verification skill actually works (might not address real rationalizations)
+- Generic guidance ("no exceptions") without specific counters
+- Will likely miss common excuses agents use
+- Violates Iron Law: no skill without failing test first
+</why_it_fails>
+
+<correction>
+**Correct approach (RED-GREEN-REFACTOR):**
+
+**RED Phase:**
+1. Create pressure scenario (time + sunk cost)
+2. Run WITHOUT skill
+3. Document baseline: Agent says "I'll test after since time is tight"
+
+**GREEN Phase:**
+1. Write skill with explicit counter to that rationalization
+2. Add: "Common excuses: 'Time is tight' → Wrong. Broken code wastes more time. Write test first."
+3. Run WITH skill → agent now writes test first
+
+**REFACTOR Phase:**
+1. Try new pressure (exhaustion: "this is the 5th feature today")
+2. Agent finds loophole: "these are all similar, I can skip tests"
+3. Add counter: "Similar ≠ identical. Write test for each."
+4. Re-test → bulletproof
+
+**What you gain:**
+- Know skill addresses real failures (saw baseline)
+- Confident skill works (saw it fix behavior)
+- Closed all loopholes (tested multiple pressures)
+- Ready for production use
+</correction>
+</example>
+
+<example>
+<scenario>Developer edits skill without testing changes</scenario>
+
+<code>
+# Existing skill works well
+# Developer thinks: "I'll just add this section about edge cases"
+
+[Adds 50 lines to skill]
+
+# Commits without testing
+</code>
+
+<why_it_fails>
+- Don't know if new section actually helps (no baseline)
+- Might introduce contradictions with existing guidance
+- Could make skill less effective (more verbose, less clear)
+- Violates Iron Law: applies to edits too
+- Changes might not address actual rationalization patterns
+</why_it_fails>
+
+<correction>
+**Correct approach:**
+
+**RED Phase (for edit):**
+1. Identify specific failure mode you want to address
+2. Create pressure scenario that triggers it
+3. Run WITH current skill → document how agent fails
+
+**GREEN Phase (edit):**
+1. Add ONLY content addressing that failure
+2. Keep changes minimal
+3. Run WITH edited skill → verify agent now complies
+
+**REFACTOR Phase:**
+1. Check edit didn't break existing scenarios
+2. Run previous test cases
+3. Verify all still pass
+
+**What you gain:**
+- Changes address real problems (saw failure)
+- Know edit helps (saw improvement)
+- Didn't break existing guidance (regression tested)
+- Skill stays bulletproof
+</correction>
+</example>
+
+<example>
+<scenario>Skill description too vague for search</scenario>
+
+<code>
+---
+name: async-testing
+description: For testing async code
+---
+
+# Skill content...
+</code>
+
+<why_it_fails>
+- Future Claude won't find this when needed
+- "For testing async code" too abstract (when would Claude search this?)
+- Doesn't describe symptoms or triggers
+- Missing keywords like "flaky," "race condition," "timeout"
+- Won't show up when agent has the actual problem
+</why_it_fails>
+
+<correction>
+**Better description:**
+
+```yaml
+---
+name: condition-based-waiting
+description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
+---
+```
+
+**Why this works:**
+- Starts with "Use when" (triggers)
+- Lists symptoms: "race conditions," "pass/fail inconsistently"
+- Describes problem AND solution
+- Keywords: "race conditions," "timing," "inconsistent," "timeouts"
+- Future Claude searching "why are my tests flaky" will find this
+
+**What you gain:**
+- Skill actually gets found when needed
+- Claude knows when to use it (clear triggers)
+- Search terms match real developer language
+- Description doubles as activation criteria
+</correction>
+</example>
+</examples>
+
+<skill_types>
+## Technique
+Concrete method with steps to follow.
+
+**Examples:** condition-based-waiting, hyperpowers:root-cause-tracing
+
+**Test approach:** Pressure scenarios with combined pressures
+
+## Pattern
+Way of thinking about problems.
+
+**Examples:** flatten-with-flags, test-invariants
+
+**Test approach:** Present problems the pattern solves, verify agent applies pattern
+
+## Reference
+API docs, syntax guides, tool documentation.
+
+**Examples:** Office document manipulation, API reference guides
+
+**Test approach:** Give task requiring reference, verify agent uses it correctly
+
+**For detailed testing methodology by skill type:** See [resources/testing-methodology.md](resources/testing-methodology.md)
+</skill_types>
+
+<file_organization>
+## Self-Contained Skill
+```
+defense-in-depth/
+  SKILL.md    # Everything inline
+```
+**When:** All content fits, no heavy reference needed
+
+## Skill with Reusable Tool
+```
+condition-based-waiting/
+  SKILL.md    # Overview + patterns
+  example.ts  # Working helpers to adapt
+```
+**When:** Tool is reusable code, not just narrative
+
+## Skill with Heavy Reference
+```
+pptx/
+  SKILL.md       # Overview + workflows
+  pptxgenjs.md   # 600 lines API reference
+  ooxml.md       # 500 lines XML structure
+  scripts/       # Executable tools
+```
+**When:** Reference material too large for inline (>100 lines)
+
+**Keep inline:**
+- Principles and concepts
+- Code patterns (<50 lines)
+- Everything that fits
+</file_organization>
+
+<search_optimization>
+## Claude Search Optimization (CSO)
+
+Future Claude needs to FIND your skill. Optimize for search.
+
+### 1. Rich Description Field
+
+**Format:** Start with "Use when..." + triggers + what it does
+
+```yaml
+# ❌ BAD: Too abstract
+description: For async testing
+
+# ❌ BAD: First person
+description: I can help you with async tests
+
+# ✅ GOOD: Triggers + problem + solution
+description: Use when tests have race conditions or pass/fail inconsistently - replaces arbitrary timeouts with condition polling
+```
+
+### 2. Keyword Coverage
+
+Use words Claude would search for:
+- **Error messages:** "Hook timed out", "ENOTEMPTY", "race condition"
+- **Symptoms:** "flaky", "hanging", "zombie", "pollution"
+- **Synonyms:** "timeout/hang/freeze", "cleanup/teardown/afterEach"
+- **Tools:** Actual commands, library names, file types
+
+### 3. Token Efficiency
+
+**Problem:** Frequently-referenced skills load into EVERY conversation.
+
+**Target word counts:**
+- Frequently-loaded: <200 words
+- Other skills: <500 words
+
+**Techniques:**
+- Move details to tool --help
+- Use cross-references to other skills
+- Compress examples
+- Eliminate redundancy
+
+**Verification:**
+```bash
+wc -w skills/skill-name/SKILL.md
+```
+
+### 4. Cross-Referencing
+
+**Use skill name only, with explicit markers:**
+```markdown
+**REQUIRED BACKGROUND:** You MUST understand hyperpowers:test-driven-development
+**REQUIRED SUB-SKILL:** Use hyperpowers:debugging-with-tools first
+```
+
+**Don't use @ links:** Force-loads files immediately, burns context unnecessarily.
+</search_optimization>
+
+<critical_rules>
+## Rules That Have No Exceptions
+
+1. **NO SKILL WITHOUT FAILING TEST FIRST** → Applies to new skills AND edits
+2. **Test with subagents under pressure** → Combined pressures (time + sunk cost + authority)
+3. **Document baseline behavior** → Exact rationalizations, not paraphrases
+4. **Write minimal skill addressing baseline** → Don't add content not validated by testing
+5. **STOP before next skill** → Complete RED-GREEN-REFACTOR-DEPLOY for each skill
+
+## Common Excuses
+
+All of these mean: **STOP. Run baseline test first.**
+
+- "Simple skill, don't need testing" (If simple, testing is fast. Do it.)
+- "Just adding documentation" (Documentation can be wrong. Test it.)
+- "I'll test after I write a few" (Batching untested = deploying untested code)
+- "This is obvious, everyone knows it" (Then baseline will show agent already complies)
+- "Testing is overkill for skills" (TDD applies to documentation too)
+- "I'll adapt while testing" (Violates RED phase. Start over.)
+- "I'll keep untested as reference" (Delete means delete. No exceptions.)
+
+## The Iron Law
+
+Same as TDD:
+
+```
+NO SKILL WITHOUT FAILING TEST FIRST
+```
+
+**No exceptions for:**
+- "Simple additions"
+- "Just adding a section"
+- "Documentation updates"
+- Edits to existing skills
+
+**Write skill before testing?** Delete it. Start over.
+</critical_rules>
+
+<verification_checklist>
+Before deploying ANY skill:
+
+**RED Phase:**
+- [ ] Created pressure scenarios (3+ combined pressures for discipline skills)
+- [ ] Ran WITHOUT skill present
+- [ ] Documented baseline behavior verbatim (exact rationalizations)
+- [ ] Identified patterns in failures
+
+**GREEN Phase:**
+- [ ] Name uses only letters, numbers, hyphens
+- [ ] YAML frontmatter: name + description only (max 1024 chars)
+- [ ] Description starts with "Use when..." and includes triggers
+- [ ] Description in third person
+- [ ] Has `<quick_reference>` section
+- [ ] Has `<rigidity_level>` explicit
+- [ ] Has 2-3 `<example>` tags
+- [ ] Addresses specific baseline failures
+- [ ] Ran WITH skill present
+- [ ] Verified agent now complies
+
+**REFACTOR Phase:**
+- [ ] Tested with different pressures
+- [ ] Found NEW rationalizations
+- [ ] Added explicit counters
+- [ ] Re-tested until bulletproof
+
+**Quality:**
+- [ ] Keywords throughout for search
+- [ ] One excellent code example (not multi-language)
+- [ ] Token-efficient (check word count)
+- [ ] Supporting files only if needed
+
+**Deploy:**
+- [ ] Committed to git with descriptive message
+- [ ] Pushed to plugin repository (if applicable)
+
+**Can't check all boxes?** Return to process and fix.
+</verification_checklist>
+
+<integration>
+**This skill requires:**
+- hyperpowers:test-driven-development (understand TDD before applying to docs)
+- Task tool (for running subagent tests)
+
+**This skill is called by:**
+- Anyone creating or editing skills
+- Plugin maintainers
+- Users with personal skill repositories
+
+**Agents used:**
+- general-purpose (for testing skills under pressure)
+</integration>
+
+<resources>
+**Detailed guides:**
+- [Testing methodology by skill type](resources/testing-methodology.md) - How to test disciplines, techniques, patterns, reference skills
+- [Anthropic best practices](resources/anthropic-best-practices.md) - Official skill authoring guidance
+- [Graphviz conventions](resources/graphviz-conventions.dot) - Flowchart style rules
+
+**When stuck:**
+- Skill seems too simple to test → If simple, testing is fast. Do it anyway.
+- Don't know what pressures to use → Time + sunk cost + authority always work
+- Agent still rationalizes → Add explicit counter for that exact excuse
+- Testing feels like overhead → Same as TDD: testing prevents bigger problems
+</resources>
--- a/skills/writing-skills/anthropic-best-practices.md
+++ b/skills/writing-skills/anthropic-best-practices.md
--- a/skills/writing-skills/graphviz-conventions.dot
+++ b/skills/writing-skills/graphviz-conventions.dot
@@ -0,0 +1,172 @@
+digraph STYLE_GUIDE {
+    // The style guide for our process DSL, written in the DSL itself
+
+    // Node type examples with their shapes
+    subgraph cluster_node_types {
+        label="NODE TYPES AND SHAPES";
+
+        // Questions are diamonds
+        "Is this a question?" [shape=diamond];
+
+        // Actions are boxes (default)
+        "Take an action" [shape=box];
+
+        // Commands are plaintext
+        "git commit -m 'msg'" [shape=plaintext];
+
+        // States are ellipses
+        "Current state" [shape=ellipse];
+
+        // Warnings are octagons
+        "STOP: Critical warning" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+
+        // Entry/exit are double circles
+        "Process starts" [shape=doublecircle];
+        "Process complete" [shape=doublecircle];
+
+        // Examples of each
+        "Is test passing?" [shape=diamond];
+        "Write test first" [shape=box];
+        "npm test" [shape=plaintext];
+        "I am stuck" [shape=ellipse];
+        "NEVER use git add -A" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+    }
+
+    // Edge naming conventions
+    subgraph cluster_edge_types {
+        label="EDGE LABELS";
+
+        "Binary decision?" [shape=diamond];
+        "Yes path" [shape=box];
+        "No path" [shape=box];
+
+        "Binary decision?" -> "Yes path" [label="yes"];
+        "Binary decision?" -> "No path" [label="no"];
+
+        "Multiple choice?" [shape=diamond];
+        "Option A" [shape=box];
+        "Option B" [shape=box];
+        "Option C" [shape=box];
+
+        "Multiple choice?" -> "Option A" [label="condition A"];
+        "Multiple choice?" -> "Option B" [label="condition B"];
+        "Multiple choice?" -> "Option C" [label="otherwise"];
+
+        "Process A done" [shape=doublecircle];
+        "Process B starts" [shape=doublecircle];
+
+        "Process A done" -> "Process B starts" [label="triggers", style=dotted];
+    }
+
+    // Naming patterns
+    subgraph cluster_naming_patterns {
+        label="NAMING PATTERNS";
+
+        // Questions end with ?
+        "Should I do X?";
+        "Can this be Y?";
+        "Is Z true?";
+        "Have I done W?";
+
+        // Actions start with verb
+        "Write the test";
+        "Search for patterns";
+        "Commit changes";
+        "Ask for help";
+
+        // Commands are literal
+        "grep -r 'pattern' .";
+        "git status";
+        "npm run build";
+
+        // States describe situation
+        "Test is failing";
+        "Build complete";
+        "Stuck on error";
+    }
+
+    // Process structure template
+    subgraph cluster_structure {
+        label="PROCESS STRUCTURE TEMPLATE";
+
+        "Trigger: Something happens" [shape=ellipse];
+        "Initial check?" [shape=diamond];
+        "Main action" [shape=box];
+        "git status" [shape=plaintext];
+        "Another check?" [shape=diamond];
+        "Alternative action" [shape=box];
+        "STOP: Don't do this" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+        "Process complete" [shape=doublecircle];
+
+        "Trigger: Something happens" -> "Initial check?";
+        "Initial check?" -> "Main action" [label="yes"];
+        "Initial check?" -> "Alternative action" [label="no"];
+        "Main action" -> "git status";
+        "git status" -> "Another check?";
+        "Another check?" -> "Process complete" [label="ok"];
+        "Another check?" -> "STOP: Don't do this" [label="problem"];
+        "Alternative action" -> "Process complete";
+    }
+
+    // When to use which shape
+    subgraph cluster_shape_rules {
+        label="WHEN TO USE EACH SHAPE";
+
+        "Choosing a shape" [shape=ellipse];
+
+        "Is it a decision?" [shape=diamond];
+        "Use diamond" [shape=diamond, style=filled, fillcolor=lightblue];
+
+        "Is it a command?" [shape=diamond];
+        "Use plaintext" [shape=plaintext, style=filled, fillcolor=lightgray];
+
+        "Is it a warning?" [shape=diamond];
+        "Use octagon" [shape=octagon, style=filled, fillcolor=pink];
+
+        "Is it entry/exit?" [shape=diamond];
+        "Use doublecircle" [shape=doublecircle, style=filled, fillcolor=lightgreen];
+
+        "Is it a state?" [shape=diamond];
+        "Use ellipse" [shape=ellipse, style=filled, fillcolor=lightyellow];
+
+        "Default: use box" [shape=box, style=filled, fillcolor=lightcyan];
+
+        "Choosing a shape" -> "Is it a decision?";
+        "Is it a decision?" -> "Use diamond" [label="yes"];
+        "Is it a decision?" -> "Is it a command?" [label="no"];
+        "Is it a command?" -> "Use plaintext" [label="yes"];
+        "Is it a command?" -> "Is it a warning?" [label="no"];
+        "Is it a warning?" -> "Use octagon" [label="yes"];
+        "Is it a warning?" -> "Is it entry/exit?" [label="no"];
+        "Is it entry/exit?" -> "Use doublecircle" [label="yes"];
+        "Is it entry/exit?" -> "Is it a state?" [label="no"];
+        "Is it a state?" -> "Use ellipse" [label="yes"];
+        "Is it a state?" -> "Default: use box" [label="no"];
+    }
+
+    // Good vs bad examples
+    subgraph cluster_examples {
+        label="GOOD VS BAD EXAMPLES";
+
+        // Good: specific and shaped correctly
+        "Test failed" [shape=ellipse];
+        "Read error message" [shape=box];
+        "Can reproduce?" [shape=diamond];
+        "git diff HEAD~1" [shape=plaintext];
+        "NEVER ignore errors" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+
+        "Test failed" -> "Read error message";
+        "Read error message" -> "Can reproduce?";
+        "Can reproduce?" -> "git diff HEAD~1" [label="yes"];
+
+        // Bad: vague and wrong shapes
+        bad_1 [label="Something wrong", shape=box];  // Should be ellipse (state)
+        bad_2 [label="Fix it", shape=box];  // Too vague
+        bad_3 [label="Check", shape=box];  // Should be diamond
+        bad_4 [label="Run command", shape=box];  // Should be plaintext with actual command
+
+        bad_1 -> bad_2;
+        bad_2 -> bad_3;
+        bad_3 -> bad_4;
+    }
+}
--- a/skills/writing-skills/persuasion-principles.md
+++ b/skills/writing-skills/persuasion-principles.md
@@ -0,0 +1,187 @@
+# Persuasion Principles for Skill Design
+
+## Overview
+
+LLMs respond to the same persuasion principles as humans. Understanding this psychology helps you design more effective skills - not to manipulate, but to ensure critical practices are followed even under pressure.
+
+**Research foundation:** Meincke et al. (2025) tested 7 persuasion principles with N=28,000 AI conversations. Persuasion techniques more than doubled compliance rates (33% → 72%, p < .001).
+
+## The Seven Principles
+
+### 1. Authority
+**What it is:** Deference to expertise, credentials, or official sources.
+
+**How it works in skills:**
+- Imperative language: "YOU MUST", "Never", "Always"
+- Non-negotiable framing: "No exceptions"
+- Eliminates decision fatigue and rationalization
+
+**When to use:**
+- Discipline-enforcing skills (TDD, verification requirements)
+- Safety-critical practices
+- Established best practices
+
+**Example:**
+```markdown
+✅ Write code before test? Delete it. Start over. No exceptions.
+❌ Consider writing tests first when feasible.
+```
+
+### 2. Commitment
+**What it is:** Consistency with prior actions, statements, or public declarations.
+
+**How it works in skills:**
+- Require announcements: "Announce skill usage"
+- Force explicit choices: "Choose A, B, or C"
+- Use tracking: TodoWrite for checklists
+
+**When to use:**
+- Ensuring skills are actually followed
+- Multi-step processes
+- Accountability mechanisms
+
+**Example:**
+```markdown
+✅ When you find a skill, you MUST announce: "I'm using [Skill Name]"
+❌ Consider letting your partner know which skill you're using.
+```
+
+### 3. Scarcity
+**What it is:** Urgency from time limits or limited availability.
+
+**How it works in skills:**
+- Time-bound requirements: "Before proceeding"
+- Sequential dependencies: "Immediately after X"
+- Prevents procrastination
+
+**When to use:**
+- Immediate verification requirements
+- Time-sensitive workflows
+- Preventing "I'll do it later"
+
+**Example:**
+```markdown
+✅ After completing a task, IMMEDIATELY request code review before proceeding.
+❌ You can review code when convenient.
+```
+
+### 4. Social Proof
+**What it is:** Conformity to what others do or what's considered normal.
+
+**How it works in skills:**
+- Universal patterns: "Every time", "Always"
+- Failure modes: "X without Y = failure"
+- Establishes norms
+
+**When to use:**
+- Documenting universal practices
+- Warning about common failures
+- Reinforcing standards
+
+**Example:**
+```markdown
+✅ Checklists without TodoWrite tracking = steps get skipped. Every time.
+❌ Some people find TodoWrite helpful for checklists.
+```
+
+### 5. Unity
+**What it is:** Shared identity, "we-ness", in-group belonging.
+
+**How it works in skills:**
+- Collaborative language: "our codebase", "we're colleagues"
+- Shared goals: "we both want quality"
+
+**When to use:**
+- Collaborative workflows
+- Establishing team culture
+- Non-hierarchical practices
+
+**Example:**
+```markdown
+✅ We're colleagues working together. I need your honest technical judgment.
+❌ You should probably tell me if I'm wrong.
+```
+
+### 6. Reciprocity
+**What it is:** Obligation to return benefits received.
+
+**How it works:**
+- Use sparingly - can feel manipulative
+- Rarely needed in skills
+
+**When to avoid:**
+- Almost always (other principles more effective)
+
+### 7. Liking
+**What it is:** Preference for cooperating with those we like.
+
+**How it works:**
+- **DON'T USE for compliance**
+- Conflicts with honest feedback culture
+- Creates sycophancy
+
+**When to avoid:**
+- Always for discipline enforcement
+
+## Principle Combinations by Skill Type
+
+| Skill Type | Use | Avoid |
+|------------|-----|-------|
+| Discipline-enforcing | Authority + Commitment + Social Proof | Liking, Reciprocity |
+| Guidance/technique | Moderate Authority + Unity | Heavy authority |
+| Collaborative | Unity + Commitment | Authority, Liking |
+| Reference | Clarity only | All persuasion |
+
+## Why This Works: The Psychology
+
+**Bright-line rules reduce rationalization:**
+- "YOU MUST" removes decision fatigue
+- Absolute language eliminates "is this an exception?" questions
+- Explicit anti-rationalization counters close specific loopholes
+
+**Implementation intentions create automatic behavior:**
+- Clear triggers + required actions = automatic execution
+- "When X, do Y" more effective than "generally do Y"
+- Reduces cognitive load on compliance
+
+**LLMs are parahuman:**
+- Trained on human text containing these patterns
+- Authority language precedes compliance in training data
+- Commitment sequences (statement → action) frequently modeled
+- Social proof patterns (everyone does X) establish norms
+
+## Ethical Use
+
+**Legitimate:**
+- Ensuring critical practices are followed
+- Creating effective documentation
+- Preventing predictable failures
+
+**Illegitimate:**
+- Manipulating for personal gain
+- Creating false urgency
+- Guilt-based compliance
+
+**The test:** Would this technique serve the user's genuine interests if they fully understood it?
+
+## Research Citations
+
+**Cialdini, R. B. (2021).** *Influence: The Psychology of Persuasion (New and Expanded).* Harper Business.
+- Seven principles of persuasion
+- Empirical foundation for influence research
+
+**Meincke, L., Shapiro, D., Duckworth, A. L., Mollick, E., Mollick, L., & Cialdini, R. (2025).** Call Me A Jerk: Persuading AI to Comply with Objectionable Requests. University of Pennsylvania.
+- Tested 7 principles with N=28,000 LLM conversations
+- Compliance increased 33% → 72% with persuasion techniques
+- Authority, commitment, scarcity most effective
+- Validates parahuman model of LLM behavior
+
+## Quick Reference
+
+When designing a skill, ask:
+
+1. **What type is it?** (Discipline vs. guidance vs. reference)
+2. **What behavior am I trying to change?**
+3. **Which principle(s) apply?** (Usually authority + commitment for discipline)
+4. **Am I combining too many?** (Don't use all seven)
+5. **Is this ethical?** (Serves user's genuine interests?)
--- a/skills/writing-skills/resources/testing-methodology.md
+++ b/skills/writing-skills/resources/testing-methodology.md
@@ -0,0 +1,167 @@
+## Testing All Skill Types
+
+Different skill types need different test approaches:
+
+### Discipline-Enforcing Skills (rules/requirements)
+
+**Examples:** TDD, hyperpowers:verification-before-completion, hyperpowers:designing-before-coding
+
+**Test with:**
+- Academic questions: Do they understand the rules?
+- Pressure scenarios: Do they comply under stress?
+- Multiple pressures combined: time + sunk cost + exhaustion
+- Identify rationalizations and add explicit counters
+
+**Success criteria:** Agent follows rule under maximum pressure
+
+### Technique Skills (how-to guides)
+
+**Examples:** condition-based-waiting, hyperpowers:root-cause-tracing, defensive-programming
+
+**Test with:**
+- Application scenarios: Can they apply the technique correctly?
+- Variation scenarios: Do they handle edge cases?
+- Missing information tests: Do instructions have gaps?
+
+**Success criteria:** Agent successfully applies technique to new scenario
+
+### Pattern Skills (mental models)
+
+**Examples:** reducing-complexity, information-hiding concepts
+
+**Test with:**
+- Recognition scenarios: Do they recognize when pattern applies?
+- Application scenarios: Can they use the mental model?
+- Counter-examples: Do they know when NOT to apply?
+
+**Success criteria:** Agent correctly identifies when/how to apply pattern
+
+### Reference Skills (documentation/APIs)
+
+**Examples:** API documentation, command references, library guides
+
+**Test with:**
+- Retrieval scenarios: Can they find the right information?
+- Application scenarios: Can they use what they found correctly?
+- Gap testing: Are common use cases covered?
+
+**Success criteria:** Agent finds and correctly applies reference information
+
+## Common Rationalizations for Skipping Testing
+
+| Excuse | Reality |
+|--------|---------|
+| "Skill is obviously clear" | Clear to you ≠ clear to other agents. Test it. |
+| "It's just a reference" | References can have gaps, unclear sections. Test retrieval. |
+| "Testing is overkill" | Untested skills have issues. Always. 15 min testing saves hours. |
+| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. |
+| "Too tedious to test" | Testing is less tedious than debugging bad skill in production. |
+| "I'm confident it's good" | Overconfidence guarantees issues. Test anyway. |
+| "Academic review is enough" | Reading ≠ using. Test application scenarios. |
+| "No time to test" | Deploying untested skill wastes more time fixing it later. |
+
+**All of these mean: Test before deploying. No exceptions.**
+
+## Bulletproofing Skills Against Rationalization
+
+Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure.
+
+**Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles.
+
+### Close Every Loophole Explicitly
+
+Don't just state the rule - forbid specific workarounds:
+
+<Bad>
+```markdown
+Write code before test? Delete it.
+```
+</Bad>
+
+<Good>
+```markdown
+Write code before test? Delete it. Start over.
+
+**No exceptions:**
+- Don't keep it as "reference"
+- Don't "adapt" it while writing tests
+- Don't look at it
+- Delete means delete
+```
+</Good>
+
+### Address "Spirit vs Letter" Arguments
+
+Add foundational principle early:
+
+```markdown
+**Violating the letter of the rules is violating the spirit of the rules.**
+```
+
+This cuts off entire class of "I'm following the spirit" rationalizations.
+
+### Build Rationalization Table
+
+Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table:
+
+```markdown
+| Excuse | Reality |
+|--------|---------|
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
+| "I'll test after" | Tests passing immediately prove nothing. |
+| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+```
+
+### Create Red Flags List
+
+Make it easy for agents to self-check when rationalizing:
+
+```markdown
+## Red Flags - STOP and Start Over
+
+- Code before test
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "This is different because..."
+
+**All of these mean: Delete code. Start over with TDD.**
+```
+
+### Update CSO for Violation Symptoms
+
+Add to description: symptoms of when you're ABOUT to violate the rule:
+
+```yaml
+description: use when implementing any feature or bugfix, before writing implementation code
+```
+
+## RED-GREEN-REFACTOR for Skills
+
+Follow the TDD cycle:
+
+### RED: Write Failing Test (Baseline)
+
+Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:
+- What choices did they make?
+- What rationalizations did they use (verbatim)?
+- Which pressures triggered violations?
+
+This is "watch the test fail" - you must see what agents naturally do before writing the skill.
+
+### GREEN: Write Minimal Skill
+
+Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases.
+
+Run same scenarios WITH skill. Agent should now comply.
+
+### REFACTOR: Close Loopholes
+
+Agent found new rationalization? Add explicit counter. Re-test until bulletproof.
+
+**REQUIRED SUB-SKILL:** Use superpowers:testing-skills-with-subagents for the complete testing methodology:
+- How to write pressure scenarios
+- Pressure types (time, sunk cost, authority, exhaustion)
+- Plugging holes systematically
+- Meta-testing techniques
+