Initial commit

2025-11-29 18:28:37 +08:00
commit ccc65b3f07
180 changed files with 53970 additions and 0 deletions
--- a/skills/create-plans/README.md
+++ b/skills/create-plans/README.md
@@ -0,0 +1,291 @@
+# create-plans
+
+**Hierarchical project planning optimized for solo developer + Claude**
+
+Create executable plans that Claude can run, not enterprise documentation that sits unused.
+
+## Philosophy
+
+**You are the visionary. Claude is the builder.**
+
+No teams. No stakeholders. No ceremonies. No coordination overhead.
+
+Plans are written AS prompts (PLAN.md IS the execution prompt), not documentation that gets transformed into prompts later.
+
+## Quick Start
+
+```
+Skill("create-plans")
+```
+
+The skill will:
+1. Scan for existing planning structure
+2. Check for git repo (offers to initialize)
+3. Present context-aware options
+4. Guide you through the appropriate workflow
+
+## Planning Hierarchy
+
+```
+BRIEF.md          → Human vision (what and why)
+    ↓
+ROADMAP.md        → Phase structure (high-level plan)
+    ↓
+RESEARCH.md       → Research prompt (for unknowns - optional)
+    ↓
+FINDINGS.md       → Research output (if research done)
+    ↓
+PLAN.md           → THE PROMPT (Claude executes this)
+    ↓
+SUMMARY.md        → Outcome (existence = phase complete)
+```
+
+## Directory Structure
+
+All planning artifacts go in `.planning/`:
+
+```
+.planning/
+├── BRIEF.md                    # Project vision
+├── ROADMAP.md                  # Phase structure + tracking
+└── phases/
+    ├── 01-foundation/
+    │   ├── PLAN.md             # THE PROMPT (execute this)
+    │   ├── SUMMARY.md          # Outcome (exists = done)
+    │   └── .continue-here.md   # Handoff (temporary)
+    └── 02-auth/
+        ├── RESEARCH.md         # Research prompt (if needed)
+        ├── FINDINGS.md         # Research output
+        ├── PLAN.md             # Execute prompt
+        └── SUMMARY.md
+```
+
+## Workflows
+
+### Starting a New Project
+
+1. Invoke skill
+2. Choose "Start new project"
+3. Answer questions about vision/goals
+4. Skill creates BRIEF.md
+5. Optionally create ROADMAP.md with phases
+6. Plan first phase
+
+### Planning a Phase
+
+1. Skill reads BRIEF + ROADMAP
+2. Loads domain expertise if applicable (see Domain Skills below)
+3. If phase has unknowns → create RESEARCH.md first
+4. Creates PLAN.md (the executable prompt)
+5. You review or execute
+
+### Executing a Phase
+
+1. Skill reads PLAN.md
+2. Executes each task with verification
+3. Creates SUMMARY.md when complete
+4. Git commits phase completion
+5. Offers to plan next phase
+
+### Pausing Work (Handoff)
+
+1. Choose "Create handoff"
+2. Skill creates `.continue-here.md` with full context
+3. When resuming, skill loads handoff and continues
+
+## Domain Skills (Optional)
+
+**What are domain skills?**
+
+Full-fledged agent skills that exhaustively document how to build in a specific framework/platform. They make your plans concrete instead of generic.
+
+**Without domain skill:**
+```
+Task: Create authentication system
+Action: Implement user login
+```
+Generic. Not helpful.
+
+**With domain skill (macOS apps):**
+```
+Task: Create login window
+Files: Sources/Views/LoginView.swift
+Action: SwiftUI view with @Bindable for User model. TextField for username/password.
+SecureField for password (uses system keychain). Submit button triggers validation
+logic. Use @FocusState for tab order. Add Command-L keyboard shortcut.
+Verify: xcodebuild test && open App.app (check tab order, keychain storage)
+```
+Specific. Executable. Framework-appropriate.
+
+**Structure of domain skills:**
+
+```
+~/.claude/skills/expertise/[domain]/
+├── SKILL.md              # Router + essential principles
+├── workflows/            # build-new-app, add-feature, debug-app, etc.
+└── references/           # Exhaustive domain knowledge (often 10k+ lines)
+```
+
+**Domain skills are dual-purpose:**
+
+1. **Standalone skills** - Invoke with `Skill("build-macos-apps")` for guided development
+2. **Context for create-plans** - Loaded automatically when planning that domain
+
+**Example domains:**
+- `macos-apps` - Swift/SwiftUI macOS (19 references, 10k+ lines)
+- `iphone-apps` - Swift/SwiftUI iOS
+- `unity-games` - Unity game development
+- `swift-midi-apps` - MIDI/audio apps
+- `with-agent-sdk` - Claude Agent SDK apps
+- `nextjs-ecommerce` - Next.js e-commerce
+
+**How it works:**
+
+1. Skill infers domain from your request ("build a macOS app" → build-macos-apps)
+2. Before creating PLAN.md, reads all `~/.claude/skills/build/macos-apps/references/*.md`
+3. Uses that exhaustive knowledge to write framework-specific tasks
+4. Result: Plans that match your actual tech stack with all the details
+
+**What if you don't have domain skills?**
+
+Skill works fine without them - proceeds with general planning. But tasks will be more generic and require more clarification during execution.
+
+### Creating a Domain Skill
+
+Domain skills are created with [create-agent-skills](../create-agent-skills/) skill.
+
+**Process:**
+
+1. `Skill("create-agent-skills")` → choose "Build a new skill"
+2. Name: `build-[your-domain]`
+3. Description: "Build [framework/platform] apps. Full lifecycle - build, debug, test, optimize, ship."
+4. Ask it to create exhaustive references covering:
+   - Architecture patterns
+   - Project scaffolding
+   - Common features (data, networking, UI)
+   - Testing and debugging
+   - Platform-specific conventions
+   - CLI workflow (how to build/run without IDE)
+   - Deployment/shipping
+
+**The skill should be comprehensive** - 5k-10k+ lines documenting everything about building in that domain. When create-plans loads it, the resulting PLAN.md tasks will be detailed and executable.
+
+## Quality Controls
+
+Research prompts include systematic verification to prevent gaps:
+
+- **Verification checklists** - Enumerate ALL options before researching
+- **Blind spots review** - "What might I have missed?"
+- **Critical claims audit** - Verify "X is not possible" with sources
+- **Quality reports** - Distinguish verified facts from assumptions
+- **Streaming writes** - Write incrementally to prevent token limit failures
+
+See `references/research-pitfalls.md` for known mistakes and prevention.
+
+## Key Principles
+
+### Solo Developer + Claude
+Planning for ONE person (you) and ONE implementer (Claude). No team coordination, stakeholder management, or enterprise processes.
+
+### Plans Are Prompts
+PLAN.md IS the execution prompt. It contains objective, context (@file references), tasks (Files/Action/Verify/Done), and verification steps.
+
+### Ship Fast, Iterate Fast
+Plan → Execute → Ship → Learn → Repeat. No multi-week timelines, approval gates, or sprint ceremonies.
+
+### Context Awareness
+Monitors token usage:
+- **25% remaining**: Mentions context getting full
+- **15% remaining**: Pauses, offers handoff
+- **10% remaining**: Auto-creates handoff, stops
+
+Never starts large operations below 15% without confirmation.
+
+### User Gates
+Pauses at critical decision points:
+- Before writing PLAN.md (confirm breakdown)
+- After low-confidence research
+- On verification failures
+- When previous phase had issues
+
+See `references/user-gates.md` for full gate patterns.
+
+### Git Versioning
+All planning artifacts are version controlled. Commits outcomes, not process:
+- Initialization commit (BRIEF + ROADMAP)
+- Phase completion commits (PLAN + SUMMARY + code)
+- Handoff commits (when pausing work)
+
+Git log becomes project history.
+
+## Anti-Patterns
+
+This skill NEVER includes:
+- Team structures, roles, RACI matrices
+- Stakeholder management, alignment meetings
+- Sprint ceremonies, standups, retros
+- Multi-week estimates, resource allocation
+- Change management, governance processes
+- Documentation for documentation's sake
+
+If it sounds like corporate PM theater, it doesn't belong.
+
+## Files Reference
+
+### Structure
+- `references/directory-structure.md` - Planning directory layout
+- `references/hierarchy-rules.md` - How levels build on each other
+
+### Formats
+- `references/plan-format.md` - PLAN.md structure
+- `references/handoff-format.md` - Context handoff structure
+
+### Patterns
+- `references/context-scanning.md` - How skill understands current state
+- `references/context-management.md` - Token usage monitoring
+- `references/user-gates.md` - When to pause and ask
+- `references/git-integration.md` - Version control patterns
+- `references/research-pitfalls.md` - Known research mistakes
+
+### Templates
+- `templates/brief.md` - Project vision document
+- `templates/roadmap.md` - Phase structure
+- `templates/phase-prompt.md` - Executable phase prompt (PLAN.md)
+- `templates/research-prompt.md` - Research prompt (RESEARCH.md)
+- `templates/summary.md` - Phase outcome (SUMMARY.md)
+- `templates/continue-here.md` - Context handoff
+
+### Workflows
+- `workflows/create-brief.md` - Create project vision
+- `workflows/create-roadmap.md` - Define phases from brief
+- `workflows/plan-phase.md` - Create executable phase prompt
+- `workflows/execute-phase.md` - Run phase, create summary
+- `workflows/research-phase.md` - Create and run research
+- `workflows/plan-chunk.md` - Plan immediate next tasks
+- `workflows/transition.md` - Mark phase complete, advance
+- `workflows/handoff.md` - Create context handoff for pausing
+- `workflows/resume.md` - Load handoff, restore context
+- `workflows/get-guidance.md` - Help decide planning approach
+
+## Example Domain Skill
+
+See `build/example-nextjs/` for a minimal domain skill showing:
+- Framework-specific patterns
+- Project structure conventions
+- Common commands
+- Phase breakdown strategies
+- Task specificity guidelines
+
+Use this as a template for creating your own domain skills.
+
+## Success Criteria
+
+Planning skill succeeds when:
+- Context scan runs before intake
+- Appropriate workflow selected based on state
+- PLAN.md IS the executable prompt (not separate doc)
+- Hierarchy is maintained (brief → roadmap → phase)
+- Handoffs preserve full context for resumption
+- Context limits respected (auto-handoff at 10%)
+- Quality controls prevent research gaps
+- Streaming writes prevent token limit failures
--- a/skills/create-plans/SKILL.md
+++ b/skills/create-plans/SKILL.md
@@ -0,0 +1,488 @@
+---
+name: create-plans
+description: Create hierarchical project plans optimized for solo agentic development. Use when planning projects, phases, or tasks that Claude will execute. Produces Claude-executable plans with verification criteria, not enterprise documentation. Handles briefs, roadmaps, phase plans, and context handoffs.
+---
+
+<essential_principles>
+
+<principle name="solo_developer_plus_claude">
+You are planning for ONE person (the user) and ONE implementer (Claude).
+No teams. No stakeholders. No ceremonies. No coordination overhead.
+The user is the visionary/product owner. Claude is the builder.
+</principle>
+
+<principle name="plans_are_prompts">
+PLAN.md is not a document that gets transformed into a prompt.
+PLAN.md IS the prompt. It contains:
+- Objective (what and why)
+- Context (@file references)
+- Tasks (type, files, action, verify, done, checkpoints)
+- Verification (overall checks)
+- Success criteria (measurable)
+- Output (SUMMARY.md specification)
+
+When planning a phase, you are writing the prompt that will execute it.
+</principle>
+
+<principle name="scope_control">
+Plans must complete within ~50% of context usage to maintain consistent quality.
+
+**The quality degradation curve:**
+- 0-30% context: Peak quality (comprehensive, thorough, no anxiety)
+- 30-50% context: Good quality (engaged, manageable pressure)
+- 50-70% context: Degrading quality (efficiency mode, compression)
+- 70%+ context: Poor quality (self-lobotomization, rushed work)
+
+**Critical insight:** Claude doesn't degrade at 80% - it degrades at ~40-50% when it sees context mounting and enters "completion mode." By 80%, quality has already crashed.
+
+**Solution:** Aggressive atomicity - split phases into many small, focused plans.
+
+Examples:
+- `01-01-PLAN.md` - Phase 1, Plan 1 (2-3 tasks: database schema only)
+- `01-02-PLAN.md` - Phase 1, Plan 2 (2-3 tasks: database client setup)
+- `01-03-PLAN.md` - Phase 1, Plan 3 (2-3 tasks: API routes)
+- `01-04-PLAN.md` - Phase 1, Plan 4 (2-3 tasks: UI components)
+
+Each plan is independently executable, verifiable, and scoped to **2-3 tasks maximum**.
+
+**Atomic task principle:** Better to have 10 small, high-quality plans than 3 large, degraded plans. Each commit should be surgical, focused, and maintainable.
+
+**Autonomous execution:** Plans without checkpoints execute via subagent with fresh context - impossible to degrade.
+
+See: references/scope-estimation.md
+</principle>
+
+<principle name="human_checkpoints">
+**Claude automates everything that has a CLI or API.** Checkpoints are for verification and decisions, not manual work.
+
+**Checkpoint types:**
+- `checkpoint:human-verify` - Human confirms Claude's automated work (visual checks, UI verification)
+- `checkpoint:decision` - Human makes implementation choice (auth provider, architecture)
+
+**Rarely needed:** `checkpoint:human-action` - Only for actions with no CLI/API (email verification links, account approvals requiring web login with 2FA)
+
+**Critical rule:** If Claude CAN do it via CLI/API/tool, Claude MUST do it. Never ask human to:
+- Deploy to Vercel/Railway/Fly (use CLI)
+- Create Stripe webhooks (use CLI/API)
+- Run builds/tests (use Bash)
+- Write .env files (use Write tool)
+- Create database resources (use provider CLI)
+
+**Protocol:** Claude automates work → reaches checkpoint:human-verify → presents what was done → waits for confirmation → resumes
+
+See: references/checkpoints.md, references/cli-automation.md
+</principle>
+
+<principle name="deviation_rules">
+Plans are guides, not straitjackets. Real development always involves discoveries.
+
+**During execution, deviations are handled automatically via 5 embedded rules:**
+
+1. **Auto-fix bugs** - Broken behavior → fix immediately, document in Summary
+2. **Auto-add missing critical** - Security/correctness gaps → add immediately, document
+3. **Auto-fix blockers** - Can't proceed → fix immediately, document
+4. **Ask about architectural** - Major structural changes → stop and ask user
+5. **Log enhancements** - Nice-to-haves → auto-log to ISSUES.md, continue
+
+**No user intervention needed for Rules 1-3, 5.** Only Rule 4 (architectural) requires user decision.
+
+**All deviations documented in Summary** with: what was found, what rule applied, what was done, commit hash.
+
+**Result:** Flow never breaks. Bugs get fixed. Scope stays controlled. Complete transparency.
+
+See: workflows/execute-phase.md (deviation_rules section)
+</principle>
+
+<principle name="ship_fast_iterate_fast">
+No enterprise process. No approval gates. No multi-week timelines.
+Plan → Execute → Ship → Learn → Repeat.
+
+**Milestone-driven:** Ship v1.0 → mark milestone → plan v1.1 → ship → repeat.
+Milestones mark shipped versions and enable continuous iteration.
+</principle>
+
+<principle name="milestone_boundaries">
+Milestones mark shipped versions (v1.0, v1.1, v2.0).
+
+**Purpose:**
+- Historical record in MILESTONES.md (what shipped when)
+- Greenfield → Brownfield transition marker
+- Git tags for releases
+- Clear completion rituals
+
+**Default approach:** Extend existing roadmap with new phases.
+- v1.0 ships (phases 1-4) → add phases 5-6 for v1.1
+- Continuous phase numbering (01-99)
+- Milestone groupings keep roadmap organized
+
+**Archive ONLY for:** Separate codebases or complete rewrites (rare).
+
+See: references/milestone-management.md
+</principle>
+
+<principle name="anti_enterprise_patterns">
+NEVER include in plans:
+- Team structures, roles, RACI matrices
+- Stakeholder management, alignment meetings
+- Sprint ceremonies, standups, retros
+- Multi-week estimates, resource allocation
+- Change management, governance processes
+- Documentation for documentation's sake
+
+If it sounds like corporate PM theater, delete it.
+</principle>
+
+<principle name="context_awareness">
+Monitor token usage via system warnings.
+
+**At 25% remaining**: Mention context getting full
+**At 15% remaining**: Pause, offer handoff
+**At 10% remaining**: Auto-create handoff, stop
+
+Never start large operations below 15% without user confirmation.
+</principle>
+
+<principle name="user_gates">
+Never charge ahead at critical decision points. Use gates:
+- **AskUserQuestion**: Structured choices (2-4 options)
+- **Inline questions**: Simple confirmations
+- **Decision gate loop**: "Ready, or ask more questions?"
+
+Mandatory gates:
+- Before writing PLAN.md (confirm breakdown)
+- After low-confidence research
+- On verification failures
+- After phase completion with issues
+- Before starting next phase with previous issues
+
+See: references/user-gates.md
+</principle>
+
+<principle name="git_versioning">
+All planning artifacts are version controlled. Commit outcomes, not process.
+
+- Check for repo on invocation, offer to initialize
+- Commit only at: initialization, phase completion, handoff
+- Intermediate artifacts (PLAN.md, RESEARCH.md, FINDINGS.md) NOT committed separately
+- Git log becomes project history
+
+See: references/git-integration.md
+</principle>
+
+</essential_principles>
+
+<context_scan>
+**Run on every invocation** to understand current state:
+
+```bash
+# Check git status
+git rev-parse --git-dir 2>/dev/null || echo "NO_GIT_REPO"
+
+# Check for planning structure
+ls -la .planning/ 2>/dev/null
+ls -la .planning/phases/ 2>/dev/null
+
+# Find any continue-here files
+find . -name ".continue-here.md" -type f 2>/dev/null
+
+# Check for existing artifacts
+[ -f .planning/BRIEF.md ] && echo "BRIEF: exists"
+[ -f .planning/ROADMAP.md ] && echo "ROADMAP: exists"
+```
+
+**If NO_GIT_REPO detected:**
+Inline question: "No git repo found. Initialize one? (Recommended for version control)"
+If yes: `git init`
+
+**Present findings before intake question.**
+</context_scan>
+
+<domain_expertise>
+**Domain expertise lives in `~/.claude/skills/expertise/`**
+
+Before creating roadmap or phase plans, determine if domain expertise should be loaded.
+
+<scan_domains>
+```bash
+ls ~/.claude/skills/expertise/ 2>/dev/null
+```
+
+This reveals available domain expertise (e.g., macos-apps, iphone-apps, unity-games, nextjs-ecommerce).
+
+**If no domain skills found:** Proceed without domain expertise (graceful degradation). The skill works fine without domain-specific context.
+</scan_domains>
+
+<inference_rules>
+If user's request contains domain keywords, INFER the domain:
+
+| Keywords | Domain Skill |
+|----------|--------------|
+| "macOS", "Mac app", "menu bar", "AppKit", "SwiftUI desktop" | expertise/macos-apps |
+| "iPhone", "iOS", "iPad", "mobile app", "SwiftUI mobile" | expertise/iphone-apps |
+| "Unity", "game", "C#", "3D game", "2D game" | expertise/unity-games |
+| "MIDI", "MIDI tool", "sequencer", "MIDI controller", "music app", "MIDI 2.0", "MPE", "SysEx" | expertise/midi |
+| "Agent SDK", "Claude SDK", "agentic app" | expertise/with-agent-sdk |
+| "Python automation", "workflow", "API integration", "webhooks", "Celery", "Airflow", "Prefect" | expertise/python-workflow-automation |
+| "UI", "design", "frontend", "interface", "responsive", "visual design", "landing page", "website design", "Tailwind", "CSS", "web design" | expertise/ui-design |
+
+If domain inferred, confirm:
+```
+Detected: [domain] project → expertise/[skill-name]
+Load this expertise for planning? (Y / see other options / none)
+```
+</inference_rules>
+
+<no_inference>
+If no domain obvious from request, present options:
+
+```
+What type of project is this?
+
+Available domain expertise:
+1. macos-apps - Native macOS with Swift/SwiftUI
+2. iphone-apps - Native iOS with Swift/SwiftUI
+3. unity-games - Unity game development
+4. swift-midi-apps - MIDI/audio apps
+5. with-agent-sdk - Claude Agent SDK apps
+6. ui-design - Stunning UI/UX design & frontend development
+[... any others found in expertise/]
+
+N. None - proceed without domain expertise
+C. Create domain skill first
+
+Select:
+```
+</no_inference>
+
+<load_domain>
+When domain selected, use intelligent loading:
+
+**Step 1: Read domain SKILL.md**
+```bash
+cat ~/.claude/skills/expertise/[domain]/SKILL.md 2>/dev/null
+```
+
+This loads core principles and routing guidance (~5k tokens).
+
+**Step 2: Determine what references are needed**
+
+Domain SKILL.md should contain a `<references_index>` section that maps planning contexts to specific references.
+
+Example:
+```markdown
+<references_index>
+**For database/persistence phases:** references/core-data.md, references/swift-concurrency.md
+**For UI/layout phases:** references/swiftui-layout.md, references/appleHIG.md
+**For system integration:** references/appkit-integration.md
+**Always useful:** references/swift-conventions.md
+</references_index>
+```
+
+**Step 3: Load only relevant references**
+
+Based on the phase being planned (from ROADMAP), load ONLY the references mentioned for that type of work.
+
+```bash
+# Example: Planning a database phase
+cat ~/.claude/skills/expertise/macos-apps/references/core-data.md
+cat ~/.claude/skills/expertise/macos-apps/references/swift-conventions.md
+```
+
+**Context efficiency:**
+- SKILL.md only: ~5k tokens
+- SKILL.md + selective references: ~8-12k tokens
+- All references (old approach): ~20-27k tokens
+
+Announce: "Loaded [domain] expertise ([X] references for [phase-type])."
+
+**If domain skill not found:** Inform user and offer to proceed without domain expertise.
+
+**If SKILL.md doesn't have references_index:** Fall back to loading all references with warning about context usage.
+</load_domain>
+
+<when_to_load>
+Domain expertise should be loaded BEFORE:
+- Creating roadmap (phases should be domain-appropriate)
+- Planning phases (tasks must be domain-specific)
+
+Domain expertise is NOT needed for:
+- Creating brief (vision is domain-agnostic)
+- Resuming from handoff (context already established)
+- Transition between phases (just updating status)
+</when_to_load>
+</domain_expertise>
+
+<intake>
+Based on scan results, present context-aware options:
+
+**If handoff found:**
+```
+Found handoff: .planning/phases/XX/.continue-here.md
+[Summary of state from handoff]
+
+1. Resume from handoff
+2. Discard handoff, start fresh
+3. Different action
+```
+
+**If planning structure exists:**
+```
+Project: [from BRIEF or directory]
+Brief: [exists/missing]
+Roadmap: [X phases defined]
+Current: [phase status]
+
+What would you like to do?
+1. Plan next phase
+2. Execute current phase
+3. Create handoff (stopping for now)
+4. View/update roadmap
+5. Something else
+```
+
+**If no planning structure:**
+```
+No planning structure found.
+
+What would you like to do?
+1. Start new project (create brief)
+2. Create roadmap from existing brief
+3. Jump straight to phase planning
+4. Get guidance on approach
+```
+
+**Wait for response before proceeding.**
+</intake>
+
+<routing>
+| Response | Workflow |
+|----------|----------|
+| "brief", "new project", "start", 1 (no structure) | `workflows/create-brief.md` |
+| "roadmap", "phases", 2 (no structure) | `workflows/create-roadmap.md` |
+| "phase", "plan phase", "next phase", 1 (has structure) | `workflows/plan-phase.md` |
+| "chunk", "next tasks", "what's next" | `workflows/plan-chunk.md` |
+| "execute", "run", "do it", "build it", 2 (has structure) | **EXIT SKILL** → Use `/run-plan <path>` slash command |
+| "research", "investigate", "unknowns" | `workflows/research-phase.md` |
+| "handoff", "pack up", "stopping", 3 (has structure) | `workflows/handoff.md` |
+| "resume", "continue", 1 (has handoff) | `workflows/resume.md` |
+| "transition", "complete", "done", "next" | `workflows/transition.md` |
+| "milestone", "ship", "v1.0", "release" | `workflows/complete-milestone.md` |
+| "guidance", "help", 4 | `workflows/get-guidance.md` |
+
+**Critical:** Plan execution should NOT invoke this skill. Use `/run-plan` for context efficiency (skill loads ~20k tokens, /run-plan loads ~5-7k).
+
+**After reading the workflow, follow it exactly.**
+</routing>
+
+<hierarchy>
+The planning hierarchy (each level builds on previous):
+
+```
+BRIEF.md          → Human vision (you read this)
+    ↓
+ROADMAP.md        → Phase structure (overview)
+    ↓
+RESEARCH.md       → Research prompt (optional, for unknowns)
+    ↓
+FINDINGS.md       → Research output (if research done)
+    ↓
+PLAN.md           → THE PROMPT (Claude executes this)
+    ↓
+SUMMARY.md        → Outcome (existence = phase complete)
+```
+
+**Rules:**
+- Roadmap requires Brief (or prompts to create one)
+- Phase plan requires Roadmap (knows phase scope)
+- PLAN.md IS the execution prompt
+- SUMMARY.md existence marks phase complete
+- Each level can look UP for context
+</hierarchy>
+
+<output_structure>
+All planning artifacts go in `.planning/`:
+
+```
+.planning/
+├── BRIEF.md                    # Human vision
+├── ROADMAP.md                  # Phase structure + tracking
+└── phases/
+    ├── 01-foundation/
+    │   ├── 01-01-PLAN.md       # Plan 1: Database setup
+    │   ├── 01-01-SUMMARY.md    # Outcome (exists = done)
+    │   ├── 01-02-PLAN.md       # Plan 2: API routes
+    │   ├── 01-02-SUMMARY.md
+    │   ├── 01-03-PLAN.md       # Plan 3: UI components
+    │   └── .continue-here-01-03.md  # Handoff (temporary, if needed)
+    └── 02-auth/
+        ├── 02-01-RESEARCH.md   # Research prompt (if needed)
+        ├── 02-01-FINDINGS.md   # Research output
+        ├── 02-02-PLAN.md       # Implementation prompt
+        └── 02-02-SUMMARY.md
+```
+
+**Naming convention:**
+- Plans: `{phase}-{plan}-PLAN.md` (e.g., 01-03-PLAN.md)
+- Summaries: `{phase}-{plan}-SUMMARY.md` (e.g., 01-03-SUMMARY.md)
+- Phase folders: `{phase}-{name}/` (e.g., 01-foundation/)
+
+Files sort chronologically. Related artifacts (plan + summary) are adjacent.
+</output_structure>
+
+<reference_index>
+All in `references/`:
+
+**Structure:** directory-structure.md, hierarchy-rules.md
+**Formats:** handoff-format.md, plan-format.md
+**Patterns:** context-scanning.md, context-management.md
+**Planning:** scope-estimation.md, checkpoints.md, milestone-management.md
+**Process:** user-gates.md, git-integration.md, research-pitfalls.md
+**Domain:** domain-expertise.md (guide for creating context-efficient domain skills)
+</reference_index>
+
+<templates_index>
+All in `templates/`:
+
+| Template | Purpose |
+|----------|---------|
+| brief.md | Project vision document with current state |
+| roadmap.md | Phase structure with milestone groupings |
+| phase-prompt.md | Executable phase prompt (PLAN.md) |
+| research-prompt.md | Research prompt (RESEARCH.md) |
+| summary.md | Phase outcome (SUMMARY.md) with deviations |
+| milestone.md | Milestone entry for MILESTONES.md |
+| issues.md | Deferred enhancements log (ISSUES.md) |
+| continue-here.md | Context handoff format |
+</templates_index>
+
+<workflows_index>
+All in `workflows/`:
+
+| Workflow | Purpose |
+|----------|---------|
+| create-brief.md | Create project vision document |
+| create-roadmap.md | Define phases from brief |
+| plan-phase.md | Create executable phase prompt |
+| execute-phase.md | Run phase prompt, create summary |
+| research-phase.md | Create and run research prompt |
+| plan-chunk.md | Plan immediate next tasks |
+| transition.md | Mark phase complete, advance |
+| complete-milestone.md | Mark shipped version, create milestone entry |
+| handoff.md | Create context handoff for pausing |
+| resume.md | Load handoff, restore context |
+| get-guidance.md | Help decide planning approach |
+</workflows_index>
+
+<success_criteria>
+Planning skill succeeds when:
+- Context scan runs before intake
+- Appropriate workflow selected based on state
+- PLAN.md IS the executable prompt (not separate)
+- Hierarchy is maintained (brief → roadmap → phase)
+- Handoffs preserve full context for resumption
+- Context limits are respected (auto-handoff at 10%)
+- Deviations handled automatically per embedded rules
+- All work (planned and discovered) fully documented
+- Domain expertise loaded intelligently (SKILL.md + selective references, not all files)
+- Plan execution uses /run-plan command (not skill invocation)
+</success_criteria>
--- a/skills/create-plans/references/checkpoints.md
+++ b/skills/create-plans/references/checkpoints.md
@@ -0,0 +1,584 @@
+# Human Checkpoints in Plans
+
+Plans execute autonomously. Checkpoints formalize the interaction points where human verification or decisions are needed.
+
+**Core principle:** Claude automates everything with CLI/API. Checkpoints are for verification and decisions, not manual work.
+
+## Checkpoint Types
+
+### 1. `checkpoint:human-verify` (Most Common)
+
+**When:** Claude completed automated work, human confirms it works correctly.
+
+**Use for:**
+- Visual UI checks (layout, styling, responsiveness)
+- Interactive flows (click through wizard, test user flows)
+- Functional verification (feature works as expected)
+- Audio/video playback quality
+- Animation smoothness
+- Accessibility testing
+
+**Structure:**
+```xml
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>[What Claude automated and deployed/built]</what-built>
+  <how-to-verify>
+    [Exact steps to test - URLs, commands, expected behavior]
+  </how-to-verify>
+  <resume-signal>[How to continue - "approved", "yes", or describe issues]</resume-signal>
+</task>
+```
+
+**Key elements:**
+- `<what-built>`: What Claude automated (deployed, built, configured)
+- `<how-to-verify>`: Exact steps to confirm it works (numbered, specific)
+- `<resume-signal>`: Clear indication of how to continue
+
+**Example: Vercel Deployment**
+```xml
+<task type="auto">
+  <name>Deploy to Vercel</name>
+  <files>.vercel/, vercel.json</files>
+  <action>Run `vercel --yes` to create project and deploy. Capture deployment URL from output.</action>
+  <verify>vercel ls shows deployment, curl {url} returns 200</verify>
+  <done>App deployed, URL captured</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Deployed to Vercel at https://myapp-abc123.vercel.app</what-built>
+  <how-to-verify>
+    Visit https://myapp-abc123.vercel.app and confirm:
+    - Homepage loads without errors
+    - Login form is visible
+    - No console errors in browser DevTools
+  </how-to-verify>
+  <resume-signal>Type "approved" to continue, or describe issues to fix</resume-signal>
+</task>
+```
+
+**Example: UI Component**
+```xml
+<task type="auto">
+  <name>Build responsive dashboard layout</name>
+  <files>src/components/Dashboard.tsx, src/app/dashboard/page.tsx</files>
+  <action>Create dashboard with sidebar, header, and content area. Use Tailwind responsive classes for mobile.</action>
+  <verify>npm run build succeeds, no TypeScript errors</verify>
+  <done>Dashboard component builds without errors</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Responsive dashboard layout at /dashboard</what-built>
+  <how-to-verify>
+    1. Run: npm run dev
+    2. Visit: http://localhost:3000/dashboard
+    3. Desktop (>1024px): Verify sidebar left, content right, header top
+    4. Tablet (768px): Verify sidebar collapses to hamburger
+    5. Mobile (375px): Verify single column, bottom nav
+    6. Check: No layout shift, no horizontal scroll
+  </how-to-verify>
+  <resume-signal>Type "approved" or describe layout issues</resume-signal>
+</task>
+```
+
+**Example: Xcode Build**
+```xml
+<task type="auto">
+  <name>Build macOS app with Xcode</name>
+  <files>App.xcodeproj, Sources/</files>
+  <action>Run `xcodebuild -project App.xcodeproj -scheme App build`. Check for compilation errors in output.</action>
+  <verify>Build output contains "BUILD SUCCEEDED", no errors</verify>
+  <done>App builds successfully</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Built macOS app at DerivedData/Build/Products/Debug/App.app</what-built>
+  <how-to-verify>
+    Open App.app and test:
+    - App launches without crashes
+    - Menu bar icon appears
+    - Preferences window opens correctly
+    - No visual glitches or layout issues
+  </how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+```
+
+### 2. `checkpoint:decision`
+
+**When:** Human must make choice that affects implementation direction.
+
+**Use for:**
+- Technology selection (which auth provider, which database)
+- Architecture decisions (monorepo vs separate repos)
+- Design choices (color scheme, layout approach)
+- Feature prioritization (which variant to build)
+- Data model decisions (schema structure)
+
+**Structure:**
+```xml
+<task type="checkpoint:decision" gate="blocking">
+  <decision>[What's being decided]</decision>
+  <context>[Why this decision matters]</context>
+  <options>
+    <option id="option-a">
+      <name>[Option name]</name>
+      <pros>[Benefits]</pros>
+      <cons>[Tradeoffs]</cons>
+    </option>
+    <option id="option-b">
+      <name>[Option name]</name>
+      <pros>[Benefits]</pros>
+      <cons>[Tradeoffs]</cons>
+    </option>
+  </options>
+  <resume-signal>[How to indicate choice]</resume-signal>
+</task>
+```
+
+**Key elements:**
+- `<decision>`: What's being decided
+- `<context>`: Why this matters
+- `<options>`: Each option with balanced pros/cons (not prescriptive)
+- `<resume-signal>`: How to indicate choice
+
+**Example: Auth Provider Selection**
+```xml
+<task type="checkpoint:decision" gate="blocking">
+  <decision>Select authentication provider</decision>
+  <context>
+    Need user authentication for the app. Three solid options with different tradeoffs.
+  </context>
+  <options>
+    <option id="supabase">
+      <name>Supabase Auth</name>
+      <pros>Built-in with Supabase DB we're using, generous free tier, row-level security integration</pros>
+      <cons>Less customizable UI, tied to Supabase ecosystem</cons>
+    </option>
+    <option id="clerk">
+      <name>Clerk</name>
+      <pros>Beautiful pre-built UI, best developer experience, excellent docs</pros>
+      <cons>Paid after 10k MAU, vendor lock-in</cons>
+    </option>
+    <option id="nextauth">
+      <name>NextAuth.js</name>
+      <pros>Free, self-hosted, maximum control, widely adopted</pros>
+      <cons>More setup work, you manage security updates, UI is DIY</cons>
+    </option>
+  </options>
+  <resume-signal>Select: supabase, clerk, or nextauth</resume-signal>
+</task>
+```
+
+### 3. `checkpoint:human-action` (Rare)
+
+**When:** Action has NO CLI/API and requires human-only interaction, OR Claude hit an authentication gate during automation.
+
+**Use ONLY for:**
+- **Authentication gates** - Claude tried to use CLI/API but needs credentials to continue (this is NOT a failure)
+- Email verification links (account creation requires clicking email)
+- SMS 2FA codes (phone verification)
+- Manual account approvals (platform requires human review before API access)
+- Credit card 3D Secure flows (web-based payment authorization)
+- OAuth app approvals (some platforms require web-based approval)
+
+**Do NOT use for pre-planned manual work:**
+- Manually deploying to Vercel (use `vercel` CLI - auth gate if needed)
+- Manually creating Stripe webhooks (use Stripe API - auth gate if needed)
+- Manually creating databases (use provider CLI - auth gate if needed)
+- Running builds/tests manually (use Bash tool)
+- Creating files manually (use Write tool)
+
+**Structure:**
+```xml
+<task type="checkpoint:human-action" gate="blocking">
+  <action>[What human must do - Claude already did everything automatable]</action>
+  <instructions>
+    [What Claude already automated]
+    [The ONE thing requiring human action]
+  </instructions>
+  <verification>[What Claude can check afterward]</verification>
+  <resume-signal>[How to continue]</resume-signal>
+</task>
+```
+
+**Key principle:** Claude automates EVERYTHING possible first, only asks human for the truly unavoidable manual step.
+
+**Example: Email Verification**
+```xml
+<task type="auto">
+  <name>Create SendGrid account via API</name>
+  <action>Use SendGrid API to create subuser account with provided email. Request verification email.</action>
+  <verify>API returns 201, account created</verify>
+  <done>Account created, verification email sent</done>
+</task>
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Complete email verification for SendGrid account</action>
+  <instructions>
+    I created the account and requested verification email.
+    Check your inbox for SendGrid verification link and click it.
+  </instructions>
+  <verification>SendGrid API key works: curl test succeeds</verification>
+  <resume-signal>Type "done" when email verified</resume-signal>
+</task>
+```
+
+**Example: Credit Card 3D Secure**
+```xml
+<task type="auto">
+  <name>Create Stripe payment intent</name>
+  <action>Use Stripe API to create payment intent for $99. Generate checkout URL.</action>
+  <verify>Stripe API returns payment intent ID and URL</verify>
+  <done>Payment intent created</done>
+</task>
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Complete 3D Secure authentication</action>
+  <instructions>
+    I created the payment intent: https://checkout.stripe.com/pay/cs_test_abc123
+    Visit that URL and complete the 3D Secure verification flow with your test card.
+  </instructions>
+  <verification>Stripe webhook receives payment_intent.succeeded event</verification>
+  <resume-signal>Type "done" when payment completes</resume-signal>
+</task>
+```
+
+**Example: Authentication Gate (Dynamic Checkpoint)**
+```xml
+<task type="auto">
+  <name>Deploy to Vercel</name>
+  <files>.vercel/, vercel.json</files>
+  <action>Run `vercel --yes` to deploy</action>
+  <verify>vercel ls shows deployment, curl returns 200</verify>
+</task>
+
+<!-- If vercel returns "Error: Not authenticated", Claude creates checkpoint on the fly -->
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Authenticate Vercel CLI so I can continue deployment</action>
+  <instructions>
+    I tried to deploy but got authentication error.
+    Run: vercel login
+    This will open your browser - complete the authentication flow.
+  </instructions>
+  <verification>vercel whoami returns your account email</verification>
+  <resume-signal>Type "done" when authenticated</resume-signal>
+</task>
+
+<!-- After authentication, Claude retries the deployment -->
+
+<task type="auto">
+  <name>Retry Vercel deployment</name>
+  <action>Run `vercel --yes` (now authenticated)</action>
+  <verify>vercel ls shows deployment, curl returns 200</verify>
+</task>
+```
+
+**Key distinction:** Authentication gates are created dynamically when Claude encounters auth errors during automation. They're NOT pre-planned - Claude tries to automate first, only asks for credentials when blocked.
+
+See references/cli-automation.md "Authentication Gates" section for more examples and full protocol.
+
+## Execution Protocol
+
+When Claude encounters `type="checkpoint:*"`:
+
+1. **Stop immediately** - do not proceed to next task
+2. **Display checkpoint clearly:**
+
+```
+════════════════════════════════════════
+CHECKPOINT: [Type]
+════════════════════════════════════════
+
+Task [X] of [Y]: [Name]
+
+[Display checkpoint-specific content]
+
+[Resume signal instruction]
+════════════════════════════════════════
+```
+
+3. **Wait for user response** - do not hallucinate completion
+4. **Verify if possible** - check files, run tests, whatever is specified
+5. **Resume execution** - continue to next task only after confirmation
+
+**For checkpoint:human-verify:**
+```
+════════════════════════════════════════
+CHECKPOINT: Verification Required
+════════════════════════════════════════
+
+Task 5 of 8: Responsive dashboard layout
+
+I built: Responsive dashboard at /dashboard
+
+How to verify:
+1. Run: npm run dev
+2. Visit: http://localhost:3000/dashboard
+3. Test: Resize browser window to mobile/tablet/desktop
+4. Confirm: No layout shift, proper responsive behavior
+
+Type "approved" to continue, or describe issues.
+════════════════════════════════════════
+```
+
+**For checkpoint:decision:**
+```
+════════════════════════════════════════
+CHECKPOINT: Decision Required
+════════════════════════════════════════
+
+Task 2 of 6: Select authentication provider
+
+Decision: Which auth provider should we use?
+
+Context: Need user authentication. Three options with different tradeoffs.
+
+Options:
+1. supabase - Built-in with our DB, free tier
+2. clerk - Best DX, paid after 10k users
+3. nextauth - Self-hosted, maximum control
+
+Select: supabase, clerk, or nextauth
+════════════════════════════════════════
+```
+
+## Writing Good Checkpoints
+
+**DO:**
+- Automate everything with CLI/API before checkpoint
+- Be specific: "Visit https://myapp.vercel.app" not "check deployment"
+- Number verification steps: easier to follow
+- State expected outcomes: "You should see X"
+- Provide context: why this checkpoint exists
+- Make verification executable: clear, testable steps
+
+**DON'T:**
+- Ask human to do work Claude can automate (deploy, create resources, run builds)
+- Assume knowledge: "Configure the usual settings" ❌
+- Skip steps: "Set up database" ❌ (too vague)
+- Mix multiple verifications in one checkpoint (split them)
+- Make verification impossible (Claude can't check visual appearance without user confirmation)
+
+## When to Use Checkpoints
+
+**Use checkpoint:human-verify for:**
+- Visual verification (UI, layouts, animations)
+- Interactive testing (click flows, user journeys)
+- Quality checks (audio/video playback, animation smoothness)
+- Confirming deployed apps are accessible
+
+**Use checkpoint:decision for:**
+- Technology selection (auth providers, databases, frameworks)
+- Architecture choices (monorepo, deployment strategy)
+- Design decisions (color schemes, layout approaches)
+- Feature prioritization
+
+**Use checkpoint:human-action for:**
+- Email verification links (no API)
+- SMS 2FA codes (no API)
+- Manual approvals with no automation
+- 3D Secure payment flows
+
+**Don't use checkpoints for:**
+- Things Claude can verify programmatically (tests pass, build succeeds)
+- File operations (Claude can read files to verify)
+- Code correctness (use tests and static analysis)
+- Anything automatable via CLI/API
+
+## Checkpoint Placement
+
+Place checkpoints:
+- **After automation completes** - not before Claude does the work
+- **After UI buildout** - before declaring phase complete
+- **Before dependent work** - decisions before implementation
+- **At integration points** - after configuring external services
+
+Bad placement:
+- Before Claude automates (asking human to do automatable work) ❌
+- Too frequent (every other task is a checkpoint) ❌
+- Too late (checkpoint is last task, but earlier tasks needed its result) ❌
+
+## Complete Examples
+
+### Example 1: Deployment Flow (Correct)
+
+```xml
+<!-- Claude automates everything -->
+<task type="auto">
+  <name>Deploy to Vercel</name>
+  <files>.vercel/, vercel.json, package.json</files>
+  <action>
+    1. Run `vercel --yes` to create project and deploy
+    2. Capture deployment URL from output
+    3. Set environment variables with `vercel env add`
+    4. Trigger production deployment with `vercel --prod`
+  </action>
+  <verify>
+    - vercel ls shows deployment
+    - curl {url} returns 200
+    - Environment variables set correctly
+  </verify>
+  <done>App deployed to production, URL captured</done>
+</task>
+
+<!-- Human verifies visual/functional correctness -->
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Deployed to https://myapp.vercel.app</what-built>
+  <how-to-verify>
+    Visit https://myapp.vercel.app and confirm:
+    - Homepage loads correctly
+    - All images/assets load
+    - Navigation works
+    - No console errors
+  </how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+```
+
+### Example 2: Database Setup (Correct)
+
+```xml
+<!-- Claude automates everything -->
+<task type="auto">
+  <name>Create Upstash Redis database</name>
+  <files>.env</files>
+  <action>
+    1. Run `upstash redis create myapp-cache --region us-east-1`
+    2. Capture connection URL from output
+    3. Write to .env: UPSTASH_REDIS_URL={url}
+    4. Verify connection with test command
+  </action>
+  <verify>
+    - upstash redis list shows database
+    - .env contains UPSTASH_REDIS_URL
+    - Test connection succeeds
+  </verify>
+  <done>Redis database created and configured</done>
+</task>
+
+<!-- NO CHECKPOINT NEEDED - Claude automated everything and verified programmatically -->
+```
+
+### Example 3: Stripe Webhooks (Correct)
+
+```xml
+<!-- Claude automates everything -->
+<task type="auto">
+  <name>Configure Stripe webhooks</name>
+  <files>.env, src/app/api/webhooks/route.ts</files>
+  <action>
+    1. Use Stripe API to create webhook endpoint pointing to /api/webhooks
+    2. Subscribe to events: payment_intent.succeeded, customer.subscription.updated
+    3. Save webhook signing secret to .env
+    4. Implement webhook handler in route.ts
+  </action>
+  <verify>
+    - Stripe API returns webhook endpoint ID
+    - .env contains STRIPE_WEBHOOK_SECRET
+    - curl webhook endpoint returns 200
+  </verify>
+  <done>Stripe webhooks configured and handler implemented</done>
+</task>
+
+<!-- Human verifies in Stripe dashboard -->
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Stripe webhook configured via API</what-built>
+  <how-to-verify>
+    Visit Stripe Dashboard > Developers > Webhooks
+    Confirm: Endpoint shows https://myapp.com/api/webhooks with correct events
+  </how-to-verify>
+  <resume-signal>Type "yes" if correct</resume-signal>
+</task>
+```
+
+## Anti-Patterns
+
+### ❌ BAD: Asking human to automate
+
+```xml
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Deploy to Vercel</action>
+  <instructions>
+    1. Visit vercel.com/new
+    2. Import Git repository
+    3. Click Deploy
+    4. Copy deployment URL
+  </instructions>
+  <verification>Deployment exists</verification>
+  <resume-signal>Paste URL</resume-signal>
+</task>
+```
+
+**Why bad:** Vercel has a CLI. Claude should run `vercel --yes`.
+
+### ✅ GOOD: Claude automates, human verifies
+
+```xml
+<task type="auto">
+  <name>Deploy to Vercel</name>
+  <action>Run `vercel --yes`. Capture URL.</action>
+  <verify>vercel ls shows deployment, curl returns 200</verify>
+</task>
+
+<task type="checkpoint:human-verify">
+  <what-built>Deployed to {url}</what-built>
+  <how-to-verify>Visit {url}, check homepage loads</how-to-verify>
+  <resume-signal>Type "approved"</resume-signal>
+</task>
+```
+
+### ❌ BAD: Too many checkpoints
+
+```xml
+<task type="auto">Create schema</task>
+<task type="checkpoint:human-verify">Check schema</task>
+<task type="auto">Create API route</task>
+<task type="checkpoint:human-verify">Check API</task>
+<task type="auto">Create UI form</task>
+<task type="checkpoint:human-verify">Check form</task>
+```
+
+**Why bad:** Verification fatigue. Combine into one checkpoint at end.
+
+### ✅ GOOD: Single verification checkpoint
+
+```xml
+<task type="auto">Create schema</task>
+<task type="auto">Create API route</task>
+<task type="auto">Create UI form</task>
+
+<task type="checkpoint:human-verify">
+  <what-built>Complete auth flow (schema + API + UI)</what-built>
+  <how-to-verify>Test full flow: register, login, access protected page</how-to-verify>
+  <resume-signal>Type "approved"</resume-signal>
+</task>
+```
+
+### ❌ BAD: Asking for automatable file operations
+
+```xml
+<task type="checkpoint:human-action">
+  <action>Create .env file</action>
+  <instructions>
+    1. Create .env in project root
+    2. Add: DATABASE_URL=...
+    3. Add: STRIPE_KEY=...
+  </instructions>
+</task>
+```
+
+**Why bad:** Claude has Write tool. This should be `type="auto"`.
+
+## Summary
+
+Checkpoints formalize human-in-the-loop points. Use them when Claude cannot complete a task autonomously OR when human verification is required for correctness.
+
+**The golden rule:** If Claude CAN automate it, Claude MUST automate it.
+
+**Checkpoint priority:**
+1. **checkpoint:human-verify** (90% of checkpoints) - Claude automated everything, human confirms visual/functional correctness
+2. **checkpoint:decision** (9% of checkpoints) - Human makes architectural/technology choices
+3. **checkpoint:human-action** (1% of checkpoints) - Truly unavoidable manual steps with no API/CLI
+
+**See also:** references/cli-automation.md for exhaustive list of what Claude can automate.
--- a/skills/create-plans/references/cli-automation.md
+++ b/skills/create-plans/references/cli-automation.md
@@ -0,0 +1,497 @@
+# CLI and API Automation Reference
+
+**Core principle:** If it has a CLI or API, Claude does it. Never ask the human to perform manual steps that Claude can automate.
+
+This reference documents what Claude CAN and SHOULD automate during plan execution.
+
+## Deployment Platforms
+
+### Vercel
+**CLI:** `vercel`
+
+**What Claude automates:**
+- Create and deploy projects: `vercel --yes`
+- Set environment variables: `vercel env add KEY production`
+- Link to git repo: `vercel link`
+- Trigger deployments: `vercel --prod`
+- Get deployment URLs: `vercel ls`
+- Manage domains: `vercel domains add example.com`
+
+**Never ask human to:**
+- Visit vercel.com/new to create project
+- Click through dashboard to add env vars
+- Manually link repository
+
+**Checkpoint pattern:**
+```xml
+<task type="auto">
+  <name>Deploy to Vercel</name>
+  <action>Run `vercel --yes` to deploy. Capture deployment URL.</action>
+  <verify>vercel ls shows deployment, curl {url} returns 200</verify>
+</task>
+
+<task type="checkpoint:human-verify">
+  <what-built>Deployed to {url}</what-built>
+  <how-to-verify>Visit {url} - check homepage loads</how-to-verify>
+  <resume-signal>Type "yes" if correct</resume-signal>
+</task>
+```
+
+### Railway
+**CLI:** `railway`
+
+**What Claude automates:**
+- Initialize project: `railway init`
+- Link to repo: `railway link`
+- Deploy: `railway up`
+- Set variables: `railway variables set KEY=value`
+- Get deployment URL: `railway domain`
+
+### Fly.io
+**CLI:** `fly`
+
+**What Claude automates:**
+- Launch app: `fly launch --no-deploy`
+- Deploy: `fly deploy`
+- Set secrets: `fly secrets set KEY=value`
+- Scale: `fly scale count 2`
+
+## Payment & Billing
+
+### Stripe
+**CLI:** `stripe`
+
+**What Claude automates:**
+- Create webhook endpoints: `stripe listen --forward-to localhost:3000/api/webhooks`
+- Trigger test events: `stripe trigger payment_intent.succeeded`
+- Create products/prices: Stripe API via curl/fetch
+- Manage customers: Stripe API via curl/fetch
+- Check webhook logs: `stripe webhooks list`
+
+**Never ask human to:**
+- Visit dashboard.stripe.com to create webhook
+- Click through UI to create products
+- Manually copy webhook signing secret
+
+**Checkpoint pattern:**
+```xml
+<task type="auto">
+  <name>Configure Stripe webhooks</name>
+  <action>Use Stripe API to create webhook endpoint at /api/webhooks. Save signing secret to .env.</action>
+  <verify>stripe webhooks list shows endpoint, .env contains STRIPE_WEBHOOK_SECRET</verify>
+</task>
+
+<task type="checkpoint:human-verify">
+  <what-built>Stripe webhook configured</what-built>
+  <how-to-verify>Check Stripe dashboard > Developers > Webhooks shows endpoint with correct URL</how-to-verify>
+  <resume-signal>Type "yes" if correct</resume-signal>
+</task>
+```
+
+## Databases & Backend
+
+### Supabase
+**CLI:** `supabase`
+
+**What Claude automates:**
+- Initialize project: `supabase init`
+- Link to remote: `supabase link --project-ref {ref}`
+- Create migrations: `supabase migration new {name}`
+- Push migrations: `supabase db push`
+- Generate types: `supabase gen types typescript`
+- Deploy functions: `supabase functions deploy {name}`
+
+**Never ask human to:**
+- Visit supabase.com to create project manually
+- Click through dashboard to run migrations
+- Copy/paste connection strings
+
+**Note:** Project creation may require web dashboard initially (no CLI for initial project creation), but all subsequent work (migrations, functions, etc.) is CLI-automated.
+
+### Upstash (Redis/Kafka)
+**CLI:** `upstash`
+
+**What Claude automates:**
+- Create Redis database: `upstash redis create {name} --region {region}`
+- Get connection details: `upstash redis get {id}`
+- Create Kafka cluster: `upstash kafka create {name} --region {region}`
+
+**Never ask human to:**
+- Visit console.upstash.com
+- Click through UI to create database
+- Copy/paste connection URLs manually
+
+**Checkpoint pattern:**
+```xml
+<task type="auto">
+  <name>Create Upstash Redis database</name>
+  <action>Run `upstash redis create myapp-cache --region us-east-1`. Save URL to .env.</action>
+  <verify>.env contains UPSTASH_REDIS_URL, upstash redis list shows database</verify>
+</task>
+```
+
+### PlanetScale
+**CLI:** `pscale`
+
+**What Claude automates:**
+- Create database: `pscale database create {name} --region {region}`
+- Create branch: `pscale branch create {db} {branch}`
+- Deploy request: `pscale deploy-request create {db} {branch}`
+- Connection string: `pscale connect {db} {branch}`
+
+## Version Control & CI/CD
+
+### GitHub
+**CLI:** `gh`
+
+**What Claude automates:**
+- Create repo: `gh repo create {name} --public/--private`
+- Create issues: `gh issue create --title "{title}" --body "{body}"`
+- Create PR: `gh pr create --title "{title}" --body "{body}"`
+- Manage secrets: `gh secret set {KEY}`
+- Trigger workflows: `gh workflow run {name}`
+- Check status: `gh run list`
+
+**Never ask human to:**
+- Visit github.com to create repo
+- Click through UI to add secrets
+- Manually create issues/PRs
+
+## Build Tools & Testing
+
+### Node/npm/pnpm/bun
+**What Claude automates:**
+- Install dependencies: `npm install`, `pnpm install`, `bun install`
+- Run builds: `npm run build`
+- Run tests: `npm test`, `npm run test:e2e`
+- Type checking: `tsc --noEmit`
+
+**Never ask human to:** Run these commands manually
+
+### Xcode (macOS/iOS)
+**CLI:** `xcodebuild`
+
+**What Claude automates:**
+- Build project: `xcodebuild -project App.xcodeproj -scheme App build`
+- Run tests: `xcodebuild test -project App.xcodeproj -scheme App`
+- Archive: `xcodebuild archive -project App.xcodeproj -scheme App`
+- Check compilation: Parse xcodebuild output for errors
+
+**Never ask human to:**
+- Open Xcode and click Product > Build
+- Click Product > Test manually
+- Check for errors by looking at Xcode UI
+
+**Checkpoint pattern:**
+```xml
+<task type="auto">
+  <name>Build macOS app</name>
+  <action>Run `xcodebuild -project App.xcodeproj -scheme App build`. Check output for errors.</action>
+  <verify>Build succeeds with "BUILD SUCCEEDED" in output</verify>
+</task>
+
+<task type="checkpoint:human-verify">
+  <what-built>Built macOS app at DerivedData/Build/Products/Debug/App.app</what-built>
+  <how-to-verify>Open App.app and check: login flow works, no visual glitches</how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+```
+
+## Environment Configuration
+
+### .env Files
+**Tool:** Write tool
+
+**What Claude automates:**
+- Create .env files: Use Write tool
+- Append variables: Use Edit tool
+- Read current values: Use Read tool
+
+**Never ask human to:**
+- Manually create .env file
+- Copy/paste values into .env
+- Edit .env in text editor
+
+**Pattern:**
+```xml
+<task type="auto">
+  <name>Configure environment variables</name>
+  <action>Write .env file with: DATABASE_URL, STRIPE_KEY, JWT_SECRET (generated).</action>
+  <verify>Read .env confirms all variables present</verify>
+</task>
+```
+
+## Email & Communication
+
+### Resend
+**API:** Resend API via HTTP
+
+**What Claude automates:**
+- Create API keys via dashboard API (if available) or instructions for one-time setup
+- Send emails: Resend API
+- Configure domains: Resend API
+
+### SendGrid
+**API:** SendGrid API via HTTP
+
+**What Claude automates:**
+- Create API keys via API
+- Send emails: SendGrid API
+- Configure webhooks: SendGrid API
+
+**Note:** Initial account setup may require email verification (checkpoint:human-action), but all subsequent work is API-automated.
+
+## Authentication Gates
+
+**Critical distinction:** When Claude tries to use a CLI/API and gets an authentication error, this is NOT a failure - it's a gate that requires human input to unblock automation.
+
+**Pattern: Claude encounters auth error → creates checkpoint → you authenticate → Claude continues**
+
+### Example: Vercel CLI Not Authenticated
+
+```xml
+<task type="auto">
+  <name>Deploy to Vercel</name>
+  <files>.vercel/, vercel.json</files>
+  <action>Run `vercel --yes` to deploy</action>
+  <verify>vercel ls shows deployment</verify>
+</task>
+
+<!-- If vercel returns "Error: Not authenticated" -->
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Authenticate Vercel CLI so I can continue deployment</action>
+  <instructions>
+    I tried to deploy but got authentication error.
+    Run: vercel login
+    This will open your browser - complete the authentication flow.
+  </instructions>
+  <verification>vercel whoami returns your account email</verification>
+  <resume-signal>Type "done" when authenticated</resume-signal>
+</task>
+
+<!-- After authentication, Claude retries automatically -->
+
+<task type="auto">
+  <name>Retry Vercel deployment</name>
+  <action>Run `vercel --yes` (now authenticated)</action>
+  <verify>vercel ls shows deployment, curl returns 200</verify>
+</task>
+```
+
+### Example: Stripe CLI Needs API Key
+
+```xml
+<task type="auto">
+  <name>Create Stripe webhook endpoint</name>
+  <action>Use Stripe API to create webhook at /api/webhooks</action>
+</task>
+
+<!-- If API returns 401 Unauthorized -->
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Provide Stripe API key so I can continue webhook configuration</action>
+  <instructions>
+    I need your Stripe API key to create webhooks.
+    1. Visit dashboard.stripe.com/apikeys
+    2. Copy your "Secret key" (starts with sk_test_ or sk_live_)
+    3. Paste it here or run: export STRIPE_SECRET_KEY=sk_...
+  </instructions>
+  <verification>Stripe API key works: curl test succeeds</verification>
+  <resume-signal>Type "done" or paste the key</resume-signal>
+</task>
+
+<!-- After key provided, Claude writes to .env and continues -->
+
+<task type="auto">
+  <name>Save Stripe key and create webhook</name>
+  <action>
+    1. Write STRIPE_SECRET_KEY to .env
+    2. Create webhook endpoint via Stripe API
+    3. Save webhook secret to .env
+  </action>
+  <verify>.env contains both keys, webhook endpoint exists</verify>
+</task>
+```
+
+### Example: GitHub CLI Not Logged In
+
+```xml
+<task type="auto">
+  <name>Create GitHub repository</name>
+  <action>Run `gh repo create myapp --public`</action>
+</task>
+
+<!-- If gh returns "Not logged in" -->
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Authenticate GitHub CLI so I can create repository</action>
+  <instructions>
+    I need GitHub authentication to create the repo.
+    Run: gh auth login
+    Follow the prompts to authenticate (browser or token).
+  </instructions>
+  <verification>gh auth status shows "Logged in"</verification>
+  <resume-signal>Type "done" when authenticated</resume-signal>
+</task>
+
+<task type="auto">
+  <name>Create repository (authenticated)</name>
+  <action>Run `gh repo create myapp --public`</action>
+  <verify>gh repo view shows repository exists</verify>
+</task>
+```
+
+### Example: Upstash CLI Needs API Key
+
+```xml
+<task type="auto">
+  <name>Create Upstash Redis database</name>
+  <action>Run `upstash redis create myapp-cache --region us-east-1`</action>
+</task>
+
+<!-- If upstash returns auth error -->
+
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Configure Upstash CLI credentials so I can create database</action>
+  <instructions>
+    I need Upstash authentication to create Redis database.
+    1. Visit console.upstash.com/account/api
+    2. Copy your API key
+    3. Run: upstash auth login
+    4. Paste your API key when prompted
+  </instructions>
+  <verification>upstash auth status shows authenticated</verification>
+  <resume-signal>Type "done" when authenticated</resume-signal>
+</task>
+
+<task type="auto">
+  <name>Create Redis database (authenticated)</name>
+  <action>
+    1. Run `upstash redis create myapp-cache --region us-east-1`
+    2. Capture connection URL
+    3. Write to .env: UPSTASH_REDIS_URL={url}
+  </action>
+  <verify>upstash redis list shows database, .env contains URL</verify>
+</task>
+```
+
+### Authentication Gate Protocol
+
+**When Claude encounters authentication error during execution:**
+
+1. **Recognize it's not a failure** - Missing auth is expected, not a bug
+2. **Stop current task** - Don't retry repeatedly
+3. **Create checkpoint:human-action on the fly** - Dynamic checkpoint, not pre-planned
+4. **Provide exact authentication steps** - CLI commands, where to get keys
+5. **Verify authentication** - Test that auth works before continuing
+6. **Retry the original task** - Resume automation where it left off
+7. **Continue normally** - One auth gate doesn't break the flow
+
+**Key difference from pre-planned checkpoints:**
+- Pre-planned: "I need you to do X" (wrong - Claude should automate)
+- Auth gate: "I tried to automate X but need credentials to continue" (correct - unblocks automation)
+
+**This preserves agentic flow:**
+- Claude tries automation first
+- Only asks for help when blocked by credentials
+- Continues automating after unblocked
+- You never manually deploy/create resources - just provide keys
+
+## When checkpoint:human-action is REQUIRED
+
+**Truly rare cases where no CLI/API exists:**
+
+1. **Email verification links** - Account signup requires clicking verification email
+2. **SMS verification codes** - 2FA requiring phone
+3. **Manual account approvals** - Platform requires human review before API access
+4. **Domain DNS records at registrar** - Some registrars have no API
+5. **Credit card input** - Payment methods requiring 3D Secure web flow
+6. **OAuth app approval** - Some platforms require web-based app approval flow
+
+**For these rare cases:**
+```xml
+<task type="checkpoint:human-action" gate="blocking">
+  <action>Complete email verification for SendGrid account</action>
+  <instructions>
+    I created the account and requested verification email.
+    Check your inbox for verification link and click it.
+  </instructions>
+  <verification>SendGrid API key works: curl test succeeds</verification>
+  <resume-signal>Type "done" when verified</resume-signal>
+</task>
+```
+
+**Key difference:** Claude does EVERYTHING possible first (account creation, API requests), only asks human for the one thing with no automation path.
+
+## Quick Reference: "Can Claude automate this?"
+
+| Action | CLI/API? | Claude does it? |
+|--------|----------|-----------------|
+| Deploy to Vercel | ✅ `vercel` | YES |
+| Create Stripe webhook | ✅ Stripe API | YES |
+| Run xcodebuild | ✅ `xcodebuild` | YES |
+| Write .env file | ✅ Write tool | YES |
+| Create Upstash DB | ✅ `upstash` CLI | YES |
+| Install npm packages | ✅ `npm` | YES |
+| Create GitHub repo | ✅ `gh` | YES |
+| Run tests | ✅ `npm test` | YES |
+| Create Supabase project | ⚠️ Web dashboard | NO (then CLI for everything else) |
+| Click email verification link | ❌ No API | NO |
+| Enter credit card with 3DS | ❌ No API | NO |
+
+**Default answer: YES.** Unless explicitly in the "NO" category, Claude automates it.
+
+## Decision Tree
+
+```
+┌─────────────────────────────────────┐
+│ Task requires external resource?    │
+└──────────────┬──────────────────────┘
+               │
+               ▼
+┌─────────────────────────────────────┐
+│ Does it have CLI/API/tool access?   │
+└──────────────┬──────────────────────┘
+               │
+         ┌─────┴─────┐
+         │           │
+         ▼           ▼
+       YES          NO
+         │           │
+         │           ▼
+         │     ┌──────────────────────────────┐
+         │     │ checkpoint:human-action      │
+         │     │ (email links, 2FA, etc.)     │
+         │     └──────────────────────────────┘
+         │
+         ▼
+    ┌────────────────────────────────────────┐
+    │ task type="auto"                       │
+    │ Claude automates via CLI/API           │
+    └────────────┬───────────────────────────┘
+                 │
+                 ▼
+    ┌────────────────────────────────────────┐
+    │ checkpoint:human-verify                │
+    │ Human confirms visual/functional       │
+    └────────────────────────────────────────┘
+```
+
+## Summary
+
+**The rule:** If Claude CAN do it, Claude MUST do it.
+
+Checkpoints are for:
+- **Verification** - Confirming Claude's automated work looks/behaves correctly
+- **Decisions** - Choosing between valid approaches
+- **True blockers** - Rare actions with literally no API/CLI (email links, 2FA)
+
+Checkpoints are NOT for:
+- Deploying (use CLI)
+- Creating resources (use CLI/API)
+- Running builds (use Bash)
+- Writing files (use Write tool)
+- Anything with automation available
+
+**This keeps the agentic coding workflow intact - Claude does the work, you verify results.**
--- a/skills/create-plans/references/context-management.md
+++ b/skills/create-plans/references/context-management.md
@@ -0,0 +1,138 @@
+<overview>
+Claude has a finite context window. This reference defines how to monitor usage and handle approaching limits gracefully.
+</overview>
+
+<context_awareness>
+Claude receives system warnings showing token usage:
+
+```
+Token usage: 150000/200000; 50000 remaining
+```
+
+This information appears in `<system_warning>` tags during the conversation.
+</context_awareness>
+
+<thresholds>
+<threshold level="comfortable" remaining="50%+">
+**Status**: Plenty of room
+**Action**: Work normally
+</threshold>
+
+<threshold level="getting_full" remaining="25%">
+**Status**: Context accumulating
+**Action**: Mention to user: "Context getting full. Consider wrapping up or creating handoff soon."
+**No immediate action required.**
+</threshold>
+
+<threshold level="low" remaining="15%">
+**Status**: Running low
+**Action**:
+1. Pause at next safe point (complete current atomic operation)
+2. Ask user: "Running low on context (~30k tokens remaining). Options:
+   - Create handoff now and resume in fresh session
+   - Push through (risky if complex work remains)"
+3. Await user decision
+
+**Do not start new large operations.**
+</threshold>
+
+<threshold level="critical" remaining="10%">
+**Status**: Must stop
+**Action**:
+1. Complete current atomic task (don't leave broken state)
+2. **Automatically create handoff** without asking
+3. Tell user: "Context limit reached. Created handoff at [location]. Start fresh session to continue."
+4. **Stop working** - do not start any new tasks
+
+This is non-negotiable. Running out of context mid-task is worse than stopping early.
+</threshold>
+</thresholds>
+
+<what_counts_as_atomic>
+An atomic operation is one that shouldn't be interrupted:
+
+**Atomic (finish before stopping)**:
+- Writing a single file
+- Running a validation command
+- Completing a single task from the plan
+
+**Not atomic (can pause between)**:
+- Multiple tasks in sequence
+- Multi-file changes (can pause between files)
+- Research + implementation (can pause between)
+
+When hitting 10% threshold, finish current atomic operation, then stop.
+</what_counts_as_atomic>
+
+<handoff_content_at_limit>
+When auto-creating handoff at 10%, include:
+
+```yaml
+---
+phase: [current phase]
+task: [current task number]
+total_tasks: [total]
+status: context_limit_reached
+last_updated: [timestamp]
+---
+```
+
+Body must capture:
+1. What was just completed
+2. What task was in progress (and how far)
+3. What remains
+4. Any decisions/context from this session
+
+Be thorough - the next session starts fresh.
+</handoff_content_at_limit>
+
+<preventing_context_bloat>
+Strategies to extend context life:
+
+**Don't re-read files unnecessarily**
+- Read once, remember content
+- Don't cat the same file multiple times
+
+**Summarize rather than quote**
+- "The schema has 5 models including User and Session"
+- Not: [paste entire schema]
+
+**Use targeted reads**
+- Read specific functions, not entire files
+- Use grep to find relevant sections
+
+**Clear completed work from "memory"**
+- Once a task is done, don't keep referencing it
+- Move forward, don't re-explain
+
+**Avoid verbose output**
+- Concise responses
+- Don't repeat user's question back
+- Don't over-explain obvious things
+</preventing_context_bloat>
+
+<user_signals>
+Watch for user signals that suggest context concern:
+
+- "Let's wrap up"
+- "Save my place"
+- "I need to step away"
+- "Pack it up"
+- "Create a handoff"
+- "Running low on context?"
+
+Any of these → trigger handoff workflow immediately.
+</user_signals>
+
+<fresh_session_guidance>
+When user returns in fresh session:
+
+1. They invoke skill
+2. Context scan finds handoff
+3. Resume workflow activates
+4. Load handoff, present summary
+5. Delete handoff after confirmation
+6. Continue from saved state
+
+The fresh session has full context available again.
+</fresh_session_guidance>
--- a/skills/create-plans/references/domain-expertise.md
+++ b/skills/create-plans/references/domain-expertise.md
@@ -0,0 +1,170 @@
+# Domain Expertise Structure
+
+Guide for creating domain expertise skills that work efficiently with create-plans.
+
+## Purpose
+
+Domain expertise provides context-specific knowledge (Swift/macOS patterns, Next.js conventions, Unity workflows) that makes plans more accurate and actionable.
+
+**Critical:** Domain skills must be context-efficient. Loading 20k+ tokens of references defeats the purpose.
+
+## File Structure
+
+```
+~/.claude/skills/expertise/[domain-name]/
+├── SKILL.md              # Core principles + references_index (5-7k tokens)
+├── references/           # Selective loading based on phase type
+│   ├── always-useful.md  # Conventions, patterns used in all phases
+│   ├── database.md       # Database-specific guidance
+│   ├── ui-layout.md      # UI-specific guidance
+│   ├── api-routes.md     # API-specific guidance
+│   └── ...
+└── workflows/            # Optional: domain-specific workflows
+    └── ...
+```
+
+## SKILL.md Template
+
+```markdown
+---
+name: [domain-name]
+description: [What this expertise covers]
+---
+
+<principles>
+## Core Principles
+
+[Fundamental patterns that apply to ALL work in this domain]
+[Should be complete enough to plan without loading references]
+
+Examples:
+- File organization patterns
+- Naming conventions
+- Architecture patterns
+- Common gotchas to avoid
+- Framework-specific requirements
+
+**Keep this section comprehensive but concise (~3-5k tokens).**
+</principles>
+
+<references_index>
+## Reference Loading Guide
+
+When planning phases, load references based on phase type:
+
+**For [phase-type-1] phases:**
+- references/[file1].md - [What it contains]
+- references/[file2].md - [What it contains]
+
+**For [phase-type-2] phases:**
+- references/[file3].md - [What it contains]
+- references/[file4].md - [What it contains]
+
+**Always useful (load for any phase):**
+- references/conventions.md - [What it contains]
+- references/common-patterns.md - [What it contains]
+
+**Examples of phase type mapping:**
+- Database/persistence phases → database.md, migrations.md
+- UI/layout phases → ui-patterns.md, design-system.md
+- API/backend phases → api-routes.md, auth.md
+- Integration phases → system-apis.md, third-party.md
+</references_index>
+
+<workflows>
+## Optional Workflows
+
+[If domain has specific workflows, list them here]
+[These are NOT auto-loaded - only used when specifically invoked]
+</workflows>
+```
+
+## Reference File Guidelines
+
+Each reference file should be:
+
+**1. Focused** - Single concern (database patterns, UI layout, API design)
+
+**2. Actionable** - Contains patterns Claude can directly apply
+```markdown
+# Database Patterns
+
+## Table Naming
+- Singular nouns (User, not Users)
+- snake_case for SQL, PascalCase for models
+
+## Common Patterns
+- Soft deletes: deleted_at timestamp
+- Audit columns: created_at, updated_at
+- Foreign keys: [table]_id format
+```
+
+**3. Sized appropriately** - 500-2000 lines (~1-5k tokens)
+   - Too small: Not worth separate file
+   - Too large: Split into more focused files
+
+**4. Self-contained** - Can be understood without reading other references
+
+## Context Efficiency Examples
+
+**Bad (old approach):**
+```
+Load all references: 10,728 lines = ~27k tokens
+Result: 50% context before planning starts
+```
+
+**Good (new approach):**
+```
+Load SKILL.md: ~5k tokens
+Planning UI phase → load ui-layout.md + conventions.md: ~7k tokens
+Total: ~12k tokens (saves 15k for workspace)
+```
+
+## Phase Type Classification
+
+Help create-plans determine which references to load:
+
+**Common phase types:**
+- **Foundation/Setup** - Project structure, dependencies, configuration
+- **Database/Data** - Schema, models, migrations, queries
+- **API/Backend** - Routes, controllers, business logic, auth
+- **UI/Frontend** - Components, layouts, styling, interactions
+- **Integration** - External APIs, system services, third-party SDKs
+- **Features** - Domain-specific functionality
+- **Polish** - Performance, accessibility, error handling
+
+**References should map to these types** so create-plans can load the right context.
+
+## Migration Guide
+
+If you have an existing domain skill with many references:
+
+1. **Audit references** - What's actually useful vs. reference dumps?
+
+2. **Consolidate principles** - Move core patterns into SKILL.md principles section
+
+3. **Create references_index** - Map phase types to relevant references
+
+4. **Test loading** - Verify you can plan a phase with <15k token overhead
+
+5. **Iterate** - Adjust groupings based on actual planning needs
+
+## Example: macos-apps
+
+**Before (inefficient):**
+- 20 reference files
+- Load all: 10,728 lines (~27k tokens)
+
+**After (efficient):**
+
+SKILL.md contains:
+- Swift/SwiftUI core principles
+- macOS app architecture patterns
+- Common patterns (MV VM, data flow)
+- references_index mapping:
+  - UI phases → swiftui-layout.md, appleHIG.md (~4k)
+  - Data phases → core-data.md, swift-concurrency.md (~5k)
+  - System phases → appkit-integration.md, menu-bar.md (~3k)
+  - Always → swift-conventions.md (~2k)
+
+**Result:** 5-12k tokens instead of 27k (saves 15-22k for planning)
--- a/skills/create-plans/references/git-integration.md
+++ b/skills/create-plans/references/git-integration.md
@@ -0,0 +1,106 @@
+# Git Integration Reference
+
+## Core Principle
+
+**Commit outcomes, not process.**
+
+The git log should read like a changelog of what shipped, not a diary of planning activity.
+
+## Commit Points (Only 3)
+
+| Event | Commit? | Why |
+|-------|---------|-----|
+| BRIEF + ROADMAP created | YES | Project initialization |
+| PLAN.md created | NO | Intermediate - commit with completion |
+| RESEARCH.md created | NO | Intermediate |
+| FINDINGS.md created | NO | Intermediate |
+| **Phase completed** | YES | Actual code shipped |
+| Handoff created | YES | WIP state preserved |
+
+## Git Check on Invocation
+
+```bash
+git rev-parse --git-dir 2>/dev/null || echo "NO_GIT_REPO"
+```
+
+If NO_GIT_REPO:
+- Inline: "No git repo found. Initialize one? (Recommended for version control)"
+- If yes: `git init`
+
+## Commit Message Formats
+
+### 1. Project Initialization (brief + roadmap together)
+
+```
+docs: initialize [project-name] ([N] phases)
+
+[One-liner from BRIEF.md]
+
+Phases:
+1. [phase-name]: [goal]
+2. [phase-name]: [goal]
+3. [phase-name]: [goal]
+```
+
+What to commit:
+```bash
+git add .planning/
+git commit
+```
+
+### 2. Phase Completion
+
+```
+feat([domain]): [one-liner from SUMMARY.md]
+
+- [Key accomplishment 1]
+- [Key accomplishment 2]
+- [Key accomplishment 3]
+
+[If issues encountered:]
+Note: [issue and resolution]
+```
+
+Use `fix([domain])` for bug fix phases.
+
+What to commit:
+```bash
+git add .planning/phases/XX-name/  # PLAN.md + SUMMARY.md
+git add src/                        # Actual code created
+git commit
+```
+
+### 3. Handoff (WIP)
+
+```
+wip: [phase-name] paused at task [X]/[Y]
+
+Current: [task name]
+[If blocked:] Blocked: [reason]
+```
+
+What to commit:
+```bash
+git add .planning/
+git commit
+```
+
+## Example Clean Git Log
+
+```
+a]7f2d1 feat(checkout): Stripe payments with webhook verification
+b]3e9c4 feat(products): catalog with search, filters, and pagination
+c]8a1b2 feat(auth): JWT with refresh rotation using jose
+d]5c3d7 feat(foundation): Next.js 15 + Prisma + Tailwind scaffold
+e]2f4a8 docs: initialize ecommerce-app (5 phases)
+```
+
+## What NOT To Commit Separately
+
+- PLAN.md creation (wait for phase completion)
+- RESEARCH.md (intermediate)
+- FINDINGS.md (intermediate)
+- Minor planning tweaks
+- "Fixed typo in roadmap"
+
+These create noise. Commit outcomes, not process.
--- a/skills/create-plans/references/hierarchy-rules.md
+++ b/skills/create-plans/references/hierarchy-rules.md
@@ -0,0 +1,142 @@
+<overview>
+The planning hierarchy ensures context flows down and progress flows up.
+Each level builds on the previous and enables the next.
+</overview>
+
+<hierarchy>
+```
+BRIEF.md          ← Vision (human-focused)
+    ↓
+ROADMAP.md        ← Structure (phases)
+    ↓
+phases/XX/PLAN.md ← Implementation (Claude-executable)
+    ↓
+prompts/          ← Execution (via create-meta-prompts)
+```
+</hierarchy>
+
+<level name="brief">
+**Purpose**: Capture vision, goals, constraints
+**Audience**: Human (the user)
+**Contains**: What we're building, why, success criteria, out of scope
+**Creates**: `.planning/BRIEF.md`
+
+**Requires**: Nothing (can start here)
+**Enables**: Roadmap creation
+
+This is the ONLY document optimized for human reading.
+</level>
+
+<level name="roadmap">
+**Purpose**: Define phases and sequence
+**Audience**: Both human and Claude
+**Contains**: Phase names, goals, dependencies, progress tracking
+**Creates**: `.planning/ROADMAP.md`, `.planning/phases/` directories
+
+**Requires**: Brief (or quick context if skipping)
+**Enables**: Phase planning
+
+Roadmap looks UP to Brief for scope, looks DOWN to track phase completion.
+</level>
+
+<level name="phase_plan">
+**Purpose**: Define Claude-executable tasks
+**Audience**: Claude (the implementer)
+**Contains**: Tasks with Files/Action/Verification/Done-when
+**Creates**: `.planning/phases/XX-name/PLAN.md`
+
+**Requires**: Roadmap (to know phase scope)
+**Enables**: Prompt generation, direct execution
+
+Phase plan looks UP to Roadmap for scope, produces implementation details.
+</level>
+
+<level name="prompts">
+**Purpose**: Optimized execution instructions
+**Audience**: Claude (via create-meta-prompts)
+**Contains**: Research/Plan/Do prompts with metadata
+**Creates**: `.planning/phases/XX-name/prompts/`
+
+**Requires**: Phase plan (tasks to execute)
+**Enables**: Autonomous execution
+
+Prompts are generated from phase plan via create-meta-prompts skill.
+</level>
+
+<navigation_rules>
+<looking_up>
+When creating a lower-level artifact, ALWAYS read higher levels for context:
+
+- Creating Roadmap → Read Brief
+- Planning Phase → Read Roadmap AND Brief
+- Generating Prompts → Read Phase Plan AND Roadmap
+
+This ensures alignment with overall vision.
+</looking_up>
+
+<looking_down>
+When updating a higher-level artifact, check lower levels for status:
+
+- Updating Roadmap progress → Check which phase PLANs exist, completion state
+- Reviewing Brief → See how far we've come via Roadmap
+
+This enables progress tracking.
+</looking_down>
+
+<missing_prerequisites>
+If a prerequisite doesn't exist:
+
+```
+Creating phase plan but no roadmap exists.
+
+Options:
+1. Create roadmap first (recommended)
+2. Create quick roadmap placeholder
+3. Proceed anyway (not recommended - loses hierarchy benefits)
+```
+
+Always offer to create missing pieces rather than skipping.
+</missing_prerequisites>
+</navigation_rules>
+
+<file_locations>
+All planning artifacts in `.planning/`:
+
+```
+.planning/
+├── BRIEF.md                    # One per project
+├── ROADMAP.md                  # One per project
+└── phases/
+    ├── 01-phase-name/
+    │   ├── PLAN.md             # One per phase
+    │   ├── .continue-here.md   # Temporary (when paused)
+    │   └── prompts/            # Generated execution prompts
+    ├── 02-phase-name/
+    │   ├── PLAN.md
+    │   └── prompts/
+    └── ...
+```
+
+Phase directories use `XX-kebab-case` for consistent ordering.
+</file_locations>
+
+<scope_inheritance>
+Each level inherits and narrows scope:
+
+**Brief**: "Build a task management app"
+**Roadmap**: "Phase 1: Core task CRUD, Phase 2: Projects, Phase 3: Collaboration"
+**Phase 1 Plan**: "Task 1: Database schema, Task 2: API endpoints, Task 3: UI"
+
+Scope flows DOWN and gets more specific.
+Progress flows UP and gets aggregated.
+</scope_inheritance>
+
+<cross_phase_context>
+When planning Phase N, Claude should understand:
+
+- What Phase N-1 delivered (completed work)
+- What Phase N should build on (foundations)
+- What Phase N+1 will need (don't paint into corner)
+
+Read previous phase's PLAN.md to understand current state.
+</cross_phase_context>
--- a/skills/create-plans/references/milestone-management.md
+++ b/skills/create-plans/references/milestone-management.md
@@ -0,0 +1,495 @@
+# Milestone Management & Greenfield/Brownfield Planning
+
+Milestones mark shipped versions. They solve the "what happens after v1.0?" problem.
+
+## The Core Problem
+
+**After shipping v1.0:**
+- Planning artifacts optimized for greenfield (starting from scratch)
+- But now you have: existing code, users, constraints, shipped features
+- Need brownfield awareness without losing planning structure
+
+**Solution:** Milestone-bounded extensions with updated BRIEF.
+
+## Three Planning Modes
+
+### 1. Greenfield (v1.0 Initial Development)
+
+**Characteristics:**
+- No existing code
+- No users
+- No constraints from shipped versions
+- Pure "build from scratch" mode
+
+**Planning structure:**
+```
+.planning/
+├── BRIEF.md              # Original vision
+├── ROADMAP.md            # Phases 1-4
+└── phases/
+    ├── 01-foundation/
+    ├── 02-features/
+    ├── 03-polish/
+    └── 04-launch/
+```
+
+**BRIEF.md looks like:**
+```markdown
+# Project Brief: AppName
+
+**Vision:** Build a thing that does X
+
+**Purpose:** Solve problem Y
+
+**Scope:**
+- Feature A
+- Feature B
+- Feature C
+
+**Success:** Ships and works
+```
+
+**Workflow:** Normal planning → execution → transition flow
+
+---
+
+### 2. Brownfield Extensions (v1.1, v1.2 - Same Codebase)
+
+**Characteristics:**
+- v1.0 shipped and in use
+- Adding features / fixing issues
+- Same codebase, continuous evolution
+- Existing code referenced in new plans
+
+**Planning structure:**
+```
+.planning/
+├── BRIEF.md              # Updated with "Current State"
+├── ROADMAP.md            # Phases 1-6 (grouped by milestone)
+├── MILESTONES.md         # v1.0 entry
+└── phases/
+    ├── 01-foundation/    # ✓ v1.0
+    ├── 02-features/      # ✓ v1.0
+    ├── 03-polish/        # ✓ v1.0
+    ├── 04-launch/        # ✓ v1.0
+    ├── 05-security/      # 🚧 v1.1 (in progress)
+    └── 06-performance/   # 📋 v1.1 (planned)
+```
+
+**BRIEF.md updated:**
+```markdown
+# Project Brief: AppName
+
+## Current State (Updated: 2025-12-01)
+
+**Shipped:** v1.0 MVP (2025-11-25)
+**Users:** 500 downloads, 50 daily actives
+**Feedback:** Requesting dark mode, occasional crashes on network errors
+**Codebase:** 2,450 lines Swift, macOS 13.0+, AppKit
+
+## v1.1 Goals
+
+**Vision:** Harden reliability and add dark mode based on user feedback
+
+**Motivation:**
+- 5 crash reports related to network errors
+- 15 users requested dark mode
+- Want to improve before marketing push
+
+**Scope (v1.1):**
+- Comprehensive error handling
+- Dark mode support
+- Crash reporting integration
+
+---
+
+<details>
+<summary>Original Vision (v1.0 - Archived)</summary>
+
+[Original brief content]
+
+</details>
+```
+
+**ROADMAP.md updated:**
+```markdown
+# Roadmap: AppName
+
+## Milestones
+
+- ✅ **v1.0 MVP** - Phases 1-4 (shipped 2025-11-25)
+- 🚧 **v1.1 Hardening** - Phases 5-6 (in progress)
+
+## Phases
+
+<details>
+<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED 2025-11-25</summary>
+
+- [x] Phase 1: Foundation
+- [x] Phase 2: Core Features
+- [x] Phase 3: Polish
+- [x] Phase 4: Launch
+
+</details>
+
+### 🚧 v1.1 Hardening (In Progress)
+
+- [ ] Phase 5: Error Handling & Stability
+- [ ] Phase 6: Dark Mode UI
+```
+
+**How plans become brownfield-aware:**
+
+When planning Phase 5, the PLAN.md automatically gets context:
+
+```markdown
+<context>
+@.planning/BRIEF.md                      # Knows: v1.0 shipped, codebase exists
+@.planning/MILESTONES.md                 # Knows: what v1.0 delivered
+@AppName/NetworkManager.swift            # Existing code to improve
+@AppName/APIClient.swift                 # Existing code to fix
+</context>
+
+<tasks>
+<task type="auto">
+  <name>Add comprehensive error handling to NetworkManager</name>
+  <files>AppName/NetworkManager.swift</files>
+  <action>Existing NetworkManager has basic try/catch. Add: retry logic (3 attempts with exponential backoff), specific error types (NetworkError enum), user-friendly error messages. Maintain existing public API - internal improvements only.</action>
+  <verify>Build succeeds, existing tests pass, new error tests pass</verify>
+  <done>All network calls have retry logic, error messages are user-friendly</done>
+</task>
+```
+
+**Key difference from greenfield:**
+- PLAN references existing files in `<context>`
+- Tasks say "update existing X" not "create X"
+- Verify includes "existing tests pass" (regression check)
+- Checkpoints may verify existing behavior still works
+
+---
+
+### 3. Major Iterations (v2.0+ - Still Same Codebase)
+
+**Characteristics:**
+- Large rewrites within same codebase
+- 8-15+ phases planned
+- Breaking changes, new architecture
+- Still continuous from v1.x
+
+**Planning structure:**
+```
+.planning/
+├── BRIEF.md              # Updated for v2.0 vision
+├── ROADMAP.md            # Phases 1-14 (grouped)
+├── MILESTONES.md         # v1.0, v1.1 entries
+└── phases/
+    ├── 01-foundation/    # ✓ v1.0
+    ├── 02-features/      # ✓ v1.0
+    ├── 03-polish/        # ✓ v1.0
+    ├── 04-launch/        # ✓ v1.0
+    ├── 05-security/      # ✓ v1.1
+    ├── 06-performance/   # ✓ v1.1
+    ├── 07-swiftui-core/  # 🚧 v2.0 (in progress)
+    ├── 08-swiftui-views/ # 📋 v2.0 (planned)
+    ├── 09-new-arch/      # 📋 v2.0
+    └── ...               # Up to 14
+```
+
+**ROADMAP.md:**
+```markdown
+## Milestones
+
+- ✅ **v1.0 MVP** - Phases 1-4 (shipped 2025-11-25)
+- ✅ **v1.1 Hardening** - Phases 5-6 (shipped 2025-12-10)
+- 🚧 **v2.0 SwiftUI Redesign** - Phases 7-14 (in progress)
+
+## Phases
+
+<details>
+<summary>✅ v1.0 MVP (Phases 1-4)</summary>
+[Collapsed]
+</details>
+
+<details>
+<summary>✅ v1.1 Hardening (Phases 5-6)</summary>
+[Collapsed]
+</details>
+
+### 🚧 v2.0 SwiftUI Redesign (In Progress)
+
+- [ ] Phase 7: SwiftUI Core Migration
+- [ ] Phase 8: SwiftUI Views
+- [ ] Phase 9: New Architecture
+- [ ] Phase 10: Widget Support
+- [ ] Phase 11: iOS Companion
+- [ ] Phase 12: Performance
+- [ ] Phase 13: Testing
+- [ ] Phase 14: Launch
+```
+
+**Same rules apply:** Continuous phase numbering, milestone groupings, brownfield-aware plans.
+
+---
+
+## When to Archive and Start Fresh
+
+**Archive ONLY for these scenarios:**
+
+### Scenario 1: Separate Codebase
+
+**Example:**
+- Built: WeatherBar (macOS app) ✓ shipped
+- Now building: WeatherBar-iOS (separate Xcode project, different repo or workspace)
+
+**Action:**
+```
+.planning/
+├── archive/
+│   └── v1-macos/
+│       ├── BRIEF.md
+│       ├── ROADMAP.md
+│       ├── MILESTONES.md
+│       └── phases/
+├── BRIEF.md              # Fresh: iOS app
+├── ROADMAP.md            # Fresh: starts at phase 01
+└── phases/
+    └── 01-ios-foundation/
+```
+
+**Why:** Different codebase = different planning context. Old planning doesn't help with iOS-specific decisions.
+
+### Scenario 2: Complete Rewrite (Different Repo)
+
+**Example:**
+- Built: AppName v1 (AppKit, shipped) ✓
+- Now building: AppName v2 (complete SwiftUI rewrite, new git repo)
+
+**Action:** Same as Scenario 1 - archive v1, fresh planning for v2
+
+**Why:** New repo, starting from scratch, v1 planning doesn't transfer.
+
+### Scenario 3: Different Product
+
+**Example:**
+- Built: WeatherBar (weather app) ✓
+- Now building: TaskBar (task management app)
+
+**Action:** New project entirely, new `.planning/` directory
+
+**Why:** Completely different product, no relationship.
+
+---
+
+## Decision Tree
+
+```
+Starting new work?
+│
+├─ Same codebase/repo?
+│  │
+│  ├─ YES → Extend existing roadmap
+│  │        ├─ Add phases 5-6+ to ROADMAP
+│  │        ├─ Update BRIEF "Current State"
+│  │        ├─ Plans reference existing code in @context
+│  │        └─ Continue normal workflow
+│  │
+│  └─ NO → Is it a separate platform/codebase for same product?
+│           │
+│           ├─ YES (e.g., iOS version of Mac app)
+│           │    └─ Archive existing planning
+│           │         └─ Start fresh with new BRIEF/ROADMAP
+│           │              └─ Reference original in "Context" section
+│           │
+│           └─ NO (completely different product)
+│                └─ New project, new planning directory
+│
+└─ Is this v1.0 initial delivery?
+   └─ YES → Greenfield mode
+            └─ Just follow normal workflow
+```
+
+---
+
+## Milestone Workflow Triggers
+
+### When completing v1.0 (first ship):
+
+**User:** "I'm ready to ship v1.0"
+
+**Action:**
+1. Verify phases 1-4 complete (all summaries exist)
+2. `/milestone:complete "v1.0 MVP"`
+3. Creates MILESTONES.md entry
+4. Updates BRIEF with "Current State"
+5. Reorganizes ROADMAP with milestone grouping
+6. Git tag v1.0
+7. Commit milestone changes
+
+**Result:** Historical record created, ready for v1.1 work
+
+### When adding v1.1 work:
+
+**User:** "Add dark mode and notifications"
+
+**Action:**
+1. Check BRIEF "Current State" - sees v1.0 shipped
+2. Ask: "Add phases 5-6 to existing roadmap? (yes / archive and start fresh)"
+3. User: "yes"
+4. Update BRIEF with v1.1 goals
+5. Add Phase 5-6 to ROADMAP under "v1.1" milestone heading
+6. Continue normal planning workflow
+
+**Result:** Phases 5-6 added, brownfield-aware through updated BRIEF
+
+### When completing v1.1:
+
+**User:** "Ship v1.1"
+
+**Action:**
+1. Verify phases 5-6 complete
+2. `/milestone:complete "v1.1 Security"`
+3. Add v1.1 entry to MILESTONES.md (prepended, newest first)
+4. Update BRIEF current state to v1.1
+5. Collapse phases 5-6 in ROADMAP
+6. Git tag v1.1
+
+**Result:** v1.0 and v1.1 both in MILESTONES.md, ROADMAP shows history
+
+---
+
+## Brownfield Plan Patterns
+
+**How a brownfield plan differs from greenfield:**
+
+### Greenfield Plan (v1.0):
+```markdown
+<objective>
+Create authentication system from scratch.
+</objective>
+
+<context>
+@.planning/BRIEF.md
+@.planning/ROADMAP.md
+</context>
+
+<tasks>
+<task type="auto">
+  <name>Create User model</name>
+  <files>src/models/User.ts</files>
+  <action>Create User interface with id, email, passwordHash, createdAt fields. Export from models/index.</action>
+  <verify>TypeScript compiles, User type exported</verify>
+  <done>User model exists and is importable</done>
+</task>
+```
+
+### Brownfield Plan (v1.1):
+```markdown
+<objective>
+Add MFA to existing authentication system.
+</objective>
+
+<context>
+@.planning/BRIEF.md              # Shows v1.0 shipped, auth exists
+@.planning/MILESTONES.md         # Shows what v1.0 delivered
+@src/models/User.ts              # Existing User model
+@src/auth/AuthService.ts         # Existing auth logic
+</context>
+
+<tasks>
+<task type="auto">
+  <name>Add MFA fields to User model</name>
+  <files>src/models/User.ts</files>
+  <action>Add to existing User interface: mfaEnabled (boolean), mfaSecret (string | null), mfaBackupCodes (string[]). Maintain backward compatibility - all new fields optional or have defaults.</action>
+  <verify>TypeScript compiles, existing User usages still work</verify>
+  <done>User model has MFA fields, no breaking changes</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>MFA enrollment flow</what-built>
+  <how-to-verify>
+    1. Run: npm run dev
+    2. Login as existing user (test@example.com)
+    3. Navigate to Settings → Security
+    4. Click "Enable MFA" - should show QR code
+    5. Scan with authenticator app (Google Authenticator)
+    6. Enter code - should enable successfully
+    7. Logout, login again - should prompt for MFA code
+    8. Verify: existing users without MFA can still login (backward compat)
+  </how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+```
+
+**Key differences:**
+1. **@context** includes existing code files
+2. **Actions** say "add to existing" / "update existing" / "maintain backward compat"
+3. **Verification** includes regression checks ("existing X still works")
+4. **Checkpoints** may verify existing user flows still work
+
+---
+
+## BRIEF Current State Section
+
+The "Current State" section in BRIEF.md is what makes plans brownfield-aware.
+
+**After v1.0 ships:**
+
+```markdown
+## Current State (Updated: 2025-11-25)
+
+**Shipped:** v1.0 MVP (2025-11-25)
+**Status:** Production
+**Users:** 500 downloads, 50 daily actives, growing 10% weekly
+**Feedback:**
+- "Love the simplicity" (common theme)
+- 15 requests for dark mode
+- 5 crash reports on network errors
+- 3 requests for multiple accounts
+
+**Codebase:**
+- 2,450 lines of Swift
+- macOS 13.0+ (AppKit)
+- OpenWeather API integration
+- Auto-refresh every 30 min
+- Signed and notarized
+
+**Known Issues:**
+- Network errors crash app (no retry logic)
+- Memory leak in auto-refresh timer
+- No dark mode support
+```
+
+When planning Phase 5 (v1.1), Claude reads this and knows:
+- Code exists (2,450 lines Swift)
+- Users exist (500 downloads)
+- Feedback exists (15 want dark mode)
+- Issues exist (network crashes, memory leak)
+
+Plans automatically become brownfield-aware because BRIEF says "this is what we have."
+
+---
+
+## Summary
+
+**Greenfield (v1.0):**
+- Fresh BRIEF with vision
+- Phases 1-4 (or however many)
+- Plans create from scratch
+- Ship → complete milestone
+
+**Brownfield (v1.1+):**
+- Update BRIEF "Current State"
+- Add phases 5-6+ to ROADMAP
+- Plans reference existing code
+- Plans include regression checks
+- Ship → complete milestone
+
+**Archive (rare):**
+- Only for separate codebases or different products
+- Move `.planning/` to `.planning/archive/v1-name/`
+- Start fresh with new BRIEF/ROADMAP
+- New planning references old in context
+
+**Key insight:** Same roadmap, continuous phase numbering (01-99), milestone groupings keep it organized. BRIEF "Current State" makes everything brownfield-aware automatically.
+
+This scales from "hello world" to 100 shipped versions.
--- a/skills/create-plans/references/plan-format.md
+++ b/skills/create-plans/references/plan-format.md
@@ -0,0 +1,377 @@
+<overview>
+Claude-executable plans have a specific format that enables Claude to implement without interpretation. This reference defines what makes a plan executable vs. vague.
+
+**Key insight:** PLAN.md IS the executable prompt. It contains everything Claude needs to execute the phase, including objective, context references, tasks, verification, success criteria, and output specification.
+</overview>
+
+<core_principle>
+A plan is Claude-executable when Claude can read the PLAN.md and immediately start implementing without asking clarifying questions.
+
+If Claude has to guess, interpret, or make assumptions - the task is too vague.
+</core_principle>
+
+<prompt_structure>
+Every PLAN.md follows this XML structure:
+
+```markdown
+---
+phase: XX-name
+type: execute
+domain: [optional]
+---
+
+<objective>
+[What and why]
+Purpose: [...]
+Output: [...]
+</objective>
+
+<context>
+@.planning/BRIEF.md
+@.planning/ROADMAP.md
+@relevant/source/files.ts
+</context>
+
+<tasks>
+<task type="auto">
+  <name>Task N: [Name]</name>
+  <files>[paths]</files>
+  <action>[what to do, what to avoid and WHY]</action>
+  <verify>[command/check]</verify>
+  <done>[criteria]</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>[what Claude automated]</what-built>
+  <how-to-verify>[numbered verification steps]</how-to-verify>
+  <resume-signal>[how to continue - "approved" or describe issues]</resume-signal>
+</task>
+
+<task type="checkpoint:decision" gate="blocking">
+  <decision>[what needs deciding]</decision>
+  <context>[why this matters]</context>
+  <options>
+    <option id="option-a"><name>[Name]</name><pros>[pros]</pros><cons>[cons]</cons></option>
+    <option id="option-b"><name>[Name]</name><pros>[pros]</pros><cons>[cons]</cons></option>
+  </options>
+  <resume-signal>[how to indicate choice]</resume-signal>
+</task>
+</tasks>
+
+<verification>
+[Overall phase checks]
+</verification>
+
+<success_criteria>
+[Measurable completion]
+</success_criteria>
+
+<output>
+[SUMMARY.md specification]
+</output>
+```
+</prompt_structure>
+
+<task_anatomy>
+Every task has four required fields:
+
+<field name="files">
+**What it is**: Exact file paths that will be created or modified.
+
+**Good**: `src/app/api/auth/login/route.ts`, `prisma/schema.prisma`
+**Bad**: "the auth files", "relevant components"
+
+Be specific. If you don't know the file path, figure it out first.
+</field>
+
+<field name="action">
+**What it is**: Specific implementation instructions, including what to avoid and WHY.
+
+**Good**: "Create POST endpoint that accepts {email, password}, validates using bcrypt against User table, returns JWT in httpOnly cookie with 15-min expiry. Use jose library (not jsonwebtoken - CommonJS issues with Next.js Edge runtime)."
+
+**Bad**: "Add authentication", "Make login work"
+
+Include: technology choices, data structures, behavior details, pitfalls to avoid.
+</field>
+
+<field name="verify">
+**What it is**: How to prove the task is complete.
+
+**Good**:
+- `npm test` passes
+- `curl -X POST /api/auth/login` returns 200 with Set-Cookie header
+- Build completes without errors
+
+**Bad**: "It works", "Looks good", "User can log in"
+
+Must be executable - a command, a test, an observable behavior.
+</field>
+
+<field name="done">
+**What it is**: Acceptance criteria - the measurable state of completion.
+
+**Good**: "Valid credentials return 200 + JWT cookie, invalid credentials return 401"
+
+**Bad**: "Authentication is complete"
+
+Should be testable without subjective judgment.
+</field>
+</task_anatomy>
+
+<task_types>
+Tasks have a `type` attribute that determines how they execute:
+
+<type name="auto">
+**Default task type** - Claude executes autonomously.
+
+**Structure:**
+```xml
+<task type="auto">
+  <name>Task 3: Create login endpoint with JWT</name>
+  <files>src/app/api/auth/login/route.ts</files>
+  <action>POST endpoint accepting {email, password}. Query User by email, compare password with bcrypt. On match, create JWT with jose library, set as httpOnly cookie (15-min expiry). Return 200. On mismatch, return 401.</action>
+  <verify>curl -X POST localhost:3000/api/auth/login returns 200 with Set-Cookie header</verify>
+  <done>Valid credentials → 200 + cookie. Invalid → 401.</done>
+</task>
+```
+
+Use for: Everything Claude can do independently (code, tests, builds, file operations).
+</type>
+
+<type name="checkpoint:human-action">
+**RARELY USED** - Only for actions with NO CLI/API. Claude automates everything possible first.
+
+**Structure:**
+```xml
+<task type="checkpoint:human-action" gate="blocking">
+  <action>[Unavoidable manual step - email link, 2FA code]</action>
+  <instructions>
+    [What Claude already automated]
+    [The ONE thing requiring human action]
+  </instructions>
+  <verification>[What Claude can check afterward]</verification>
+  <resume-signal>[How to continue]</resume-signal>
+</task>
+```
+
+Use ONLY for: Email verification links, SMS 2FA codes, manual approvals with no API, 3D Secure payment flows.
+
+Do NOT use for: Anything with a CLI (Vercel, Stripe, Upstash, Railway, GitHub), builds, tests, file creation, deployments.
+
+See: references/cli-automation.md for what Claude can automate.
+
+**Execution:** Claude automates everything with CLI/API, stops only for truly unavoidable manual steps.
+</type>
+
+<type name="checkpoint:human-verify">
+**Human must verify Claude's work** - Visual checks, UX testing.
+
+**Structure:**
+```xml
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Responsive dashboard layout</what-built>
+  <how-to-verify>
+    1. Run: npm run dev
+    2. Visit: http://localhost:3000/dashboard
+    3. Desktop (>1024px): Verify sidebar left, content right
+    4. Tablet (768px): Verify sidebar collapses to hamburger
+    5. Mobile (375px): Verify single column, bottom nav
+    6. Check: No layout shift, no horizontal scroll
+  </how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+```
+
+Use for: UI/UX verification, visual design checks, animation smoothness, accessibility testing.
+
+**Execution:** Claude builds the feature, stops, provides testing instructions, waits for approval/feedback.
+</type>
+
+<type name="checkpoint:decision">
+**Human must make implementation choice** - Direction-setting decisions.
+
+**Structure:**
+```xml
+<task type="checkpoint:decision" gate="blocking">
+  <decision>Select authentication provider</decision>
+  <context>We need user authentication. Three approaches with different tradeoffs:</context>
+  <options>
+    <option id="supabase">
+      <name>Supabase Auth</name>
+      <pros>Built-in with Supabase, generous free tier</pros>
+      <cons>Less customizable UI, tied to ecosystem</cons>
+    </option>
+    <option id="clerk">
+      <name>Clerk</name>
+      <pros>Beautiful pre-built UI, best DX</pros>
+      <cons>Paid after 10k MAU</cons>
+    </option>
+    <option id="nextauth">
+      <name>NextAuth.js</name>
+      <pros>Free, self-hosted, maximum control</pros>
+      <cons>More setup, you manage security</cons>
+    </option>
+  </options>
+  <resume-signal>Select: supabase, clerk, or nextauth</resume-signal>
+</task>
+```
+
+Use for: Technology selection, architecture decisions, design choices, feature prioritization.
+
+**Execution:** Claude presents options with balanced pros/cons, waits for decision, proceeds with chosen direction.
+</type>
+
+**When to use checkpoints:**
+- Visual/UX verification (after Claude builds) → `checkpoint:human-verify`
+- Implementation direction choice → `checkpoint:decision`
+- Truly unavoidable manual actions (email links, 2FA) → `checkpoint:human-action` (rare)
+
+**When NOT to use checkpoints:**
+- Anything with CLI/API (Claude automates it) → `type="auto"`
+- Deployments (Vercel, Railway, Fly) → `type="auto"` with CLI
+- Creating resources (Upstash, Stripe, GitHub) → `type="auto"` with CLI/API
+- File operations, tests, builds → `type="auto"`
+
+**Golden rule:** If Claude CAN automate it, Claude MUST automate it. See: references/cli-automation.md
+
+See `references/checkpoints.md` for comprehensive checkpoint guidance.
+</task_types>
+
+<context_references>
+Use @file references to load context for the prompt:
+
+```markdown
+<context>
+@.planning/BRIEF.md           # Project vision
+@.planning/ROADMAP.md         # Phase structure
+@.planning/phases/02-auth/FINDINGS.md  # Research results
+@src/lib/db.ts                # Existing database setup
+@src/types/user.ts            # Existing type definitions
+</context>
+```
+
+Reference files that Claude needs to understand before implementing.
+</context_references>
+
+<verification_section>
+Overall phase verification (beyond individual task verification):
+
+```markdown
+<verification>
+Before declaring phase complete:
+- [ ] `npm run build` succeeds without errors
+- [ ] `npm test` passes all tests
+- [ ] No TypeScript errors
+- [ ] Feature works end-to-end manually
+</verification>
+```
+</verification_section>
+
+<success_criteria_section>
+Measurable criteria for phase completion:
+
+```markdown
+<success_criteria>
+- All tasks completed
+- All verification checks pass
+- No errors or warnings introduced
+- JWT auth flow works end-to-end
+- Protected routes redirect unauthenticated users
+</success_criteria>
+```
+</success_criteria_section>
+
+<output_section>
+Specify the SUMMARY.md structure:
+
+```markdown
+<output>
+After completion, create `.planning/phases/XX-name/SUMMARY.md`:
+
+# Phase X: Name Summary
+
+**[Substantive one-liner]**
+
+## Accomplishments
+## Files Created/Modified
+## Decisions Made
+## Issues Encountered
+## Next Phase Readiness
+</output>
+```
+</output_section>
+
+<specificity_levels>
+<too_vague>
+```xml
+<task type="auto">
+  <name>Task 1: Add authentication</name>
+  <files>???</files>
+  <action>Implement auth</action>
+  <verify>???</verify>
+  <done>Users can authenticate</done>
+</task>
+```
+
+Claude: "How? What type? What library? Where?"
+</too_vague>
+
+<just_right>
+```xml
+<task type="auto">
+  <name>Task 1: Create login endpoint with JWT</name>
+  <files>src/app/api/auth/login/route.ts</files>
+  <action>POST endpoint accepting {email, password}. Query User by email, compare password with bcrypt. On match, create JWT with jose library, set as httpOnly cookie (15-min expiry). Return 200. On mismatch, return 401. Use jose instead of jsonwebtoken (CommonJS issues with Edge).</action>
+  <verify>curl -X POST localhost:3000/api/auth/login -H "Content-Type: application/json" -d '{"email":"test@test.com","password":"test123"}' returns 200 with Set-Cookie header containing JWT</verify>
+  <done>Valid credentials → 200 + cookie. Invalid → 401. Missing fields → 400.</done>
+</task>
+```
+
+Claude can implement this immediately.
+</just_right>
+
+<too_detailed>
+Writing the actual code in the plan. Trust Claude to implement from clear instructions.
+</too_detailed>
+</specificity_levels>
+
+<anti_patterns>
+<vague_actions>
+- "Set up the infrastructure"
+- "Handle edge cases"
+- "Make it production-ready"
+- "Add proper error handling"
+
+These require Claude to decide WHAT to do. Specify it.
+</vague_actions>
+
+<unverifiable_completion>
+- "It works correctly"
+- "User experience is good"
+- "Code is clean"
+- "Tests pass" (which tests? do they exist?)
+
+These require subjective judgment. Make it objective.
+</unverifiable_completion>
+
+<missing_context>
+- "Use the standard approach"
+- "Follow best practices"
+- "Like the other endpoints"
+
+Claude doesn't know your standards. Be explicit.
+</missing_context>
+</anti_patterns>
+
+<sizing_tasks>
+Good task size: 15-60 minutes of Claude work.
+
+**Too small**: "Add import statement for bcrypt" (combine with related task)
+**Just right**: "Create login endpoint with JWT validation" (focused, specific)
+**Too big**: "Implement full authentication system" (split into multiple plans)
+
+If a task takes multiple sessions, break it down.
+If a task is trivial, combine with related tasks.
+
+**Note on scope:** If a phase has >7 tasks or spans multiple subsystems, split into multiple plans using the naming convention `{phase}-{plan}-PLAN.md`. See `references/scope-estimation.md` for guidance.
+</sizing_tasks>
--- a/skills/create-plans/references/research-pitfalls.md
+++ b/skills/create-plans/references/research-pitfalls.md
@@ -0,0 +1,198 @@
+# Research Pitfalls - Known Patterns to Avoid
+
+## Purpose
+This document catalogs research mistakes discovered in production use, providing specific patterns to avoid and verification strategies to prevent recurrence.
+
+## Known Pitfalls
+
+### Pitfall 1: Configuration Scope Assumptions
+**What**: Assuming global configuration means no project-scoping exists
+**Example**: Concluding "MCP servers are configured GLOBALLY only" while missing project-scoped `.mcp.json`
+**Why it happens**: Not explicitly checking all known configuration patterns
+**Prevention**:
+```xml
+<verification_checklist>
+**CRITICAL**: Verify ALL configuration scopes:
+□ User/global scope - System-wide configuration
+□ Project scope - Project-level configuration files
+□ Local scope - Project-specific user overrides
+□ Workspace scope - IDE/tool workspace settings
+□ Environment scope - Environment variables
+</verification_checklist>
+```
+
+### Pitfall 2: "Search for X" Vagueness
+**What**: Asking researchers to "search for documentation" without specifying where
+**Example**: "Research MCP documentation" → finds outdated community blog instead of official docs
+**Why it happens**: Vague research instructions don't specify exact sources
+**Prevention**:
+```xml
+<sources>
+Official sources (use WebFetch):
+- https://exact-url-to-official-docs
+- https://exact-url-to-api-reference
+
+Search queries (use WebSearch):
+- "specific search query {current_year}"
+- "another specific query {current_year}"
+</sources>
+```
+
+### Pitfall 3: Deprecated vs Current Features
+**What**: Finding archived/old documentation and concluding feature doesn't exist
+**Example**: Finding 2022 docs saying "feature not supported" when current version added it
+**Why it happens**: Not checking multiple sources or recent updates
+**Prevention**:
+```xml
+<verification_checklist>
+□ Check current official documentation
+□ Review changelog/release notes for recent updates
+□ Verify version numbers and publication dates
+□ Cross-reference multiple authoritative sources
+</verification_checklist>
+```
+
+### Pitfall 4: Tool-Specific Variations
+**What**: Conflating capabilities across different tools/environments
+**Example**: "Claude Desktop supports X" ≠ "Claude Code supports X"
+**Why it happens**: Not explicitly checking each environment separately
+**Prevention**:
+```xml
+<verification_checklist>
+□ Claude Desktop capabilities
+□ Claude Code capabilities
+□ VS Code extension capabilities
+□ API/SDK capabilities
+Document which environment supports which features
+</verification_checklist>
+```
+
+### Pitfall 5: Confident Negative Claims Without Citations
+**What**: Making definitive "X is not possible" statements without official source verification
+**Example**: "Folder-scoped MCP configuration is not supported" (missing `.mcp.json`)
+**Why it happens**: Drawing conclusions from absence of evidence rather than evidence of absence
+**Prevention**:
+```xml
+<critical_claims_audit>
+For any "X is not possible" or "Y is the only way" statement:
+- [ ] Is this verified by official documentation stating it explicitly?
+- [ ] Have I checked for recent updates that might change this?
+- [ ] Have I verified all possible approaches/mechanisms?
+- [ ] Am I confusing "I didn't find it" with "it doesn't exist"?
+</critical_claims_audit>
+```
+
+### Pitfall 6: Missing Enumeration
+**What**: Investigating open-ended scope without enumerating known possibilities first
+**Example**: "Research configuration options" instead of listing specific options to verify
+**Why it happens**: Not creating explicit checklist of items to investigate
+**Prevention**:
+```xml
+<verification_checklist>
+Enumerate ALL known options FIRST:
+□ Option 1: [specific item]
+□ Option 2: [specific item]
+□ Option 3: [specific item]
+□ Check for additional unlisted options
+
+For each option above, document:
+- Existence (confirmed/not found/unclear)
+- Official source URL
+- Current status (active/deprecated/beta)
+</verification_checklist>
+```
+
+### Pitfall 7: Single-Source Verification
+**What**: Relying on a single source for critical claims
+**Example**: Using only Stack Overflow answer from 2021 for current best practices
+**Why it happens**: Not cross-referencing multiple authoritative sources
+**Prevention**:
+```xml
+<source_verification>
+For critical claims, require multiple sources:
+- [ ] Official documentation (primary)
+- [ ] Release notes/changelog (for currency)
+- [ ] Additional authoritative source (for verification)
+- [ ] Contradiction check (ensure sources agree)
+</source_verification>
+```
+
+### Pitfall 8: Assumed Completeness
+**What**: Assuming search results are complete and authoritative
+**Example**: First Google result is outdated but assumed current
+**Why it happens**: Not verifying publication dates and source authority
+**Prevention**:
+```xml
+<source_verification>
+For each source consulted:
+- [ ] Publication/update date verified (prefer recent/current)
+- [ ] Source authority confirmed (official docs, not blogs)
+- [ ] Version relevance checked (matches current version)
+- [ ] Multiple search queries tried (not just one)
+</source_verification>
+```
+
+## Red Flags in Research Outputs
+
+### 🚩 Red Flag 1: Zero "Not Found" Results
+**Warning**: Every investigation succeeds perfectly
+**Problem**: Real research encounters dead ends, ambiguity, and unknowns
+**Action**: Expect honest reporting of limitations, contradictions, and gaps
+
+### 🚩 Red Flag 2: No Confidence Indicators
+**Warning**: All findings presented as equally certain
+**Problem**: Can't distinguish verified facts from educated guesses
+**Action**: Require confidence levels (High/Medium/Low) for key findings
+
+### 🚩 Red Flag 3: Missing URLs
+**Warning**: "According to documentation..." without specific URL
+**Problem**: Can't verify claims or check for updates
+**Action**: Require actual URLs for all official documentation claims
+
+### 🚩 Red Flag 4: Definitive Statements Without Evidence
+**Warning**: "X cannot do Y" or "Z is the only way" without citation
+**Problem**: Strong claims require strong evidence
+**Action**: Flag for verification against official sources
+
+### 🚩 Red Flag 5: Incomplete Enumeration
+**Warning**: Verification checklist lists 4 items, output covers 2
+**Problem**: Systematic gaps in coverage
+**Action**: Ensure all enumerated items addressed or marked "not found"
+
+## Continuous Improvement
+
+When research gaps occur:
+
+1. **Document the gap**
+   - What was missed or incorrect?
+   - What was the actual correct information?
+   - What was the impact?
+
+2. **Root cause analysis**
+   - Why wasn't it caught?
+   - Which verification step would have prevented it?
+   - What pattern does this reveal?
+
+3. **Update this document**
+   - Add new pitfall entry
+   - Update relevant checklists
+   - Share lesson learned
+
+## Quick Reference Checklist
+
+Before submitting research, verify:
+
+- [ ] All enumerated items investigated (not just some)
+- [ ] Negative claims verified with official docs
+- [ ] Multiple sources cross-referenced for critical claims
+- [ ] URLs provided for all official documentation
+- [ ] Publication dates checked (prefer recent/current)
+- [ ] Tool/environment-specific variations documented
+- [ ] Confidence levels assigned honestly
+- [ ] Assumptions distinguished from verified facts
+- [ ] "What might I have missed?" review completed
+
+---
+
+**Living Document**: Update after each significant research gap
+**Lessons From**: MCP configuration research gap (missed `.mcp.json`)
--- a/skills/create-plans/references/scope-estimation.md
+++ b/skills/create-plans/references/scope-estimation.md
@@ -0,0 +1,415 @@
+# Scope Estimation & Quality-Driven Plan Splitting
+
+Plans must maintain consistent quality from first task to last. This requires understanding the **quality degradation curve** and splitting aggressively to stay in the peak quality zone.
+
+## The Quality Degradation Curve
+
+**Critical insight:** Claude doesn't degrade at arbitrary percentages - it degrades when it *perceives* context pressure and enters "completion mode."
+
+```
+Context Usage  │  Quality Level   │  Claude's Mental State
+─────────────────────────────────────────────────────────
+0-30%          │  ████████ PEAK   │  "I can be thorough and comprehensive"
+               │                  │  No anxiety, full detail, best work
+
+30-50%         │  ██████ GOOD     │  "Still have room, maintaining quality"
+               │                  │  Engaged, confident, solid work
+
+50-70%         │  ███ DEGRADING   │  "Getting tight, need to be efficient"
+               │                  │  Efficiency mode, compression begins
+
+70%+           │  █ POOR          │  "Running out, must finish quickly"
+               │                  │  Self-lobotomization, rushed, minimal
+```
+
+**The 40-50% inflection point:**
+
+This is where quality breaks. Claude sees context mounting and thinks "I'd better conserve now or I won't finish." Result: The classic mid-execution statement "I'll complete the remaining tasks more concisely" = quality crash.
+
+**The fundamental rule:** Stop BEFORE quality degrades, not at context limit.
+
+## Target: 50% Context Maximum
+
+**Plans should complete within ~50% of context usage.**
+
+Why 50% not 80%?
+- Huge safety buffer
+- No context anxiety possible
+- Quality maintained from start to finish
+- Room for unexpected complexity
+- Space for iteration and fixes
+
+**If you target 80%, you're planning for failure.** By the time you hit 80%, you've already spent 40% in degradation mode.
+
+## The 2-3 Task Rule
+
+**Each plan should contain 2-3 tasks maximum.**
+
+Why this number?
+
+**Task 1 (0-15% context):**
+- Fresh context
+- Peak quality
+- Comprehensive implementation
+- Full testing
+- Complete documentation
+
+**Task 2 (15-35% context):**
+- Still in peak zone
+- Quality maintained
+- Buffer feels safe
+- No anxiety
+
+**Task 3 (35-50% context):**
+- Beginning to feel pressure
+- Quality still good but managing it
+- Natural stopping point
+- Better to commit here
+
+**Task 4+ (50%+ context):**
+- DEGRADATION ZONE
+- "I'll do this concisely" appears
+- Quality crashes
+- Should have split before this
+
+**The principle:** Each task is independently committable. 2-3 focused changes per commit creates beautiful, surgical git history.
+
+## Signals to Split Into Multiple Plans
+
+### Always Split If:
+
+**1. More than 3 tasks**
+- Even if tasks seem small
+- Each additional task increases degradation risk
+- Split into logical groups of 2-3
+
+**2. Multiple subsystems**
+```
+❌ Bad (1 plan):
+- Database schema (3 files)
+- API routes (5 files)
+- UI components (8 files)
+Total: 16 files, 1 plan → guaranteed degradation
+
+✅ Good (3 plans):
+- 01-01-PLAN.md: Database schema (3 files, 2 tasks)
+- 01-02-PLAN.md: API routes (5 files, 3 tasks)
+- 01-03-PLAN.md: UI components (8 files, 3 tasks)
+Total: 16 files, 3 plans → consistent quality
+```
+
+**3. Any task with >5 file modifications**
+- Large tasks burn context fast
+- Split by file groups or logical units
+- Better: 3 plans of 2 files each vs 1 plan of 6 files
+
+**4. Checkpoint + implementation work**
+- Checkpoints require user interaction (context preserved)
+- Implementation after checkpoint should be separate plan
+```
+✅ Good split:
+- 02-01-PLAN.md: Setup (checkpoint: decision on auth provider)
+- 02-02-PLAN.md: Implement chosen auth solution
+```
+
+**5. Research + implementation**
+- Research produces FINDINGS.md (separate plan)
+- Implementation consumes FINDINGS.md (separate plan)
+- Clear boundary, clean handoff
+
+### Consider Splitting If:
+
+**1. Estimated >5 files modified total**
+- Context from reading existing code
+- Context from diffs
+- Context from responses
+- Adds up faster than expected
+
+**2. Complex domains (auth, payments, data modeling)**
+- These require careful thinking
+- Burns more context per task than simple CRUD
+- Split more aggressively
+
+**3. Any uncertainty about approach**
+- "Figure out X" phase separate from "implement X" phase
+- Don't mix exploration and implementation
+
+**4. Natural semantic boundaries**
+- Setup → Core → Features
+- Backend → Frontend
+- Configuration → Implementation → Testing
+
+## Splitting Strategies
+
+### By Subsystem
+
+**Phase:** "Authentication System"
+
+**Split:**
+```
+- 03-01-PLAN.md: Database models (User, Session tables + relations)
+- 03-02-PLAN.md: Auth API (register, login, logout endpoints)
+- 03-03-PLAN.md: Protected routes (middleware, JWT validation)
+- 03-04-PLAN.md: UI components (login form, registration form)
+```
+
+Each plan: 2-3 tasks, single subsystem, clean commits.
+
+### By Dependency
+
+**Phase:** "Payment Integration"
+
+**Split:**
+```
+- 04-01-PLAN.md: Stripe setup (webhook endpoints via API, env vars, test mode)
+- 04-02-PLAN.md: Subscription logic (plans, checkout, customer portal)
+- 04-03-PLAN.md: Frontend integration (pricing page, payment flow)
+```
+
+Later plans depend on earlier completion. Sequential execution, fresh context each time.
+
+### By Complexity
+
+**Phase:** "Dashboard Buildout"
+
+**Split:**
+```
+- 05-01-PLAN.md: Layout shell (simple: sidebar, header, routing)
+- 05-02-PLAN.md: Data fetching (moderate: TanStack Query setup, API integration)
+- 05-03-PLAN.md: Data visualization (complex: charts, tables, real-time updates)
+```
+
+Complex work gets its own plan with full context budget.
+
+### By Verification Points
+
+**Phase:** "Deployment Pipeline"
+
+**Split:**
+```
+- 06-01-PLAN.md: Vercel setup (deploy via CLI, configure domains)
+  → Ends with checkpoint:human-verify "check xyz.vercel.app loads"
+
+- 06-02-PLAN.md: Environment config (secrets via CLI, env vars)
+  → Autonomous (no checkpoints) → subagent execution
+
+- 06-03-PLAN.md: CI/CD (GitHub Actions, preview deploys)
+  → Ends with checkpoint:human-verify "check PR preview works"
+```
+
+Verification checkpoints create natural boundaries. Autonomous plans between checkpoints execute via subagent with fresh context.
+
+## Autonomous vs Interactive Plans
+
+**Critical optimization:** Plans without checkpoints don't need main context.
+
+### Autonomous Plans (No Checkpoints)
+- Contains only `type="auto"` tasks
+- No user interaction needed
+- **Execute via subagent with fresh 200k context**
+- Impossible to degrade (always starts at 0%)
+- Creates SUMMARY, commits, reports back
+- Can run in parallel (multiple subagents)
+
+### Interactive Plans (Has Checkpoints)
+- Contains `checkpoint:human-verify` or `checkpoint:decision` tasks
+- Requires user interaction
+- Must execute in main context
+- Still target 50% context (2-3 tasks)
+
+**Planning guidance:** If splitting a phase, try to:
+- Group autonomous work together (→ subagent)
+- Separate interactive work (→ main context)
+- Maximize autonomous plans (more fresh contexts)
+
+Example:
+```
+Phase: Feature X
+- 07-01-PLAN.md: Backend (autonomous) → subagent
+- 07-02-PLAN.md: Frontend (autonomous) → subagent
+- 07-03-PLAN.md: Integration test (has checkpoint:human-verify) → main context
+```
+
+Two fresh contexts, one interactive verification. Perfect.
+
+## Anti-Patterns
+
+### ❌ The "Comprehensive Plan" Anti-Pattern
+
+```
+Plan: "Complete Authentication System"
+Tasks:
+1. Database models
+2. Migration files
+3. Auth API endpoints
+4. JWT utilities
+5. Protected route middleware
+6. Password hashing
+7. Login form component
+8. Registration form component
+
+Result: 8 tasks, 80%+ context, degradation at task 4-5
+```
+
+**Why this fails:**
+- Task 1-3: Good quality
+- Task 4-5: "I'll do these concisely" = degradation begins
+- Task 6-8: Rushed, minimal, poor quality
+
+### ✅ The "Atomic Plan" Pattern
+
+```
+Split into 4 plans:
+
+Plan 1: "Auth Database Models" (2 tasks)
+- Database schema (User, Session)
+- Migration files
+
+Plan 2: "Auth API Core" (3 tasks)
+- Register endpoint
+- Login endpoint
+- JWT utilities
+
+Plan 3: "Auth API Protection" (2 tasks)
+- Protected route middleware
+- Logout endpoint
+
+Plan 4: "Auth UI Components" (2 tasks)
+- Login form
+- Registration form
+```
+
+**Why this succeeds:**
+- Each plan: 2-3 tasks, 30-40% context
+- All tasks: Peak quality throughout
+- Git history: 4 focused commits
+- Easy to verify each piece
+- Rollback is surgical
+
+### ❌ The "Efficiency Trap" Anti-Pattern
+
+```
+Thinking: "These tasks are small, let's do 6 to be efficient"
+
+Result: Task 1-2 are good, task 3-4 begin degrading, task 5-6 are rushed
+```
+
+**Why this fails:** You're optimizing for fewer plans, not quality. The "efficiency" is false - poor quality requires more rework.
+
+### ✅ The "Quality First" Pattern
+
+```
+Thinking: "These tasks are small, but let's do 2-3 to guarantee quality"
+
+Result: All tasks peak quality, clean commits, no rework needed
+```
+
+**Why this succeeds:** You optimize for quality, which is true efficiency. No rework = faster overall.
+
+## Estimating Context Usage
+
+**Rough heuristics for plan size:**
+
+### File Counts
+- 0-3 files modified: Small task (~10-15% context)
+- 4-6 files modified: Medium task (~20-30% context)
+- 7+ files modified: Large task (~40%+ context) - split this
+
+### Complexity
+- Simple CRUD: ~15% per task
+- Business logic: ~25% per task
+- Complex algorithms: ~40% per task
+- Domain modeling: ~35% per task
+
+### 2-Task Plan (Safe)
+- 2 simple tasks: ~30% total ✅ Plenty of room
+- 2 medium tasks: ~50% total ✅ At target
+- 2 complex tasks: ~80% total ❌ Too tight, split
+
+### 3-Task Plan (Risky)
+- 3 simple tasks: ~45% total ✅ Good
+- 3 medium tasks: ~75% total ⚠️ Pushing it
+- 3 complex tasks: 120% total ❌ Impossible, split
+
+**Conservative principle:** When in doubt, split. Better to have an extra plan than degraded quality.
+
+## The Atomic Commit Philosophy
+
+**What we're optimizing for:** Beautiful git history where each commit is:
+- Focused (2-3 related changes)
+- Complete (fully implemented, tested)
+- Documented (clear commit message)
+- Reviewable (small enough to understand)
+- Revertable (surgical rollback possible)
+
+**Bad git history (large plans):**
+```
+feat(auth): Complete authentication system
+- Added 16 files
+- Modified 8 files
+- 1200 lines changed
+- Contains: models, API, UI, middleware, utilities
+```
+
+Impossible to review, hard to understand, can't revert without losing everything.
+
+**Good git history (atomic plans):**
+```
+feat(auth-01): Add User and Session database models
+- Added schema files
+- Added migration
+- 45 lines changed
+
+feat(auth-02): Implement register and login API endpoints
+- Added /api/auth/register
+- Added /api/auth/login
+- Added JWT utilities
+- 120 lines changed
+
+feat(auth-03): Add protected route middleware
+- Added middleware/auth.ts
+- Added tests
+- 60 lines changed
+
+feat(auth-04): Build login and registration forms
+- Added LoginForm component
+- Added RegisterForm component
+- 90 lines changed
+```
+
+Each commit tells a story. Each is reviewable. Each is revertable. This is craftsmanship.
+
+## Quality Assurance Through Scope Control
+
+**The guarantee:** When you follow the 2-3 task rule with 50% context target:
+
+1. **Consistency:** First task has same quality as last task
+2. **Thoroughness:** No "I'll complete X concisely" degradation
+3. **Documentation:** Full context budget for comments/tests
+4. **Error handling:** Space for proper validation and edge cases
+5. **Testing:** Room for comprehensive test coverage
+
+**The cost:** More plans to manage.
+
+**The benefit:** Consistent excellence. No rework. Clean history. Maintainable code.
+
+**The trade-off is worth it.**
+
+## Summary
+
+**Old way (3-6 tasks, 80% target):**
+- Tasks 1-2: Good
+- Tasks 3-4: Degrading
+- Tasks 5-6: Poor
+- Git: Large, unreviewable commits
+- Quality: Inconsistent
+
+**New way (2-3 tasks, 50% target):**
+- All tasks: Peak quality
+- Git: Atomic, surgical commits
+- Quality: Consistent excellence
+- Autonomous plans: Subagent execution (fresh context)
+
+**The principle:** Aggressive atomicity. More plans, smaller scope, consistent quality.
+
+**The rule:** If in doubt, split. Quality over consolidation. Always.
--- a/skills/create-plans/references/user-gates.md
+++ b/skills/create-plans/references/user-gates.md
@@ -0,0 +1,72 @@
+# User Gates Reference
+
+User gates prevent Claude from charging ahead at critical decision points.
+
+## Question Types
+
+### AskUserQuestion Tool
+Use for **structured choices** (2-4 options):
+- Selecting from distinct approaches
+- Domain/type selection
+- When user needs to see options to decide
+
+Examples:
+- "What type of project?" (macos-app / iphone-app / web-app / other)
+- "Research confidence is low. How to proceed?" (dig deeper / proceed anyway / pause)
+- "Multiple valid approaches exist:" (Option A / Option B / Option C)
+
+### Inline Questions
+Use for **simple confirmations**:
+- Yes/no decisions
+- "Does this look right?"
+- "Ready to proceed?"
+
+Examples:
+- "Here's the task breakdown: [list]. Does this look right?"
+- "Proceed with this approach?"
+- "I'll initialize a git repo. OK?"
+
+## Decision Gate Loop
+
+After gathering context, ALWAYS offer:
+
+```
+Ready to [action], or would you like me to ask more questions?
+
+1. Proceed - I have enough context
+2. Ask more questions - There are details to clarify
+3. Let me add context - I want to provide additional information
+```
+
+Loop continues until user selects "Proceed".
+
+## Mandatory Gate Points
+
+| Location | Gate Type | Trigger |
+|----------|-----------|---------|
+| plan-phase | Inline | Confirm task breakdown |
+| plan-phase | AskUserQuestion | Multiple valid approaches |
+| plan-phase | AskUserQuestion | Decision gate before writing |
+| research-phase | AskUserQuestion | Low confidence findings |
+| research-phase | Inline | Open questions acknowledgment |
+| execute-phase | Inline | Verification failure |
+| execute-phase | Inline | Issues review before proceeding |
+| execute-phase | AskUserQuestion | Previous phase had issues |
+| create-brief | AskUserQuestion | Decision gate before writing |
+| create-roadmap | Inline | Confirm phase breakdown |
+| create-roadmap | AskUserQuestion | Decision gate before writing |
+| handoff | Inline | Handoff acknowledgment |
+
+## Good vs Bad Gating
+
+### Good
+- Gate before writing artifacts (not after)
+- Gate when genuinely ambiguous
+- Gate when issues affect next steps
+- Quick inline for simple confirmations
+
+### Bad
+- Asking obvious choices ("Should I save the file?")
+- Multiple gates for same decision
+- AskUserQuestion for yes/no
+- Gates after the fact
--- a/skills/create-plans/templates/brief.md
+++ b/skills/create-plans/templates/brief.md
@@ -0,0 +1,157 @@
+# Brief Template
+
+## Greenfield Brief (v1.0)
+
+Copy and fill this structure for `.planning/BRIEF.md` when starting a new project:
+
+```markdown
+# [Project Name]
+
+**One-liner**: [What this is in one sentence]
+
+## Problem
+
+[What problem does this solve? Why does it need to exist?
+2-3 sentences max.]
+
+## Success Criteria
+
+How we know it worked:
+
+- [ ] [Measurable outcome 1]
+- [ ] [Measurable outcome 2]
+- [ ] [Measurable outcome 3]
+
+## Constraints
+
+[Any hard constraints: tech stack, timeline, budget, dependencies]
+
+- [Constraint 1]
+- [Constraint 2]
+
+## Out of Scope
+
+What we're NOT building (prevents scope creep):
+
+- [Not doing X]
+- [Not doing Y]
+```
+
+<guidelines>
+- Keep under 50 lines
+- Success criteria must be measurable/verifiable
+- Out of scope prevents "while we're at it" creep
+- This is the ONLY human-focused document
+</guidelines>
+
+## Brownfield Brief (v1.1+)
+
+After shipping v1.0, update BRIEF.md to include current state:
+
+```markdown
+# [Project Name]
+
+## Current State (Updated: YYYY-MM-DD)
+
+**Shipped:** v[X.Y] [Name] (YYYY-MM-DD)
+**Status:** [Production / Beta / Internal / Live with users]
+**Users:** [If known: "~500 downloads, 50 DAU" or "Internal use only" or "N/A"]
+**Feedback:** [Key themes from user feedback, or "Initial release, gathering feedback"]
+**Codebase:**
+- [X,XXX] lines of [primary language]
+- [Key tech stack: framework, platform, deployment target]
+- [Notable dependencies or architecture]
+
+**Known Issues:**
+- [Issue 1 from v1.x that needs addressing]
+- [Issue 2]
+- [Or "None" if clean slate]
+
+## v[Next] Goals
+
+**Vision:** [What's the goal for this next iteration?]
+
+**Motivation:**
+- [Why this work matters now]
+- [User feedback driving it]
+- [Technical debt or improvements needed]
+
+**Scope (v[X.Y]):**
+- [Feature/improvement 1]
+- [Feature/improvement 2]
+- [Feature/improvement 3]
+
+**Success Criteria:**
+- [ ] [Measurable outcome 1]
+- [ ] [Measurable outcome 2]
+- [ ] [Measurable outcome 3]
+
+**Out of Scope:**
+- [Not doing X in this version]
+- [Not doing Y in this version]
+
+---
+
+<details>
+<summary>Original Vision (v1.0 - Archived for reference)</summary>
+
+**One-liner**: [What this is in one sentence]
+
+## Problem
+
+[What problem does this solve? Why does it need to exist?]
+
+## Success Criteria
+
+How we know it worked:
+- [x] [Outcome 1] - Achieved
+- [x] [Outcome 2] - Achieved
+- [x] [Outcome 3] - Achieved
+
+## Constraints
+
+- [Constraint 1]
+- [Constraint 2]
+
+## Out of Scope
+
+- [Not doing X]
+- [Not doing Y]
+
+</details>
+```
+
+<brownfield_guidelines>
+**When to update BRIEF:**
+- After completing each milestone (v1.0 → v1.1 → v2.0)
+- When starting new phases after a shipped version
+- Use `complete-milestone.md` workflow to update systematically
+
+**Current State captures:**
+- What shipped (version, date)
+- Real-world status (production, beta, etc.)
+- User metrics (if applicable)
+- User feedback themes
+- Codebase stats (LOC, tech stack)
+- Known issues needing attention
+
+**Next Goals captures:**
+- Vision for next version
+- Why now (motivation)
+- What's in scope
+- What's measurable
+- What's explicitly out
+
+**Original Vision:**
+- Collapsed in `<details>` tag
+- Reference for "where we came from"
+- Shows evolution of product thinking
+- Checkboxes marked [x] for achieved goals
+
+This structure makes all new plans brownfield-aware automatically because they read BRIEF and see:
+- "v1.0 shipped"
+- "2,450 lines of existing Swift code"
+- "Users reporting X, requesting Y"
+- Plans naturally reference existing files in @context
+</brownfield_guidelines>
+
--- a/skills/create-plans/templates/continue-here.md
+++ b/skills/create-plans/templates/continue-here.md
@@ -0,0 +1,78 @@
+# Continue-Here Template
+
+Copy and fill this structure for `.planning/phases/XX-name/.continue-here.md`:
+
+```yaml
+---
+phase: XX-name
+task: 3
+total_tasks: 7
+status: in_progress
+last_updated: 2025-01-15T14:30:00Z
+---
+```
+
+```markdown
+<current_state>
+[Where exactly are we? What's the immediate context?]
+</current_state>
+
+<completed_work>
+[What got done this session - be specific]
+
+- Task 1: [name] - Done
+- Task 2: [name] - Done
+- Task 3: [name] - In progress, [what's done on it]
+</completed_work>
+
+<remaining_work>
+[What's left in this phase]
+
+- Task 3: [name] - [what's left to do]
+- Task 4: [name] - Not started
+- Task 5: [name] - Not started
+</remaining_work>
+
+<decisions_made>
+[Key decisions and why - so next session doesn't re-debate]
+
+- Decided to use [X] because [reason]
+- Chose [approach] over [alternative] because [reason]
+</decisions_made>
+
+<blockers>
+[Anything stuck or waiting on external factors]
+
+- [Blocker 1]: [status/workaround]
+</blockers>
+
+<context>
+[Mental state, "vibe", anything that helps resume smoothly]
+
+[What were you thinking about? What was the plan?
+This is the "pick up exactly where you left off" context.]
+</context>
+
+<next_action>
+[The very first thing to do when resuming]
+
+Start with: [specific action]
+</next_action>
+```
+
+<yaml_fields>
+Required YAML frontmatter:
+
+- `phase`: Directory name (e.g., `02-authentication`)
+- `task`: Current task number
+- `total_tasks`: How many tasks in phase
+- `status`: `in_progress`, `blocked`, `almost_done`
+- `last_updated`: ISO timestamp
+</yaml_fields>
+
+<guidelines>
+- Be specific enough that a fresh Claude instance understands immediately
+- Include WHY decisions were made, not just what
+- The `<next_action>` should be actionable without reading anything else
+- This file gets DELETED after resume - it's not permanent storage
+</guidelines>
--- a/skills/create-plans/templates/issues.md
+++ b/skills/create-plans/templates/issues.md
@@ -0,0 +1,91 @@
+# ISSUES.md Template
+
+This file is auto-created when Rule 5 (Log non-critical enhancements) is first triggered during execution.
+
+Location: `.planning/ISSUES.md`
+
+```markdown
+# Project Issues Log
+
+Non-critical enhancements discovered during execution. Address in future phases when appropriate.
+
+## Open Enhancements
+
+### ISS-001: [Brief description]
+- **Discovered:** Phase [X] Plan [Y] Task [Z] (YYYY-MM-DD)
+- **Type:** [Performance / Refactoring / UX / Testing / Documentation / Accessibility]
+- **Description:** [What could be improved and why it would help]
+- **Impact:** Low (works correctly, this would enhance)
+- **Effort:** [Quick (<1hr) / Medium (1-4hr) / Substantial (>4hr)]
+- **Suggested phase:** [Phase number where this makes sense, or "Future"]
+
+### ISS-002: Add connection pooling for Redis
+- **Discovered:** Phase 2 Plan 3 Task 6 (2025-11-23)
+- **Type:** Performance
+- **Description:** Redis client creates new connection per request. Connection pooling would reduce latency and handle connection failures better. Currently works but suboptimal under load.
+- **Impact:** Low (works correctly, ~20ms overhead per request)
+- **Effort:** Medium (2-3 hours - need to configure ioredis pool, test connection reuse)
+- **Suggested phase:** Phase 5 (Performance optimization)
+
+### ISS-003: Refactor UserService into smaller modules
+- **Discovered:** Phase 1 Plan 2 Task 3 (2025-11-22)
+- **Type:** Refactoring
+- **Description:** UserService has grown to 400 lines with mixed concerns (auth, profile, settings). Would be cleaner as separate services (AuthService, ProfileService, SettingsService). Currently works but harder to test and reason about.
+- **Impact:** Low (works correctly, just organizational)
+- **Effort:** Substantial (4-6 hours - need to split, update imports, ensure no breakage)
+- **Suggested phase:** Phase 7 (Code health milestone)
+
+## Closed Enhancements
+
+### ISS-XXX: [Brief description]
+- **Status:** Resolved in Phase [X] Plan [Y] (YYYY-MM-DD)
+- **Resolution:** [What was done]
+- **Benefit:** [How it improved the codebase]
+
+---
+
+**Summary:** [X] open, [Y] closed
+**Priority queue:** [List ISS numbers in priority order, or "Address as time permits"]
+```
+
+## Usage Guidelines
+
+**When issues are added:**
+- Auto-increment ISS numbers (ISS-001, ISS-002, etc.)
+- Always include discovery context (Phase/Plan/Task and date)
+- Be specific about impact and effort
+- Suggested phase helps with roadmap planning
+
+**When issues are resolved:**
+- Move to "Closed Enhancements" section
+- Document resolution and benefit
+- Keeps history for reference
+
+**Prioritization:**
+- Quick wins (Quick effort, visible benefit) → Earlier phases
+- Substantial refactors (Substantial effort, organizational benefit) → Dedicated "code health" phases
+- Nice-to-haves (Low impact, high effort) → "Future" or never
+
+**Integration with roadmap:**
+- When planning new phases, scan ISSUES.md for relevant items
+- Can create phases specifically for addressing accumulated issues
+- Example: "Phase 8: Code Health - Address ISS-003, ISS-007, ISS-012"
+
+## Example: Issues Driving Phase Planning
+
+```markdown
+# Roadmap excerpt
+
+### Phase 6: Performance Optimization (Planned)
+
+**Milestone Goal:** Address performance issues discovered during v1.0 usage
+
+**Includes:**
+- ISS-002: Redis connection pooling (Medium effort)
+- ISS-015: Database query optimization (Quick)
+- ISS-021: Image lazy loading (Medium)
+
+**Excludes ISS-003 (refactoring):** Saving for dedicated code health phase
+```
+
+This creates traceability: enhancement discovered → logged → planned → addressed → documented.
--- a/skills/create-plans/templates/milestone.md
+++ b/skills/create-plans/templates/milestone.md
@@ -0,0 +1,115 @@
+# Milestone Entry Template
+
+Add this entry to `.planning/MILESTONES.md` when completing a milestone:
+
+```markdown
+## v[X.Y] [Name] (Shipped: YYYY-MM-DD)
+
+**Delivered:** [One sentence describing what shipped]
+
+**Phases completed:** [X-Y] ([Z] plans total)
+
+**Key accomplishments:**
+- [Major achievement 1]
+- [Major achievement 2]
+- [Major achievement 3]
+- [Major achievement 4]
+
+**Stats:**
+- [X] files created/modified
+- [Y] lines of code (primary language)
+- [Z] phases, [N] plans, [M] tasks
+- [D] days from start to ship (or milestone to milestone)
+
+**Git range:** `feat(XX-XX)` → `feat(YY-YY)`
+
+**What's next:** [Brief description of next milestone goals, or "Project complete"]
+
+---
+```
+
+<structure>
+If MILESTONES.md doesn't exist, create it with header:
+
+```markdown
+# Project Milestones: [Project Name]
+
+[Entries in reverse chronological order - newest first]
+```
+</structure>
+
+<guidelines>
+**When to create milestones:**
+- Initial v1.0 MVP shipped
+- Major version releases (v2.0, v3.0)
+- Significant feature milestones (v1.1, v1.2)
+- Before archiving planning (capture what was shipped)
+
+**Don't create milestones for:**
+- Individual phase completions (normal workflow)
+- Work in progress (wait until shipped)
+- Minor bug fixes that don't constitute a release
+
+**Stats to include:**
+- Count modified files: `git diff --stat feat(XX-XX)..feat(YY-YY) | tail -1`
+- Count LOC: `find . -name "*.swift" -o -name "*.ts" | xargs wc -l` (or relevant extension)
+- Phase/plan/task counts from ROADMAP
+- Timeline from first phase commit to last phase commit
+
+**Git range format:**
+- First commit of milestone → last commit of milestone
+- Example: `feat(01-01)` → `feat(04-01)` for phases 1-4
+</guidelines>
+
+<example>
+```markdown
+# Project Milestones: WeatherBar
+
+## v1.1 Security & Polish (Shipped: 2025-12-10)
+
+**Delivered:** Security hardening with Keychain integration and comprehensive error handling
+
+**Phases completed:** 5-6 (3 plans total)
+
+**Key accomplishments:**
+- Migrated API key storage from plaintext to macOS Keychain
+- Implemented comprehensive error handling for network failures
+- Added Sentry crash reporting integration
+- Fixed memory leak in auto-refresh timer
+
+**Stats:**
+- 23 files modified
+- 650 lines of Swift added
+- 2 phases, 3 plans, 12 tasks
+- 8 days from v1.0 to v1.1
+
+**Git range:** `feat(05-01)` → `feat(06-02)`
+
+**What's next:** v2.0 SwiftUI redesign with widget support
+
+---
+
+## v1.0 MVP (Shipped: 2025-11-25)
+
+**Delivered:** Menu bar weather app with current conditions and 3-day forecast
+
+**Phases completed:** 1-4 (7 plans total)
+
+**Key accomplishments:**
+- Menu bar app with popover UI (AppKit)
+- OpenWeather API integration with auto-refresh
+- Current weather display with conditions icon
+- 3-day forecast list with high/low temperatures
+- Code signed and notarized for distribution
+
+**Stats:**
+- 47 files created
+- 2,450 lines of Swift
+- 4 phases, 7 plans, 28 tasks
+- 12 days from start to ship
+
+**Git range:** `feat(01-01)` → `feat(04-01)`
+
+**What's next:** Security audit and hardening for v1.1
+```
+</example>
--- a/skills/create-plans/templates/phase-prompt.md
+++ b/skills/create-plans/templates/phase-prompt.md
@@ -0,0 +1,233 @@
+# Phase Prompt Template
+
+Copy and fill this structure for `.planning/phases/XX-name/{phase}-{plan}-PLAN.md`:
+
+**Naming:** Use `{phase}-{plan}-PLAN.md` format (e.g., `01-02-PLAN.md` for Phase 1, Plan 2)
+
+```markdown
+---
+phase: XX-name
+type: execute
+domain: [optional - if domain skill loaded]
+---
+
+<objective>
+[What this phase accomplishes - from roadmap phase goal]
+
+Purpose: [Why this matters for the project]
+Output: [What artifacts will be created]
+</objective>
+
+<execution_context>
+@~/.claude/skills/create-plans/workflows/execute-phase.md
+@~/.claude/skills/create-plans/templates/summary.md
+[If plan contains checkpoint tasks (type="checkpoint:*"), add:]
+@~/.claude/skills/create-plans/references/checkpoints.md
+</execution_context>
+
+<context>
+@.planning/BRIEF.md
+@.planning/ROADMAP.md
+[If research exists:]
+@.planning/phases/XX-name/FINDINGS.md
+[Relevant source files:]
+@src/path/to/relevant.ts
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: [Action-oriented name]</name>
+  <files>path/to/file.ext, another/file.ext</files>
+  <action>[Specific implementation - what to do, how to do it, what to avoid and WHY]</action>
+  <verify>[Command or check to prove it worked]</verify>
+  <done>[Measurable acceptance criteria]</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: [Action-oriented name]</name>
+  <files>path/to/file.ext</files>
+  <action>[Specific implementation]</action>
+  <verify>[Command or check]</verify>
+  <done>[Acceptance criteria]</done>
+</task>
+
+<task type="checkpoint:decision" gate="blocking">
+  <decision>[What needs deciding]</decision>
+  <context>[Why this decision matters]</context>
+  <options>
+    <option id="option-a">
+      <name>[Option name]</name>
+      <pros>[Benefits and advantages]</pros>
+      <cons>[Tradeoffs and limitations]</cons>
+    </option>
+    <option id="option-b">
+      <name>[Option name]</name>
+      <pros>[Benefits and advantages]</pros>
+      <cons>[Tradeoffs and limitations]</cons>
+    </option>
+  </options>
+  <resume-signal>[How to indicate choice - "Select: option-a or option-b"]</resume-signal>
+</task>
+
+<task type="auto">
+  <name>Task 3: [Action-oriented name]</name>
+  <files>path/to/file.ext</files>
+  <action>[Specific implementation]</action>
+  <verify>[Command or check]</verify>
+  <done>[Acceptance criteria]</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>[What Claude just built that needs verification]</what-built>
+  <how-to-verify>
+    1. Run: [command to start dev server/app]
+    2. Visit: [URL to check]
+    3. Test: [Specific interactions]
+    4. Confirm: [Expected behaviors]
+  </how-to-verify>
+  <resume-signal>Type "approved" to continue, or describe issues to fix</resume-signal>
+</task>
+
+[Continue for all tasks - mix of auto and checkpoints as needed...]
+
+</tasks>
+
+<verification>
+Before declaring phase complete:
+- [ ] [Specific test command]
+- [ ] [Build/type check passes]
+- [ ] [Behavior verification]
+</verification>
+
+<success_criteria>
+- All tasks completed
+- All verification checks pass
+- No errors or warnings introduced
+- [Phase-specific criteria]
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/XX-name/{phase}-{plan}-SUMMARY.md`:
+
+# Phase [X] Plan [Y]: [Name] Summary
+
+**[Substantive one-liner - what shipped, not "phase complete"]**
+
+## Accomplishments
+- [Key outcome 1]
+- [Key outcome 2]
+
+## Files Created/Modified
+- `path/to/file.ts` - Description
+- `path/to/another.ts` - Description
+
+## Decisions Made
+[Key decisions and rationale, or "None"]
+
+## Issues Encountered
+[Problems and resolutions, or "None"]
+
+## Next Step
+[If more plans in this phase: "Ready for {phase}-{next-plan}-PLAN.md"]
+[If phase complete: "Phase complete, ready for next phase"]
+</output>
+```
+
+<key_elements>
+From create-meta-prompts patterns:
+- XML structure for Claude parsing
+- @context references for file loading
+- Task types: auto, checkpoint:human-action, checkpoint:human-verify, checkpoint:decision
+- Action includes "what to avoid and WHY" (from intelligence-rules)
+- Verification is specific and executable
+- Success criteria is measurable
+- Output specification includes SUMMARY.md structure
+
+**Scope guidance:**
+- Aim for 3-6 tasks per plan
+- If planning >7 tasks, split into multiple plans (01-01, 01-02, etc.)
+- Target ~80% context usage maximum
+- See references/scope-estimation.md for splitting guidance
+</key_elements>
+
+<good_examples>
+```markdown
+---
+phase: 01-foundation
+type: execute
+domain: next-js
+---
+
+<objective>
+Set up Next.js project with authentication foundation.
+
+Purpose: Establish the core structure and auth patterns all features depend on.
+Output: Working Next.js app with JWT auth, protected routes, and user model.
+</objective>
+
+<execution_context>
+@~/.claude/skills/create-plans/workflows/execute-phase.md
+@~/.claude/skills/create-plans/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/BRIEF.md
+@.planning/ROADMAP.md
+@src/lib/db.ts
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Add User model to database schema</name>
+  <files>prisma/schema.prisma</files>
+  <action>Add User model with fields: id (cuid), email (unique), passwordHash, createdAt, updatedAt. Add Session relation. Use @db.VarChar(255) for email to prevent index issues.</action>
+  <verify>npx prisma validate passes, npx prisma generate succeeds</verify>
+  <done>Schema valid, types generated, no errors</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Create login API endpoint</name>
+  <files>src/app/api/auth/login/route.ts</files>
+  <action>POST endpoint that accepts {email, password}, validates against User table using bcrypt, returns JWT in httpOnly cookie with 15-min expiry. Use jose library for JWT (not jsonwebtoken - it has CommonJS issues with Next.js).</action>
+  <verify>curl -X POST /api/auth/login -d '{"email":"test@test.com","password":"test"}' -H "Content-Type: application/json" returns 200 with Set-Cookie header</verify>
+  <done>Valid credentials return 200 + cookie, invalid return 401, missing fields return 400</done>
+</task>
+
+</tasks>
+
+<verification>
+Before declaring phase complete:
+- [ ] `npm run build` succeeds without errors
+- [ ] `npx prisma validate` passes
+- [ ] Login endpoint responds correctly to valid/invalid credentials
+- [ ] Protected route redirects unauthenticated users
+</verification>
+
+<success_criteria>
+- All tasks completed
+- All verification checks pass
+- No TypeScript errors
+- JWT auth flow works end-to-end
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/01-foundation/01-01-SUMMARY.md`
+</output>
+```
+</good_examples>
+
+<bad_examples>
+```markdown
+# Phase 1: Foundation
+
+## Tasks
+
+### Task 1: Set up authentication
+**Action**: Add auth to the app
+**Done when**: Users can log in
+```
+
+This is useless. No XML structure, no @context, no verification, no specificity.
+</bad_examples>
--- a/skills/create-plans/templates/research-prompt.md
+++ b/skills/create-plans/templates/research-prompt.md
@@ -0,0 +1,274 @@
+# Research Prompt Template
+
+For phases requiring research before planning:
+
+```markdown
+---
+phase: XX-name
+type: research
+topic: [research-topic]
+---
+
+<session_initialization>
+Before beginning research, verify today's date:
+!`date +%Y-%m-%d`
+
+Use this date when searching for "current" or "latest" information.
+Example: If today is 2025-11-22, search for "2025" not "2024".
+</session_initialization>
+
+<research_objective>
+Research [topic] to inform [phase name] implementation.
+
+Purpose: [What decision/implementation this enables]
+Scope: [Boundaries]
+Output: FINDINGS.md with structured recommendations
+</research_objective>
+
+<research_scope>
+<include>
+- [Question to answer]
+- [Area to investigate]
+- [Specific comparison if needed]
+</include>
+
+<exclude>
+- [Out of scope for this research]
+- [Defer to implementation phase]
+</exclude>
+
+<sources>
+Official documentation (with exact URLs when known):
+- https://example.com/official-docs
+- https://example.com/api-reference
+
+Search queries for WebSearch:
+- "[topic] best practices {current_year}"
+- "[topic] latest version"
+
+Context7 MCP for library docs
+Prefer current/recent sources (check date above)
+</sources>
+</research_scope>
+
+<verification_checklist>
+{If researching configuration/architecture with known components:}
+□ Enumerate ALL known options/scopes (list them explicitly):
+  □ Option/Scope 1: [description]
+  □ Option/Scope 2: [description]
+  □ Option/Scope 3: [description]
+□ Document exact file locations/URLs for each option
+□ Verify precedence/hierarchy rules if applicable
+□ Check for recent updates or changes to documentation
+
+{For all research:}
+□ Verify negative claims ("X is not possible") with official docs
+□ Confirm all primary claims have authoritative sources
+□ Check both current docs AND recent updates/changelogs
+□ Test multiple search queries to avoid missing information
+□ Check for environment/tool-specific variations
+</verification_checklist>
+
+<research_quality_assurance>
+Before completing research, perform these checks:
+
+<completeness_check>
+- [ ] All enumerated options/components documented with evidence
+- [ ] Official documentation cited for critical claims
+- [ ] Contradictory information resolved or flagged
+</completeness_check>
+
+<blind_spots_review>
+Ask yourself: "What might I have missed?"
+- [ ] Are there configuration/implementation options I didn't investigate?
+- [ ] Did I check for multiple environments/contexts?
+- [ ] Did I verify claims that seem definitive ("cannot", "only", "must")?
+- [ ] Did I look for recent changes or updates to documentation?
+</blind_spots_review>
+
+<critical_claims_audit>
+For any statement like "X is not possible" or "Y is the only way":
+- [ ] Is this verified by official documentation?
+- [ ] Have I checked for recent updates that might change this?
+- [ ] Are there alternative approaches I haven't considered?
+</critical_claims_audit>
+</research_quality_assurance>
+
+<incremental_output>
+**CRITICAL: Write findings incrementally to prevent token limit failures**
+
+Instead of generating full FINDINGS.md at the end:
+1. Create FINDINGS.md with structure skeleton
+2. Write each finding as you discover it (append immediately)
+3. Add code examples as found (append immediately)
+4. Finalize summary and metadata at end
+
+This ensures zero lost work if token limits are hit.
+
+<workflow>
+Step 1 - Initialize:
+```bash
+# Create skeleton file
+cat > .planning/phases/XX-name/FINDINGS.md <<'EOF'
+# [Topic] Research Findings
+
+## Summary
+[Will complete at end]
+
+## Recommendations
+[Will complete at end]
+
+## Key Findings
+[Append findings here as discovered]
+
+## Code Examples
+[Append examples here as found]
+
+## Metadata
+[Will complete at end]
+EOF
+```
+
+Step 2 - Append findings as discovered:
+After researching each aspect, immediately append to Key Findings section
+
+Step 3 - Finalize at end:
+Complete Summary, Recommendations, and Metadata sections
+</workflow>
+</incremental_output>
+
+<output_structure>
+Create `.planning/phases/XX-name/FINDINGS.md`:
+
+# [Topic] Research Findings
+
+## Summary
+[2-3 paragraph executive summary]
+
+## Recommendations
+
+### Primary Recommendation
+[What to do and why]
+
+### Alternatives Considered
+[What else was evaluated]
+
+## Key Findings
+
+### [Category 1]
+- Finding with source URL
+- Relevance to our case
+
+### [Category 2]
+- Finding with source URL
+- Relevance
+
+## Code Examples
+[Relevant patterns, if applicable]
+
+## Metadata
+
+<metadata>
+<confidence level="high|medium|low">
+[Why this confidence level]
+</confidence>
+
+<dependencies>
+[What's needed to proceed]
+</dependencies>
+
+<open_questions>
+[What couldn't be determined]
+</open_questions>
+
+<assumptions>
+[What was assumed]
+</assumptions>
+
+<quality_report>
+  <sources_consulted>
+    [List URLs of official documentation and primary sources]
+  </sources_consulted>
+  <claims_verified>
+    [Key findings verified with official sources]
+  </claims_verified>
+  <claims_assumed>
+    [Findings based on inference or incomplete information]
+  </claims_assumed>
+  <confidence_by_finding>
+    - Finding 1: High (official docs + multiple sources)
+    - Finding 2: Medium (single source)
+    - Finding 3: Low (inferred, requires verification)
+  </confidence_by_finding>
+</quality_report>
+</metadata>
+</output_structure>
+
+<success_criteria>
+- All scope questions answered
+- All verification checklist items completed
+- Sources are current and authoritative
+- Clear primary recommendation
+- Metadata captures uncertainties
+- Quality report distinguishes verified from assumed
+- Ready to inform PLAN.md creation
+</success_criteria>
+```
+
+<when_to_use>
+Create RESEARCH.md before PLAN.md when:
+- Technology choice unclear
+- Best practices needed for unfamiliar domain
+- API/library investigation required
+- Architecture decision pending
+- Multiple valid approaches exist
+</when_to_use>
+
+<example>
+```markdown
+---
+phase: 02-auth
+type: research
+topic: JWT library selection for Next.js App Router
+---
+
+<research_objective>
+Research JWT libraries to determine best option for Next.js 14 App Router authentication.
+
+Purpose: Select JWT library before implementing auth endpoints
+Scope: Compare jose, jsonwebtoken, and @auth/core for our use case
+Output: FINDINGS.md with library recommendation
+</research_objective>
+
+<research_scope>
+<include>
+- ESM/CommonJS compatibility with Next.js 14
+- Edge runtime support
+- Token creation and validation patterns
+- Community adoption and maintenance
+</include>
+
+<exclude>
+- Full auth framework comparison (NextAuth vs custom)
+- OAuth provider configuration
+- Session storage strategies
+</exclude>
+
+<sources>
+Official documentation (prioritize):
+- https://github.com/panva/jose
+- https://github.com/auth0/node-jsonwebtoken
+
+Context7 MCP for library docs
+Prefer current/recent sources
+</sources>
+</research_scope>
+
+<success_criteria>
+- Clear recommendation with rationale
+- Code examples for selected library
+- Known limitations documented
+- Verification checklist completed
+</success_criteria>
+```
+</example>
--- a/skills/create-plans/templates/roadmap.md
+++ b/skills/create-plans/templates/roadmap.md
@@ -0,0 +1,200 @@
+# Roadmap Template
+
+Copy and fill this structure for `.planning/ROADMAP.md`:
+
+## Initial Roadmap (v1.0 Greenfield)
+
+```markdown
+# Roadmap: [Project Name]
+
+## Overview
+
+[One paragraph describing the journey from start to finish]
+
+## Phases
+
+- [ ] **Phase 1: [Name]** - [One-line description]
+- [ ] **Phase 2: [Name]** - [One-line description]
+- [ ] **Phase 3: [Name]** - [One-line description]
+- [ ] **Phase 4: [Name]** - [One-line description]
+
+## Phase Details
+
+### Phase 1: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Nothing (first phase)
+**Plans**: [Number of plans, e.g., "3 plans" or "TBD after research"]
+
+Plans:
+- [ ] 01-01: [Brief description of first plan]
+- [ ] 01-02: [Brief description of second plan]
+- [ ] 01-03: [Brief description of third plan]
+
+### Phase 2: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 1
+**Plans**: [Number of plans]
+
+Plans:
+- [ ] 02-01: [Brief description]
+
+### Phase 3: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 2
+**Plans**: [Number of plans]
+
+Plans:
+- [ ] 03-01: [Brief description]
+- [ ] 03-02: [Brief description]
+
+### Phase 4: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 3
+**Plans**: [Number of plans]
+
+Plans:
+- [ ] 04-01: [Brief description]
+
+## Progress
+
+| Phase | Plans Complete | Status | Completed |
+|-------|----------------|--------|-----------|
+| 1. [Name] | 0/3 | Not started | - |
+| 2. [Name] | 0/1 | Not started | - |
+| 3. [Name] | 0/2 | Not started | - |
+| 4. [Name] | 0/1 | Not started | - |
+```
+
+<guidelines>
+**Initial planning (v1.0):**
+- 3-6 phases total (more = scope creep)
+- Each phase delivers something coherent
+- Phases can have 1+ plans (split if >7 tasks or multiple subsystems)
+- Plans use naming: {phase}-{plan}-PLAN.md (e.g., 01-02-PLAN.md)
+- No time estimates (this isn't enterprise PM)
+- Progress table updated by transition workflow
+- Plan count can be "TBD" initially, refined during planning
+
+**After milestones ship:**
+- Reorganize with milestone groupings (see below)
+- Collapse completed milestones in `<details>` tags
+- Add new milestone sections for upcoming work
+- Keep continuous phase numbering (never restart at 01)
+</guidelines>
+
+<status_values>
+- `Not started` - Haven't begun
+- `In progress` - Currently working
+- `Complete` - Done (add completion date)
+- `Deferred` - Pushed to later (with reason)
+</status_values>
+
+## Milestone-Grouped Roadmap (After v1.0 Ships)
+
+After completing first milestone, reorganize roadmap with milestone groupings:
+
+```markdown
+# Roadmap: [Project Name]
+
+## Milestones
+
+- ✅ **v1.0 MVP** - Phases 1-4 (shipped YYYY-MM-DD)
+- 🚧 **v1.1 [Name]** - Phases 5-6 (in progress)
+- 📋 **v2.0 [Name]** - Phases 7-10 (planned)
+
+## Phases
+
+<details>
+<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED YYYY-MM-DD</summary>
+
+### Phase 1: [Name]
+**Goal**: [What this phase delivers]
+**Plans**: 3 plans
+
+Plans:
+- [x] 01-01: [Brief description]
+- [x] 01-02: [Brief description]
+- [x] 01-03: [Brief description]
+
+### Phase 2: [Name]
+**Goal**: [What this phase delivers]
+**Plans**: 2 plans
+
+Plans:
+- [x] 02-01: [Brief description]
+- [x] 02-02: [Brief description]
+
+### Phase 3: [Name]
+**Goal**: [What this phase delivers]
+**Plans**: 2 plans
+
+Plans:
+- [x] 03-01: [Brief description]
+- [x] 03-02: [Brief description]
+
+### Phase 4: [Name]
+**Goal**: [What this phase delivers]
+**Plans**: 1 plan
+
+Plans:
+- [x] 04-01: [Brief description]
+
+</details>
+
+### 🚧 v1.1 [Name] (In Progress)
+
+**Milestone Goal:** [What v1.1 delivers]
+
+#### Phase 5: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 4
+**Plans**: 1 plan
+
+Plans:
+- [ ] 05-01: [Brief description]
+
+#### Phase 6: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 5
+**Plans**: 2 plans
+
+Plans:
+- [ ] 06-01: [Brief description]
+- [ ] 06-02: [Brief description]
+
+### 📋 v2.0 [Name] (Planned)
+
+**Milestone Goal:** [What v2.0 delivers]
+
+#### Phase 7: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 6
+**Plans**: 3 plans
+
+Plans:
+- [ ] 07-01: [Brief description]
+- [ ] 07-02: [Brief description]
+- [ ] 07-03: [Brief description]
+
+[... additional phases for v2.0 ...]
+
+## Progress
+
+| Phase | Milestone | Plans Complete | Status | Completed |
+|-------|-----------|----------------|--------|-----------|
+| 1. Foundation | v1.0 | 3/3 | Complete | YYYY-MM-DD |
+| 2. Features | v1.0 | 2/2 | Complete | YYYY-MM-DD |
+| 3. Polish | v1.0 | 2/2 | Complete | YYYY-MM-DD |
+| 4. Launch | v1.0 | 1/1 | Complete | YYYY-MM-DD |
+| 5. Security | v1.1 | 0/1 | Not started | - |
+| 6. Hardening | v1.1 | 0/2 | Not started | - |
+| 7. Redesign Core | v2.0 | 0/3 | Not started | - |
+```
+
+**Notes:**
+- Milestone emoji: ✅ shipped, 🚧 in progress, 📋 planned
+- Completed milestones collapsed in `<details>` for readability
+- Current/future milestones expanded
+- Continuous phase numbering (01-99)
+- Progress table includes milestone column
+
--- a/skills/create-plans/templates/summary.md
+++ b/skills/create-plans/templates/summary.md
@@ -0,0 +1,148 @@
+# Summary Template
+
+Standardize SUMMARY.md format for phase completion:
+
+```markdown
+# Phase [X]: [Name] Summary
+
+**[Substantive one-liner describing outcome - NOT "phase complete" or "implementation finished"]**
+
+## Accomplishments
+- [Most important outcome]
+- [Second key accomplishment]
+- [Third if applicable]
+
+## Files Created/Modified
+- `path/to/file.ts` - What it does
+- `path/to/another.ts` - What it does
+
+## Decisions Made
+[Key decisions with brief rationale, or "None - followed plan as specified"]
+
+## Deviations from Plan
+
+[If no deviations: "None - plan executed exactly as written"]
+
+[If deviations occurred:]
+
+### Auto-fixed Issues
+
+**1. [Rule X - Category] Brief description**
+- **Found during:** Task [N] ([task name])
+- **Issue:** [What was wrong]
+- **Fix:** [What was done]
+- **Files modified:** [file paths]
+- **Verification:** [How it was verified]
+- **Commit:** [hash]
+
+[... repeat for each auto-fix ...]
+
+### Deferred Enhancements
+
+Logged to .planning/ISSUES.md for future consideration:
+- ISS-XXX: [Brief description] (discovered in Task [N])
+- ISS-XXX: [Brief description] (discovered in Task [N])
+
+---
+
+**Total deviations:** [N] auto-fixed ([breakdown by rule]), [N] deferred
+**Impact on plan:** [Brief assessment - e.g., "All auto-fixes necessary for correctness/security. No scope creep."]
+
+## Issues Encountered
+[Problems and how they were resolved, or "None"]
+
+[Note: "Deviations from Plan" documents unplanned work that was handled automatically via deviation rules. "Issues Encountered" documents problems during planned work that required problem-solving.]
+
+## Next Phase Readiness
+[What's ready for next phase]
+[Any blockers or concerns]
+
+---
+*Phase: XX-name*
+*Completed: [date]*
+```
+
+<one_liner_rules>
+The one-liner MUST be substantive:
+
+**Good:**
+- "JWT auth with refresh rotation using jose library"
+- "Prisma schema with User, Session, and Product models"
+- "Dashboard with real-time metrics via Server-Sent Events"
+
+**Bad:**
+- "Phase complete"
+- "Authentication implemented"
+- "Foundation finished"
+- "All tasks done"
+
+The one-liner should tell someone what actually shipped.
+</one_liner_rules>
+
+<example>
+```markdown
+# Phase 1: Foundation Summary
+
+**JWT auth with refresh rotation using jose library, Prisma User model, and protected API middleware**
+
+## Accomplishments
+- User model with email/password auth
+- Login/logout endpoints with httpOnly JWT cookies
+- Protected route middleware checking token validity
+- Refresh token rotation on each request
+
+## Files Created/Modified
+- `prisma/schema.prisma` - User and Session models
+- `src/app/api/auth/login/route.ts` - Login endpoint
+- `src/app/api/auth/logout/route.ts` - Logout endpoint
+- `src/middleware.ts` - Protected route checks
+- `src/lib/auth.ts` - JWT helpers using jose
+
+## Decisions Made
+- Used jose instead of jsonwebtoken (ESM-native, Edge-compatible)
+- 15-min access tokens with 7-day refresh tokens
+- Storing refresh tokens in database for revocation capability
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 2 - Missing Critical] Added password hashing with bcrypt**
+- **Found during:** Task 2 (Login endpoint implementation)
+- **Issue:** Plan didn't specify password hashing - storing plaintext would be critical security flaw
+- **Fix:** Added bcrypt hashing on registration, comparison on login with salt rounds 10
+- **Files modified:** src/app/api/auth/login/route.ts, src/lib/auth.ts
+- **Verification:** Password hash test passes, plaintext never stored
+- **Commit:** abc123f
+
+**2. [Rule 3 - Blocking] Installed missing jose dependency**
+- **Found during:** Task 4 (JWT token generation)
+- **Issue:** jose package not in package.json, import failing
+- **Fix:** Ran `npm install jose`
+- **Files modified:** package.json, package-lock.json
+- **Verification:** Import succeeds, build passes
+- **Commit:** def456g
+
+### Deferred Enhancements
+
+Logged to .planning/ISSUES.md for future consideration:
+- ISS-001: Add rate limiting to login endpoint (discovered in Task 2)
+- ISS-002: Improve token refresh UX with auto-retry on 401 (discovered in Task 5)
+
+---
+
+**Total deviations:** 2 auto-fixed (1 missing critical, 1 blocking), 2 deferred
+**Impact on plan:** Both auto-fixes essential for security and functionality. No scope creep.
+
+## Issues Encountered
+- jsonwebtoken CommonJS import failed in Edge runtime - switched to jose (planned library change, worked as expected)
+
+## Next Phase Readiness
+- Auth foundation complete, ready for feature development
+- User registration endpoint needed before public launch
+
+---
+*Phase: 01-foundation*
+*Completed: 2025-01-15*
+```
+</example>
--- a/skills/create-plans/workflows/complete-milestone.md
+++ b/skills/create-plans/workflows/complete-milestone.md
@@ -0,0 +1,366 @@
+# Workflow: Complete Milestone
+
+<required_reading>
+**Read these files NOW:**
+1. templates/milestone.md
+2. `.planning/ROADMAP.md`
+3. `.planning/BRIEF.md`
+</required_reading>
+
+<purpose>
+Mark a shipped version (v1.0, v1.1, v2.0) as complete. This creates a historical record in MILESTONES.md, updates BRIEF.md with current state, reorganizes ROADMAP.md with milestone groupings, and tags the release in git.
+
+This is the ritual that separates "development" from "shipped."
+</purpose>
+
+<process>
+
+<step name="verify_readiness">
+Check if milestone is truly complete:
+
+```bash
+cat .planning/ROADMAP.md
+ls .planning/phases/*/SUMMARY.md 2>/dev/null | wc -l
+```
+
+**Questions to ask:**
+- Which phases belong to this milestone?
+- Are all those phases complete (all plans have summaries)?
+- Has the work been tested/validated?
+- Is this ready to ship/tag?
+
+Present:
+```
+Milestone: [Name from user, e.g., "v1.0 MVP"]
+
+Appears to include:
+- Phase 1: Foundation (2/2 plans complete)
+- Phase 2: Authentication (2/2 plans complete)
+- Phase 3: Core Features (3/3 plans complete)
+- Phase 4: Polish (1/1 plan complete)
+
+Total: 4 phases, 8 plans, all complete
+
+Ready to mark this milestone as shipped?
+(yes / wait / adjust scope)
+```
+
+Wait for confirmation.
+
+If "adjust scope": Ask which phases should be included.
+If "wait": Stop, user will return when ready.
+</step>
+
+<step name="gather_stats">
+Calculate milestone statistics:
+
+```bash
+# Count phases and plans in milestone
+# (user specified or detected from roadmap)
+
+# Find git range
+git log --oneline --grep="feat(" | head -20
+
+# Count files modified in range
+git diff --stat FIRST_COMMIT..LAST_COMMIT | tail -1
+
+# Count LOC (adapt to language)
+find . -name "*.swift" -o -name "*.ts" -o -name "*.py" | xargs wc -l 2>/dev/null
+
+# Calculate timeline
+git log --format="%ai" FIRST_COMMIT | tail -1  # Start date
+git log --format="%ai" LAST_COMMIT | head -1   # End date
+```
+
+Present summary:
+```
+Milestone Stats:
+- Phases: [X-Y]
+- Plans: [Z] total
+- Tasks: [N] total (estimated from phase summaries)
+- Files modified: [M]
+- Lines of code: [LOC] [language]
+- Timeline: [Days] days ([Start] → [End])
+- Git range: feat(XX-XX) → feat(YY-YY)
+```
+
+Confirm before proceeding.
+</step>
+
+<step name="extract_accomplishments">
+Read all phase SUMMARY.md files in milestone range:
+
+```bash
+cat .planning/phases/01-*/01-*-SUMMARY.md
+cat .planning/phases/02-*/02-*-SUMMARY.md
+# ... for each phase in milestone
+```
+
+From summaries, extract 4-6 key accomplishments.
+
+Present:
+```
+Key accomplishments for this milestone:
+1. [Achievement from phase 1]
+2. [Achievement from phase 2]
+3. [Achievement from phase 3]
+4. [Achievement from phase 4]
+5. [Achievement from phase 5]
+
+Does this capture the milestone? (yes / adjust)
+```
+
+If "adjust": User can add/remove/edit accomplishments.
+</step>
+
+<step name="create_milestone_entry">
+Create or update `.planning/MILESTONES.md`.
+
+If file doesn't exist:
+```markdown
+# Project Milestones: [Project Name from BRIEF]
+
+[New entry]
+```
+
+If exists, prepend new entry (reverse chronological order).
+
+Use template from `templates/milestone.md`:
+
+```markdown
+## v[Version] [Name] (Shipped: YYYY-MM-DD)
+
+**Delivered:** [One sentence from user]
+
+**Phases completed:** [X-Y] ([Z] plans total)
+
+**Key accomplishments:**
+- [List from previous step]
+
+**Stats:**
+- [Files] files created/modified
+- [LOC] lines of [language]
+- [Phases] phases, [Plans] plans, [Tasks] tasks
+- [Days] days from [start milestone or start project] to ship
+
+**Git range:** `feat(XX-XX)` → `feat(YY-YY)`
+
+**What's next:** [Ask user: what's the next goal?]
+
+---
+```
+
+Confirm entry looks correct.
+</step>
+
+<step name="update_brief">
+Update `.planning/BRIEF.md` to reflect current state.
+
+Add/update "Current State" section at top (after YAML if present):
+
+```markdown
+# Project Brief: [Name]
+
+## Current State (Updated: YYYY-MM-DD)
+
+**Shipped:** v[X.Y] [Name] (YYYY-MM-DD)
+**Status:** [Production / Beta / Internal]
+**Users:** [If known, e.g., "~500 downloads, 50 DAU" or "Internal use only"]
+**Feedback:** [Key themes from users, or "Initial release, gathering feedback"]
+**Codebase:** [LOC] [language], [key tech stack], [platform/deployment target]
+
+## [Next Milestone] Goals
+
+**Vision:** [What's the goal for next version?]
+
+**Motivation:**
+- [Why this next work matters]
+- [User feedback driving it]
+- [Technical debt or improvements needed]
+
+**Scope (v[X.Y]):**
+- [Feature/improvement 1]
+- [Feature/improvement 2]
+- [Feature/improvement 3]
+
+---
+
+<details>
+<summary>Original Vision (v1.0 - Archived for reference)</summary>
+
+[Move original brief content here]
+
+</details>
+```
+
+**If this is v1.0 (first milestone):**
+Just add "Current State" section, no need to archive original vision yet.
+
+**If this is v1.1+:**
+Collapse previous version's content into `<details>` section.
+
+Show diff, confirm changes.
+</step>
+
+<step name="reorganize_roadmap">
+Update `.planning/ROADMAP.md` to group completed milestone phases.
+
+Add milestone headers and collapse completed work:
+
+```markdown
+# Roadmap: [Project Name]
+
+## Milestones
+
+- ✅ **v1.0 MVP** - Phases 1-4 (shipped YYYY-MM-DD)
+- 🚧 **v1.1 Security** - Phases 5-6 (in progress)
+- 📋 **v2.0 Redesign** - Phases 7-10 (planned)
+
+## Phases
+
+<details>
+<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED YYYY-MM-DD</summary>
+
+- [x] Phase 1: Foundation (2/2 plans) - completed YYYY-MM-DD
+- [x] Phase 2: Authentication (2/2 plans) - completed YYYY-MM-DD
+- [x] Phase 3: Core Features (3/3 plans) - completed YYYY-MM-DD
+- [x] Phase 4: Polish (1/1 plan) - completed YYYY-MM-DD
+
+</details>
+
+### 🚧 v[Next] [Name] (In Progress / Planned)
+
+- [ ] Phase 5: [Name] ([N] plans)
+- [ ] Phase 6: [Name] ([N] plans)
+
+## Progress
+
+| Phase | Milestone | Plans Complete | Status | Completed |
+|-------|-----------|----------------|--------|-----------|
+| 1. Foundation | v1.0 | 2/2 | Complete | YYYY-MM-DD |
+| 2. Authentication | v1.0 | 2/2 | Complete | YYYY-MM-DD |
+| 3. Core Features | v1.0 | 3/3 | Complete | YYYY-MM-DD |
+| 4. Polish | v1.0 | 1/1 | Complete | YYYY-MM-DD |
+| 5. Security Audit | v1.1 | 0/1 | Not started | - |
+| 6. Hardening | v1.1 | 0/2 | Not started | - |
+```
+
+Show diff, confirm changes.
+</step>
+
+<step name="git_tag">
+Create git tag for milestone:
+
+```bash
+git tag -a v[X.Y] -m "$(cat <<'EOF'
+v[X.Y] [Name]
+
+Delivered: [One sentence]
+
+Key accomplishments:
+- [Item 1]
+- [Item 2]
+- [Item 3]
+
+See .planning/MILESTONES.md for full details.
+EOF
+)"
+```
+
+Confirm: "Tagged: v[X.Y]"
+
+Ask: "Push tag to remote? (y/n)"
+
+If yes:
+```bash
+git push origin v[X.Y]
+```
+</step>
+
+<step name="git_commit_milestone">
+Commit milestone completion (MILESTONES.md + BRIEF.md + ROADMAP.md updates):
+
+```bash
+git add .planning/MILESTONES.md
+git add .planning/BRIEF.md
+git add .planning/ROADMAP.md
+git commit -m "$(cat <<'EOF'
+chore: milestone v[X.Y] [Name] shipped
+
+- Added MILESTONES.md entry
+- Updated BRIEF.md current state
+- Reorganized ROADMAP.md with milestone grouping
+- Tagged v[X.Y]
+EOF
+)"
+```
+
+Confirm: "Committed: chore: milestone v[X.Y] shipped"
+</step>
+
+<step name="offer_next">
+```
+✅ Milestone v[X.Y] [Name] complete
+
+Shipped:
+- [N] phases ([M] plans, [P] tasks)
+- [One sentence of what shipped]
+
+Summary: .planning/MILESTONES.md
+Tag: v[X.Y]
+
+Next steps:
+1. Plan next milestone work (add phases to roadmap)
+2. Archive and start fresh (for major rewrite/new codebase)
+3. Take a break (done for now)
+```
+
+Wait for user decision.
+
+If "1": Route to workflows/plan-phase.md (but ask about milestone scope first)
+If "2": Route to workflows/archive-planning.md (to be created)
+</step>
+
+</process>
+
+<milestone_naming>
+**Version conventions:**
+- **v1.0** - Initial MVP
+- **v1.1, v1.2, v1.3** - Minor updates, new features, fixes
+- **v2.0, v3.0** - Major rewrites, breaking changes, significant new direction
+
+**Name conventions:**
+- v1.0 MVP
+- v1.1 Security
+- v1.2 Performance
+- v2.0 Redesign
+- v2.0 iOS Launch
+
+Keep names short (1-2 words describing the focus).
+</milestone_naming>
+
+<what_qualifies>
+**Create milestones for:**
+- Initial release (v1.0)
+- Public releases
+- Major feature sets shipped
+- Before archiving planning
+
+**Don't create milestones for:**
+- Every phase completion (too granular)
+- Work in progress (wait until shipped)
+- Internal dev iterations (unless truly shipped internally)
+
+If uncertain, ask: "Is this deployed/usable/shipped in some form?"
+If yes → milestone. If no → keep working.
+</what_qualifies>
+
+<success_criteria>
+Milestone completion is successful when:
+- [ ] MILESTONES.md entry created with stats and accomplishments
+- [ ] BRIEF.md updated with current state
+- [ ] ROADMAP.md reorganized with milestone grouping
+- [ ] Git tag created (v[X.Y])
+- [ ] Milestone commit made
+- [ ] User knows next steps
+</success_criteria>
--- a/skills/create-plans/workflows/create-brief.md
+++ b/skills/create-plans/workflows/create-brief.md
@@ -0,0 +1,95 @@
+# Workflow: Create Brief
+
+<required_reading>
+**Read these files NOW:**
+1. templates/brief.md
+</required_reading>
+
+<purpose>
+Create a project vision document that captures what we're building and why.
+This is the ONLY human-focused document - everything else is for Claude.
+</purpose>
+
+<process>
+
+<step name="gather_vision">
+Ask the user (conversationally, not AskUserQuestion):
+
+1. **What are we building?** (one sentence)
+2. **Why does this need to exist?** (the problem it solves)
+3. **What does success look like?** (how we know it worked)
+4. **Any constraints?** (tech stack, timeline, budget, etc.)
+
+Keep it conversational. Don't ask all at once - let it flow naturally.
+</step>
+
+<step name="decision_gate">
+After gathering context:
+
+Use AskUserQuestion:
+- header: "Ready"
+- question: "Ready to create the brief, or would you like me to ask more questions?"
+- options:
+  - "Create brief" - I have enough context
+  - "Ask more questions" - There are details to clarify
+  - "Let me add context" - I want to provide more information
+
+Loop until "Create brief" selected.
+</step>
+
+<step name="create_structure">
+Create the planning directory:
+
+```bash
+mkdir -p .planning
+```
+</step>
+
+<step name="write_brief">
+Use the template from `templates/brief.md`.
+
+Write to `.planning/BRIEF.md` with:
+- Project name
+- One-line description
+- Problem statement (why this exists)
+- Success criteria (measurable outcomes)
+- Constraints (if any)
+- Out of scope (what we're NOT building)
+
+**Keep it SHORT.** Under 50 lines. This is a reference, not a novel.
+</step>
+
+<step name="offer_next">
+After creating brief, present options:
+
+```
+Brief created: .planning/BRIEF.md
+
+NOTE: Brief is NOT committed yet. It will be committed with the roadmap as project initialization.
+
+What's next?
+1. Create roadmap now (recommended - commits brief + roadmap together)
+2. Review/edit brief
+3. Done for now (brief will remain uncommitted)
+```
+</step>
+
+</process>
+
+<anti_patterns>
+- Don't write a business plan
+- Don't include market analysis
+- Don't add stakeholder sections
+- Don't create executive summaries
+- Don't add timelines (that's roadmap's job)
+
+Keep it focused: What, Why, Success, Constraints.
+</anti_patterns>
+
+<success_criteria>
+Brief is complete when:
+- [ ] `.planning/BRIEF.md` exists
+- [ ] Contains: name, description, problem, success criteria
+- [ ] Under 50 lines
+- [ ] User knows what's next
+</success_criteria>
--- a/skills/create-plans/workflows/create-roadmap.md
+++ b/skills/create-plans/workflows/create-roadmap.md
@@ -0,0 +1,158 @@
+# Workflow: Create Roadmap
+
+<required_reading>
+**Read these files NOW:**
+1. templates/roadmap.md
+2. Read `.planning/BRIEF.md` if it exists
+</required_reading>
+
+<purpose>
+Define the phases of implementation. Each phase is a coherent chunk of work
+that delivers value. The roadmap provides structure, not detailed tasks.
+</purpose>
+
+<process>
+
+<step name="check_brief">
+```bash
+cat .planning/BRIEF.md 2>/dev/null || echo "No brief found"
+```
+
+**If no brief exists:**
+Ask: "No brief found. Want to create one first, or proceed with roadmap?"
+
+If proceeding without brief, gather quick context:
+- What are we building?
+- What's the rough scope?
+</step>
+
+<step name="identify_phases">
+Based on the brief/context, identify 3-6 phases.
+
+Good phases are:
+- **Coherent**: Each delivers something complete
+- **Sequential**: Later phases build on earlier
+- **Sized right**: 1-3 days of work each (for solo + Claude)
+
+Common phase patterns:
+- Foundation → Core Feature → Enhancement → Polish
+- Setup → MVP → Iteration → Launch
+- Infrastructure → Backend → Frontend → Integration
+</step>
+
+<step name="confirm_phases">
+Present the phase breakdown inline:
+
+"Here's how I'd break this down:
+
+1. [Phase name] - [goal]
+2. [Phase name] - [goal]
+3. [Phase name] - [goal]
+...
+
+Does this feel right? (yes / adjust)"
+
+If "adjust": Ask what to change, revise, present again.
+</step>
+
+<step name="decision_gate">
+After phases confirmed:
+
+Use AskUserQuestion:
+- header: "Ready"
+- question: "Ready to create the roadmap, or would you like me to ask more questions?"
+- options:
+  - "Create roadmap" - I have enough context
+  - "Ask more questions" - There are details to clarify
+  - "Let me add context" - I want to provide more information
+
+Loop until "Create roadmap" selected.
+</step>
+
+<step name="create_structure">
+```bash
+mkdir -p .planning/phases
+```
+</step>
+
+<step name="write_roadmap">
+Use template from `templates/roadmap.md`.
+
+Write to `.planning/ROADMAP.md` with:
+- Phase list with names and one-line descriptions
+- Dependencies (what must complete before what)
+- Status tracking (all start as "not started")
+
+Create phase directories:
+```bash
+mkdir -p .planning/phases/01-{phase-name}
+mkdir -p .planning/phases/02-{phase-name}
+# etc.
+```
+</step>
+
+<step name="git_commit_initialization">
+Commit project initialization (brief + roadmap together):
+
+```bash
+git add .planning/
+git commit -m "$(cat <<'EOF'
+docs: initialize [project-name] ([N] phases)
+
+[One-liner from BRIEF.md]
+
+Phases:
+1. [phase-name]: [goal]
+2. [phase-name]: [goal]
+3. [phase-name]: [goal]
+EOF
+)"
+```
+
+Confirm: "Committed: docs: initialize [project] ([N] phases)"
+</step>
+
+<step name="offer_next">
+```
+Project initialized:
+- Brief: .planning/BRIEF.md
+- Roadmap: .planning/ROADMAP.md
+- Committed as: docs: initialize [project] ([N] phases)
+
+What's next?
+1. Plan Phase 1 in detail
+2. Review/adjust phases
+3. Done for now
+```
+</step>
+
+</process>
+
+<phase_naming>
+Use `XX-kebab-case-name` format:
+- `01-foundation`
+- `02-authentication`
+- `03-core-features`
+- `04-polish`
+
+Numbers ensure ordering. Names describe content.
+</phase_naming>
+
+<anti_patterns>
+- Don't add time estimates
+- Don't create Gantt charts
+- Don't add resource allocation
+- Don't include risk matrices
+- Don't plan more than 6 phases (scope creep)
+
+Phases are buckets of work, not project management artifacts.
+</anti_patterns>
+
+<success_criteria>
+Roadmap is complete when:
+- [ ] `.planning/ROADMAP.md` exists
+- [ ] 3-6 phases defined with clear names
+- [ ] Phase directories created
+- [ ] Dependencies noted if any
+- [ ] Status tracking in place
+</success_criteria>
--- a/skills/create-plans/workflows/execute-phase.md
+++ b/skills/create-plans/workflows/execute-phase.md
@@ -0,0 +1,982 @@
+# Workflow: Execute Phase
+
+<purpose>
+Execute a phase prompt (PLAN.md) and create the outcome summary (SUMMARY.md).
+</purpose>
+
+<process>
+
+<step name="identify_plan">
+Find the next plan to execute:
+- Check ROADMAP.md for "In progress" phase
+- Find plans in that phase directory
+- Identify first plan without corresponding SUMMARY
+
+```bash
+cat .planning/ROADMAP.md
+# Look for phase with "In progress" status
+# Then find plans in that phase
+ls .planning/phases/XX-name/*-PLAN.md 2>/dev/null | sort
+ls .planning/phases/XX-name/*-SUMMARY.md 2>/dev/null | sort
+```
+
+**Logic:**
+- If `01-01-PLAN.md` exists but `01-01-SUMMARY.md` doesn't → execute 01-01
+- If `01-01-SUMMARY.md` exists but `01-02-SUMMARY.md` doesn't → execute 01-02
+- Pattern: Find first PLAN file without matching SUMMARY file
+
+Confirm with user if ambiguous.
+
+Present:
+```
+Found plan to execute: {phase}-{plan}-PLAN.md
+[Plan X of Y for Phase Z]
+
+Proceed with execution?
+```
+</step>
+
+<step name="parse_segments">
+**Intelligent segmentation: Parse plan into execution segments.**
+
+Plans are divided into segments by checkpoints. Each segment is routed to optimal execution context (subagent or main).
+
+**1. Check for checkpoints:**
+```bash
+# Find all checkpoints and their types
+grep -n "type=\"checkpoint" .planning/phases/XX-name/{phase}-{plan}-PLAN.md
+```
+
+**2. Analyze execution strategy:**
+
+**If NO checkpoints found:**
+- **Fully autonomous plan** - spawn single subagent for entire plan
+- Subagent gets fresh 200k context, executes all tasks, creates SUMMARY, commits
+- Main context: Just orchestration (~5% usage)
+
+**If checkpoints found, parse into segments:**
+
+Segment = tasks between checkpoints (or start→first checkpoint, or last checkpoint→end)
+
+**For each segment, determine routing:**
+
+```
+Segment routing rules:
+
+IF segment has no prior checkpoint:
+  → SUBAGENT (first segment, nothing to depend on)
+
+IF segment follows checkpoint:human-verify:
+  → SUBAGENT (verification is just confirmation, doesn't affect next work)
+
+IF segment follows checkpoint:decision OR checkpoint:human-action:
+  → MAIN CONTEXT (next tasks need the decision/result)
+```
+
+**3. Execution pattern:**
+
+**Pattern A: Fully autonomous (no checkpoints)**
+```
+Spawn subagent → execute all tasks → SUMMARY → commit → report back
+```
+
+**Pattern B: Segmented with verify-only checkpoints**
+```
+Segment 1 (tasks 1-3): Spawn subagent → execute → report back
+Checkpoint 4 (human-verify): Main context → you verify → continue
+Segment 2 (tasks 5-6): Spawn NEW subagent → execute → report back
+Checkpoint 7 (human-verify): Main context → you verify → continue
+Aggregate results → SUMMARY → commit
+```
+
+**Pattern C: Decision-dependent (must stay in main)**
+```
+Checkpoint 1 (decision): Main context → you decide → continue in main
+Tasks 2-5: Main context (need decision from checkpoint 1)
+No segmentation benefit - execute entirely in main
+```
+
+**4. Why this works:**
+
+**Segmentation benefits:**
+- Fresh context for each autonomous segment (0% start every time)
+- Main context only for checkpoints (~10-20% total)
+- Can handle 10+ task plans if properly segmented
+- Quality impossible to degrade in autonomous segments
+
+**When segmentation provides no benefit:**
+- Checkpoint is decision/human-action and following tasks depend on outcome
+- Better to execute sequentially in main than break flow
+
+**5. Implementation:**
+
+**For fully autonomous plans:**
+```
+Use Task tool with subagent_type="general-purpose":
+
+Prompt: "Execute plan at .planning/phases/{phase}-{plan}-PLAN.md
+
+This is an autonomous plan (no checkpoints). Execute all tasks, create SUMMARY.md in phase directory, commit with message following plan's commit guidance.
+
+Follow all deviation rules and authentication gate protocols from the plan.
+
+When complete, report: plan name, tasks completed, SUMMARY path, commit hash."
+```
+
+**For segmented plans (has verify-only checkpoints):**
+```
+Execute segment-by-segment:
+
+For each autonomous segment:
+  Spawn subagent with prompt: "Execute tasks [X-Y] from plan at .planning/phases/{phase}-{plan}-PLAN.md. Read the plan for full context and deviation rules. Do NOT create SUMMARY or commit - just execute these tasks and report results."
+
+  Wait for subagent completion
+
+For each checkpoint:
+  Execute in main context
+  Wait for user interaction
+  Continue to next segment
+
+After all segments complete:
+  Aggregate all results
+  Create SUMMARY.md
+  Commit with all changes
+```
+
+**For decision-dependent plans:**
+```
+Execute in main context (standard flow below)
+No subagent routing
+Quality maintained through small scope (2-3 tasks per plan)
+```
+
+See step name="segment_execution" for detailed segment execution loop.
+</step>
+
+<step name="segment_execution">
+**Detailed segment execution loop for segmented plans.**
+
+**This step applies ONLY to segmented plans (Pattern B: has checkpoints, but they're verify-only).**
+
+For Pattern A (fully autonomous) and Pattern C (decision-dependent), skip this step.
+
+**Execution flow:**
+
+```
+1. Parse plan to identify segments:
+   - Read plan file
+   - Find checkpoint locations: grep -n "type=\"checkpoint" PLAN.md
+   - Identify checkpoint types: grep "type=\"checkpoint" PLAN.md | grep -o 'checkpoint:[^"]*'
+   - Build segment map:
+     * Segment 1: Start → first checkpoint (tasks 1-X)
+     * Checkpoint 1: Type and location
+     * Segment 2: After checkpoint 1 → next checkpoint (tasks X+1 to Y)
+     * Checkpoint 2: Type and location
+     * ... continue for all segments
+
+2. For each segment in order:
+
+   A. Determine routing (apply rules from parse_segments):
+      - No prior checkpoint? → Subagent
+      - Prior checkpoint was human-verify? → Subagent
+      - Prior checkpoint was decision/human-action? → Main context
+
+   B. If routing = Subagent:
+      ```
+      Spawn Task tool with subagent_type="general-purpose":
+
+      Prompt: "Execute tasks [task numbers/names] from plan at [plan path].
+
+      **Context:**
+      - Read the full plan for objective, context files, and deviation rules
+      - You are executing a SEGMENT of this plan (not the full plan)
+      - Other segments will be executed separately
+
+      **Your responsibilities:**
+      - Execute only the tasks assigned to you
+      - Follow all deviation rules and authentication gate protocols
+      - Track deviations for later Summary
+      - DO NOT create SUMMARY.md (will be created after all segments complete)
+      - DO NOT commit (will be done after all segments complete)
+
+      **Report back:**
+      - Tasks completed
+      - Files created/modified
+      - Deviations encountered
+      - Any issues or blockers"
+
+      Wait for subagent to complete
+      Capture results (files changed, deviations, etc.)
+      ```
+
+   C. If routing = Main context:
+      Execute tasks in main using standard execution flow (step name="execute")
+      Track results locally
+
+   D. After segment completes (whether subagent or main):
+      Continue to next checkpoint/segment
+
+3. After ALL segments complete:
+
+   A. Aggregate results from all segments:
+      - Collect files created/modified from all segments
+      - Collect deviations from all segments
+      - Collect decisions from all checkpoints
+      - Merge into complete picture
+
+   B. Create SUMMARY.md:
+      - Use aggregated results
+      - Document all work from all segments
+      - Include deviations from all segments
+      - Note which segments were subagented
+
+   C. Commit:
+      - Stage all files from all segments
+      - Stage SUMMARY.md
+      - Commit with message following plan guidance
+      - Include note about segmented execution if relevant
+
+   D. Report completion
+
+**Example execution trace:**
+
+```
+Plan: 01-02-PLAN.md (8 tasks, 2 verify checkpoints)
+
+Parsing segments...
+- Segment 1: Tasks 1-3 (autonomous)
+- Checkpoint 4: human-verify
+- Segment 2: Tasks 5-6 (autonomous)
+- Checkpoint 7: human-verify
+- Segment 3: Task 8 (autonomous)
+
+Routing analysis:
+- Segment 1: No prior checkpoint → SUBAGENT ✓
+- Checkpoint 4: Verify only → MAIN (required)
+- Segment 2: After verify → SUBAGENT ✓
+- Checkpoint 7: Verify only → MAIN (required)
+- Segment 3: After verify → SUBAGENT ✓
+
+Execution:
+[1] Spawning subagent for tasks 1-3...
+    → Subagent completes: 3 files modified, 0 deviations
+[2] Executing checkpoint 4 (human-verify)...
+    ════════════════════════════════════════
+    CHECKPOINT: Verification Required
+    Task 4 of 8: Verify database schema
+    I built: User and Session tables with relations
+    How to verify: Check src/db/schema.ts for correct types
+    ════════════════════════════════════════
+    User: "approved"
+[3] Spawning subagent for tasks 5-6...
+    → Subagent completes: 2 files modified, 1 deviation (added error handling)
+[4] Executing checkpoint 7 (human-verify)...
+    User: "approved"
+[5] Spawning subagent for task 8...
+    → Subagent completes: 1 file modified, 0 deviations
+
+Aggregating results...
+- Total files: 6 modified
+- Total deviations: 1
+- Segmented execution: 3 subagents, 2 checkpoints
+
+Creating SUMMARY.md...
+Committing...
+✓ Complete
+```
+
+**Benefits of this pattern:**
+- Main context usage: ~20% (just orchestration + checkpoints)
+- Subagent 1: Fresh 0-30% (tasks 1-3)
+- Subagent 2: Fresh 0-30% (tasks 5-6)
+- Subagent 3: Fresh 0-20% (task 8)
+- All autonomous work: Peak quality
+- Can handle large plans with many tasks if properly segmented
+
+**When NOT to use segmentation:**
+- Plan has decision/human-action checkpoints that affect following tasks
+- Following tasks depend on checkpoint outcome
+- Better to execute in main sequentially in those cases
+</step>
+
+<step name="load_prompt">
+Read the plan prompt:
+```bash
+cat .planning/phases/XX-name/{phase}-{plan}-PLAN.md
+```
+
+This IS the execution instructions. Follow it exactly.
+</step>
+
+<step name="previous_phase_check">
+Before executing, check if previous phase had issues:
+
+```bash
+# Find previous phase summary
+ls .planning/phases/*/SUMMARY.md 2>/dev/null | sort -r | head -2 | tail -1
+```
+
+If previous phase SUMMARY.md has "Issues Encountered" != "None" or "Next Phase Readiness" mentions blockers:
+
+Use AskUserQuestion:
+- header: "Previous Issues"
+- question: "Previous phase had unresolved items: [summary]. How to proceed?"
+- options:
+  - "Proceed anyway" - Issues won't block this phase
+  - "Address first" - Let's resolve before continuing
+  - "Review previous" - Show me the full summary
+</step>
+
+<step name="execute">
+Execute each task in the prompt. **Deviations are normal** - handle them automatically using embedded rules below.
+
+1. Read the @context files listed in the prompt
+
+2. For each task:
+
+   **If `type="auto"`:**
+   - Work toward task completion
+   - **If CLI/API returns authentication error:** Handle as authentication gate (see below)
+   - **When you discover additional work not in plan:** Apply deviation rules (see below) automatically
+   - Continue implementing, applying rules as needed
+   - Run the verification
+   - Confirm done criteria met
+   - Track any deviations for Summary documentation
+   - Continue to next task
+
+   **If `type="checkpoint:*"`:**
+   - STOP immediately (do not continue to next task)
+   - Execute checkpoint_protocol (see below)
+   - Wait for user response
+   - Verify if possible (check files, env vars, etc.)
+   - Only after user confirmation: continue to next task
+
+3. Run overall verification checks from `<verification>` section
+4. Confirm all success criteria from `<success_criteria>` section met
+5. Document all deviations in Summary (automatic - see deviation_documentation below)
+</step>
+
+<authentication_gates>
+## Handling Authentication Errors During Execution
+
+**When you encounter authentication errors during `type="auto"` task execution:**
+
+This is NOT a failure. Authentication gates are expected and normal. Handle them dynamically:
+
+**Authentication error indicators:**
+- CLI returns: "Error: Not authenticated", "Not logged in", "Unauthorized", "401", "403"
+- API returns: "Authentication required", "Invalid API key", "Missing credentials"
+- Command fails with: "Please run {tool} login" or "Set {ENV_VAR} environment variable"
+
+**Authentication gate protocol:**
+
+1. **Recognize it's an auth gate** - Not a bug, just needs credentials
+2. **STOP current task execution** - Don't retry repeatedly
+3. **Create dynamic checkpoint:human-action** - Present it to user immediately
+4. **Provide exact authentication steps** - CLI commands, where to get keys
+5. **Wait for user to authenticate** - Let them complete auth flow
+6. **Verify authentication works** - Test that credentials are valid
+7. **Retry the original task** - Resume automation where you left off
+8. **Continue normally** - Don't treat this as an error in Summary
+
+**Example: Vercel deployment hits auth error**
+
+```
+Task 3: Deploy to Vercel
+Running: vercel --yes
+
+Error: Not authenticated. Please run 'vercel login'
+
+[Create checkpoint dynamically]
+
+════════════════════════════════════════
+CHECKPOINT: Authentication Required
+════════════════════════════════════════
+
+Task 3 of 8: Authenticate Vercel CLI
+
+I tried to deploy but got authentication error.
+
+What you need to do:
+Run: vercel login
+
+This will open your browser - complete the authentication flow.
+
+I'll verify after: vercel whoami returns your account
+
+Type "done" when authenticated
+════════════════════════════════════════
+
+[Wait for user response]
+
+[User types "done"]
+
+Verifying authentication...
+Running: vercel whoami
+✓ Authenticated as: user@example.com
+
+Retrying deployment...
+Running: vercel --yes
+✓ Deployed to: https://myapp-abc123.vercel.app
+
+Task 3 complete. Continuing to task 4...
+```
+
+**Example: Stripe API needs key**
+
+```
+Task 5: Create Stripe webhook endpoint
+Using Stripe API...
+
+Error: 401 Unauthorized - No API key provided
+
+[Create checkpoint dynamically]
+
+════════════════════════════════════════
+CHECKPOINT: Credentials Required
+════════════════════════════════════════
+
+Task 5 of 8: Provide Stripe API key
+
+I tried to create webhook but need your Stripe API key.
+
+What you need to do:
+1. Visit dashboard.stripe.com/apikeys
+2. Copy your "Secret key" (starts with sk_test_ or sk_live_)
+3. Paste it here, or run: export STRIPE_SECRET_KEY=sk_...
+
+I'll verify after: Stripe API call succeeds
+
+Type "done" when ready, or paste the key
+════════════════════════════════════════
+
+[Wait for user response]
+
+[User pastes key or exports env var]
+
+Saving key to .env...
+Verifying Stripe API access...
+✓ Stripe API authenticated
+
+Retrying webhook creation...
+✓ Webhook endpoint created: whsec_abc123
+
+Task 5 complete. Continuing to task 6...
+```
+
+**In Summary documentation:**
+
+Document authentication gates as normal flow, not deviations:
+
+```markdown
+## Authentication Gates
+
+During execution, I encountered authentication requirements:
+
+1. Task 3: Vercel CLI required authentication
+   - Paused for `vercel login`
+   - Resumed after authentication
+   - Deployed successfully
+
+2. Task 5: Stripe API required API key
+   - Paused for API key input
+   - Saved to .env
+   - Resumed webhook creation
+
+These are normal gates, not errors.
+```
+
+**Key principles:**
+- Authentication gates are NOT failures or bugs
+- They're expected interaction points during first-time setup
+- Handle them gracefully and continue automation after unblocked
+- Don't mark tasks as "failed" or "incomplete" due to auth gates
+- Document them as normal flow, separate from deviations
+
+See references/cli-automation.md "Authentication Gates" section for complete examples.
+</authentication_gates>
+
+<step name="execute">
+
+<deviation_rules>
+## Automatic Deviation Handling
+
+**While executing tasks, you WILL discover work not in the plan.** This is normal.
+
+Apply these rules automatically. Track all deviations for Summary documentation.
+
+---
+
+**RULE 1: Auto-fix bugs**
+
+**Trigger:** Code doesn't work as intended (broken behavior, incorrect output, errors)
+
+**Action:** Fix immediately, track for Summary
+
+**Examples:**
+- Wrong SQL query returning incorrect data
+- Logic errors (inverted condition, off-by-one, infinite loop)
+- Type errors, null pointer exceptions, undefined references
+- Broken validation (accepts invalid input, rejects valid input)
+- Security vulnerabilities (SQL injection, XSS, CSRF, insecure auth)
+- Race conditions, deadlocks
+- Memory leaks, resource leaks
+
+**Process:**
+1. Fix the bug inline
+2. Add/update tests to prevent regression
+3. Verify fix works
+4. Continue task
+5. Track in deviations list: `[Rule 1 - Bug] [description]`
+
+**No user permission needed.** Bugs must be fixed for correct operation.
+
+---
+
+**RULE 2: Auto-add missing critical functionality**
+
+**Trigger:** Code is missing essential features for correctness, security, or basic operation
+
+**Action:** Add immediately, track for Summary
+
+**Examples:**
+- Missing error handling (no try/catch, unhandled promise rejections)
+- No input validation (accepts malicious data, type coercion issues)
+- Missing null/undefined checks (crashes on edge cases)
+- No authentication on protected routes
+- Missing authorization checks (users can access others' data)
+- No CSRF protection, missing CORS configuration
+- No rate limiting on public APIs
+- Missing required database indexes (causes timeouts)
+- No logging for errors (can't debug production)
+
+**Process:**
+1. Add the missing functionality inline
+2. Add tests for the new functionality
+3. Verify it works
+4. Continue task
+5. Track in deviations list: `[Rule 2 - Missing Critical] [description]`
+
+**Critical = required for correct/secure/performant operation**
+**No user permission needed.** These are not "features" - they're requirements for basic correctness.
+
+---
+
+**RULE 3: Auto-fix blocking issues**
+
+**Trigger:** Something prevents you from completing current task
+
+**Action:** Fix immediately to unblock, track for Summary
+
+**Examples:**
+- Missing dependency (package not installed, import fails)
+- Wrong types blocking compilation
+- Broken import paths (file moved, wrong relative path)
+- Missing environment variable (app won't start)
+- Database connection config error
+- Build configuration error (webpack, tsconfig, etc.)
+- Missing file referenced in code
+- Circular dependency blocking module resolution
+
+**Process:**
+1. Fix the blocking issue
+2. Verify task can now proceed
+3. Continue task
+4. Track in deviations list: `[Rule 3 - Blocking] [description]`
+
+**No user permission needed.** Can't complete task without fixing blocker.
+
+---
+
+**RULE 4: Ask about architectural changes**
+
+**Trigger:** Fix/addition requires significant structural modification
+
+**Action:** STOP, present to user, wait for decision
+
+**Examples:**
+- Adding new database table (not just column)
+- Major schema changes (changing primary key, splitting tables)
+- Introducing new service layer or architectural pattern
+- Switching libraries/frameworks (React → Vue, REST → GraphQL)
+- Changing authentication approach (sessions → JWT)
+- Adding new infrastructure (message queue, cache layer, CDN)
+- Changing API contracts (breaking changes to endpoints)
+- Adding new deployment environment
+
+**Process:**
+1. STOP current task
+2. Present clearly:
+```
+⚠️ Architectural Decision Needed
+
+Current task: [task name]
+Discovery: [what you found that prompted this]
+Proposed change: [architectural modification]
+Why needed: [rationale]
+Impact: [what this affects - APIs, deployment, dependencies, etc.]
+Alternatives: [other approaches, or "none apparent"]
+
+Proceed with proposed change? (yes / different approach / defer)
+```
+3. WAIT for user response
+4. If approved: implement, track as `[Rule 4 - Architectural] [description]`
+5. If different approach: discuss and implement
+6. If deferred: log to ISSUES.md, continue without change
+
+**User decision required.** These changes affect system design.
+
+---
+
+**RULE 5: Log non-critical enhancements**
+
+**Trigger:** Improvement that would enhance code but isn't essential now
+
+**Action:** Add to .planning/ISSUES.md automatically, continue task
+
+**Examples:**
+- Performance optimization (works correctly, just slower than ideal)
+- Code refactoring (works, but could be cleaner/DRY-er)
+- Better naming (works, but variables could be clearer)
+- Organizational improvements (works, but file structure could be better)
+- Nice-to-have UX improvements (works, but could be smoother)
+- Additional test coverage beyond basics (basics exist, could be more thorough)
+- Documentation improvements (code works, docs could be better)
+- Accessibility enhancements beyond minimum
+
+**Process:**
+1. Create .planning/ISSUES.md if doesn't exist (use template)
+2. Add entry with ISS-XXX number (auto-increment)
+3. Brief notification: `📋 Logged enhancement: [brief] (ISS-XXX)`
+4. Continue task without implementing
+
+**Template for ISSUES.md:**
+```markdown
+# Project Issues Log
+
+Enhancements discovered during execution. Not critical - address in future phases.
+
+## Open Enhancements
+
+### ISS-001: [Brief description]
+- **Discovered:** Phase [X] Plan [Y] Task [Z] (YYYY-MM-DD)
+- **Type:** [Performance / Refactoring / UX / Testing / Documentation / Accessibility]
+- **Description:** [What could be improved and why it would help]
+- **Impact:** Low (works correctly, this would enhance)
+- **Effort:** [Quick / Medium / Substantial]
+- **Suggested phase:** [Phase number or "Future"]
+
+## Closed Enhancements
+
+[Moved here when addressed]
+```
+
+**No user permission needed.** Logging for future consideration.
+
+---
+
+**RULE PRIORITY (when multiple could apply):**
+
+1. **If Rule 4 applies** → STOP and ask (architectural decision)
+2. **If Rules 1-3 apply** → Fix automatically, track for Summary
+3. **If Rule 5 applies** → Log to ISSUES.md, continue
+4. **If genuinely unsure which rule** → Apply Rule 4 (ask user)
+
+**Edge case guidance:**
+- "This validation is missing" → Rule 2 (critical for security)
+- "This validation could be better" → Rule 5 (enhancement)
+- "This crashes on null" → Rule 1 (bug)
+- "This could be faster" → Rule 5 (enhancement) UNLESS actually timing out → Rule 2 (critical)
+- "Need to add table" → Rule 4 (architectural)
+- "Need to add column" → Rule 1 or 2 (depends: fixing bug or adding critical field)
+
+**When in doubt:** Ask yourself "Does this affect correctness, security, or ability to complete task?"
+- YES → Rules 1-3 (fix automatically)
+- NO → Rule 5 (log it)
+- MAYBE → Rule 4 (ask user)
+
+</deviation_rules>
+
+<deviation_documentation>
+## Documenting Deviations in Summary
+
+After all tasks complete, Summary MUST include deviations section.
+
+**If no deviations:**
+```markdown
+## Deviations from Plan
+
+None - plan executed exactly as written.
+```
+
+**If deviations occurred:**
+```markdown
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 1 - Bug] Fixed case-sensitive email uniqueness constraint**
+- **Found during:** Task 4 (Follow/unfollow API implementation)
+- **Issue:** User.email unique constraint was case-sensitive - Test@example.com and test@example.com were both allowed, causing duplicate accounts
+- **Fix:** Changed to `CREATE UNIQUE INDEX users_email_unique ON users (LOWER(email))`
+- **Files modified:** src/models/User.ts, migrations/003_fix_email_unique.sql
+- **Verification:** Unique constraint test passes - duplicate emails properly rejected
+- **Commit:** abc123f
+
+**2. [Rule 2 - Missing Critical] Added JWT expiry validation to auth middleware**
+- **Found during:** Task 3 (Protected route implementation)
+- **Issue:** Auth middleware wasn't checking token expiry - expired tokens were being accepted
+- **Fix:** Added exp claim validation in middleware, reject with 401 if expired
+- **Files modified:** src/middleware/auth.ts, src/middleware/auth.test.ts
+- **Verification:** Expired token test passes - properly rejects with 401
+- **Commit:** def456g
+
+**3. [Rule 3 - Blocking] Fixed broken import path for UserService**
+- **Found during:** Task 5 (Profile endpoint)
+- **Issue:** Import path referenced old location (src/services/User.ts) but file was moved to src/services/users/UserService.ts in previous plan
+- **Fix:** Updated import path
+- **Files modified:** src/api/profile.ts
+- **Verification:** Build succeeds, imports resolve
+- **Commit:** ghi789h
+
+**4. [Rule 4 - Architectural] Added Redis caching layer (APPROVED BY USER)**
+- **Found during:** Task 6 (Feed endpoint)
+- **Issue:** Feed queries hitting database on every request, causing 2-3 second response times under load
+- **Proposed:** Add Redis cache with 5-minute TTL for feed data
+- **User decision:** Approved
+- **Fix:** Implemented Redis caching with ioredis client, cache invalidation on new posts
+- **Files created:** src/cache/RedisCache.ts, src/cache/CacheKeys.ts, docker-compose.yml (added Redis)
+- **Verification:** Feed response time reduced to <200ms, cache hit rate >80% in testing
+- **Commit:** jkl012m
+
+### Deferred Enhancements
+
+Logged to .planning/ISSUES.md for future consideration:
+- ISS-001: Refactor UserService into smaller modules (discovered in Task 3)
+- ISS-002: Add connection pooling for Redis (discovered in Task 6)
+- ISS-003: Improve error messages for validation failures (discovered in Task 2)
+
+---
+
+**Total deviations:** 4 auto-fixed (1 bug, 1 missing critical, 1 blocking, 1 architectural with approval), 3 deferred
+**Impact on plan:** All auto-fixes necessary for correctness/security/performance. No scope creep.
+```
+
+**This provides complete transparency:**
+- Every deviation documented
+- Why it was needed
+- What rule applied
+- What was done
+- User can see exactly what happened beyond the plan
+
+</deviation_documentation>
+
+<step name="checkpoint_protocol">
+When encountering `type="checkpoint:*"`:
+
+**Critical: Claude automates everything with CLI/API before checkpoints.** Checkpoints are for verification and decisions, not manual work.
+
+**Display checkpoint clearly:**
+```
+════════════════════════════════════════
+CHECKPOINT: [Type]
+════════════════════════════════════════
+
+Task [X] of [Y]: [Action/What-Built/Decision]
+
+[Display task-specific content based on type]
+
+[Resume signal instruction]
+════════════════════════════════════════
+```
+
+**For checkpoint:human-verify (90% of checkpoints):**
+```
+I automated: [what was automated - deployed, built, configured]
+
+How to verify:
+1. [Step 1 - exact command/URL]
+2. [Step 2 - what to check]
+3. [Step 3 - expected behavior]
+
+[Resume signal - e.g., "Type 'approved' or describe issues"]
+```
+
+**For checkpoint:decision (9% of checkpoints):**
+```
+Decision needed: [decision]
+
+Context: [why this matters]
+
+Options:
+1. [option-id]: [name]
+   Pros: [pros]
+   Cons: [cons]
+
+2. [option-id]: [name]
+   Pros: [pros]
+   Cons: [cons]
+
+[Resume signal - e.g., "Select: option-id"]
+```
+
+**For checkpoint:human-action (1% - rare, only for truly unavoidable manual steps):**
+```
+I automated: [what Claude already did via CLI/API]
+
+Need your help with: [the ONE thing with no CLI/API - email link, 2FA code]
+
+Instructions:
+[Single unavoidable step]
+
+I'll verify after: [verification]
+
+[Resume signal - e.g., "Type 'done' when complete"]
+```
+
+**After displaying:** WAIT for user response. Do NOT hallucinate completion. Do NOT continue to next task.
+
+**After user responds:**
+- Run verification if specified (file exists, env var set, tests pass, etc.)
+- If verification passes or N/A: continue to next task
+- If verification fails: inform user, wait for resolution
+
+See references/checkpoints.md and references/cli-automation.md for complete checkpoint guidance.
+</step>
+
+<step name="verification_failure_gate">
+If any task verification fails:
+
+STOP. Do not continue to next task.
+
+Present inline:
+"Verification failed for Task [X]: [task name]
+
+Expected: [verification criteria]
+Actual: [what happened]
+
+How to proceed?
+1. Retry - Try the task again
+2. Skip - Mark as incomplete, continue
+3. Stop - Pause execution, investigate"
+
+Wait for user decision.
+
+If user chose "Skip", note it in SUMMARY.md under "Issues Encountered".
+</step>
+
+<step name="create_summary">
+Create `{phase}-{plan}-SUMMARY.md` as specified in the prompt's `<output>` section.
+Use templates/summary.md for structure.
+
+**File location:** `.planning/phases/XX-name/{phase}-{plan}-SUMMARY.md`
+
+**Title format:** `# Phase [X] Plan [Y]: [Name] Summary`
+
+The one-liner must be SUBSTANTIVE:
+- Good: "JWT auth with refresh rotation using jose library"
+- Bad: "Authentication implemented"
+
+**Next Step section:**
+- If more plans exist in this phase: "Ready for {phase}-{next-plan}-PLAN.md"
+- If this is the last plan: "Phase complete, ready for transition"
+</step>
+
+<step name="issues_review_gate">
+Before proceeding, check SUMMARY.md content:
+
+If "Issues Encountered" is NOT "None":
+  Present inline:
+  "Phase complete, but issues were encountered:
+  - [Issue 1]
+  - [Issue 2]
+
+  Please review before proceeding. Acknowledged?"
+
+  Wait for acknowledgment.
+
+If "Next Phase Readiness" mentions blockers or concerns:
+  Present inline:
+  "Note for next phase:
+  [concerns from Next Phase Readiness]
+
+  Acknowledged?"
+
+  Wait for acknowledgment.
+</step>
+
+<step name="update_roadmap">
+Update ROADMAP.md:
+
+**If more plans remain in this phase:**
+- Update plan count: "2/3 plans complete"
+- Keep phase status as "In progress"
+
+**If this was the last plan in the phase:**
+- Mark phase complete: status → "Complete"
+- Add completion date
+- Update plan count: "3/3 plans complete"
+</step>
+
+<step name="git_commit_plan">
+Commit plan completion (PLAN + SUMMARY + code):
+
+```bash
+git add .planning/phases/XX-name/{phase}-{plan}-PLAN.md
+git add .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md
+git add .planning/ROADMAP.md
+git add src/  # or relevant code directories
+git commit -m "$(cat <<'EOF'
+feat({phase}-{plan}): [one-liner from SUMMARY.md]
+
+- [Key accomplishment 1]
+- [Key accomplishment 2]
+- [Key accomplishment 3]
+EOF
+)"
+```
+
+Confirm: "Committed: feat({phase}-{plan}): [what shipped]"
+
+**Commit scope pattern:**
+- `feat(01-01):` for phase 1 plan 1
+- `feat(02-03):` for phase 2 plan 3
+- Creates clear, chronological git history
+</step>
+
+<step name="offer_next">
+**If more plans in this phase:**
+```
+Plan {phase}-{plan} complete.
+Summary: .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md
+
+[X] of [Y] plans complete for Phase Z.
+
+What's next?
+1. Execute next plan ({phase}-{next-plan})
+2. Review what was built
+3. Done for now
+```
+
+**If phase complete (last plan done):**
+```
+Plan {phase}-{plan} complete.
+Summary: .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md
+
+Phase [Z]: [Name] COMPLETE - all [Y] plans finished.
+
+What's next?
+1. Transition to next phase
+2. Review phase accomplishments
+3. Done for now
+```
+</step>
+
+</process>
+
+<success_criteria>
+- All tasks from PLAN.md completed
+- All verifications pass
+- SUMMARY.md created with substantive content
+- ROADMAP.md updated
+</success_criteria>
--- a/skills/create-plans/workflows/get-guidance.md
+++ b/skills/create-plans/workflows/get-guidance.md
@@ -0,0 +1,84 @@
+# Workflow: Get Planning Guidance
+
+<purpose>
+Help decide the right planning approach based on project state and goals.
+</purpose>
+
+<process>
+
+<step name="understand_situation">
+Ask conversationally:
+- What's the project/idea?
+- How far along are you? (idea, started, mid-project, almost done)
+- What feels unclear?
+</step>
+
+<step name="recommend_approach">
+Based on situation:
+
+**Just an idea:**
+→ Start with Brief. Capture vision before diving in.
+
+**Know what to build, unclear how:**
+→ Create Roadmap. Break into phases first.
+
+**Have phases, need specifics:**
+→ Plan Phase. Get Claude-executable tasks.
+
+**Mid-project, lost track:**
+→ Audit current state. What exists? What's left?
+
+**Project feels stuck:**
+→ Identify the blocker. Is it planning or execution?
+</step>
+
+<step name="offer_next_action">
+```
+Recommendation: [approach]
+
+Because: [one sentence why]
+
+Start now?
+1. Yes, proceed with [recommended workflow]
+2. Different approach
+3. More questions first
+```
+</step>
+
+</process>
+
+<decision_tree>
+```
+Is there a brief?
+├─ No → Create Brief
+└─ Yes → Is there a roadmap?
+         ├─ No → Create Roadmap
+         └─ Yes → Is current phase planned?
+                  ├─ No → Plan Phase
+                  └─ Yes → Plan Chunk or Generate Prompts
+```
+</decision_tree>
+
+<common_situations>
+**"I have an idea but don't know where to start"**
+→ Brief first. 5 minutes to capture vision.
+
+**"I know what to build but it feels overwhelming"**
+→ Roadmap. Break it into 3-5 phases.
+
+**"I have a phase but tasks are vague"**
+→ Plan Phase with Claude-executable specificity.
+
+**"I have a plan but Claude keeps going off track"**
+→ Tasks aren't specific enough. Add Files/Action/Verification.
+
+**"Context keeps running out mid-task"**
+→ Tasks are too big. Break into smaller chunks + use handoff.
+</common_situations>
+
+<success_criteria>
+Guidance is complete when:
+- [ ] User's situation understood
+- [ ] Appropriate approach recommended
+- [ ] User knows next step
+</success_criteria>
--- a/skills/create-plans/workflows/handoff.md
+++ b/skills/create-plans/workflows/handoff.md
@@ -0,0 +1,134 @@
+# Workflow: Create Handoff
+
+<required_reading>
+**Read these files NOW:**
+1. templates/continue-here.md
+</required_reading>
+
+<purpose>
+Create a context handoff file when pausing work. This preserves full context
+so a fresh Claude session can pick up exactly where you left off.
+
+**Handoff is a parking lot, not a journal.** Create when leaving, delete when returning.
+</purpose>
+
+<when_to_create>
+- User says "pack it up", "stopping for now", "save my place"
+- Context window at 15% or below (offer to create)
+- Context window at 10% (auto-create)
+- Switching to different project
+</when_to_create>
+
+<process>
+
+<step name="identify_location">
+Determine which phase we're in:
+
+```bash
+# Find current phase (most recently modified PLAN.md)
+ls -lt .planning/phases/*/PLAN.md 2>/dev/null | head -1
+```
+
+Handoff goes in the current phase directory.
+</step>
+
+<step name="gather_context">
+Collect everything needed for seamless resumption:
+
+1. **Current position**: Which phase, which task
+2. **Work completed**: What's done this session
+3. **Work remaining**: What's left
+4. **Decisions made**: Why things were done this way
+5. **Blockers/issues**: Anything stuck
+6. **Mental context**: The "vibe" - what you were thinking
+</step>
+
+<step name="write_handoff">
+Use template from `templates/continue-here.md`.
+
+Write to `.planning/phases/XX-name/.continue-here.md`:
+
+```yaml
+---
+phase: XX-name
+task: 3
+total_tasks: 7
+status: in_progress
+last_updated: [ISO timestamp]
+---
+```
+
+Then markdown body with full context.
+</step>
+
+<step name="git_commit_wip">
+Commit handoff as WIP:
+
+```bash
+git add .planning/
+git commit -m "$(cat <<'EOF'
+wip: [phase-name] paused at task [X]/[Y]
+
+Current: [task name]
+[If blocked:] Blocked: [reason]
+EOF
+)"
+```
+
+Confirm: "Committed: wip: [phase] paused at task [X]/[Y]"
+</step>
+
+<step name="handoff_confirmation">
+Require acknowledgment:
+
+"Handoff created: .planning/phases/[XX]/.continue-here.md
+
+Current state:
+- Phase: [XX-name]
+- Task: [X] of [Y]
+- Status: [in_progress/blocked/etc]
+- Committed as WIP
+
+To resume: Invoke this skill in a new session.
+
+Confirmed?"
+
+Wait for acknowledgment before ending.
+</step>
+
+</process>
+
+<context_trigger>
+**Auto-handoff at 10% context:**
+
+When system warning shows ~20k tokens remaining:
+1. Complete current atomic operation (don't leave broken state)
+2. Create handoff automatically
+3. Tell user: "Context limit reached. Handoff created at [location]."
+4. Stop working - don't start new tasks
+
+**Warning at 15%:**
+"Context getting low (~30k remaining). Create handoff now or push through?"
+</context_trigger>
+
+<handoff_lifecycle>
+```
+Working           → No handoff exists
+"Pack it up"      → CREATE .continue-here.md
+[Session ends]
+[New session]
+"Resume"          → READ handoff, then DELETE it
+Working           → No handoff (context is fresh)
+Phase complete    → Ensure no stale handoff exists
+```
+
+Handoff is temporary. If it persists after resuming, it's stale.
+</handoff_lifecycle>
+
+<success_criteria>
+Handoff is complete when:
+- [ ] .continue-here.md exists in current phase
+- [ ] YAML frontmatter has phase, task, status, timestamp
+- [ ] Body has: completed work, remaining work, decisions, context
+- [ ] User knows how to resume
+</success_criteria>
--- a/skills/create-plans/workflows/plan-chunk.md
+++ b/skills/create-plans/workflows/plan-chunk.md
@@ -0,0 +1,70 @@
+# Workflow: Plan Next Chunk
+
+<required_reading>
+**Read the current phase's PLAN.md**
+</required_reading>
+
+<purpose>
+Identify the immediate next 1-3 tasks to work on. This is for when you want
+to focus on "what's next" without replanning the whole phase.
+</purpose>
+
+<process>
+
+<step name="find_current_position">
+Read the phase plan:
+```bash
+cat .planning/phases/XX-current/PLAN.md
+```
+
+Identify:
+- Which tasks are complete (marked or inferred)
+- Which task is next
+- Dependencies between tasks
+</step>
+
+<step name="identify_chunk">
+Select 1-3 tasks that:
+- Are next in sequence
+- Have dependencies met
+- Form a coherent chunk of work
+
+Present:
+```
+Current phase: [Phase Name]
+Progress: [X] of [Y] tasks complete
+
+Next chunk:
+1. Task [N]: [Name] - [Brief description]
+2. Task [N+1]: [Name] - [Brief description]
+
+Ready to work on these?
+```
+</step>
+
+<step name="offer_execution">
+Options:
+1. **Start working** - Begin with Task N
+2. **Generate prompt** - Create meta-prompt for this chunk
+3. **See full plan** - Review all remaining tasks
+4. **Different chunk** - Pick different tasks
+</step>
+
+</process>
+
+<chunk_sizing>
+Good chunks:
+- 1-3 tasks
+- Can complete in one session
+- Deliver something testable
+
+If user asks "what's next" - give them ONE task.
+If user asks "plan my session" - give them 2-3 tasks.
+</chunk_sizing>
+
+<success_criteria>
+Chunk planning is complete when:
+- [ ] Current position identified
+- [ ] Next 1-3 tasks selected
+- [ ] User knows what to work on
+</success_criteria>
--- a/skills/create-plans/workflows/plan-phase.md
+++ b/skills/create-plans/workflows/plan-phase.md
@@ -0,0 +1,334 @@
+# Workflow: Plan Phase
+
+<required_reading>
+**Read these files NOW:**
+1. templates/phase-prompt.md
+2. references/plan-format.md
+3. references/scope-estimation.md
+4. references/checkpoints.md
+5. Read `.planning/ROADMAP.md`
+6. Read `.planning/BRIEF.md`
+
+**If domain expertise should be loaded (determined by intake):**
+7. Read domain SKILL.md: `~/.claude/skills/expertise/[domain]/SKILL.md`
+8. Determine phase type from ROADMAP (UI, database, API, etc.)
+9. Read ONLY relevant references from domain's `<references_index>` section
+</required_reading>
+
+<purpose>
+Create an executable phase prompt (PLAN.md). This is where we get specific:
+objective, context, tasks, verification, success criteria, and output specification.
+
+**Key insight:** PLAN.md IS the prompt that Claude executes. Not a document that
+gets transformed into a prompt.
+</purpose>
+
+<process>
+
+<step name="identify_phase">
+Check roadmap for phases:
+```bash
+cat .planning/ROADMAP.md
+ls .planning/phases/
+```
+
+If multiple phases available, ask which one to plan.
+If obvious (first incomplete phase), proceed.
+
+Read any existing PLAN.md or FINDINGS.md in the phase directory.
+</step>
+
+<step name="check_research_needed">
+For this phase, assess:
+- Are there technology choices to make?
+- Are there unknowns about the approach?
+- Do we need to investigate APIs or libraries?
+
+If yes: Route to workflows/research-phase.md first.
+Research produces FINDINGS.md, then return here.
+
+If no: Proceed with planning.
+</step>
+
+<step name="gather_phase_context">
+For this specific phase, understand:
+- What's the phase goal? (from roadmap)
+- What exists already? (scan codebase if mid-project)
+- What dependencies are met? (previous phases complete?)
+- Any research findings? (FINDINGS.md)
+
+```bash
+# If mid-project, understand current state
+ls -la src/ 2>/dev/null
+cat package.json 2>/dev/null | head -20
+```
+</step>
+
+<step name="break_into_tasks">
+Decompose the phase into tasks.
+
+Each task must have:
+- **Type**: auto, checkpoint:human-verify, checkpoint:decision (human-action rarely needed)
+- **Task name**: Clear, action-oriented
+- **Files**: Which files created/modified (for auto tasks)
+- **Action**: Specific implementation (including what to avoid and WHY)
+- **Verify**: How to prove it worked
+- **Done**: Acceptance criteria
+
+**Identify checkpoints:**
+- Claude automated work needing visual/functional verification? → checkpoint:human-verify
+- Implementation choices to make? → checkpoint:decision
+- Truly unavoidable manual action (email link, 2FA)? → checkpoint:human-action (rare)
+
+**Critical:** If external resource has CLI/API (Vercel, Stripe, Upstash, GitHub, etc.), use type="auto" to automate it. Only checkpoint for verification AFTER automation.
+
+See references/checkpoints.md and references/cli-automation.md for checkpoint structure and automation guidance.
+</step>
+
+<step name="estimate_scope">
+After breaking into tasks, assess scope against the **quality degradation curve**.
+
+**ALWAYS split if:**
+- >3 tasks total
+- Multiple subsystems (DB + API + UI = separate plans)
+- >5 files modified in any single task
+- Complex domains (auth, payments, data modeling)
+
+**Aggressive atomicity principle:** Better to have 10 small, high-quality plans than 3 large, degraded plans.
+
+**If scope is appropriate (2-3 tasks, single subsystem, <5 files per task):**
+Proceed to confirm_breakdown for a single plan.
+
+**If scope is large (>3 tasks):**
+Split into multiple plans by:
+- Subsystem (01-01: Database, 01-02: API, 01-03: UI, 01-04: Frontend)
+- Dependency (01-01: Setup, 01-02: Core, 01-03: Features, 01-04: Testing)
+- Complexity (01-01: Layout, 01-02: Data fetch, 01-03: Visualization)
+- Autonomous vs Interactive (group auto tasks for subagent execution)
+
+**Each plan must be:**
+- 2-3 tasks maximum
+- ~50% context target (not 80%)
+- Independently committable
+
+**Autonomous plan optimization:**
+- Plans with NO checkpoints → will execute via subagent (fresh context)
+- Plans with checkpoints → execute in main context (user interaction required)
+- Try to group autonomous work together for maximum fresh contexts
+
+See references/scope-estimation.md for complete splitting guidance and quality degradation analysis.
+</step>
+
+<step name="confirm_breakdown">
+Present the breakdown inline:
+
+**If single plan (2-3 tasks):**
+```
+Here's the proposed breakdown for Phase [X]:
+
+### Tasks (single plan: {phase}-01-PLAN.md)
+1. [Task name] - [brief description] [type: auto/checkpoint]
+2. [Task name] - [brief description] [type: auto/checkpoint]
+[3. [Task name] - [brief description] [type: auto/checkpoint]] (optional 3rd task if small)
+
+Autonomous: [yes/no] (no checkpoints = subagent execution with fresh context)
+
+Does this breakdown look right? (yes / adjust / start over)
+```
+
+**If multiple plans (>3 tasks or multiple subsystems):**
+```
+Here's the proposed breakdown for Phase [X]:
+
+This phase requires 3 plans to maintain quality:
+
+### Plan 1: {phase}-01-PLAN.md - [Subsystem/Component Name]
+1. [Task name] - [brief description] [type]
+2. [Task name] - [brief description] [type]
+3. [Task name] - [brief description] [type]
+
+### Plan 2: {phase}-02-PLAN.md - [Subsystem/Component Name]
+1. [Task name] - [brief description] [type]
+2. [Task name] - [brief description] [type]
+
+### Plan 3: {phase}-03-PLAN.md - [Subsystem/Component Name]
+1. [Task name] - [brief description] [type]
+2. [Task name] - [brief description] [type]
+
+Each plan is independently executable and scoped to ~80% context.
+
+Does this breakdown look right? (yes / adjust / start over)
+```
+
+Wait for confirmation before proceeding.
+
+If "adjust": Ask what to change, revise, present again.
+If "start over": Return to gather_phase_context step.
+</step>
+
+<step name="approach_ambiguity">
+If multiple valid approaches exist for any task:
+
+Use AskUserQuestion:
+- header: "Approach"
+- question: "For [task], there are multiple valid approaches:"
+- options:
+  - "[Approach A]" - [tradeoff description]
+  - "[Approach B]" - [tradeoff description]
+  - "Decide for me" - Use your best judgment
+
+Only ask if genuinely ambiguous. Don't ask obvious choices.
+</step>
+
+<step name="decision_gate">
+After breakdown confirmed:
+
+Use AskUserQuestion:
+- header: "Ready"
+- question: "Ready to create the phase prompt, or would you like me to ask more questions?"
+- options:
+  - "Create phase prompt" - I have enough context
+  - "Ask more questions" - There are details to clarify
+  - "Let me add context" - I want to provide more information
+
+Loop until "Create phase prompt" selected.
+</step>
+
+<step name="write_phase_prompt">
+Use template from `templates/phase-prompt.md`.
+
+**If single plan:**
+Write to `.planning/phases/XX-name/{phase}-01-PLAN.md`
+
+**If multiple plans:**
+Write multiple files:
+- `.planning/phases/XX-name/{phase}-01-PLAN.md`
+- `.planning/phases/XX-name/{phase}-02-PLAN.md`
+- `.planning/phases/XX-name/{phase}-03-PLAN.md`
+
+Each file follows the template structure:
+
+```markdown
+---
+phase: XX-name
+plan: {plan-number}
+type: execute
+domain: [if domain expertise loaded]
+---
+
+<objective>
+[Plan-specific goal - what this plan accomplishes]
+
+Purpose: [Why this plan matters for the phase]
+Output: [What artifacts will be created by this plan]
+</objective>
+
+<execution_context>
+@~/.claude/skills/create-plans/workflows/execute-phase.md
+@~/.claude/skills/create-plans/templates/summary.md
+[If plan has ANY checkpoint tasks (type="checkpoint:*"), add:]
+@~/.claude/skills/create-plans/references/checkpoints.md
+</execution_context>
+
+<context>
+@.planning/BRIEF.md
+@.planning/ROADMAP.md
+[If research done:]
+@.planning/phases/XX-name/FINDINGS.md
+[If continuing from previous plan:]
+@.planning/phases/XX-name/{phase}-{prev}-SUMMARY.md
+[Relevant source files:]
+@src/path/to/relevant.ts
+</context>
+
+<tasks>
+[Tasks in XML format with type attribute]
+[Mix of type="auto" and type="checkpoint:*" as needed]
+</tasks>
+
+<verification>
+[Overall plan verification checks]
+</verification>
+
+<success_criteria>
+[Measurable completion criteria for this plan]
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/XX-name/{phase}-{plan}-SUMMARY.md`
+[Include summary structure from template]
+</output>
+```
+
+**For multi-plan phases:**
+- Each plan has focused scope (3-6 tasks)
+- Plans reference previous plan summaries in context
+- Last plan's success criteria includes "Phase X complete"
+</step>
+
+<step name="offer_next">
+**If single plan:**
+```
+Phase plan created: .planning/phases/XX-name/{phase}-01-PLAN.md
+[X] tasks defined.
+
+What's next?
+1. Execute plan
+2. Review/adjust tasks
+3. Done for now
+```
+
+**If multiple plans:**
+```
+Phase plans created:
+- {phase}-01-PLAN.md ([X] tasks) - [Subsystem name]
+- {phase}-02-PLAN.md ([X] tasks) - [Subsystem name]
+- {phase}-03-PLAN.md ([X] tasks) - [Subsystem name]
+
+Total: [X] tasks across [Y] focused plans.
+
+What's next?
+1. Execute first plan ({phase}-01)
+2. Review/adjust tasks
+3. Done for now
+```
+</step>
+
+</process>
+
+<task_quality>
+Good tasks:
+- "Add User model to Prisma schema with email, passwordHash, createdAt"
+- "Create POST /api/auth/login endpoint with bcrypt validation"
+- "Add protected route middleware checking JWT in cookies"
+
+Bad tasks:
+- "Set up authentication" (too vague)
+- "Make it secure" (not actionable)
+- "Handle edge cases" (which ones?)
+
+If you can't specify Files + Action + Verify + Done, the task is too vague.
+</task_quality>
+
+<anti_patterns>
+- Don't add story points
+- Don't estimate hours
+- Don't assign to team members
+- Don't add acceptance criteria committees
+- Don't create sub-sub-sub tasks
+
+Tasks are instructions for Claude, not Jira tickets.
+</anti_patterns>
+
+<success_criteria>
+Phase planning is complete when:
+- [ ] One or more PLAN files exist with XML structure ({phase}-{plan}-PLAN.md)
+- [ ] Each plan has: Objective, context, tasks, verification, success criteria, output
+- [ ] @context references included
+- [ ] Each plan has 3-6 tasks (scoped to ~80% context)
+- [ ] Each task has: Type, Files (if auto), Action, Verify, Done
+- [ ] Checkpoints identified and properly structured
+- [ ] Tasks are specific enough for Claude to execute
+- [ ] If multiple plans: logical split by subsystem/dependency/complexity
+- [ ] User knows next steps
+</success_criteria>
--- a/skills/create-plans/workflows/research-phase.md
+++ b/skills/create-plans/workflows/research-phase.md
@@ -0,0 +1,106 @@
+# Workflow: Research Phase
+
+<purpose>
+Create and execute a research prompt for phases with unknowns.
+Produces FINDINGS.md that informs PLAN.md creation.
+</purpose>
+
+<when_to_use>
+- Technology choice unclear
+- Best practices needed
+- API/library investigation required
+- Architecture decision pending
+</when_to_use>
+
+<process>
+
+<step name="identify_unknowns">
+Ask: What do we need to learn before we can plan this phase?
+- Technology choices?
+- Best practices?
+- API patterns?
+- Architecture approach?
+</step>
+
+<step name="create_research_prompt">
+Use templates/research-prompt.md.
+Write to `.planning/phases/XX-name/RESEARCH.md`
+
+Include:
+- Clear research objective
+- Scoped include/exclude lists
+- Source preferences (official docs, Context7, 2024-2025)
+- Output structure for FINDINGS.md
+</step>
+
+<step name="execute_research">
+Run the research prompt:
+- Use web search for current info
+- Use Context7 MCP for library docs
+- Prefer 2024-2025 sources
+- Structure findings per template
+</step>
+
+<step name="create_findings">
+Write `.planning/phases/XX-name/FINDINGS.md`:
+- Summary with recommendation
+- Key findings with sources
+- Code examples if applicable
+- Metadata (confidence, dependencies, open questions, assumptions)
+</step>
+
+<step name="confidence_gate">
+After creating FINDINGS.md, check confidence level.
+
+If confidence is LOW:
+  Use AskUserQuestion:
+  - header: "Low Confidence"
+  - question: "Research confidence is LOW: [reason]. How would you like to proceed?"
+  - options:
+    - "Dig deeper" - Do more research before planning
+    - "Proceed anyway" - Accept uncertainty, plan with caveats
+    - "Pause" - I need to think about this
+
+If confidence is MEDIUM:
+  Inline: "Research complete (medium confidence). [brief reason]. Proceed to planning?"
+
+If confidence is HIGH:
+  Proceed directly, just note: "Research complete (high confidence)."
+</step>
+
+<step name="open_questions_gate">
+If FINDINGS.md has open_questions:
+
+Present them inline:
+"Open questions from research:
+- [Question 1]
+- [Question 2]
+
+These may affect implementation. Acknowledge and proceed? (yes / address first)"
+
+If "address first": Gather user input on questions, update findings.
+</step>
+
+<step name="offer_next">
+```
+Research complete: .planning/phases/XX-name/FINDINGS.md
+Recommendation: [one-liner]
+Confidence: [level]
+
+What's next?
+1. Create phase plan (PLAN.md) using findings
+2. Refine research (dig deeper)
+3. Review findings
+```
+
+NOTE: FINDINGS.md is NOT committed separately. It will be committed with phase completion.
+</step>
+
+</process>
+
+<success_criteria>
+- RESEARCH.md exists with clear scope
+- FINDINGS.md created with structured recommendations
+- Confidence level and metadata included
+- Ready to inform PLAN.md creation
+</success_criteria>
--- a/skills/create-plans/workflows/resume.md
+++ b/skills/create-plans/workflows/resume.md
@@ -0,0 +1,124 @@
+# Workflow: Resume from Handoff
+
+<required_reading>
+**Read the handoff file found by context scan.**
+</required_reading>
+
+<purpose>
+Load context from a handoff file and restore working state.
+After loading, DELETE the handoff - it's a parking lot, not permanent storage.
+</purpose>
+
+<process>
+
+<step name="locate_handoff">
+Context scan already found handoff. Read it:
+
+```bash
+cat .planning/phases/*/.continue-here.md 2>/dev/null
+```
+
+Parse YAML frontmatter for: phase, task, status, last_updated
+Parse markdown body for: context, completed work, remaining work
+</step>
+
+<step name="calculate_time_ago">
+Convert `last_updated` to human-readable:
+- "3 hours ago"
+- "Yesterday"
+- "5 days ago"
+
+If > 2 weeks, warn: "This handoff is [X] old. Code may have changed."
+</step>
+
+<step name="present_summary">
+Display to user:
+
+```
+Resuming: Phase [X] - [Name]
+Last updated: [time ago]
+
+Task [N] of [Total]: [Task name]
+Status: [in_progress/blocked/etc]
+
+Completed this phase:
+- [task 1]
+- [task 2]
+
+Remaining:
+- [task 3] ← You are here
+- [task 4]
+
+Context notes:
+[Key decisions, blockers, mental state from handoff]
+
+Ready to continue? (1) Yes (2) See full handoff (3) Different action
+```
+</step>
+
+<step name="user_confirms">
+**WAIT for user confirmation.** Do not auto-proceed.
+
+On confirmation:
+1. Load relevant files mentioned in handoff
+2. Delete the handoff file
+3. Continue from where we left off
+</step>
+
+<step name="delete_handoff">
+After user confirms and context is loaded:
+
+```bash
+rm .planning/phases/XX-name/.continue-here.md
+```
+
+Tell user: "Handoff loaded and cleared. Let's continue."
+</step>
+
+<step name="continue_work">
+Based on handoff state:
+- If mid-task: Continue that task
+- If between tasks: Start next task
+- If blocked: Address blocker first
+
+Offer: "Continue with [next action]?"
+</step>
+
+</process>
+
+<stale_handoff>
+If handoff is > 2 weeks old:
+
+```
+Warning: This handoff is [X days] old.
+
+The codebase may have changed. Recommend:
+1. Review what's changed (git log)
+2. Discard handoff, reassess from PLAN.md
+3. Continue anyway (risky)
+```
+</stale_handoff>
+
+<multiple_handoffs>
+If multiple `.continue-here.md` files found:
+
+```
+Found multiple handoffs:
+1. phases/02-auth/.continue-here.md (3 hours ago)
+2. phases/01-setup/.continue-here.md (2 days ago)
+
+Which one? (likely want #1, the most recent)
+```
+
+Most recent is usually correct. Older ones may be stale/forgotten.
+</multiple_handoffs>
+
+<success_criteria>
+Resume is complete when:
+- [ ] Handoff located and parsed
+- [ ] Time-ago displayed
+- [ ] Summary presented to user
+- [ ] User explicitly confirmed
+- [ ] Handoff file deleted
+- [ ] Context loaded, ready to continue
+</success_criteria>
--- a/skills/create-plans/workflows/transition.md
+++ b/skills/create-plans/workflows/transition.md
@@ -0,0 +1,151 @@
+# Workflow: Transition to Next Phase
+
+<required_reading>
+**Read these files NOW:**
+1. `.planning/ROADMAP.md`
+2. Current phase's plan files (`*-PLAN.md`)
+3. Current phase's summary files (`*-SUMMARY.md`)
+</required_reading>
+
+<purpose>
+Mark current phase complete and advance to next. This is the natural point
+where progress tracking happens - implicit via forward motion.
+
+"Planning next phase" = "current phase is done"
+</purpose>
+
+<process>
+
+<step name="verify_completion">
+Check current phase has all plan summaries:
+
+```bash
+ls .planning/phases/XX-current/*-PLAN.md 2>/dev/null | sort
+ls .planning/phases/XX-current/*-SUMMARY.md 2>/dev/null | sort
+```
+
+**Verification logic:**
+- Count PLAN files
+- Count SUMMARY files
+- If counts match: all plans complete
+- If counts don't match: incomplete
+
+**If all plans complete:**
+Ask: "Phase [X] complete - all [Y] plans finished. Ready to mark done and move to Phase [X+1]?"
+
+**If plans incomplete:**
+Present:
+```
+Phase [X] has incomplete plans:
+- {phase}-01-SUMMARY.md ✓ Complete
+- {phase}-02-SUMMARY.md ✗ Missing
+- {phase}-03-SUMMARY.md ✗ Missing
+
+Options:
+1. Continue current phase (execute remaining plans)
+2. Mark complete anyway (skip remaining plans)
+3. Review what's left
+```
+
+Wait for user decision.
+</step>
+
+<step name="cleanup_handoff">
+Check for lingering handoffs:
+
+```bash
+ls .planning/phases/XX-current/.continue-here*.md 2>/dev/null
+```
+
+If found, delete them - phase is complete, handoffs are stale.
+
+Pattern matches:
+- `.continue-here.md` (legacy)
+- `.continue-here-01-02.md` (plan-specific)
+</step>
+
+<step name="update_roadmap">
+Update `.planning/ROADMAP.md`:
+- Mark current phase: `[x] Complete`
+- Add completion date
+- Update plan count to final (e.g., "3/3 plans complete")
+- Update Progress table
+- Keep next phase as `[ ] Not started`
+
+**Example:**
+```markdown
+## Phases
+
+- [x] Phase 1: Foundation (completed 2025-01-15)
+- [ ] Phase 2: Authentication ← Next
+- [ ] Phase 3: Core Features
+
+## Progress
+
+| Phase | Plans Complete | Status | Completed |
+|-------|----------------|--------|-----------|
+| 1. Foundation | 3/3 | Complete | 2025-01-15 |
+| 2. Authentication | 0/2 | Not started | - |
+| 3. Core Features | 0/1 | Not started | - |
+```
+</step>
+
+<step name="archive_prompts">
+If prompts were generated for the phase, they stay in place.
+The `completed/` subfolder pattern from create-meta-prompts handles archival.
+</step>
+
+<step name="offer_next_phase">
+```
+Phase [X] marked complete.
+
+Next: Phase [X+1] - [Name]
+
+What would you like to do?
+1. Plan Phase [X+1] in detail
+2. Review roadmap
+3. Take a break (done for now)
+```
+</step>
+
+</process>
+
+<implicit_tracking>
+Progress tracking is IMPLICIT:
+
+- "Plan phase 2" → Phase 1 must be done (or ask)
+- "Plan phase 3" → Phases 1-2 must be done (or ask)
+- Transition workflow makes it explicit in ROADMAP.md
+
+No separate "update progress" step. Forward motion IS progress.
+</implicit_tracking>
+
+<partial_completion>
+If user wants to move on but phase isn't fully complete:
+
+```
+Phase [X] has incomplete plans:
+- {phase}-02-PLAN.md (not executed)
+- {phase}-03-PLAN.md (not executed)
+
+Options:
+1. Mark complete anyway (plans weren't needed)
+2. Defer work to later phase
+3. Stay and finish current phase
+```
+
+Respect user judgment - they know if work matters.
+
+**If marking complete with incomplete plans:**
+- Update ROADMAP: "2/3 plans complete" (not "3/3")
+- Note in transition message which plans were skipped
+</partial_completion>
+
+<success_criteria>
+Transition is complete when:
+- [ ] Current phase plan summaries verified (all exist or user chose to skip)
+- [ ] Any stale handoffs deleted
+- [ ] ROADMAP.md updated with completion status and plan count
+- [ ] Progress table updated
+- [ ] User knows next steps
+</success_criteria>