Initial commit

2025-11-29 18:28:37 +08:00
commit ccc65b3f07
180 changed files with 53970 additions and 0 deletions
--- a/skills/create-plans/references/scope-estimation.md
+++ b/skills/create-plans/references/scope-estimation.md
@@ -0,0 +1,415 @@
+# Scope Estimation & Quality-Driven Plan Splitting
+
+Plans must maintain consistent quality from first task to last. This requires understanding the **quality degradation curve** and splitting aggressively to stay in the peak quality zone.
+
+## The Quality Degradation Curve
+
+**Critical insight:** Claude doesn't degrade at arbitrary percentages - it degrades when it *perceives* context pressure and enters "completion mode."
+
+```
+Context Usage  │  Quality Level   │  Claude's Mental State
+─────────────────────────────────────────────────────────
+0-30%          │  ████████ PEAK   │  "I can be thorough and comprehensive"
+               │                  │  No anxiety, full detail, best work
+
+30-50%         │  ██████ GOOD     │  "Still have room, maintaining quality"
+               │                  │  Engaged, confident, solid work
+
+50-70%         │  ███ DEGRADING   │  "Getting tight, need to be efficient"
+               │                  │  Efficiency mode, compression begins
+
+70%+           │  █ POOR          │  "Running out, must finish quickly"
+               │                  │  Self-lobotomization, rushed, minimal
+```
+
+**The 40-50% inflection point:**
+
+This is where quality breaks. Claude sees context mounting and thinks "I'd better conserve now or I won't finish." Result: The classic mid-execution statement "I'll complete the remaining tasks more concisely" = quality crash.
+
+**The fundamental rule:** Stop BEFORE quality degrades, not at context limit.
+
+## Target: 50% Context Maximum
+
+**Plans should complete within ~50% of context usage.**
+
+Why 50% not 80%?
+- Huge safety buffer
+- No context anxiety possible
+- Quality maintained from start to finish
+- Room for unexpected complexity
+- Space for iteration and fixes
+
+**If you target 80%, you're planning for failure.** By the time you hit 80%, you've already spent 40% in degradation mode.
+
+## The 2-3 Task Rule
+
+**Each plan should contain 2-3 tasks maximum.**
+
+Why this number?
+
+**Task 1 (0-15% context):**
+- Fresh context
+- Peak quality
+- Comprehensive implementation
+- Full testing
+- Complete documentation
+
+**Task 2 (15-35% context):**
+- Still in peak zone
+- Quality maintained
+- Buffer feels safe
+- No anxiety
+
+**Task 3 (35-50% context):**
+- Beginning to feel pressure
+- Quality still good but managing it
+- Natural stopping point
+- Better to commit here
+
+**Task 4+ (50%+ context):**
+- DEGRADATION ZONE
+- "I'll do this concisely" appears
+- Quality crashes
+- Should have split before this
+
+**The principle:** Each task is independently committable. 2-3 focused changes per commit creates beautiful, surgical git history.
+
+## Signals to Split Into Multiple Plans
+
+### Always Split If:
+
+**1. More than 3 tasks**
+- Even if tasks seem small
+- Each additional task increases degradation risk
+- Split into logical groups of 2-3
+
+**2. Multiple subsystems**
+```
+❌ Bad (1 plan):
+- Database schema (3 files)
+- API routes (5 files)
+- UI components (8 files)
+Total: 16 files, 1 plan → guaranteed degradation
+
+✅ Good (3 plans):
+- 01-01-PLAN.md: Database schema (3 files, 2 tasks)
+- 01-02-PLAN.md: API routes (5 files, 3 tasks)
+- 01-03-PLAN.md: UI components (8 files, 3 tasks)
+Total: 16 files, 3 plans → consistent quality
+```
+
+**3. Any task with >5 file modifications**
+- Large tasks burn context fast
+- Split by file groups or logical units
+- Better: 3 plans of 2 files each vs 1 plan of 6 files
+
+**4. Checkpoint + implementation work**
+- Checkpoints require user interaction (context preserved)
+- Implementation after checkpoint should be separate plan
+```
+✅ Good split:
+- 02-01-PLAN.md: Setup (checkpoint: decision on auth provider)
+- 02-02-PLAN.md: Implement chosen auth solution
+```
+
+**5. Research + implementation**
+- Research produces FINDINGS.md (separate plan)
+- Implementation consumes FINDINGS.md (separate plan)
+- Clear boundary, clean handoff
+
+### Consider Splitting If:
+
+**1. Estimated >5 files modified total**
+- Context from reading existing code
+- Context from diffs
+- Context from responses
+- Adds up faster than expected
+
+**2. Complex domains (auth, payments, data modeling)**
+- These require careful thinking
+- Burns more context per task than simple CRUD
+- Split more aggressively
+
+**3. Any uncertainty about approach**
+- "Figure out X" phase separate from "implement X" phase
+- Don't mix exploration and implementation
+
+**4. Natural semantic boundaries**
+- Setup → Core → Features
+- Backend → Frontend
+- Configuration → Implementation → Testing
+
+## Splitting Strategies
+
+### By Subsystem
+
+**Phase:** "Authentication System"
+
+**Split:**
+```
+- 03-01-PLAN.md: Database models (User, Session tables + relations)
+- 03-02-PLAN.md: Auth API (register, login, logout endpoints)
+- 03-03-PLAN.md: Protected routes (middleware, JWT validation)
+- 03-04-PLAN.md: UI components (login form, registration form)
+```
+
+Each plan: 2-3 tasks, single subsystem, clean commits.
+
+### By Dependency
+
+**Phase:** "Payment Integration"
+
+**Split:**
+```
+- 04-01-PLAN.md: Stripe setup (webhook endpoints via API, env vars, test mode)
+- 04-02-PLAN.md: Subscription logic (plans, checkout, customer portal)
+- 04-03-PLAN.md: Frontend integration (pricing page, payment flow)
+```
+
+Later plans depend on earlier completion. Sequential execution, fresh context each time.
+
+### By Complexity
+
+**Phase:** "Dashboard Buildout"
+
+**Split:**
+```
+- 05-01-PLAN.md: Layout shell (simple: sidebar, header, routing)
+- 05-02-PLAN.md: Data fetching (moderate: TanStack Query setup, API integration)
+- 05-03-PLAN.md: Data visualization (complex: charts, tables, real-time updates)
+```
+
+Complex work gets its own plan with full context budget.
+
+### By Verification Points
+
+**Phase:** "Deployment Pipeline"
+
+**Split:**
+```
+- 06-01-PLAN.md: Vercel setup (deploy via CLI, configure domains)
+  → Ends with checkpoint:human-verify "check xyz.vercel.app loads"
+
+- 06-02-PLAN.md: Environment config (secrets via CLI, env vars)
+  → Autonomous (no checkpoints) → subagent execution
+
+- 06-03-PLAN.md: CI/CD (GitHub Actions, preview deploys)
+  → Ends with checkpoint:human-verify "check PR preview works"
+```
+
+Verification checkpoints create natural boundaries. Autonomous plans between checkpoints execute via subagent with fresh context.
+
+## Autonomous vs Interactive Plans
+
+**Critical optimization:** Plans without checkpoints don't need main context.
+
+### Autonomous Plans (No Checkpoints)
+- Contains only `type="auto"` tasks
+- No user interaction needed
+- **Execute via subagent with fresh 200k context**
+- Impossible to degrade (always starts at 0%)
+- Creates SUMMARY, commits, reports back
+- Can run in parallel (multiple subagents)
+
+### Interactive Plans (Has Checkpoints)
+- Contains `checkpoint:human-verify` or `checkpoint:decision` tasks
+- Requires user interaction
+- Must execute in main context
+- Still target 50% context (2-3 tasks)
+
+**Planning guidance:** If splitting a phase, try to:
+- Group autonomous work together (→ subagent)
+- Separate interactive work (→ main context)
+- Maximize autonomous plans (more fresh contexts)
+
+Example:
+```
+Phase: Feature X
+- 07-01-PLAN.md: Backend (autonomous) → subagent
+- 07-02-PLAN.md: Frontend (autonomous) → subagent
+- 07-03-PLAN.md: Integration test (has checkpoint:human-verify) → main context
+```
+
+Two fresh contexts, one interactive verification. Perfect.
+
+## Anti-Patterns
+
+### ❌ The "Comprehensive Plan" Anti-Pattern
+
+```
+Plan: "Complete Authentication System"
+Tasks:
+1. Database models
+2. Migration files
+3. Auth API endpoints
+4. JWT utilities
+5. Protected route middleware
+6. Password hashing
+7. Login form component
+8. Registration form component
+
+Result: 8 tasks, 80%+ context, degradation at task 4-5
+```
+
+**Why this fails:**
+- Task 1-3: Good quality
+- Task 4-5: "I'll do these concisely" = degradation begins
+- Task 6-8: Rushed, minimal, poor quality
+
+### ✅ The "Atomic Plan" Pattern
+
+```
+Split into 4 plans:
+
+Plan 1: "Auth Database Models" (2 tasks)
+- Database schema (User, Session)
+- Migration files
+
+Plan 2: "Auth API Core" (3 tasks)
+- Register endpoint
+- Login endpoint
+- JWT utilities
+
+Plan 3: "Auth API Protection" (2 tasks)
+- Protected route middleware
+- Logout endpoint
+
+Plan 4: "Auth UI Components" (2 tasks)
+- Login form
+- Registration form
+```
+
+**Why this succeeds:**
+- Each plan: 2-3 tasks, 30-40% context
+- All tasks: Peak quality throughout
+- Git history: 4 focused commits
+- Easy to verify each piece
+- Rollback is surgical
+
+### ❌ The "Efficiency Trap" Anti-Pattern
+
+```
+Thinking: "These tasks are small, let's do 6 to be efficient"
+
+Result: Task 1-2 are good, task 3-4 begin degrading, task 5-6 are rushed
+```
+
+**Why this fails:** You're optimizing for fewer plans, not quality. The "efficiency" is false - poor quality requires more rework.
+
+### ✅ The "Quality First" Pattern
+
+```
+Thinking: "These tasks are small, but let's do 2-3 to guarantee quality"
+
+Result: All tasks peak quality, clean commits, no rework needed
+```
+
+**Why this succeeds:** You optimize for quality, which is true efficiency. No rework = faster overall.
+
+## Estimating Context Usage
+
+**Rough heuristics for plan size:**
+
+### File Counts
+- 0-3 files modified: Small task (~10-15% context)
+- 4-6 files modified: Medium task (~20-30% context)
+- 7+ files modified: Large task (~40%+ context) - split this
+
+### Complexity
+- Simple CRUD: ~15% per task
+- Business logic: ~25% per task
+- Complex algorithms: ~40% per task
+- Domain modeling: ~35% per task
+
+### 2-Task Plan (Safe)
+- 2 simple tasks: ~30% total ✅ Plenty of room
+- 2 medium tasks: ~50% total ✅ At target
+- 2 complex tasks: ~80% total ❌ Too tight, split
+
+### 3-Task Plan (Risky)
+- 3 simple tasks: ~45% total ✅ Good
+- 3 medium tasks: ~75% total ⚠️ Pushing it
+- 3 complex tasks: 120% total ❌ Impossible, split
+
+**Conservative principle:** When in doubt, split. Better to have an extra plan than degraded quality.
+
+## The Atomic Commit Philosophy
+
+**What we're optimizing for:** Beautiful git history where each commit is:
+- Focused (2-3 related changes)
+- Complete (fully implemented, tested)
+- Documented (clear commit message)
+- Reviewable (small enough to understand)
+- Revertable (surgical rollback possible)
+
+**Bad git history (large plans):**
+```
+feat(auth): Complete authentication system
+- Added 16 files
+- Modified 8 files
+- 1200 lines changed
+- Contains: models, API, UI, middleware, utilities
+```
+
+Impossible to review, hard to understand, can't revert without losing everything.
+
+**Good git history (atomic plans):**
+```
+feat(auth-01): Add User and Session database models
+- Added schema files
+- Added migration
+- 45 lines changed
+
+feat(auth-02): Implement register and login API endpoints
+- Added /api/auth/register
+- Added /api/auth/login
+- Added JWT utilities
+- 120 lines changed
+
+feat(auth-03): Add protected route middleware
+- Added middleware/auth.ts
+- Added tests
+- 60 lines changed
+
+feat(auth-04): Build login and registration forms
+- Added LoginForm component
+- Added RegisterForm component
+- 90 lines changed
+```
+
+Each commit tells a story. Each is reviewable. Each is revertable. This is craftsmanship.
+
+## Quality Assurance Through Scope Control
+
+**The guarantee:** When you follow the 2-3 task rule with 50% context target:
+
+1. **Consistency:** First task has same quality as last task
+2. **Thoroughness:** No "I'll complete X concisely" degradation
+3. **Documentation:** Full context budget for comments/tests
+4. **Error handling:** Space for proper validation and edge cases
+5. **Testing:** Room for comprehensive test coverage
+
+**The cost:** More plans to manage.
+
+**The benefit:** Consistent excellence. No rework. Clean history. Maintainable code.
+
+**The trade-off is worth it.**
+
+## Summary
+
+**Old way (3-6 tasks, 80% target):**
+- Tasks 1-2: Good
+- Tasks 3-4: Degrading
+- Tasks 5-6: Poor
+- Git: Large, unreviewable commits
+- Quality: Inconsistent
+
+**New way (2-3 tasks, 50% target):**
+- All tasks: Peak quality
+- Git: Atomic, surgical commits
+- Quality: Consistent excellence
+- Autonomous plans: Subagent execution (fresh context)
+
+**The principle:** Aggressive atomicity. More plans, smaller scope, consistent quality.
+
+**The rule:** If in doubt, split. Quality over consolidation. Always.