Initial commit
This commit is contained in:
415
skills/create-plans/references/scope-estimation.md
Normal file
415
skills/create-plans/references/scope-estimation.md
Normal file
@@ -0,0 +1,415 @@
|
||||
# Scope Estimation & Quality-Driven Plan Splitting
|
||||
|
||||
Plans must maintain consistent quality from first task to last. This requires understanding the **quality degradation curve** and splitting aggressively to stay in the peak quality zone.
|
||||
|
||||
## The Quality Degradation Curve
|
||||
|
||||
**Critical insight:** Claude doesn't degrade at arbitrary percentages - it degrades when it *perceives* context pressure and enters "completion mode."
|
||||
|
||||
```
|
||||
Context Usage │ Quality Level │ Claude's Mental State
|
||||
─────────────────────────────────────────────────────────
|
||||
0-30% │ ████████ PEAK │ "I can be thorough and comprehensive"
|
||||
│ │ No anxiety, full detail, best work
|
||||
|
||||
30-50% │ ██████ GOOD │ "Still have room, maintaining quality"
|
||||
│ │ Engaged, confident, solid work
|
||||
|
||||
50-70% │ ███ DEGRADING │ "Getting tight, need to be efficient"
|
||||
│ │ Efficiency mode, compression begins
|
||||
|
||||
70%+ │ █ POOR │ "Running out, must finish quickly"
|
||||
│ │ Self-lobotomization, rushed, minimal
|
||||
```
|
||||
|
||||
**The 40-50% inflection point:**
|
||||
|
||||
This is where quality breaks. Claude sees context mounting and thinks "I'd better conserve now or I won't finish." Result: The classic mid-execution statement "I'll complete the remaining tasks more concisely" = quality crash.
|
||||
|
||||
**The fundamental rule:** Stop BEFORE quality degrades, not at context limit.
|
||||
|
||||
## Target: 50% Context Maximum
|
||||
|
||||
**Plans should complete within ~50% of context usage.**
|
||||
|
||||
Why 50% not 80%?
|
||||
- Huge safety buffer
|
||||
- No context anxiety possible
|
||||
- Quality maintained from start to finish
|
||||
- Room for unexpected complexity
|
||||
- Space for iteration and fixes
|
||||
|
||||
**If you target 80%, you're planning for failure.** By the time you hit 80%, you've already spent 40% in degradation mode.
|
||||
|
||||
## The 2-3 Task Rule
|
||||
|
||||
**Each plan should contain 2-3 tasks maximum.**
|
||||
|
||||
Why this number?
|
||||
|
||||
**Task 1 (0-15% context):**
|
||||
- Fresh context
|
||||
- Peak quality
|
||||
- Comprehensive implementation
|
||||
- Full testing
|
||||
- Complete documentation
|
||||
|
||||
**Task 2 (15-35% context):**
|
||||
- Still in peak zone
|
||||
- Quality maintained
|
||||
- Buffer feels safe
|
||||
- No anxiety
|
||||
|
||||
**Task 3 (35-50% context):**
|
||||
- Beginning to feel pressure
|
||||
- Quality still good but managing it
|
||||
- Natural stopping point
|
||||
- Better to commit here
|
||||
|
||||
**Task 4+ (50%+ context):**
|
||||
- DEGRADATION ZONE
|
||||
- "I'll do this concisely" appears
|
||||
- Quality crashes
|
||||
- Should have split before this
|
||||
|
||||
**The principle:** Each task is independently committable. 2-3 focused changes per commit creates beautiful, surgical git history.
|
||||
|
||||
## Signals to Split Into Multiple Plans
|
||||
|
||||
### Always Split If:
|
||||
|
||||
**1. More than 3 tasks**
|
||||
- Even if tasks seem small
|
||||
- Each additional task increases degradation risk
|
||||
- Split into logical groups of 2-3
|
||||
|
||||
**2. Multiple subsystems**
|
||||
```
|
||||
❌ Bad (1 plan):
|
||||
- Database schema (3 files)
|
||||
- API routes (5 files)
|
||||
- UI components (8 files)
|
||||
Total: 16 files, 1 plan → guaranteed degradation
|
||||
|
||||
✅ Good (3 plans):
|
||||
- 01-01-PLAN.md: Database schema (3 files, 2 tasks)
|
||||
- 01-02-PLAN.md: API routes (5 files, 3 tasks)
|
||||
- 01-03-PLAN.md: UI components (8 files, 3 tasks)
|
||||
Total: 16 files, 3 plans → consistent quality
|
||||
```
|
||||
|
||||
**3. Any task with >5 file modifications**
|
||||
- Large tasks burn context fast
|
||||
- Split by file groups or logical units
|
||||
- Better: 3 plans of 2 files each vs 1 plan of 6 files
|
||||
|
||||
**4. Checkpoint + implementation work**
|
||||
- Checkpoints require user interaction (context preserved)
|
||||
- Implementation after checkpoint should be separate plan
|
||||
```
|
||||
✅ Good split:
|
||||
- 02-01-PLAN.md: Setup (checkpoint: decision on auth provider)
|
||||
- 02-02-PLAN.md: Implement chosen auth solution
|
||||
```
|
||||
|
||||
**5. Research + implementation**
|
||||
- Research produces FINDINGS.md (separate plan)
|
||||
- Implementation consumes FINDINGS.md (separate plan)
|
||||
- Clear boundary, clean handoff
|
||||
|
||||
### Consider Splitting If:
|
||||
|
||||
**1. Estimated >5 files modified total**
|
||||
- Context from reading existing code
|
||||
- Context from diffs
|
||||
- Context from responses
|
||||
- Adds up faster than expected
|
||||
|
||||
**2. Complex domains (auth, payments, data modeling)**
|
||||
- These require careful thinking
|
||||
- Burns more context per task than simple CRUD
|
||||
- Split more aggressively
|
||||
|
||||
**3. Any uncertainty about approach**
|
||||
- "Figure out X" phase separate from "implement X" phase
|
||||
- Don't mix exploration and implementation
|
||||
|
||||
**4. Natural semantic boundaries**
|
||||
- Setup → Core → Features
|
||||
- Backend → Frontend
|
||||
- Configuration → Implementation → Testing
|
||||
|
||||
## Splitting Strategies
|
||||
|
||||
### By Subsystem
|
||||
|
||||
**Phase:** "Authentication System"
|
||||
|
||||
**Split:**
|
||||
```
|
||||
- 03-01-PLAN.md: Database models (User, Session tables + relations)
|
||||
- 03-02-PLAN.md: Auth API (register, login, logout endpoints)
|
||||
- 03-03-PLAN.md: Protected routes (middleware, JWT validation)
|
||||
- 03-04-PLAN.md: UI components (login form, registration form)
|
||||
```
|
||||
|
||||
Each plan: 2-3 tasks, single subsystem, clean commits.
|
||||
|
||||
### By Dependency
|
||||
|
||||
**Phase:** "Payment Integration"
|
||||
|
||||
**Split:**
|
||||
```
|
||||
- 04-01-PLAN.md: Stripe setup (webhook endpoints via API, env vars, test mode)
|
||||
- 04-02-PLAN.md: Subscription logic (plans, checkout, customer portal)
|
||||
- 04-03-PLAN.md: Frontend integration (pricing page, payment flow)
|
||||
```
|
||||
|
||||
Later plans depend on earlier completion. Sequential execution, fresh context each time.
|
||||
|
||||
### By Complexity
|
||||
|
||||
**Phase:** "Dashboard Buildout"
|
||||
|
||||
**Split:**
|
||||
```
|
||||
- 05-01-PLAN.md: Layout shell (simple: sidebar, header, routing)
|
||||
- 05-02-PLAN.md: Data fetching (moderate: TanStack Query setup, API integration)
|
||||
- 05-03-PLAN.md: Data visualization (complex: charts, tables, real-time updates)
|
||||
```
|
||||
|
||||
Complex work gets its own plan with full context budget.
|
||||
|
||||
### By Verification Points
|
||||
|
||||
**Phase:** "Deployment Pipeline"
|
||||
|
||||
**Split:**
|
||||
```
|
||||
- 06-01-PLAN.md: Vercel setup (deploy via CLI, configure domains)
|
||||
→ Ends with checkpoint:human-verify "check xyz.vercel.app loads"
|
||||
|
||||
- 06-02-PLAN.md: Environment config (secrets via CLI, env vars)
|
||||
→ Autonomous (no checkpoints) → subagent execution
|
||||
|
||||
- 06-03-PLAN.md: CI/CD (GitHub Actions, preview deploys)
|
||||
→ Ends with checkpoint:human-verify "check PR preview works"
|
||||
```
|
||||
|
||||
Verification checkpoints create natural boundaries. Autonomous plans between checkpoints execute via subagent with fresh context.
|
||||
|
||||
## Autonomous vs Interactive Plans
|
||||
|
||||
**Critical optimization:** Plans without checkpoints don't need main context.
|
||||
|
||||
### Autonomous Plans (No Checkpoints)
|
||||
- Contains only `type="auto"` tasks
|
||||
- No user interaction needed
|
||||
- **Execute via subagent with fresh 200k context**
|
||||
- Impossible to degrade (always starts at 0%)
|
||||
- Creates SUMMARY, commits, reports back
|
||||
- Can run in parallel (multiple subagents)
|
||||
|
||||
### Interactive Plans (Has Checkpoints)
|
||||
- Contains `checkpoint:human-verify` or `checkpoint:decision` tasks
|
||||
- Requires user interaction
|
||||
- Must execute in main context
|
||||
- Still target 50% context (2-3 tasks)
|
||||
|
||||
**Planning guidance:** If splitting a phase, try to:
|
||||
- Group autonomous work together (→ subagent)
|
||||
- Separate interactive work (→ main context)
|
||||
- Maximize autonomous plans (more fresh contexts)
|
||||
|
||||
Example:
|
||||
```
|
||||
Phase: Feature X
|
||||
- 07-01-PLAN.md: Backend (autonomous) → subagent
|
||||
- 07-02-PLAN.md: Frontend (autonomous) → subagent
|
||||
- 07-03-PLAN.md: Integration test (has checkpoint:human-verify) → main context
|
||||
```
|
||||
|
||||
Two fresh contexts, one interactive verification. Perfect.
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### ❌ The "Comprehensive Plan" Anti-Pattern
|
||||
|
||||
```
|
||||
Plan: "Complete Authentication System"
|
||||
Tasks:
|
||||
1. Database models
|
||||
2. Migration files
|
||||
3. Auth API endpoints
|
||||
4. JWT utilities
|
||||
5. Protected route middleware
|
||||
6. Password hashing
|
||||
7. Login form component
|
||||
8. Registration form component
|
||||
|
||||
Result: 8 tasks, 80%+ context, degradation at task 4-5
|
||||
```
|
||||
|
||||
**Why this fails:**
|
||||
- Task 1-3: Good quality
|
||||
- Task 4-5: "I'll do these concisely" = degradation begins
|
||||
- Task 6-8: Rushed, minimal, poor quality
|
||||
|
||||
### ✅ The "Atomic Plan" Pattern
|
||||
|
||||
```
|
||||
Split into 4 plans:
|
||||
|
||||
Plan 1: "Auth Database Models" (2 tasks)
|
||||
- Database schema (User, Session)
|
||||
- Migration files
|
||||
|
||||
Plan 2: "Auth API Core" (3 tasks)
|
||||
- Register endpoint
|
||||
- Login endpoint
|
||||
- JWT utilities
|
||||
|
||||
Plan 3: "Auth API Protection" (2 tasks)
|
||||
- Protected route middleware
|
||||
- Logout endpoint
|
||||
|
||||
Plan 4: "Auth UI Components" (2 tasks)
|
||||
- Login form
|
||||
- Registration form
|
||||
```
|
||||
|
||||
**Why this succeeds:**
|
||||
- Each plan: 2-3 tasks, 30-40% context
|
||||
- All tasks: Peak quality throughout
|
||||
- Git history: 4 focused commits
|
||||
- Easy to verify each piece
|
||||
- Rollback is surgical
|
||||
|
||||
### ❌ The "Efficiency Trap" Anti-Pattern
|
||||
|
||||
```
|
||||
Thinking: "These tasks are small, let's do 6 to be efficient"
|
||||
|
||||
Result: Task 1-2 are good, task 3-4 begin degrading, task 5-6 are rushed
|
||||
```
|
||||
|
||||
**Why this fails:** You're optimizing for fewer plans, not quality. The "efficiency" is false - poor quality requires more rework.
|
||||
|
||||
### ✅ The "Quality First" Pattern
|
||||
|
||||
```
|
||||
Thinking: "These tasks are small, but let's do 2-3 to guarantee quality"
|
||||
|
||||
Result: All tasks peak quality, clean commits, no rework needed
|
||||
```
|
||||
|
||||
**Why this succeeds:** You optimize for quality, which is true efficiency. No rework = faster overall.
|
||||
|
||||
## Estimating Context Usage
|
||||
|
||||
**Rough heuristics for plan size:**
|
||||
|
||||
### File Counts
|
||||
- 0-3 files modified: Small task (~10-15% context)
|
||||
- 4-6 files modified: Medium task (~20-30% context)
|
||||
- 7+ files modified: Large task (~40%+ context) - split this
|
||||
|
||||
### Complexity
|
||||
- Simple CRUD: ~15% per task
|
||||
- Business logic: ~25% per task
|
||||
- Complex algorithms: ~40% per task
|
||||
- Domain modeling: ~35% per task
|
||||
|
||||
### 2-Task Plan (Safe)
|
||||
- 2 simple tasks: ~30% total ✅ Plenty of room
|
||||
- 2 medium tasks: ~50% total ✅ At target
|
||||
- 2 complex tasks: ~80% total ❌ Too tight, split
|
||||
|
||||
### 3-Task Plan (Risky)
|
||||
- 3 simple tasks: ~45% total ✅ Good
|
||||
- 3 medium tasks: ~75% total ⚠️ Pushing it
|
||||
- 3 complex tasks: 120% total ❌ Impossible, split
|
||||
|
||||
**Conservative principle:** When in doubt, split. Better to have an extra plan than degraded quality.
|
||||
|
||||
## The Atomic Commit Philosophy
|
||||
|
||||
**What we're optimizing for:** Beautiful git history where each commit is:
|
||||
- Focused (2-3 related changes)
|
||||
- Complete (fully implemented, tested)
|
||||
- Documented (clear commit message)
|
||||
- Reviewable (small enough to understand)
|
||||
- Revertable (surgical rollback possible)
|
||||
|
||||
**Bad git history (large plans):**
|
||||
```
|
||||
feat(auth): Complete authentication system
|
||||
- Added 16 files
|
||||
- Modified 8 files
|
||||
- 1200 lines changed
|
||||
- Contains: models, API, UI, middleware, utilities
|
||||
```
|
||||
|
||||
Impossible to review, hard to understand, can't revert without losing everything.
|
||||
|
||||
**Good git history (atomic plans):**
|
||||
```
|
||||
feat(auth-01): Add User and Session database models
|
||||
- Added schema files
|
||||
- Added migration
|
||||
- 45 lines changed
|
||||
|
||||
feat(auth-02): Implement register and login API endpoints
|
||||
- Added /api/auth/register
|
||||
- Added /api/auth/login
|
||||
- Added JWT utilities
|
||||
- 120 lines changed
|
||||
|
||||
feat(auth-03): Add protected route middleware
|
||||
- Added middleware/auth.ts
|
||||
- Added tests
|
||||
- 60 lines changed
|
||||
|
||||
feat(auth-04): Build login and registration forms
|
||||
- Added LoginForm component
|
||||
- Added RegisterForm component
|
||||
- 90 lines changed
|
||||
```
|
||||
|
||||
Each commit tells a story. Each is reviewable. Each is revertable. This is craftsmanship.
|
||||
|
||||
## Quality Assurance Through Scope Control
|
||||
|
||||
**The guarantee:** When you follow the 2-3 task rule with 50% context target:
|
||||
|
||||
1. **Consistency:** First task has same quality as last task
|
||||
2. **Thoroughness:** No "I'll complete X concisely" degradation
|
||||
3. **Documentation:** Full context budget for comments/tests
|
||||
4. **Error handling:** Space for proper validation and edge cases
|
||||
5. **Testing:** Room for comprehensive test coverage
|
||||
|
||||
**The cost:** More plans to manage.
|
||||
|
||||
**The benefit:** Consistent excellence. No rework. Clean history. Maintainable code.
|
||||
|
||||
**The trade-off is worth it.**
|
||||
|
||||
## Summary
|
||||
|
||||
**Old way (3-6 tasks, 80% target):**
|
||||
- Tasks 1-2: Good
|
||||
- Tasks 3-4: Degrading
|
||||
- Tasks 5-6: Poor
|
||||
- Git: Large, unreviewable commits
|
||||
- Quality: Inconsistent
|
||||
|
||||
**New way (2-3 tasks, 50% target):**
|
||||
- All tasks: Peak quality
|
||||
- Git: Atomic, surgical commits
|
||||
- Quality: Consistent excellence
|
||||
- Autonomous plans: Subagent execution (fresh context)
|
||||
|
||||
**The principle:** Aggressive atomicity. More plans, smaller scope, consistent quality.
|
||||
|
||||
**The rule:** If in doubt, split. Quality over consolidation. Always.
|
||||
Reference in New Issue
Block a user