zhongwei/gh-glittercowboy-taches-cc-resources

Files

Zhongwei Li ccc65b3f07 Initial commit

2025-11-29 18:28:37 +08:00

12 KiB

Raw Blame History

Scope Estimation & Quality-Driven Plan Splitting

Plans must maintain consistent quality from first task to last. This requires understanding the quality degradation curve and splitting aggressively to stay in the peak quality zone.

The Quality Degradation Curve

Critical insight: Claude doesn't degrade at arbitrary percentages - it degrades when it perceives context pressure and enters "completion mode."

Context Usage  │  Quality Level   │  Claude's Mental State
─────────────────────────────────────────────────────────
0-30%          │  ████████ PEAK   │  "I can be thorough and comprehensive"
               │                  │  No anxiety, full detail, best work

30-50%         │  ██████ GOOD     │  "Still have room, maintaining quality"
               │                  │  Engaged, confident, solid work

50-70%         │  ███ DEGRADING   │  "Getting tight, need to be efficient"
               │                  │  Efficiency mode, compression begins

70%+           │  █ POOR          │  "Running out, must finish quickly"
               │                  │  Self-lobotomization, rushed, minimal

The 40-50% inflection point:

This is where quality breaks. Claude sees context mounting and thinks "I'd better conserve now or I won't finish." Result: The classic mid-execution statement "I'll complete the remaining tasks more concisely" = quality crash.

The fundamental rule: Stop BEFORE quality degrades, not at context limit.

Target: 50% Context Maximum

Plans should complete within ~50% of context usage.

Why 50% not 80%?

Huge safety buffer
No context anxiety possible
Quality maintained from start to finish
Room for unexpected complexity
Space for iteration and fixes

If you target 80%, you're planning for failure. By the time you hit 80%, you've already spent 40% in degradation mode.

The 2-3 Task Rule

Each plan should contain 2-3 tasks maximum.

Why this number?

Task 1 (0-15% context):

Fresh context
Peak quality
Comprehensive implementation
Full testing
Complete documentation

Task 2 (15-35% context):

Still in peak zone
Quality maintained
Buffer feels safe
No anxiety

Task 3 (35-50% context):

Beginning to feel pressure
Quality still good but managing it
Natural stopping point
Better to commit here

Task 4+ (50%+ context):

DEGRADATION ZONE
"I'll do this concisely" appears
Quality crashes
Should have split before this

The principle: Each task is independently committable. 2-3 focused changes per commit creates beautiful, surgical git history.

Signals to Split Into Multiple Plans

Always Split If:

1. More than 3 tasks

Even if tasks seem small
Each additional task increases degradation risk
Split into logical groups of 2-3

2. Multiple subsystems

❌ Bad (1 plan):
- Database schema (3 files)
- API routes (5 files)
- UI components (8 files)
Total: 16 files, 1 plan → guaranteed degradation

✅ Good (3 plans):
- 01-01-PLAN.md: Database schema (3 files, 2 tasks)
- 01-02-PLAN.md: API routes (5 files, 3 tasks)
- 01-03-PLAN.md: UI components (8 files, 3 tasks)
Total: 16 files, 3 plans → consistent quality

3. Any task with >5 file modifications

Large tasks burn context fast
Split by file groups or logical units
Better: 3 plans of 2 files each vs 1 plan of 6 files

4. Checkpoint + implementation work

Checkpoints require user interaction (context preserved)
Implementation after checkpoint should be separate plan

✅ Good split:
- 02-01-PLAN.md: Setup (checkpoint: decision on auth provider)
- 02-02-PLAN.md: Implement chosen auth solution

5. Research + implementation

Research produces FINDINGS.md (separate plan)
Implementation consumes FINDINGS.md (separate plan)
Clear boundary, clean handoff

Consider Splitting If:

1. Estimated >5 files modified total

Context from reading existing code
Context from diffs
Context from responses
Adds up faster than expected

2. Complex domains (auth, payments, data modeling)

These require careful thinking
Burns more context per task than simple CRUD
Split more aggressively

3. Any uncertainty about approach

"Figure out X" phase separate from "implement X" phase
Don't mix exploration and implementation

4. Natural semantic boundaries

Setup → Core → Features
Backend → Frontend
Configuration → Implementation → Testing

Splitting Strategies

By Subsystem

Phase: "Authentication System"

Split:

- 03-01-PLAN.md: Database models (User, Session tables + relations)
- 03-02-PLAN.md: Auth API (register, login, logout endpoints)
- 03-03-PLAN.md: Protected routes (middleware, JWT validation)
- 03-04-PLAN.md: UI components (login form, registration form)

Each plan: 2-3 tasks, single subsystem, clean commits.

By Dependency

Phase: "Payment Integration"

Split:

- 04-01-PLAN.md: Stripe setup (webhook endpoints via API, env vars, test mode)
- 04-02-PLAN.md: Subscription logic (plans, checkout, customer portal)
- 04-03-PLAN.md: Frontend integration (pricing page, payment flow)

Later plans depend on earlier completion. Sequential execution, fresh context each time.

By Complexity

Phase: "Dashboard Buildout"

Split:

- 05-01-PLAN.md: Layout shell (simple: sidebar, header, routing)
- 05-02-PLAN.md: Data fetching (moderate: TanStack Query setup, API integration)
- 05-03-PLAN.md: Data visualization (complex: charts, tables, real-time updates)

Complex work gets its own plan with full context budget.

By Verification Points

Phase: "Deployment Pipeline"

Split:

- 06-01-PLAN.md: Vercel setup (deploy via CLI, configure domains)
  → Ends with checkpoint:human-verify "check xyz.vercel.app loads"

- 06-02-PLAN.md: Environment config (secrets via CLI, env vars)
  → Autonomous (no checkpoints) → subagent execution

- 06-03-PLAN.md: CI/CD (GitHub Actions, preview deploys)
  → Ends with checkpoint:human-verify "check PR preview works"

Verification checkpoints create natural boundaries. Autonomous plans between checkpoints execute via subagent with fresh context.

Autonomous vs Interactive Plans

Critical optimization: Plans without checkpoints don't need main context.

Autonomous Plans (No Checkpoints)

Contains only type="auto" tasks
No user interaction needed
Execute via subagent with fresh 200k context
Impossible to degrade (always starts at 0%)
Creates SUMMARY, commits, reports back
Can run in parallel (multiple subagents)

Interactive Plans (Has Checkpoints)

Contains checkpoint:human-verify or checkpoint:decision tasks
Requires user interaction
Must execute in main context
Still target 50% context (2-3 tasks)

Planning guidance: If splitting a phase, try to:

Group autonomous work together (→ subagent)
Separate interactive work (→ main context)
Maximize autonomous plans (more fresh contexts)

Example:

Phase: Feature X
- 07-01-PLAN.md: Backend (autonomous) → subagent
- 07-02-PLAN.md: Frontend (autonomous) → subagent
- 07-03-PLAN.md: Integration test (has checkpoint:human-verify) → main context

Two fresh contexts, one interactive verification. Perfect.

Anti-Patterns

❌ The "Comprehensive Plan" Anti-Pattern

Plan: "Complete Authentication System"
Tasks:
1. Database models
2. Migration files
3. Auth API endpoints
4. JWT utilities
5. Protected route middleware
6. Password hashing
7. Login form component
8. Registration form component

Result: 8 tasks, 80%+ context, degradation at task 4-5

Why this fails:

Task 1-3: Good quality
Task 4-5: "I'll do these concisely" = degradation begins
Task 6-8: Rushed, minimal, poor quality

✅ The "Atomic Plan" Pattern

Split into 4 plans:

Plan 1: "Auth Database Models" (2 tasks)
- Database schema (User, Session)
- Migration files

Plan 2: "Auth API Core" (3 tasks)
- Register endpoint
- Login endpoint
- JWT utilities

Plan 3: "Auth API Protection" (2 tasks)
- Protected route middleware
- Logout endpoint

Plan 4: "Auth UI Components" (2 tasks)
- Login form
- Registration form

Why this succeeds:

Each plan: 2-3 tasks, 30-40% context
All tasks: Peak quality throughout
Git history: 4 focused commits
Easy to verify each piece
Rollback is surgical

❌ The "Efficiency Trap" Anti-Pattern

Thinking: "These tasks are small, let's do 6 to be efficient"

Result: Task 1-2 are good, task 3-4 begin degrading, task 5-6 are rushed

Why this fails: You're optimizing for fewer plans, not quality. The "efficiency" is false - poor quality requires more rework.

✅ The "Quality First" Pattern

Thinking: "These tasks are small, but let's do 2-3 to guarantee quality"

Result: All tasks peak quality, clean commits, no rework needed

Why this succeeds: You optimize for quality, which is true efficiency. No rework = faster overall.

Estimating Context Usage

Rough heuristics for plan size:

File Counts

0-3 files modified: Small task (~10-15% context)
4-6 files modified: Medium task (~20-30% context)
7+ files modified: Large task (~40%+ context) - split this

Complexity

Simple CRUD: ~15% per task
Business logic: ~25% per task
Complex algorithms: ~40% per task
Domain modeling: ~35% per task

2-Task Plan (Safe)

2 simple tasks: ~30% total ✅ Plenty of room
2 medium tasks: ~50% total ✅ At target
2 complex tasks: ~80% total ❌ Too tight, split

3-Task Plan (Risky)

3 simple tasks: ~45% total ✅ Good
3 medium tasks: ~75% total ⚠️ Pushing it
3 complex tasks: 120% total ❌ Impossible, split

Conservative principle: When in doubt, split. Better to have an extra plan than degraded quality.

The Atomic Commit Philosophy

What we're optimizing for: Beautiful git history where each commit is:

Focused (2-3 related changes)
Complete (fully implemented, tested)
Documented (clear commit message)
Reviewable (small enough to understand)
Revertable (surgical rollback possible)

Bad git history (large plans):

feat(auth): Complete authentication system
- Added 16 files
- Modified 8 files
- 1200 lines changed
- Contains: models, API, UI, middleware, utilities

Impossible to review, hard to understand, can't revert without losing everything.

Good git history (atomic plans):

feat(auth-01): Add User and Session database models
- Added schema files
- Added migration
- 45 lines changed

feat(auth-02): Implement register and login API endpoints
- Added /api/auth/register
- Added /api/auth/login
- Added JWT utilities
- 120 lines changed

feat(auth-03): Add protected route middleware
- Added middleware/auth.ts
- Added tests
- 60 lines changed

feat(auth-04): Build login and registration forms
- Added LoginForm component
- Added RegisterForm component
- 90 lines changed

Each commit tells a story. Each is reviewable. Each is revertable. This is craftsmanship.

Quality Assurance Through Scope Control

The guarantee: When you follow the 2-3 task rule with 50% context target:

Consistency: First task has same quality as last task
Thoroughness: No "I'll complete X concisely" degradation
Documentation: Full context budget for comments/tests
Error handling: Space for proper validation and edge cases
Testing: Room for comprehensive test coverage

The cost: More plans to manage.

The benefit: Consistent excellence. No rework. Clean history. Maintainable code.

The trade-off is worth it.

Summary

Old way (3-6 tasks, 80% target):

Tasks 1-2: Good
Tasks 3-4: Degrading
Tasks 5-6: Poor
Git: Large, unreviewable commits
Quality: Inconsistent

New way (2-3 tasks, 50% target):

All tasks: Peak quality
Git: Atomic, surgical commits
Quality: Consistent excellence
Autonomous plans: Subagent execution (fresh context)

The principle: Aggressive atomicity. More plans, smaller scope, consistent quality.

The rule: If in doubt, split. Quality over consolidation. Always.

12 KiB Raw Blame History

Scope Estimation & Quality-Driven Plan Splitting

The Quality Degradation Curve

Target: 50% Context Maximum

The 2-3 Task Rule

Signals to Split Into Multiple Plans

Always Split If:

Consider Splitting If:

Splitting Strategies

By Subsystem

By Dependency

By Complexity

By Verification Points

Autonomous vs Interactive Plans

Autonomous Plans (No Checkpoints)

Interactive Plans (Has Checkpoints)

Anti-Patterns

❌ The "Comprehensive Plan" Anti-Pattern

✅ The "Atomic Plan" Pattern

❌ The "Efficiency Trap" Anti-Pattern

✅ The "Quality First" Pattern

Estimating Context Usage

File Counts

Complexity

2-Task Plan (Safe)

3-Task Plan (Risky)

The Atomic Commit Philosophy

Quality Assurance Through Scope Control

Summary

12 KiB

Raw Blame History