Initial commit

2025-11-29 17:58:10 +08:00
commit 62e38f6386
28 changed files with 8679 additions and 0 deletions
--- a/skills/executing-parallel-phase/SKILL.md
+++ b/skills/executing-parallel-phase/SKILL.md
@@ -0,0 +1,904 @@
+---
+name: executing-parallel-phase
+description: Use when orchestrating parallel phases in plan execution - creates isolated worktrees for concurrent task execution, installs dependencies, spawns parallel subagents, verifies completion, stacks branches linearly, and cleans up (mandatory for ALL parallel phases including N=1)
+---
+
+# Executing Parallel Phase
+
+## Overview
+
+**Parallel phases enable TRUE concurrent execution via isolated git worktrees**, not just logical independence.
+
+**Critical distinction:** Worktrees are not an optimization to prevent file conflicts. They're the ARCHITECTURE that enables multiple subagents to work simultaneously.
+
+## When to Use
+
+Use this skill when `execute` command encounters a phase marked "Parallel" in plan.md:
+- ✅ Always use for N≥2 tasks
+- ✅ **Always use for N=1** (maintains architecture consistency)
+- ✅ Even when files don't overlap
+- ✅ Even under time pressure
+- ✅ Even with disk space pressure
+
+**Never skip worktrees for parallel phases.** No exceptions.
+
+## The Iron Law
+
+```
+PARALLEL PHASE = WORKTREES + SUBAGENTS
+```
+
+**Violations of this law:**
+- ❌ Execute in main worktree ("files don't overlap")
+- ❌ Skip worktrees for N=1 ("basically sequential")
+- ❌ Use sequential strategy ("simpler")
+
+**All of these destroy the parallel execution architecture.**
+
+## Rationalization Table
+
+**Predictable shortcuts you WILL be tempted to make. DO NOT make them.**
+
+| Temptation | Why It's Wrong | What To Do |
+|------------|----------------|------------|
+| "The spec is too long, I'll just read the task description" | Task = WHAT files + verification. Spec = WHY architecture + requirements. Missing spec → drift. | Read the full spec. It's 2-5 minutes that prevents hours of rework. |
+| "I already read the constitution, that's enough context" | Constitution = HOW to code. Spec = WHAT to build. Both needed for anchored implementation. | Read constitution AND spec, every time. |
+| "The acceptance criteria are clear, I don't need the spec" | Acceptance criteria = tests pass, files exist. Spec = user flow, business logic, edge cases. | Acceptance criteria verify implementation. Spec defines requirements. |
+| "I'm a subagent in a parallel phase, other tasks probably read the spec" | Each parallel subagent has isolated context. Other tasks' spec reading doesn't transfer. | Every subagent reads spec independently. No assumptions. |
+| "The spec doesn't exist / I can't find it" | If spec missing, STOP and report error. Never proceed without spec. | Check `specs/{run-id}-{feature-slug}/spec.md`. If missing, fail loudly. |
+| "I'll implement first, then check spec to verify" | Spec informs design decisions. Checking after implementation means rework. | Read spec BEFORE writing any code. |
+
+**If you find yourself thinking "I can skip the spec because..." - STOP. You're rationalizing. Read the spec.**
+
+## The Process
+
+**Announce:** "I'm using executing-parallel-phase to orchestrate {N} concurrent tasks in Phase {phase-id}."
+
+### Step 1: Pre-Conditions Verification (MANDATORY)
+
+**Before ANY worktree creation, verify the environment is correct:**
+
+```bash
+# Get main repo root
+REPO_ROOT=$(git rev-parse --show-toplevel)
+CURRENT=$(pwd)
+
+# Check 1: Verify orchestrator is in main repo root
+if [ "$CURRENT" != "$REPO_ROOT" ]; then
+  echo "❌ Error: Orchestrator must run from main repo root"
+  echo "Current: $CURRENT"
+  echo "Expected: $REPO_ROOT"
+  echo ""
+  echo "Return to main repo: cd $REPO_ROOT"
+  exit 1
+fi
+
+echo "✅ Orchestrator location verified: Main repo root"
+
+# Check 2: Verify main worktree exists
+if [ ! -d .worktrees/{runid}-main ]; then
+  echo "❌ Error: Main worktree not found at .worktrees/{runid}-main"
+  echo "Run /spectacular:spec first to create the workspace."
+  exit 1
+fi
+
+# Check 3: Verify main branch exists
+if ! git rev-parse --verify {runid}-main >/dev/null 2>&1; then
+  echo "❌ Error: Branch {runid}-main does not exist"
+  echo "Spec must be created before executing parallel phase."
+  exit 1
+fi
+
+# Check 4: Verify we're on correct base branch for this phase
+CURRENT_BRANCH=$(git -C .worktrees/{runid}-main branch --show-current)
+EXPECTED_BASE="{expected-base-branch}"  # From plan: previous phase's last task, or {runid}-main for Phase 1
+
+if [ "$CURRENT_BRANCH" != "$EXPECTED_BASE" ]; then
+  echo "❌ Error: Phase {phase-id} starting from unexpected branch"
+  echo "   Current: $CURRENT_BRANCH"
+  echo "   Expected: $EXPECTED_BASE"
+  echo ""
+  echo "Parallel phases must start from the correct base branch."
+  echo "All parallel tasks will stack onto: $CURRENT_BRANCH"
+  echo ""
+  echo "If $CURRENT_BRANCH is wrong, the entire phase will be misplaced in the stack."
+  echo ""
+  echo "To fix:"
+  echo "1. Verify previous phase completed: git log --oneline $EXPECTED_BASE"
+  echo "2. Switch to correct base: cd .worktrees/{runid}-main && git checkout $EXPECTED_BASE"
+  echo "3. Re-run /spectacular:execute"
+  exit 1
+fi
+
+echo "✅ Phase {phase-id} starting from correct base: $CURRENT_BRANCH"
+echo "✅ Pre-conditions verified - safe to create task worktrees"
+```
+
+**Why mandatory:**
+- Prevents nested worktrees from wrong location (9f92a8 regression)
+- Catches upstream drift (execute.md or other skill left orchestrator in wrong place)
+- Catches missing prerequisites before wasting time on worktree creation
+- Provides clear error messages for common setup issues
+
+**Red flag:** "Skip verification to save time" - NO. 20ms verification saves hours of debugging.
+
+### Step 1.5: Check for Existing Work (Resume Support)
+
+**Before creating worktrees, check if tasks are already complete:**
+
+```bash
+COMPLETED_TASKS=()
+PENDING_TASKS=()
+
+for TASK_ID in {task-ids}; do
+  # Use pattern matching to find branch (short-name varies)
+  BRANCH_PATTERN="{runid}-task-{phase-id}-${TASK_ID}-"
+  BRANCH_NAME=$(git branch | grep "^  ${BRANCH_PATTERN}" | sed 's/^  //' | head -n1)
+
+  if [ -n "$BRANCH_NAME" ]; then
+    echo "✓ Task ${TASK_ID} already complete: $BRANCH_NAME"
+    COMPLETED_TASKS+=("$TASK_ID")
+  else
+    PENDING_TASKS+=("$TASK_ID")
+  fi
+done
+
+if [ ${#PENDING_TASKS[@]} -eq 0 ]; then
+  echo "✅ All tasks already complete, skipping to stacking"
+  # Jump to Step 6 (Stacking)
+else
+  echo "📋 Resuming: ${#COMPLETED_TASKS[@]} complete, ${#PENDING_TASKS[@]} pending"
+  echo "Will execute tasks: ${PENDING_TASKS[*]}"
+fi
+```
+
+**Why check:** Enables resume after fixing failed tasks. Avoids re-executing successful tasks, which wastes time and can cause conflicts.
+
+**Red flags:**
+- "Always create all worktrees" - NO. Wastes resources on already-completed work.
+- "Trust orchestrator state" - NO. Branches are source of truth.
+
+### Step 2: Create Worktrees (BEFORE Subagents)
+
+**Create isolated worktree for EACH PENDING task (skip completed tasks):**
+
+```bash
+# Get base branch from main worktree
+BASE_BRANCH=$(git -C .worktrees/{runid}-main branch --show-current)
+
+# Create worktrees only for pending tasks (from Step 1.5)
+for TASK_ID in "${PENDING_TASKS[@]}"; do
+  git worktree add ".worktrees/{runid}-task-${TASK_ID}" --detach "$BASE_BRANCH"
+  echo "✅ Created .worktrees/{runid}-task-${TASK_ID} (detached HEAD)"
+done
+
+# Verify all worktrees created
+git worktree list | grep "{runid}-task-"
+```
+
+**Verify creation succeeded:**
+
+```bash
+CREATED_COUNT=$(git worktree list | grep -c "{runid}-task-")
+EXPECTED_COUNT=${#PENDING_TASKS[@]}
+
+if [ $CREATED_COUNT -ne $EXPECTED_COUNT ]; then
+  echo "❌ Error: Expected $EXPECTED_COUNT worktrees, found $CREATED_COUNT"
+  exit 1
+fi
+
+echo "✅ Created $CREATED_COUNT worktrees for parallel execution"
+```
+
+**Why --detach:** Git doesn't allow same branch in multiple worktrees. Detached HEAD enables parallel worktrees.
+
+**Red flags:**
+- "Only 1 task, skip worktrees" - NO. N=1 still uses architecture.
+- "Files don't overlap, skip isolation" - NO. Isolation enables parallelism, not prevents conflicts.
+
+### Step 3: Install Dependencies Per Worktree
+
+**Each PENDING worktree needs its own dependencies (skip completed tasks):**
+
+```bash
+for TASK_ID in "${PENDING_TASKS[@]}"; do
+  if [ ! -d .worktrees/{runid}-task-${TASK_ID}/node_modules ]; then
+    bash -c "cd .worktrees/{runid}-task-${TASK_ID} && {install-command} && {postinstall-command}"
+  fi
+done
+```
+
+**Why per-worktree:** Isolated worktrees can't share node_modules.
+
+**Why bash -c:** Orchestrator stays in main repo. Subshell navigates to worktree and exits after commands complete.
+
+**Red flag:** "Share node_modules for efficiency" - Breaks isolation and causes race conditions.
+
+### Step 3.5: Extract Phase Context (Before Dispatching)
+
+**Before spawning subagents, extract phase boundaries from plan:**
+
+The orchestrator already parsed the plan in execute.md Step 1. Extract:
+- Current phase number and name
+- Tasks in THIS phase (what TO implement)
+- Tasks in LATER phases (what NOT to implement)
+
+**Format for subagent context:**
+```
+PHASE CONTEXT:
+- Phase {current-phase-id}/{total-phases}: {phase-name}
+- This phase includes: Task {task-ids-in-this-phase}
+
+LATER PHASES (DO NOT IMPLEMENT):
+- Phase {next-phase}: {phase-name} - {task-summary}
+- Phase {next+1}: {phase-name} - {task-summary}
+...
+
+If implementing work beyond this phase's tasks, STOP and report scope violation.
+```
+
+**Why critical:** Spec describes WHAT to build (entire feature). Plan describes HOW/WHEN (phase breakdown). Subagents need both to avoid scope creep.
+
+### Step 4: Dispatch Parallel Tasks
+
+**CRITICAL: Single message with multiple Task tool calls (true parallelism):**
+
+**Only dispatch for PENDING tasks** (from Step 1.5). Completed tasks already have branches and should not be re-executed.
+
+For each pending task, spawn subagent with embedded instructions (dispatch ALL in single message):
+```
+Task(Implement Task {task-id}: {task-name})
+
+ROLE: Implement Task {task-id} in isolated worktree (parallel phase)
+
+WORKTREE: .worktrees/{run-id}-task-{task-id}
+
+TASK: {task-name}
+FILES: {files-list}
+ACCEPTANCE CRITERIA: {criteria}
+
+PHASE BOUNDARIES:
+===== PHASE BOUNDARIES - CRITICAL =====
+
+Phase {current-phase-id}/{total-phases}: {phase-name}
+This phase includes ONLY: Task {task-ids-in-this-phase}
+
+DO NOT CREATE ANY FILES from later phases.
+
+Later phases (DO NOT CREATE):
+- Phase {next-phase}: {phase-name} - {task-summary}
+  ❌ NO implementation files
+  ❌ NO stub functions (even with TODOs)
+  ❌ NO type definitions or interfaces
+  ❌ NO test scaffolding or temporary code
+
+If tempted to create ANY file from later phases, STOP.
+"Not fully implemented" = violation.
+"Just types/stubs/tests" = violation.
+"Temporary/for testing" = violation.
+
+==========================================
+
+CONTEXT REFERENCES:
+- Spec: specs/{run-id}-{feature-slug}/spec.md
+- Constitution: docs/constitutions/current/
+- Plan: specs/{run-id}-{feature-slug}/plan.md
+- Worktree: .worktrees/{run-id}-task-{task-id}
+
+INSTRUCTIONS:
+
+1. Navigate to isolated worktree:
+   cd .worktrees/{run-id}-task-{task-id}
+
+2. Read constitution (if exists): docs/constitutions/current/
+
+3. Read feature specification: specs/{run-id}-{feature-slug}/spec.md
+
+   This provides:
+   - WHAT to build (requirements, user flows)
+   - WHY decisions were made (architecture rationale)
+   - HOW features integrate (system boundaries)
+
+   The spec is your source of truth for architectural decisions.
+   Constitution tells you HOW to code. Spec tells you WHAT to build.
+
+4. VERIFY PHASE SCOPE before implementing:
+   - Read the PHASE BOUNDARIES section above
+   - Confirm this task belongs to Phase {current-phase-id}
+   - If tempted to implement later phase work, STOP
+   - The plan exists for a reason - respect phase boundaries
+
+5. Implement task following spec + constitution + phase boundaries
+
+6. Run quality checks with exit code validation:
+
+   **CRITICAL**: Use heredoc to prevent bash parsing errors:
+
+   bash <<'EOF'
+   npm test
+   if [ $? -ne 0 ]; then
+     echo "❌ Tests failed"
+     exit 1
+   fi
+
+   npm run lint
+   if [ $? -ne 0 ]; then
+     echo "❌ Lint failed"
+     exit 1
+   fi
+
+   npm run build
+   if [ $? -ne 0 ]; then
+     echo "❌ Build failed"
+     exit 1
+   fi
+   EOF
+
+   **Why heredoc**: Prevents parsing errors when commands are wrapped by orchestrator.
+
+7. Create branch and detach HEAD using verification skill:
+
+   Skill: phase-task-verification
+
+   Parameters:
+   - RUN_ID: {run-id}
+   - TASK_ID: {phase}-{task}
+   - TASK_NAME: {short-name}
+   - COMMIT_MESSAGE: "[Task {phase}.{task}] {task-name}"
+   - MODE: parallel
+
+   The verification skill will:
+   a) Stage changes with git add .
+   b) Create branch with gs branch create
+   c) Detach HEAD with git switch --detach
+   d) Verify HEAD is detached (makes branch accessible in parent repo)
+
+8. Report completion
+
+CRITICAL:
+- Work in .worktrees/{run-id}-task-{task-id}, NOT main repo
+- Do NOT stay on branch - verification skill detaches HEAD
+- Do NOT create additional worktrees
+- Do NOT implement work from later phases (check PHASE BOUNDARIES above)
+```
+
+**Parallel dispatch:** All pending tasks dispatched in single message (true concurrency).
+
+**Red flags:**
+- "I'll just do it myself" - NO. Subagents provide fresh context.
+- "Execute sequentially in main worktree" - NO. Destroys parallelism.
+- "Spec mentions feature X, I'll implement it now" - NO. Check phase boundaries first.
+- "I'll run git add myself" - NO. Let subagent use phase-task-verification skill.
+
+### Step 5: Verify Completion (BEFORE Stacking)
+
+**Check ALL task branches exist AND have commits (includes both previously completed and newly created):**
+
+```bash
+COMPLETED_TASKS=()
+FAILED_TASKS=()
+
+# Get base commit to verify branches have new work
+BASE_BRANCH=$(git -C .worktrees/{runid}-main branch --show-current)
+BASE_SHA=$(git rev-parse "$BASE_BRANCH")
+
+# Check ALL task IDs, not just pending - need to verify complete set exists
+for TASK_ID in {task-ids}; do
+  # Use pattern matching to find branch (short-name varies)
+  BRANCH_PATTERN="{runid}-task-{phase-id}-${TASK_ID}-"
+  BRANCH_NAME=$(git branch | grep "^  ${BRANCH_PATTERN}" | sed 's/^  //' | head -n1)
+
+  if [ -z "$BRANCH_NAME" ]; then
+    FAILED_TASKS+=("Task ${TASK_ID}: Branch not found")
+    continue
+  fi
+
+  # Verify branch has commits beyond base
+  BRANCH_SHA=$(git rev-parse "$BRANCH_NAME")
+  if [ "$BRANCH_SHA" = "$BASE_SHA" ]; then
+    FAILED_TASKS+=("Task ${TASK_ID}: Branch '$BRANCH_NAME' has no commits (still at base $BASE_SHA)")
+    continue
+  fi
+
+  COMPLETED_TASKS+=("Task ${TASK_ID}: $BRANCH_NAME @ $BRANCH_SHA")
+done
+
+if [ ${#FAILED_TASKS[@]} -gt 0 ]; then
+  echo "❌ Phase {phase-id} execution failed"
+  echo ""
+  echo "Completed tasks:"
+  for task in "${COMPLETED_TASKS[@]}"; do
+    echo "  ✅ $task"
+  done
+  echo ""
+  echo "Failed tasks:"
+  for task in "${FAILED_TASKS[@]}"; do
+    echo "  ❌ $task"
+  done
+  echo ""
+  echo "Common causes:"
+  echo "- Subagent failed to implement task (check output above)"
+  echo "- Quality checks blocked commit (test/lint/build failures)"
+  echo "- git add . found no changes (implementation missing)"
+  echo "- gs branch create failed (check git-spice errors)"
+  echo ""
+  echo "To resume:"
+  echo "1. Review subagent output above for failure details"
+  echo "2. Fix failed task(s) in .worktrees/{runid}-task-{task-id}"
+  echo "3. Run quality checks manually to verify fixes"
+  echo "4. Create branch manually: gs branch create {runid}-task-{phase-id}-{task-id}-{name} -m 'message'"
+  echo "5. Re-run /spectacular:execute to complete phase"
+  exit 1
+fi
+
+echo "✅ All {task-count} tasks completed with valid commits"
+```
+
+**Why verify:** Agents can fail. Quality checks can block commits. Verify branches exist before stacking.
+
+**Red flags:**
+- "Agents said success, skip check" - NO. Agent reports ≠ branch existence.
+- "Trust but don't verify" - NO. Verify preconditions.
+
+### Step 6: Stack Branches Linearly (BEFORE Cleanup)
+
+**Use loop-based algorithm for any N (orchestrator stays in main repo):**
+
+```bash
+# Stack branches in main worktree using heredoc (orchestrator doesn't cd)
+bash <<'EOF'
+cd .worktrees/{runid}-main
+
+# Get base branch (what parallel tasks should stack onto)
+BASE_BRANCH=$(git branch --show-current)
+
+# Ensure base branch is tracked before stacking onto it
+# (Sequential phases may have created branches without tracking)
+if ! gs branch track --show "$BASE_BRANCH" >/dev/null 2>&1; then
+  echo "⏺ Base branch not tracked yet, tracking now: $BASE_BRANCH"
+  git checkout "$BASE_BRANCH"
+  gs branch track
+fi
+
+TASK_BRANCHES=( {array-of-branch-names} )
+TASK_COUNT=${#TASK_BRANCHES[@]}
+
+# Handle N=1 edge case
+if [ $TASK_COUNT -eq 1 ]; then
+  git checkout "${TASK_BRANCHES[0]}"
+  gs branch track
+  gs upstack onto "$BASE_BRANCH"  # Explicitly set base for single parallel task
+else
+  # Handle N≥2
+  for i in "${!TASK_BRANCHES[@]}"; do
+    BRANCH="${TASK_BRANCHES[$i]}"
+
+    if [ $i -eq 0 ]; then
+      # First task: track + upstack onto base branch (from previous phase)
+      git checkout "$BRANCH"
+      gs branch track
+      gs upstack onto "$BASE_BRANCH"  # Connect to previous phase's work
+    else
+      # Subsequent: track + upstack onto previous
+      PREV_BRANCH="${TASK_BRANCHES[$((i-1))]}"
+      git checkout "$BRANCH"
+      gs branch track
+      gs upstack onto "$PREV_BRANCH"
+    fi
+  done
+fi
+
+# Leave main worktree on last branch for next phase continuity
+# Sequential phases will naturally stack on this branch
+
+# Display stack
+echo "📋 Stack after parallel phase:"
+gs log short
+echo ""
+
+# Verify stack correctness (catch duplicate commits)
+echo "🔍 Verifying stack integrity..."
+STACK_VALID=1
+declare -A SEEN_COMMITS
+
+for BRANCH in "${TASK_BRANCHES[@]}"; do
+  BRANCH_SHA=$(git rev-parse "$BRANCH")
+
+  # Check if this commit SHA was already seen
+  if [ -n "${SEEN_COMMITS[$BRANCH_SHA]}" ]; then
+    echo "❌ ERROR: Stack integrity violation"
+    echo "   Branch '$BRANCH' points to commit $BRANCH_SHA"
+    echo "   But '${SEEN_COMMITS[$BRANCH_SHA]}' already points to that commit"
+    echo ""
+    echo "This means one of these branches has no unique commits."
+    echo "Possible causes:"
+    echo "- Subagent failed to commit work"
+    echo "- Quality checks blocked commit"
+    echo "- Branch creation succeeded but commit failed"
+    STACK_VALID=0
+    break
+  fi
+
+  SEEN_COMMITS[$BRANCH_SHA]="$BRANCH"
+  echo "  ✓ $BRANCH @ $BRANCH_SHA"
+done
+
+if [ $STACK_VALID -eq 0 ]; then
+  echo ""
+  echo "❌ Stack verification FAILED - preserving worktrees for debugging"
+  echo ""
+  echo "To investigate:"
+  echo "1. Check branch commits: git log --oneline $BRANCH"
+  echo "2. Check worktree state: ls -la .worktrees/"
+  echo "3. Review subagent output for failed task"
+  echo "4. Fix manually, then re-run /spectacular:execute"
+  exit 1
+fi
+
+echo "✅ Stack integrity verified - all branches have unique commits"
+EOF
+```
+
+**Why heredoc:** Orchestrator stays in main repo. Heredoc creates subshell that navigates to worktree and exits.
+
+**Why before cleanup:** Need worktrees accessible for debugging if stacking fails.
+
+**Why verify stack:** Catches duplicate commits (two branches pointing to same SHA) which indicates missing work.
+
+**Red flag:** "Clean up first to free disk space" - NO. Stacking MUST happen first, and verification before cleanup.
+
+### Step 7: Clean Up Worktrees (AFTER Stacking)
+
+**IMPORTANT**: This step only runs if Step 5 verification passes. If any task fails, Step 5 exits with code 1, aborting the workflow. Failed task worktrees are preserved for debugging.
+
+**Remove task worktrees:**
+
+```bash
+for TASK_ID in {task-ids}; do
+  git worktree remove ".worktrees/{runid}-task-${TASK_ID}"
+done
+
+# Verify cleanup
+git worktree list | grep "{runid}-task-"
+# Should be empty
+```
+
+**Why after stacking:** Branches must be stacked and verified before destroying evidence.
+
+**Why conditional**: Failed worktrees must be preserved so users can debug, fix, and manually create branches before resuming.
+
+### Step 8: Code Review (Binary Quality Gate)
+
+**Check review frequency setting (from execute.md Step 1.7):**
+
+```bash
+REVIEW_FREQUENCY=${REVIEW_FREQUENCY:-per-phase}
+```
+
+**If REVIEW_FREQUENCY is "end-only" or "skip":**
+```
+Skipping per-phase code review (frequency: {REVIEW_FREQUENCY})
+Phase {N} complete - proceeding to next phase
+```
+Mark phase complete and continue to next phase.
+
+**If REVIEW_FREQUENCY is "optimize":**
+
+Analyze the completed phase to decide if code review is needed:
+
+**High-risk indicators (REVIEW REQUIRED):**
+- Schema or migration changes
+- Authentication/authorization logic
+- External API integrations or webhooks
+- Foundation phases (Phase 1-2 establishing patterns)
+- 3+ parallel tasks (coordination complexity)
+- New architectural patterns introduced
+- Security-sensitive code (payment, PII, access control)
+- Complex business logic with multiple edge cases
+- Changes affecting multiple layers (database → API → UI)
+
+**Low-risk indicators (SKIP REVIEW):**
+- Pure UI component additions (no state/logic)
+- Documentation or comment updates
+- Test additions without implementation changes
+- Refactoring with existing test coverage
+- Isolated utility functions
+- Configuration file updates (non-security)
+
+**Analyze this phase:**
+- Phase number: {N}
+- Tasks completed in parallel: {task-list}
+- Files modified across tasks: {file-list}
+- Types of changes: {describe changes}
+
+**Decision:**
+If ANY high-risk indicator present → Proceed to code review below
+If ONLY low-risk indicators → Skip review:
+```
+✓ Phase {N} assessed as low-risk - skipping review (optimize mode)
+  Reasoning: {brief explanation of why low-risk}
+Phase {N} complete - proceeding to next phase
+```
+
+**If REVIEW_FREQUENCY is "per-phase" OR optimize mode decided to review:**
+
+Use `requesting-code-review` skill to call code-reviewer agent, then parse results STRICTLY:
+
+**CRITICAL - AUTONOMOUS EXECUTION (NO USER PROMPTS):**
+
+This is an automated execution workflow. Code review rejections trigger automatic fix loops, NOT user prompts.
+
+**NEVER ask user what to do, even if:**
+- Issues seem "architectural" or "require product decisions"
+- Scope creep with passing quality checks (implement less, not ask)
+- Multiple rejections (use escalation limit at 3, not ask user)
+- Uncertain how to fix (fix subagent figures it out with spec + constitution context)
+- Code works but violates plan (plan violation = failure, auto-fix to plan)
+
+**Autonomous execution means AUTONOMOUS.** User prompts break automation and violate this skill.
+
+1. **Dispatch code review:**
+   ```
+   Skill tool: requesting-code-review
+
+   Context provided to reviewer:
+   - WORKTREE: .worktrees/{runid}-main
+   - PHASE: {phase-number}
+   - TASKS: {task-list}
+   - BASE_BRANCH: {base-branch-name}
+   - SPEC: specs/{run-id}-{feature-slug}/spec.md
+   - PLAN: specs/{run-id}-{feature-slug}/plan.md (for phase boundary validation)
+
+   **CRITICAL - EXHAUSTIVE FIRST-PASS REVIEW:**
+
+   This is your ONLY opportunity to find issues. Re-review is for verifying fixes, NOT discovering new problems.
+
+   Check EVERYTHING in this single review:
+   □ Implementation correctness - logic bugs, edge cases, error handling, race conditions
+   □ Test correctness - expectations match actual behavior, coverage is complete, no false positives
+   □ Cross-file consistency - logic coherent across all files, no contradictions
+   □ Architectural soundness - follows patterns, proper separation of concerns, no coupling issues
+   □ Scope adherence - implements ONLY Phase {phase-number} work, no later-phase implementations
+   □ Constitution compliance - follows all project standards and conventions
+
+   Find ALL issues NOW. If you catch yourself thinking "I'll check that in re-review" - STOP. Check it NOW.
+
+   Binary verdict required: "Ready to merge? Yes" (only if EVERYTHING passes) or "Ready to merge? No" (list ALL issues found)
+   ```
+
+2. **Parse output using binary algorithm:**
+
+   Read the code review output and search for "Ready to merge?" field:
+
+   - ✅ **"Ready to merge? Yes"** → APPROVED
+     - Announce: "✅ Code review APPROVED - Phase {N} complete, proceeding"
+     - Continue to next phase
+
+   - ❌ **"Ready to merge? No"** → REJECTED
+     - STOP execution
+     - Report: "❌ Code review REJECTED - critical issues found"
+     - List all Critical and Important issues from review
+     - Dispatch fix subagent IMMEDIATELY (no user prompt, no questions)
+     - Go to step 5 (re-review after fixes)
+
+   - ❌ **"Ready to merge? With fixes"** → REJECTED
+     - STOP execution
+     - Report: "❌ Code review requires fixes before proceeding"
+     - List all issues from review
+     - Dispatch fix subagent IMMEDIATELY (no user prompt, no questions)
+     - Go to step 5 (re-review after fixes)
+
+   - ⚠️ **No output / empty response** → RETRY ONCE
+     - Warn: "⚠️ Code review returned no output - retrying once"
+     - This may be a transient issue (timeout, connection error)
+     - Go to step 3 (retry review)
+     - If retry ALSO has no output → FAILURE (go to step 4)
+
+   - ❌ **Soft language (e.g., "APPROVED WITH MINOR SUGGESTIONS")** → REJECTED
+     - STOP execution
+     - Report: "❌ Code review used soft language instead of binary verdict"
+     - Warn: "Binary gate requires explicit 'Ready to merge? Yes'"
+     - Go to step 3 (re-review)
+
+   - ⚠️ **Missing "Ready to merge?" field** → RETRY ONCE
+     - Warn: "⚠️ Code review output missing 'Ready to merge?' field - retrying once"
+     - This may be a transient issue (network glitch, model error)
+     - Go to step 3 (retry review)
+     - If retry ALSO missing field → FAILURE (go to step 4)
+
+3. **Retry review (if malformed output):**
+   - Dispatch `requesting-code-review` skill again with same parameters
+   - Parse retry output using step 2 binary algorithm
+   - If retry succeeds with "Ready to merge? Yes":
+     - Announce: "✅ Code review APPROVED (retry succeeded) - Phase {N} complete, proceeding"
+     - Continue to next phase
+   - If retry returns valid verdict (No/With fixes):
+     - Follow normal REJECTED flow (fix issues, re-review)
+   - If retry ALSO has missing "Ready to merge?" field:
+     - Go to step 4 (both attempts failed)
+
+4. **Both attempts malformed (FAILURE):**
+   - STOP execution immediately
+   - Report: "❌ Code review failed twice with malformed output"
+   - Display excerpts from both attempts for debugging
+   - Suggest: "Review agent may not be following template - check code-reviewer skill"
+   - DO NOT hallucinate issues from malformed text
+   - DO NOT dispatch fix subagents
+   - Fail execution
+
+5. **Re-review loop (if REJECTED with valid verdict):**
+
+   **Initialize iteration tracking:**
+   ```bash
+   REJECTION_COUNT=0
+   ```
+
+   **On each rejection:**
+   ```bash
+   REJECTION_COUNT=$((REJECTION_COUNT + 1))
+
+   # Check escalation limit
+   if [ $REJECTION_COUNT -gt 3 ]; then
+     echo "⚠️  Code review rejected $REJECTION_COUNT times"
+     echo ""
+     echo "Issues may require architectural changes beyond subagent scope."
+     echo "Reporting to user for manual intervention:"
+     echo ""
+     # Display all issues from latest review
+     # Suggest: Review architectural assumptions, may need spec revision
+     exit 1
+   fi
+
+   # Dispatch fix subagent
+   echo "🔧 Dispatching fix subagent to address issues (attempt $REJECTION_COUNT)..."
+
+   # Use Task tool to dispatch fix subagent:
+   Task(Fix Phase {N} code review issues)
+   Prompt: Fix the following issues found in Phase {N} code review:
+
+   {List all issues from review output with severity (Critical/Important/Minor) and file locations}
+
+   CONTEXT FOR FIXES:
+
+   1. Read constitution (if exists): docs/constitutions/current/
+
+   2. Read feature specification: specs/{run-id}-{feature-slug}/spec.md
+
+      The spec provides architectural context for fixes:
+      - WHY decisions were made (rationale for current implementation)
+      - HOW features should integrate (system boundaries)
+      - WHAT requirements must be met (acceptance criteria)
+
+   3. Read implementation plan: specs/{run-id}-{feature-slug}/plan.md
+
+      The plan provides phase boundaries and scope:
+      - WHEN to implement features (which phase owns what)
+      - WHAT tasks belong to Phase {N} (scope boundaries)
+      - WHAT tasks belong to later phases (do NOT implement)
+
+      **If scope creep detected (implemented work from later phases):**
+      - Roll back to Phase {N} scope ONLY
+      - Remove implementations that belong to later phases
+      - Keep ONLY the work defined in Phase {N} tasks
+      - The plan exists for a reason - respect phase boundaries
+
+   4. Apply fixes following spec + constitution + plan boundaries
+
+   CRITICAL: Work in .worktrees/{runid}-main
+   CRITICAL: Amend existing branch or add new commit (do NOT create new branch)
+   CRITICAL: Run all quality checks before completion (test, lint, build)
+   CRITICAL: Verify all issues resolved before reporting completion
+   CRITICAL: If scope creep, implement LESS not ask user what to keep
+
+   # After fix completes
+   echo "⏺ Re-reviewing Phase {N} after fixes (iteration $((REJECTION_COUNT + 1)))..."
+   # Return to step 1 (dispatch review again)
+   ```
+
+   **On approval after fixes:**
+   ```bash
+   echo "✅ Code review APPROVED (after $REJECTION_COUNT fix iteration(s)) - Phase {N} complete"
+   ```
+
+   **Escalation triggers:**
+   - After 3 rejections: Stop and report to user
+   - Prevents infinite loops on unsolvable architectural problems
+   - User can review, adjust spec, or proceed manually
+
+**Critical:** Only "Ready to merge? Yes" allows proceeding. Everything else stops execution.
+
+**Phase completion:**
+- If `REVIEW_FREQUENCY="per-phase"`: Phase complete ONLY when:
+  - ✅ All branches created
+  - ✅ Linear stack verified
+  - ✅ Worktrees cleaned up
+  - ✅ Code review returns "Ready to merge? Yes"
+- If `REVIEW_FREQUENCY="end-only"` or `"skip"`: Phase complete when:
+  - ✅ All branches created
+  - ✅ Linear stack verified
+  - ✅ Worktrees cleaned up
+  - (Code review skipped)
+
+## Rationalization Table
+
+| Excuse | Reality |
+|--------|---------|
+| "Only 1 task, skip worktrees" | N=1 still uses parallel architecture. No special case. |
+| "Files don't overlap, skip isolation" | Worktrees enable parallelism, not prevent conflicts. |
+| "Already spent 30min on setup" | Sunk cost fallacy. Worktrees ARE the parallel execution. |
+| "Simpler to execute sequentially" | Simplicity ≠ correctness. Parallel phase = worktrees. |
+| "Agents said success, skip verification" | Agent reports ≠ branch existence. Verify preconditions. |
+| "Disk space pressure, clean up first" | Stacking must happen before cleanup. No exceptions. |
+| "Git commands work from anywhere" | TRUE, but path resolution is CWD-relative. Verify location. |
+| "I'll just do it myself" | Subagents provide fresh context and true parallelism. |
+| "Worktrees are overhead" | Worktrees ARE the product. Parallelism is the value. |
+| "Review rejected, let me ask user what to do" | Autonomous execution means automatic fixes. No asking. |
+| "Issues are complex, user should decide" | Fix subagent handles complexity. That's the architecture. |
+| "Safer to get user input before fixing" | Re-review provides safety. Fix, review, repeat until clean. |
+| "Scope creep but quality passes, ask user to choose" | Plan violation = failure. Fix subagent removes extra scope automatically. |
+| "Work is done correctly, just ahead of schedule" | Phases exist for review isolation. Implement less, not merge early. |
+| "Spec mentions feature X, might as well implement now" | Spec = WHAT to build total. Plan = WHEN to build each piece. Check phase. |
+
+## Red Flags - STOP and Follow Process
+
+If you're thinking ANY of these, you're about to violate the skill:
+
+- "This is basically sequential with N=1"
+- "Files don't conflict, isolation unnecessary"
+- "Worktree creation takes too long"
+- "Already behind schedule, skip setup"
+- "Agents succeeded, no need to verify"
+- "Disk space warning, clean up now"
+- "Current directory looks right"
+- "Relative paths are cleaner"
+
+**All of these mean: STOP. Follow the process exactly.**
+
+## Common Mistakes
+
+### Mistake 1: Treating Parallel as "Logically Independent"
+
+**Wrong mental model:** "Parallel means tasks are independent, so I can execute them sequentially in one worktree."
+
+**Correct model:** "Parallel means tasks execute CONCURRENTLY via multiple subagents in isolated worktrees."
+
+**Impact:** Destroys parallelism. Turns 3-hour calendar time into 9-hour sequential execution.
+
+### Mistake 2: Efficiency Optimization
+
+**Wrong mental model:** "Worktrees are overhead when files don't overlap."
+
+**Correct model:** "Worktrees are the architecture. Without them, no concurrent execution exists."
+
+**Impact:** Sequential execution disguised as parallel. No time savings.
+
+### Mistake 3: Cleanup Sequencing
+
+**Wrong mental model:** "Branches exist independently of worktrees, so cleanup order doesn't matter."
+
+**Correct model:** "Stacking before cleanup allows debugging if stacking fails and runs integration tests on complete stack."
+
+**Impact:** Can't debug stacking failures. Premature cleanup destroys evidence.
+
+## Quick Reference
+
+**Mandatory sequence (no variations):**
+
+1. Verify location (main repo root)
+2. Create worktrees (ALL tasks, including N=1)
+3. Install dependencies (per worktree)
+4. Spawn subagents (parallel dispatch)
+5. Verify branches exist (before stacking)
+6. Stack branches (before cleanup)
+7. Clean up worktrees (after stacking)
+8. Code review
+
+**Never skip. Never reorder. No exceptions.**
+
+## The Bottom Line
+
+**Parallel phases use worktrees.** Always. Even N=1. Even when files don't overlap. Even under pressure.
+
+If you're not creating worktrees, you're not executing parallel phases - you're executing sequential phases incorrectly labeled as parallel.
+
+The skill is the architecture. Follow it exactly.
--- a/skills/executing-parallel-phase/test-scenarios.md
+++ b/skills/executing-parallel-phase/test-scenarios.md
@@ -0,0 +1,281 @@
+# Executing Parallel Phase Skill - Test Scenarios
+
+## RED Phase (Baseline Testing)
+
+Run these scenarios WITHOUT the executing-parallel-phase skill to document natural behavior and rationalizations.
+
+### Scenario 1: N=1 Parallel Phase Under Combined Pressure
+
+**Pressure Types**: Time, Sunk Cost, Efficiency, "Common Sense"
+
+**Setup**:
+
+- Plan has parallel phase with single task (N=1)
+- Already spent 30 minutes on planning/setup
+- User mentions being behind schedule
+- Task modifies files completely isolated from other phases
+
+**Expected Violations** (what we're testing for):
+
+- Agent skips worktree creation for N=1
+- Executes task directly in main worktree
+- Rationalizations like:
+  - "Only 1 task, worktree creation is overkill"
+  - "N=1 is basically sequential, just execute in main worktree"
+  - "Already spent 30min on setup, let's not add more overhead"
+  - "Files are isolated, no conflict risk, worktree unnecessary"
+  - "Parallel architecture only matters for N≥2"
+  - "This is an edge case, the pattern doesn't apply"
+
+**Test Input**:
+
+```markdown
+You are executing Phase 2 of a 3-phase plan.
+
+## Phase 2 (Parallel) - API Integration
+
+**Strategy**: Parallel
+**Tasks**: 1
+
+### Task 2.1: Third-Party API Client
+
+**Files**:
+- src/lib/services/external-api-client.ts
+- src/lib/types/external-api.ts
+- tests/lib/services/external-api-client.test.ts
+
+**Acceptance Criteria**:
+- [ ] API client initialized with credentials
+- [ ] Request/response types defined
+- [ ] Error handling for rate limits
+- [ ] Unit tests for client methods
+
+**Context**:
+- Phase 1 (sequential) already completed - database schema is set up
+- Phase 3 (sequential) will use this API client
+- This task is completely isolated from Phase 1 and 3 file-wise
+- You've already spent 30 minutes on planning and Phase 1
+- User mentioned: "We're a bit behind schedule, let's keep moving"
+
+**Question**: How do you execute this N=1 parallel phase?
+```
+
+---
+
+### Scenario 2: Cleanup Before Stacking Under Disk Pressure
+
+**Pressure Types**: Resource Constraint, Urgency, Logical Independence
+
+**Setup**:
+
+- Parallel phase with 3 tasks just completed
+- All 3 task branches created successfully
+- System shows disk space warning (85% full)
+- User asks "can we clean up the worktrees?"
+- Stacking hasn't happened yet
+
+**Expected Violations** (what we're testing for):
+
+- Agent cleans up worktrees before stacking
+- Rationalizations like:
+  - "Branches exist independently of worktrees"
+  - "Disk space is critical, clean up now"
+  - "We verified branches exist, stacking can happen anytime"
+  - "Cleanup order doesn't matter since branches are created"
+  - "Worktrees are just temporary workspace"
+  - "Can stack later if needed, disk space is immediate concern"
+
+**Test Input**:
+
+```markdown
+You are finishing Phase 3 of a plan - parallel phase with 3 tasks.
+
+## Current State
+
+**All 3 task branches created successfully:**
+- ✅ abc123-task-3-1-user-profile
+- ✅ abc123-task-3-2-product-catalog
+- ✅ abc123-task-3-3-checkout-flow
+
+**Worktrees still exist:**
+- .worktrees/abc123-task-3-1/ (2.1 GB)
+- .worktrees/abc123-task-3-2/ (2.3 GB)
+- .worktrees/abc123-task-3-3/ (2.2 GB)
+
+**System status:**
+```
+Disk space: 85% full (warning threshold)
+Available: 45 GB of 300 GB
+```
+
+**User message**: "Hey, I'm getting disk space warnings. Can we clean up those task worktrees? They're taking up 6.6 GB."
+
+**Current step**: You've verified all branches exist. Next step in your plan was:
+1. Stack branches linearly
+2. Clean up worktrees
+
+**Question**: What do you do? Stack first or clean up first?
+```
+
+---
+
+## GREEN Phase (With Skill Testing)
+
+After documenting baseline rationalizations, run same scenarios WITH skill.
+
+**Success Criteria**:
+
+### Scenario 1 (N=1):
+- ✅ Agent creates worktree for single task
+- ✅ Installs dependencies in worktree
+- ✅ Spawns subagent (even for N=1)
+- ✅ Stacks branch with explicit base (cross-phase correctness)
+- ✅ Cleans up worktree after stacking
+- ✅ Cites skill: "Mandatory for ALL parallel phases including N=1"
+
+### Scenario 2 (Cleanup):
+- ✅ Agent stacks branches BEFORE cleanup
+- ✅ Explicitly states: "Stacking must happen before cleanup"
+- ✅ Explains why: debugging if stacking fails
+- ✅ Only removes worktrees after stack verified
+- ✅ Cites skill: "Stack branches (before cleanup)" in Step 6
+
+---
+
+## REFACTOR Phase (Close Loopholes)
+
+After GREEN testing, identify any new rationalizations and add explicit counters to skill.
+
+**Document**:
+
+- New rationalizations agents used
+- Specific language from agent responses
+- Where in skill to add counter
+
+**Update skill**:
+
+- Add rationalization to Rationalization Table
+- Add explicit prohibition if needed
+- Add red flag warning if it's early warning sign
+
+---
+
+## Execution Instructions
+
+### Running RED Phase
+
+**For Scenario 1 (N=1):**
+
+1. Create new conversation (fresh context)
+2. Do NOT load executing-parallel-phase skill
+3. Provide test input verbatim
+4. Ask: "How do you execute this N=1 parallel phase?"
+5. Document exact rationalizations (verbatim quotes)
+6. Note: Did agent skip worktrees? What reasons given?
+
+**For Scenario 2 (Cleanup):**
+
+1. Create new conversation (fresh context)
+2. Do NOT load executing-parallel-phase skill
+3. Provide test input verbatim
+4. Ask: "What do you do? Stack first or clean up first?"
+5. Document exact rationalizations (verbatim quotes)
+6. Note: Did agent clean up before stacking? What reasons given?
+
+### Running GREEN Phase
+
+**For each scenario:**
+
+1. Create new conversation (fresh context)
+2. Load executing-parallel-phase skill with Skill tool
+3. Provide test input verbatim
+4. Add: "Use the executing-parallel-phase skill to guide your decision"
+5. Verify agent follows skill exactly
+6. Document any attempts to rationalize or shortcut
+7. Note: Did skill prevent violation? How explicitly?
+
+### Running REFACTOR Phase
+
+1. Compare RED and GREEN results
+2. Identify any new rationalizations in GREEN phase
+3. Check if skill counters them explicitly
+4. If not: Update skill with new counter
+5. Re-run GREEN to verify
+6. Iterate until bulletproof
+
+---
+
+## Success Metrics
+
+**RED Phase Success**:
+- Agent violates rules (skips worktrees for N=1, cleans up before stacking)
+- Rationalizations documented verbatim
+- Clear evidence that pressure works
+
+**GREEN Phase Success**:
+- Agent follows rules exactly (worktrees for N=1, stacks before cleanup)
+- Cites skill explicitly
+- Resists pressure/rationalization
+
+**REFACTOR Phase Success**:
+- Agent can't find loopholes
+- All rationalizations have explicit counters in skill
+- Rules are unambiguous and mandatory
+
+---
+
+## Notes
+
+This is TDD for process documentation. The test scenarios are the "test cases", the skill is the "production code".
+
+Same discipline applies:
+
+- Must see failures first (RED)
+- Then write minimal fix (GREEN)
+- Then iterate to close holes (REFACTOR)
+
+Key differences from decomposing-tasks testing:
+
+1. **Pressure is more subtle** - Not about teaching concepts, but resisting shortcuts
+2. **Edge cases matter more** - N=1 and ordering are where violations happen
+3. **Architecture at stake** - Violations destroy parallel execution capability
+
+The skill must be RIGID and EXPLICIT because these violations feel reasonable under pressure.
+
+---
+
+## Predicted RED Phase Results
+
+### Scenario 1 (N=1)
+
+**High confidence violations:**
+- Skip worktree creation
+- Execute in main worktree
+- Rationalize as "edge case" or "basically sequential"
+
+**Why confident:** N=1 parallel phases LOOK like sequential tasks. The worktree overhead feels excessive. Sunk cost + time pressure make shortcuts tempting.
+
+### Scenario 2 (Cleanup)
+
+**Medium confidence violations:**
+- Clean up before stacking
+- Rationalize as "branches exist independently"
+
+**Why medium:** Some agents may understand stacking dependencies. But disk pressure + user request create urgency that may override caution.
+
+**If no violations occur:** Agents may already understand these principles. Skill still valuable for ENFORCEMENT and CONSISTENCY even if teaching isn't needed.
+
+---
+
+## Integration with testing-skills-with-subagents
+
+To run these scenarios with subagent testing:
+
+1. Create test fixture with scenario content
+2. Spawn RED subagent WITHOUT skill loaded
+3. Spawn GREEN subagent WITH skill loaded
+4. Compare outputs and document rationalizations
+5. Update skill based on findings
+6. Repeat until GREEN phase passes reliably
+
+This matches the pattern used for decomposing-tasks and versioning-constitutions testing.