Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:09:26 +08:00
commit 71330f5583
76 changed files with 15081 additions and 0 deletions

View File

@@ -0,0 +1,242 @@
---
name: Validating Review Feedback
description: Validate code review feedback against implementation plan to prevent scope creep and derailment
when_to_use: when orchestrating plan execution with code review checkpoints, after receiving review feedback, before dispatching fixes to agents
related_skills: conducting-code-review
related_practices: code-review.md
version: 1.0.0
---
# Validating Review Feedback
## Overview
When orchestrating plan execution, code review feedback must align with the plan's goals. This workflow validates BLOCKING feedback against the plan, gets user decisions on misalignments, and annotates the review file to guide fixing agents.
**Core principle:** User decides scope changes, not the agent. Validate → Ask → Annotate.
**Announce at start:** "I'm using the validating-review-feedback skill to validate this review against the plan."
## The Workflow
### Phase 1: Parse Review Feedback
**Step 1: Read review file**
- Path provided by orchestrator
- Expected format: BLOCKING and NON-BLOCKING sections
**Step 2: Extract items**
- Parse all BLOCKING items into list
- Parse all NON-BLOCKING items (for awareness only)
- Preserve original wording and line numbers
### Phase 2: Validate Against Plan
**Step 1: Read plan file**
- Path provided by orchestrator
- Understand original scope and goals
**Step 2: Categorize each BLOCKING item**
For each BLOCKING item, determine:
- **In-scope**: Required by plan OR directly supports plan goals OR fixes bugs introduced while implementing plan
- **Out-of-scope**: Would require work beyond current plan (new features, refactoring unrelated code, performance optimizations not in plan)
- **Unclear**: Needs user judgment (edge cases, architectural decisions, ambiguous recommendations)
**Step 3: Document reasoning**
For each categorization, note brief reasoning:
- In-scope: "Task 3 requires auth validation"
- Out-of-scope: "SRP refactoring not in plan scope"
- Unclear: "Review recommends documentation alternative - needs user decision"
**Note on NON-BLOCKING items:** All NON-BLOCKING items are automatically marked [DEFERRED] without user consultation (see Phase 4 Step 3), as they are by definition not required for merge. User can choose to address them in a follow-up or ignore them entirely.
### Phase 3: Present Misalignments to User
**When:** Any BLOCKING items categorized as out-of-scope or unclear
**Step 1: Show misaligned items**
For each misaligned item:
```
BLOCKING Item: [exact text from review]
Categorization: [Out-of-scope / Unclear]
Reasoning: [why it doesn't clearly align with plan]
```
**Step 2: Ask user about each item**
Use AskUserQuestion for each misaligned BLOCKING item:
```
Question: "Should we address this BLOCKING issue in the current scope?"
Options:
- "[FIX] Yes, fix now" (Add to current scope)
- "[WONTFIX] No, reject feedback" (User disagrees with review)
- "[DEFERRED] Defer to follow-up" (Valid but out of scope)
```
**Step 3: Check for plan revision**
If user selected [DEFERRED] for multiple items or items seem interconnected:
- Ask: "Do you want to revise the plan to accommodate these deferred items?"
- If yes: Set `plan_revision_needed` flag
### Phase 4: Annotate Review File
**Step 1: Add tags to BLOCKING items**
For each BLOCKING item in original review file:
- Prepend `[FIX]` if in-scope or user approved
- Prepend `[WONTFIX]` if user rejected
- Prepend `[DEFERRED]` if user deferred
**Step 2: Add clarifying notes**
For each tagged item, add Gatekeeper note explaining categorization:
- `(Gatekeeper: In-scope - {reasoning})`
- `(Gatekeeper: Out-of-scope - {reasoning})`
- `(Gatekeeper: User approved - {decision})`
**Step 3: Tag all NON-BLOCKING items**
- Prepend `[DEFERRED]` to all NON-BLOCKING items
- Add note: "(Gatekeeper: NON-BLOCKING items deferred by default)"
**Step 4: Write annotated review**
Save back to same review file path with annotations.
Example annotated review:
```markdown
# Code Review - 2025-10-19
## Summary
Found 3 BLOCKING issues and 2 NON-BLOCKING suggestions.
## BLOCKING (Must Fix Before Merge)
### [FIX] Security vulnerability in auth endpoint
Missing input validation on user-provided email parameter allows potential injection attacks.
(Gatekeeper: In-scope - required by Task 2 auth implementation)
### [DEFERRED] SRP violation in data processing module
The processUserData function handles both validation and database writes.
(Gatekeeper: Out-of-scope - refactoring not in current plan)
### [FIX] Missing tests for preference storage
No test coverage for the new user preference persistence logic.
(Gatekeeper: In-scope - Task 3 requires test coverage)
## NON-BLOCKING (Can Be Deferred)
(Gatekeeper: All NON-BLOCKING items deferred by default)
### [DEFERRED] Ambiguous variable name in utils
The variable 'data' in formatUserData could be more descriptive like 'userData'.
### [DEFERRED] Consider extracting magic number
The timeout value 5000 appears in multiple places.
```
### Phase 5: Update Plan with Deferred Items
**When:** Any items marked [DEFERRED]
**Step 1: Check if plan has Deferred section**
- Read plan file
- Look for `## Deferred Items` section
**Step 2: Create or append to Deferred section**
Add to end of plan file:
```markdown
---
## Deferred Items
Items deferred during code review - must be reviewed before merge.
### From Batch N Review ({review-filename})
- **[DEFERRED]** {Item description}
- Source: Task X
- Severity: BLOCKING or NON-BLOCKING
- Reason: {why deferred}
```
**Step 3: Save updated plan**
Write plan file with deferred items section.
### Phase 6: Return Summary
Provide summary to orchestrator:
```
Validation complete:
- {N} BLOCKING items marked [FIX] (ready for fixing agent)
- {N} BLOCKING items marked [DEFERRED] (added to plan)
- {N} BLOCKING items marked [WONTFIX] (rejected by user)
- {N} NON-BLOCKING items marked [DEFERRED]
- Plan revision needed: {yes/no}
Annotated review saved to: {review-file-path}
Plan updated with deferred items: {plan-file-path}
```
## Key Principles
| Principle | Application |
|-----------|-------------|
| **User decides scope** | Never auto-approve out-of-scope items, always ask |
| **Annotate in place** | Modify review file with tags, don't create new files |
| **Track deferrals** | All deferred items must go in plan's Deferred section |
| **Clear communication** | Tags ([FIX]/[WONTFIX]/[DEFERRED]) guide fixing agent |
| **No silent filtering** | User must explicitly decide on every misalignment |
## Error Handling
**Missing required inputs (plan or review path):**
- Error immediately with clear message: "Cannot validate without both plan file path and review file path"
- Do not attempt to proceed with partial inputs
**No BLOCKING items found:**
- Still tag all NON-BLOCKING as [DEFERRED]
- Return summary indicating clean review
**User marks all BLOCKING as [WONTFIX]:**
- Annotate review accordingly
- Return to orchestrator with plan_revision_needed suggestion
- Orchestrator should pause and ask user about plan validity
**Plan file not found:**
- Error immediately
- Cannot validate without plan context
**Review file not parseable:**
- Error immediately
- Show user the review file format issue
## Integration
**Called by:**
- Gatekeeper agent (enforces this workflow)
- /execute command (via Gatekeeper dispatch)
**Requires:**
- Plan file path (from orchestrator)
- Review file path (from code-review-agent agent)
**Produces:**
- Annotated review file (with [FIX]/[WONTFIX]/[DEFERRED] tags)
- Updated plan file (with Deferred Items section)
- Summary for orchestrator
## Test Scenarios
See `test-scenarios.md` for baseline and with-skill tests proving this workflow prevents scope creep and misinterpretation.

View File

@@ -0,0 +1,323 @@
# Test Scenarios: Validating Review Feedback
## Baseline Test: Agent Misinterprets Review Recommendations
**Goal:** Prove that without gatekeeper, agents misinterpret code review feedback as permission to skip BLOCKING issues.
### Setup
**Mock plan:**
```markdown
# Orbital Mechanics Feature
## Task 1: Add ArrivalBurn state
- Implement state transition
- Calculate burn parameters
## Task 2: Add tests for ArrivalBurn
- Unit tests for state logic
- Integration tests for burn calculations
```
**Mock review (from code-review-agent):**
```markdown
# Code Review - Batch 2
## BLOCKING (Must Fix Before Merge)
### No Lambert recalculation at ArrivalBurn entry
The current implementation assumes Lambert solution from transfer is still valid when entering ArrivalBurn state. For long transfers, this can become stale.
**Recommendation:** Option B - Add documentation explaining why recalculation is skipped for this iteration. Include TODO for future enhancement.
### Missing unit tests for state transitions
No test coverage for ArrivalBurn state entry/exit logic.
## NON-BLOCKING (Can Be Deferred)
### Variable naming: 'data' is too generic
Consider renaming to 'burnParameters' for clarity.
```
### Dispatch rust-agent WITHOUT gatekeeper
**Prompt:**
```
Fix all BLOCKING issues found in Batch 2 code review.
Review file: {path-to-review}
Plan file: {path-to-plan}
```
### Expected Agent Failure
**Agent reasoning (flawed):**
1. Sees "Recommendation: Option B"
2. Interprets as "skip recalculation, documentation is enough"
3. Thinks "review explained it, I don't need to add docs"
4. Reports: "✅ All blocking issues resolved"
**What agent actually fixes:**
- ✅ Missing unit tests (clearly actionable)
- ❌ Lambert recalculation (skipped entirely, no documentation added)
**Why this fails:**
- BLOCKING = must resolve (either implement OR document, not skip)
- Agent confused "solution suggestion" with "permission to ignore"
- No checkpoint to validate interpretation
### Success Criteria for Baseline
❌ Agent skips BLOCKING item entirely
❌ No user consultation on ambiguous recommendation
❌ Reports completion despite unresolved BLOCKING issue
**Baseline proves:** Without gatekeeper, agents misinterpret review feedback and skip BLOCKING issues when recommendations suggest alternatives.
---
## With-Skill Test: Gatekeeper Enforces Resolution
**Goal:** Verify gatekeeper prevents the baseline failure by forcing explicit categorization and user decisions.
### Same Setup (plan + review from baseline)
### Dispatch gatekeeper agent BEFORE rust-agent
**Prompt:**
```
Validate review feedback against plan.
Plan file: {path-to-plan}
Review file: {path-to-review}
Batch: 2
```
### Expected Gatekeeper Behavior
**Phase 1: Parse review**
- Extract 2 BLOCKING items
- Extract 1 NON-BLOCKING item
**Phase 2: Validate against plan**
- Item 1 (Lambert recalculation):
- **Unclear**: Recommendation suggests documentation, but is that in scope?
- Plan says "calculate burn parameters" but doesn't specify recalculation strategy
- Item 2 (missing tests):
- **In-scope**: Task 2 explicitly requires tests
**Phase 3: Present misalignment to user**
Uses AskUserQuestion:
```
Question: "Should we address this BLOCKING issue in the current scope?"
BLOCKING Item: No Lambert recalculation at ArrivalBurn entry
Categorization: Unclear
Reasoning: Plan requires burn calculations but doesn't specify recalculation strategy. Review recommends documentation alternative.
Options:
- [FIX] Yes, fix now (add documentation as recommended)
- [WONTFIX] No, reject feedback
- [DEFERRED] Defer to follow-up
```
**User chooses: [FIX]** (accepts recommendation to document)
**Phase 4: Annotate review**
Gatekeeper modifies review file:
```markdown
## BLOCKING (Must Fix Before Merge)
### [FIX] No Lambert recalculation at ArrivalBurn entry
The current implementation assumes Lambert solution from transfer is still valid when entering ArrivalBurn state. For long transfers, this can become stale.
**Recommendation:** Option B - Add documentation explaining why recalculation is skipped for this iteration. Include TODO for future enhancement.
(Gatekeeper: In-scope - user approved Option B documentation approach)
### [FIX] Missing unit tests for state transitions
No test coverage for ArrivalBurn state entry/exit logic.
(Gatekeeper: In-scope - Task 2 requires tests)
## NON-BLOCKING (Can Be Deferred)
(Gatekeeper: All NON-BLOCKING items deferred by default)
### [DEFERRED] Variable naming: 'data' is too generic
Consider renaming to 'burnParameters' for clarity.
```
**Phase 5: Update plan**
No deferred items (user approved both BLOCKING as [FIX]).
**Phase 6: Return summary**
```
Gatekeeper Validation Complete - Batch 2
BLOCKING Items:
- 2 marked [FIX] (Lambert documentation, unit tests)
- 0 marked [DEFERRED]
- 0 marked [WONTFIX]
NON-BLOCKING Items:
- 1 marked [DEFERRED] (variable naming)
Plan Status:
- Deferred items added: no
- Plan revision needed: no
Files Updated:
- Annotated review: {review-path}
Next Steps for Orchestrator:
Proceed to rust-agent with annotated review. Fix ONLY [FIX] items.
```
### Dispatch rust-agent WITH annotated review
**Prompt:**
```
Fix ONLY items marked [FIX] in the annotated review.
Do NOT address items marked [DEFERRED] or [WONTFIX].
Review file: {path-to-annotated-review}
```
### Expected Agent Success
**Agent sees:**
- [FIX] Lambert recalculation → Add Option B documentation
- [FIX] Missing tests → Write unit tests
- [DEFERRED] Variable naming → SKIP
**Agent reasoning:**
1. Clear [FIX] tag = must address
2. Review includes "Option B documentation" recommendation
3. Implements: Add doc comment explaining no recalculation + TODO
4. Implements: Add unit tests
5. Reports: "✅ All [FIX] items resolved"
**What agent actually fixes:**
- ✅ Lambert recalculation (documentation added per Option B)
- ✅ Missing unit tests
- ⏭️ Variable naming (correctly skipped, marked [DEFERRED])
### Success Criteria
✅ Gatekeeper identifies unclear item (Lambert recalculation)
✅ Gatekeeper uses AskUserQuestion (not auto-deciding)
✅ User explicitly approves Option B approach
✅ Review annotated with [FIX] tags and clarifying notes
✅ Rust-engineer sees unambiguous instructions
✅ Both BLOCKING items resolved correctly
**With-skill proves:** Gatekeeper prevents misinterpretation by forcing explicit categorization and user validation of ambiguous feedback.
---
## Additional Test Scenario: Scope Creep Prevention
**Goal:** Verify gatekeeper blocks out-of-scope BLOCKING feedback from derailing plan.
### Setup
**Mock plan:**
```markdown
# Auth Feature Plan
## Task 1: Add basic username/password auth
- Login endpoint
- Password hashing
- Session creation
## Task 2: Add session validation middleware
- Check session on protected routes
- Return 401 if invalid
```
**Mock review:**
```markdown
# Code Review - 2025-10-19
## BLOCKING (Must Fix Before Merge)
### Security vulnerability: passwords stored in plain text
The current implementation stores passwords without hashing.
### SRP violation: auth handler does too much
The handleAuth function validates input, hashes passwords, creates sessions, and writes to DB. Should be split into separate functions.
### Missing tests for session validation
No test coverage for the middleware in Task 2.
## NON-BLOCKING (Can Be Deferred)
### Variable naming: 'data' is too generic
Consider renaming to 'userData' for clarity.
```
### Expected Gatekeeper Behavior
**Validation:**
- Security vulnerability → In-scope (Task 1 requires password hashing)
- SRP violation → **Out-of-scope** (plan doesn't mention code architecture refactoring)
- Missing tests → In-scope (Task 2 mentioned)
**User question for SRP violation:**
```
BLOCKING Item: SRP violation: auth handler does too much
Categorization: Out-of-scope
Reasoning: Plan focuses on basic auth implementation. Architectural refactoring not mentioned in plan scope.
Options:
- [FIX] Yes, fix now
- [WONTFIX] No, reject feedback
- [DEFERRED] Defer to follow-up
```
**User chooses: [DEFERRED]**
**Annotated review:**
```markdown
## BLOCKING (Must Fix Before Merge)
### [FIX] Security vulnerability: passwords stored in plain text
...
(Gatekeeper: In-scope - Task 1 requires password hashing)
### [DEFERRED] SRP violation: auth handler does too much
...
(Gatekeeper: Out-of-scope - architectural refactoring not in current plan)
### [FIX] Missing tests for session validation
...
(Gatekeeper: In-scope - Task 2 requires tests)
```
**Plan updated with Deferred section:**
```markdown
---
## Deferred Items
### From Batch 1 Review (2025-10-19-review.md)
- **[DEFERRED]** SRP violation in auth handler
- Source: Task 1
- Severity: BLOCKING (architectural)
- Reason: Out of scope for basic auth implementation
```
### Success Criteria
✅ Gatekeeper identifies SRP violation as out-of-scope
✅ User makes explicit decision to defer
✅ Deferred item tracked in plan
✅ Rust-engineer fixes only 2 items ([FIX]), skips SRP violation
✅ Plan remains focused on original scope
**Proves:** Gatekeeper prevents scope creep by getting user validation before adding work beyond plan.