Files
2025-11-30 09:06:38 +08:00

827 lines
26 KiB
Markdown

---
name: sre-task-refinement
description: Use when you have to refine subtasks into actionable plans ensuring that all corner cases are handled and we understand all the requirements.
---
<skill_overview>
Review bd task plans with Google Fellow SRE perspective to ensure junior engineer can execute without questions; catch edge cases, verify granularity, strengthen criteria, prevent production issues before implementation.
</skill_overview>
<rigidity_level>
LOW FREEDOM - Follow the 7-category checklist exactly. Apply all categories to every task. No skipping red flag checks. Always verify no placeholder text after updates. Reject plans with critical gaps.
</rigidity_level>
<quick_reference>
| Category | Key Questions | Auto-Reject If |
|----------|---------------|----------------|
| 1. Granularity | Tasks 4-8 hours? Phases <16 hours? | Any task >16h without breakdown |
| 2. Implementability | Junior can execute without questions? | Vague language, missing details |
| 3. Success Criteria | 3+ measurable criteria per task? | Can't verify ("works well") |
| 4. Dependencies | Correct parent-child, blocking relationships? | Circular dependencies |
| 5. Safety Standards | Anti-patterns specified? Error handling? | No anti-patterns section |
| 6. Edge Cases | Empty input? Unicode? Concurrency? Failures? | No edge case consideration |
| 7. Red Flags | Placeholder text? Vague instructions? | "[detailed above]", "TODO" |
**Perspective**: Google Fellow SRE with 20+ years experience reviewing junior engineer designs.
**Time**: Don't rush - catching one gap pre-implementation saves hours of rework.
</quick_reference>
<when_to_use>
Use when:
- Reviewing bd epic/feature plans before implementation
- Need to ensure junior engineer can execute without questions
- Want to catch edge cases and failure modes upfront
- Need to verify task granularity (4-8 hour subtasks)
- After hyperpowers:writing-plans creates initial plan
- Before hyperpowers:executing-plans starts implementation
Don't use when:
- Task already being implemented (too late)
- Just need to understand existing code (use codebase-investigator)
- Debugging issues (use debugging-with-tools)
- Want to create plan from scratch (use brainstorming → writing-plans)
</when_to_use>
<the_process>
## Announcement
**Announce:** "I'm using hyperpowers:sre-task-refinement to review this plan with Google Fellow-level scrutiny."
---
## Review Checklist (Apply to Every Task)
### 1. Task Granularity
**Check:**
- [ ] No task >8 hours (subtasks) or >16 hours (phases)?
- [ ] Large phases broken into 4-8 hour subtasks?
- [ ] Each subtask independently completable?
- [ ] Each subtask has clear deliverable?
**If task >16 hours:**
- Create subtasks with `bd create`
- Link with `bd dep add child parent --type parent-child`
- Update parent to coordinator role
---
### 2. Implementability (Junior Engineer Test)
**Check:**
- [ ] Can junior engineer implement without asking questions?
- [ ] Function signatures/behaviors described, not just "implement X"?
- [ ] Test scenarios described (what they verify, not just names)?
- [ ] "Done" clearly defined with verifiable criteria?
- [ ] All file paths specified or marked "TBD: new file"?
**Red flags:**
- "Implement properly" (how?)
- "Add support" (for what exactly?)
- "Make it work" (what does working mean?)
- File paths missing or ambiguous
---
### 3. Success Criteria Quality
**Check:**
- [ ] Each task has 3+ specific, measurable success criteria?
- [ ] All criteria testable/verifiable (not subjective)?
- [ ] Includes automated verification (tests pass, clippy clean)?
- [ ] No vague criteria like "works well" or "is implemented"?
**Good criteria examples:**
- ✅ "5+ unit tests pass (valid VIN, invalid checksum, various formats)"
- ✅ "Clippy clean with no warnings"
- ✅ "Performance: <100ms for 1000 records"
**Bad criteria examples:**
- ❌ "Code is good quality"
- ❌ "Works correctly"
- ❌ "Is implemented"
---
### 4. Dependency Structure
**Check:**
- [ ] Parent-child relationships correct (epic → phases → subtasks)?
- [ ] Blocking dependencies correct (earlier work blocks later)?
- [ ] No circular dependencies?
- [ ] Dependency graph makes logical sense?
**Verify with:**
```bash
bd dep tree bd-1 # Show full dependency tree
```
---
### 5. Safety & Quality Standards
**Check:**
- [ ] Anti-patterns include unwrap/expect prohibition?
- [ ] Anti-patterns include TODO prohibition (or must have issue #)?
- [ ] Anti-patterns include stub implementation prohibition?
- [ ] Error handling requirements specified (use Result, avoid panic)?
- [ ] Test requirements specific (test names, scenarios listed)?
**Minimum anti-patterns:**
- ❌ No unwrap/expect in production code
- ❌ No TODOs without issue numbers
- ❌ No stub implementations (unimplemented!, todo!)
- ❌ No regex without catastrophic backtracking check
---
### 6. Edge Cases & Failure Modes (Fellow SRE Perspective)
**Ask for each task:**
- [ ] What happens with malformed input?
- [ ] What happens with empty/nil/zero values?
- [ ] What happens under high load/concurrency?
- [ ] What happens when dependencies fail?
- [ ] What happens with Unicode, special characters, large inputs?
- [ ] Are these edge cases addressed in the plan?
**Add to Key Considerations section:**
- Edge case descriptions
- Mitigation strategies
- References to similar code handling these cases
---
### 7. Red Flags (AUTO-REJECT)
**Check for these - if found, REJECT plan:**
- ❌ Any task >16 hours without subtask breakdown
- ❌ Vague language: "implement properly", "add support", "make it work"
- ❌ Success criteria that can't be verified: "code is good", "works well"
- ❌ Missing test specifications
- ❌ "We'll handle this later" or "TODO" in the plan itself
- ❌ No anti-patterns section
- ❌ Implementation checklist with fewer than 3 items per task
- ❌ No effort estimates
- ❌ Missing error handling considerations
-**CRITICAL: Placeholder text in design field** - "[detailed above]", "[as specified]", "[complete steps here]"
---
## Review Process
For each task in the plan:
**Step 1: Read the task**
```bash
bd show bd-3
```
**Step 2: Apply all 7 checklist categories**
- Task Granularity
- Implementability
- Success Criteria Quality
- Dependency Structure
- Safety & Quality Standards
- Edge Cases & Failure Modes
- Red Flags
**Step 3: Document findings**
Take notes:
- What's done well
- What's missing
- What's vague or ambiguous
- Hidden failure modes not addressed
- Better approaches or simplifications
**Step 4: Update the task**
Use `bd update` to add missing information:
```bash
bd update bd-3 --design "$(cat <<'EOF'
## Goal
[Original goal, preserved]
## Effort Estimate
[Updated estimate if needed]
## Success Criteria
- [ ] Existing criteria
- [ ] NEW: Added missing measurable criteria
## Implementation Checklist
[Complete checklist with file paths]
## Key Considerations (ADDED BY SRE REVIEW)
**Edge Case: Empty Input**
- What happens when input is empty string?
- MUST validate input length before processing
**Edge Case: Unicode Handling**
- What if string contains RTL or surrogate pairs?
- Use proper Unicode-aware string methods
**Performance Concern: Regex Backtracking**
- Pattern `.*[a-z]+.*` has catastrophic backtracking risk
- MUST test with pathological inputs (e.g., 10000 'a's)
- Use possessive quantifiers or bounded repetition
**Reference Implementation**
- Study src/similar/module.rs for pattern to follow
## Anti-patterns
[Original anti-patterns]
- ❌ NEW: Specific anti-pattern for this task's risks
EOF
)"
```
**IMPORTANT:** Use `--design` for full detailed description, NOT `--description` (title only).
**Step 5: Verify no placeholder text (MANDATORY)**
After updating, read back with `bd show bd-N` and verify:
- ✅ All sections contain actual content, not meta-references
- ✅ No placeholder text like "[detailed above]", "[as specified]", "[will be added]"
- ✅ Implementation steps fully written with actual code examples
- ✅ Success criteria explicit, not referencing "criteria above"
- ❌ If ANY placeholder text found: REJECT and rewrite with actual content
---
## Breaking Down Large Tasks
If task >16 hours, create subtasks:
```bash
# Create first subtask
bd create "Subtask 1: [Specific Component]" \
--type task \
--priority 1 \
--design "[Complete subtask design with all 7 categories addressed]"
# Returns bd-10
# Create second subtask
bd create "Subtask 2: [Another Component]" \
--type task \
--priority 1 \
--design "[Complete subtask design]"
# Returns bd-11
# Link subtasks to parent with parent-child relationship
bd dep add bd-10 bd-3 --type parent-child # bd-10 is child of bd-3
bd dep add bd-11 bd-3 --type parent-child # bd-11 is child of bd-3
# Add sequential dependencies if needed (LATER depends on EARLIER)
bd dep add bd-11 bd-10 # bd-11 depends on bd-10 (do bd-10 first)
# Update parent to coordinator
bd update bd-3 --design "$(cat <<'EOF'
## Goal
Coordinate implementation of [feature]. Broken into N subtasks.
## Success Criteria
- [ ] All N child subtasks closed
- [ ] Integration tests pass
- [ ] [High-level verification criteria]
EOF
)"
```
---
## Output Format
After reviewing all tasks:
```markdown
## Plan Review Results
### Epic: [Name] ([epic-id])
### Overall Assessment
[APPROVE ✅ / NEEDS REVISION ⚠️ / REJECT ❌]
### Dependency Structure Review
[Output of `bd dep tree [epic-id]`]
**Structure Quality**: [✅ Correct / ❌ Issues found]
- [Comments on parent-child relationships]
- [Comments on blocking dependencies]
- [Comments on granularity]
### Task-by-Task Review
#### [Task Name] (bd-N)
**Type**: [epic/feature/task]
**Status**: [✅ Ready / ⚠️ Needs Minor Improvements / ❌ Needs Major Revision]
**Estimated Effort**: [X hours] ([✅ Good / ❌ Too large - needs breakdown])
**Strengths**:
- [What's done well]
**Critical Issues** (must fix):
- [Blocking problems]
**Improvements Needed**:
- [What to add/clarify]
**Edge Cases Missing**:
- [Failure modes not addressed]
**Changes Made**:
- [Specific improvements added via `bd update`]
---
[Repeat for each task/phase/subtask]
### Summary of Changes
**Issues Updated**:
- bd-3 - Added edge case handling for Unicode, regex backtracking risks
- bd-5 - Broke into 3 subtasks (was 40 hours, now 3x8 hours)
- bd-7 - Strengthened success criteria (added test names, verification commands)
### Critical Gaps Across Plan
1. [Pattern of missing items across multiple tasks]
2. [Systemic issues in the plan]
### Recommendations
[If APPROVE]:
✅ Plan is solid and ready for implementation.
- All tasks are junior-engineer implementable
- Dependency structure is correct
- Edge cases and failure modes addressed
[If NEEDS REVISION]:
⚠️ Plan needs improvements before implementation:
- [List major items that need addressing]
- After changes, re-run hyperpowers:sre-task-refinement
[If REJECT]:
❌ Plan has fundamental issues and needs redesign:
- [Critical problems]
```
</the_process>
<examples>
<example>
<scenario>Developer reviews task but skips edge case analysis (Category 6)</scenario>
<code>
# Review of bd-3: Implement VIN scanner
## Checklist review:
1. Granularity: ✅ 6-8 hours
2. Implementability: ✅ Junior can implement
3. Success Criteria: ✅ Has 5 test scenarios
4. Dependencies: ✅ Correct
5. Safety Standards: ✅ Anti-patterns present
6. Edge Cases: [SKIPPED - "looks straightforward"]
7. Red Flags: ✅ None found
Conclusion: "Task looks good, approve ✅"
# Task ships without edge case review
# Production issues occur:
- VIN scanner matches random 17-char strings (no checksum validation)
- Lowercase VINs not handled (should normalize)
- Catastrophic regex backtracking on long inputs (DoS vulnerability)
</code>
<why_it_fails>
- Skipped Category 6 (Edge Cases) assuming task was "straightforward"
- Didn't ask: What happens with invalid checksums? Lowercase? Long inputs?
- Missed critical production issues:
- False positives (no checksum validation)
- Data handling bugs (case sensitivity)
- Security vulnerability (regex DoS)
- Junior engineer didn't know to handle these (not in task)
- Production incidents occur after deployment
- Hours of emergency fixes, customer impact
- SRE review failed to prevent known failure modes
</why_it_fails>
<correction>
**Apply Category 6 rigorously:**
```markdown
## Edge Case Analysis for bd-3: VIN Scanner
Ask for EVERY task:
- Malformed input? VIN has checksum - must validate, not just pattern match
- Empty/nil? What if empty string passed?
- Concurrency? Read-only scanner, no concurrency issues
- Dependency failures? No external dependencies
- Unicode/special chars? VIN is alphanumeric only, but what about lowercase?
- Large inputs? Regex `.*` patterns can cause catastrophic backtracking
Findings:
❌ VIN checksum validation not mentioned (will match random strings)
❌ Case normalization not mentioned (lowercase VINs exist)
❌ Regex backtracking risk not mentioned (DoS vulnerability)
```
**Update task:**
```bash
bd update bd-3 --design "$(cat <<'EOF'
[... original content ...]
## Key Considerations (ADDED BY SRE REVIEW)
**VIN Checksum Complexity**:
- ISO 3779 requires transliteration table (letters → numbers)
- Weighted sum algorithm with modulo 11
- Reference: https://en.wikipedia.org/wiki/Vehicle_identification_number#Check_digit
- MUST validate checksum, not just pattern - prevents false positives
**Case Normalization**:
- VINs can appear in lowercase
- MUST normalize to uppercase before validation
- Test with mixed case: "1hgbh41jxmn109186"
**Regex Backtracking Risk**:
- CRITICAL: Pattern `.*[A-HJ-NPR-Z0-9]{17}.*` has backtracking risk
- Test with pathological input: 10000 'X's followed by 16-char string
- Use possessive quantifiers or bounded repetition
- Reference: https://www.regular-expressions.info/catastrophic.html
**Edge Cases to Test**:
- Valid VIN with valid checksum (should match)
- Valid pattern but invalid checksum (should NOT match)
- Lowercase VIN (should normalize and validate)
- Ambiguous chars I/O not valid in VIN (should reject)
- Very long input (should not DoS)
EOF
)"
```
**What you gain:**
- Prevented false positives (checksum validation)
- Prevented data handling bugs (case normalization)
- Prevented security vulnerability (regex DoS)
- Junior engineer has complete requirements
- Production issues caught pre-implementation
- Proper SRE review preventing known failure modes
- Customer trust maintained
</correction>
</example>
<example>
<scenario>Developer approves task with placeholder text (Red Flag #10)</scenario>
<code>
# Review of bd-5: Implement License Plate Scanner
bd show bd-5:
## Implementation Checklist
- [ ] Create scanner module
- [ ] [Complete implementation steps detailed above]
- [ ] Add tests
## Success Criteria
- [ ] [As specified in the implementation checklist]
- [ ] Tests pass
## Key Considerations
- [Will be added during implementation]
# Developer's review:
"Looks comprehensive, has implementation checklist and success criteria ✅"
# During implementation:
Junior engineer: "What are the 'implementation steps detailed above'?"
Junior engineer: "What specific success criteria should I verify?"
Junior engineer: "What key considerations exist?"
# No answers in the task - junior engineer blocked
# Have to research and add missing information
# Implementation delayed by 2 days
</code>
<why_it_fails>
- Missed Red Flag #10: Placeholder text present
- "[Complete implementation steps detailed above]" is meta-reference, not content
- "[As specified in the implementation checklist]" is circular reference
- "[Will be added during implementation]" is deferral, not specification
- Junior engineer can't execute - missing critical information
- Task looks complete but actually incomplete
- Implementation blocked until details added
- SRE review failed to catch placeholder text
</why_it_fails>
<correction>
**Check for placeholder text after reading:**
```markdown
## Red Flag Check (Category 7)
Read through bd-5 line by line:
Line 15: "[Complete implementation steps detailed above]"
❌ PLACEHOLDER - "detailed above" is meta-reference, not actual content
Line 22: "[As specified in the implementation checklist]"
❌ PLACEHOLDER - Circular reference to another section, not explicit criteria
Line 30: "[Will be added during implementation]"
❌ PLACEHOLDER - Deferral to future, not actual considerations
DECISION: REJECT ❌
Reason: Contains placeholder text - task not ready for implementation
```
**Update task with actual content:**
```bash
bd update bd-5 --design "$(cat <<'EOF'
## Implementation Checklist
- [ ] Create src/scan/plugins/scanners/license_plate.rs
- [ ] Implement LicensePlateScanner struct with ScanPlugin trait
- [ ] Add regex patterns for US states:
- CA: `[0-9][A-Z]{3}[0-9]{3}` (e.g., 1ABC123)
- NY: `[A-Z]{3}[0-9]{4}` (e.g., ABC1234)
- TX: `[A-Z]{3}[0-9]{4}|[0-9]{3}[A-Z]{3}` (e.g., ABC1234 or 123ABC)
- Generic: `[A-Z0-9]{5,8}` (fallback)
- [ ] Implement has_healthcare_context() check
- [ ] Create test module with 8+ test cases
- [ ] Register in src/scan/plugins/scanners/mod.rs
## Success Criteria
- [ ] Valid CA plate "1ABC123" detected in healthcare context
- [ ] Valid NY plate "ABC1234" detected in healthcare context
- [ ] Invalid plate "123" NOT detected (too short)
- [ ] Valid plate NOT detected outside healthcare context
- [ ] 8+ unit tests pass covering all patterns and edge cases
- [ ] Clippy clean, no warnings
- [ ] cargo test passes
## Key Considerations
**False Positive Risk**:
- License plates are short and generic (5-8 chars)
- MUST require healthcare context via has_healthcare_context()
- Without context, will match random alphanumeric sequences
- Test: Random string "ABC1234" should NOT match outside healthcare context
**State Format Variations**:
- 50 US states have different formats
- Implement common formats (CA, NY, TX) + generic fallback
- Document which formats supported in module docstring
- Consider international plates in future iteration
**Performance**:
- Regex patterns are simple, no backtracking risk
- Should process <1ms per chunk
**Reference Implementation**:
- Study src/scan/plugins/scanners/vehicle_identifier.rs
- Follow same pattern: regex + context check + tests
EOF
)"
```
**Verify no placeholder text:**
```bash
bd show bd-5
# Read entire output
# Confirm: All sections have actual content
# Confirm: No "[detailed above]", "[as specified]", "[will be added]"
# ✅ Task ready for implementation
```
**What you gain:**
- Junior engineer has complete specification
- No blocked implementation waiting for details
- All edge cases documented upfront
- Success criteria explicit and verifiable
- Key considerations prevent common mistakes
- No placeholder text - task truly ready
- Professional SRE review standard maintained
</correction>
</example>
<example>
<scenario>Developer accepts vague success criteria (Category 3)</scenario>
<code>
# Review of bd-7: Implement Data Encryption
bd show bd-7:
## Success Criteria
- [ ] Encryption is implemented correctly
- [ ] Code is good quality
- [ ] Tests work properly
# Developer's review:
"Has 3 success criteria ✅ Meets minimum requirement"
# During implementation:
Junior engineer: "How do I know if encryption is 'correct'?"
Junior engineer: "What makes code 'good quality'?"
Junior engineer: "What does 'tests work properly' mean?"
# Junior engineer makes best guesses:
- Uses ECB mode (insecure, should use GCM)
- No key rotation (bad practice)
- Tests only happy path (misses edge cases)
# Code review finds critical security issues
# Complete rewrite required - 3 days wasted
</code>
<why_it_fails>
- Category 3 check: "All criteria testable/verifiable?" ❌ FAILED
- "Implemented correctly" - not measurable (correct by what standard?)
- "Good quality" - subjective, not verifiable
- "Work properly" - vague, what is proper?
- Junior engineer can't verify criteria objectively
- Makes incorrect assumptions filling gaps
- Security vulnerabilities introduced
- Wastes time on implementation that fails review
- SRE review failed to strengthen criteria
</why_it_fails>
<correction>
**Apply Category 3 rigorously:**
```markdown
## Success Criteria Analysis for bd-7
Current criteria:
- [ ] Encryption is implemented correctly
❌ NOT TESTABLE - "correctly" is subjective, no standard specified
- [ ] Code is good quality
❌ NOT TESTABLE - "good quality" is opinion, not measurable
- [ ] Tests work properly
❌ NOT TESTABLE - "properly" is vague, no definition
Minimum requirement: 3+ specific, measurable, testable criteria
Current: 0 testable criteria
DECISION: REJECT ❌
```
**Update with measurable criteria:**
```bash
bd update bd-7 --design "$(cat <<'EOF'
[... original content ...]
## Success Criteria
**Encryption Implementation**:
- [ ] Uses AES-256-GCM mode (verified in code review)
- [ ] Key derivation via PBKDF2 with 100,000 iterations (NIST recommendation)
- [ ] Unique IV generated per encryption (crypto_random)
- [ ] Authentication tag verified on decryption
**Code Quality** (automated checks):
- [ ] Clippy clean with no warnings: `cargo clippy -- -D warnings`
- [ ] Rustfmt compliant: `cargo fmt --check`
- [ ] No unwrap/expect in production: `rg "\.unwrap\(\)|\.expect\(" src/` returns 0
- [ ] No TODOs without issue numbers: `rg "TODO" src/` returns 0
**Test Coverage**:
- [ ] 12+ unit tests pass covering:
- test_encrypt_decrypt_roundtrip (happy path)
- test_wrong_key_fails_auth (security)
- test_modified_ciphertext_fails_auth (security)
- test_empty_plaintext (edge case)
- test_large_plaintext_10mb (performance)
- test_unicode_plaintext (data handling)
- test_concurrent_encryption (thread safety)
- test_iv_uniqueness (security)
- [4 more specific scenarios]
- [ ] All tests pass: `cargo test encryption`
- [ ] Test coverage >90%: `cargo tarpaulin --packages encryption`
**Documentation**:
- [ ] Module docstring explains encryption scheme (AES-256-GCM)
- [ ] Function docstrings include examples
- [ ] Security considerations documented (key management, IV handling)
**Security Review**:
- [ ] No hardcoded keys or IVs (verified via grep)
- [ ] Key zeroized after use (verified in code)
- [ ] Constant-time comparison for auth tag (timing attack prevention)
EOF
)"
```
**What you gain:**
- Every criterion objectively verifiable
- Junior engineer knows exactly what "done" means
- Automated checks (clippy, fmt, grep) provide instant feedback
- Specific test scenarios prevent missed edge cases
- Security requirements explicit (GCM, PBKDF2, unique IV)
- No ambiguity - can verify each criterion with command or code review
- Professional SRE review standard: measurable, testable, specific
</correction>
</example>
</examples>
<critical_rules>
## Rules That Have No Exceptions
1. **Apply all 7 categories to every task** → No skipping any category for any task
2. **Reject plans with placeholder text** → "[detailed above]", "[as specified]" = instant reject
3. **Verify no placeholder after updates** → Read back with `bd show` and confirm actual content
4. **Break tasks >16 hours** → Create subtasks, don't accept large tasks
5. **Strengthen vague criteria** → "Works correctly" → measurable verification commands
6. **Add edge cases to every task** → Empty? Unicode? Concurrency? Failures?
7. **Never skip Category 6** → Edge case analysis prevents production issues
## Common Excuses
All of these mean: **STOP. Apply the full process.**
- "Task looks straightforward" (Edge cases hide in "straightforward" tasks)
- "Has 3 criteria, meets minimum" (Criteria must be measurable, not just 3+ items)
- "Placeholder text is just formatting" (Placeholders mean incomplete specification)
- "Can handle edge cases during implementation" (Must specify upfront, not defer)
- "Junior will figure it out" (Junior should NOT need to figure out - we specify)
- "Too detailed, feels like micromanaging" (Detail prevents questions and rework)
- "Taking too long to review" (One gap caught saves hours of rework)
</critical_rules>
<verification_checklist>
Before completing SRE review:
**Per task reviewed:**
- [ ] Applied all 7 categories (Granularity, Implementability, Criteria, Dependencies, Safety, Edge Cases, Red Flags)
- [ ] Checked for placeholder text in design field
- [ ] Updated task with missing information via `bd update --design`
- [ ] Verified updated task with `bd show` (no placeholders remain)
- [ ] Broke down any task >16 hours into subtasks
- [ ] Strengthened vague success criteria to measurable
- [ ] Added edge case analysis to Key Considerations
- [ ] Strengthened anti-patterns based on failure modes
**Overall plan:**
- [ ] Reviewed ALL tasks/phases/subtasks (no exceptions)
- [ ] Verified dependency structure with `bd dep tree`
- [ ] Documented findings for each task
- [ ] Created summary of changes made
- [ ] Provided clear recommendation (APPROVE/NEEDS REVISION/REJECT)
**Can't check all boxes?** Return to review process and complete missing steps.
</verification_checklist>
<integration>
**This skill is used after:**
- hyperpowers:writing-plans (creates initial plan)
- hyperpowers:brainstorming (establishes requirements)
**This skill is used before:**
- hyperpowers:executing-plans (implements tasks)
**This skill is also called by:**
- hyperpowers:executing-plans (REQUIRED for new tasks created during execution)
**Call chains:**
```
Initial planning:
hyperpowers:brainstorming → hyperpowers:writing-plans → hyperpowers:sre-task-refinement → hyperpowers:executing-plans
(if gaps: revise and re-review)
During execution (for new tasks):
hyperpowers:executing-plans → creates new task → hyperpowers:sre-task-refinement → STOP checkpoint
```
**This skill uses:**
- bd commands (show, update, create, dep add, dep tree)
- Google Fellow SRE perspective (20+ years distributed systems)
- 7-category checklist (mandatory for every task)
**Time expectations:**
- Small epic (3-5 tasks): 15-20 minutes
- Medium epic (6-10 tasks): 25-40 minutes
- Large epic (10+ tasks): 45-60 minutes
**Don't rush:** Catching one critical gap pre-implementation saves hours of rework.
</integration>
<resources>
**Review patterns:**
- Task too large (>16h) → Break into 4-8h subtasks
- Vague criteria ("works correctly") → Measurable commands/checks
- Missing edge cases → Add to Key Considerations with mitigations
- Placeholder text → Rewrite with actual content
**When stuck:**
- Unsure if task too large → Ask: Can junior complete in one day?
- Unsure if criteria measurable → Ask: Can I verify with command/code review?
- Unsure if edge case matters → Ask: Could this fail in production?
- Unsure if placeholder → Ask: Does this reference other content instead of providing content?
**Key principle:** Junior engineer should be able to execute task without asking questions. If they would need to ask, specification is incomplete.
</resources>