gh-withzombies-hyperpowers/skills/verification-before-completion/SKILL.md

---
name: verification-before-completion
description: Use before claiming work complete, fixed, or passing - requires running verification commands and confirming output; evidence before assertions always
---

<skill_overview>
Claiming work is complete without verification is dishonesty, not efficiency. Evidence before claims, always.
</skill_overview>

<rigidity_level>
LOW FREEDOM - NO exceptions. Run verification command, read output, THEN make claim.

No shortcuts. No "should work". No partial verification. Run it, prove it.
</rigidity_level>

<quick_reference>

| Claim | Verification Required | Not Sufficient |
|-------|----------------------|----------------|
| **Tests pass** | Run full test command, see 0 failures | Previous run, "should pass" |
| **Build succeeds** | Run build, see exit 0 | Linter passing |
| **Bug fixed** | Test original symptom, passes | Code changed |
| **Task complete** | Check all success criteria, run verifications | "Implemented bd-3" |
| **Epic complete** | `bd list --status open --parent bd-1` shows 0 | "All tasks done" |

**Iron Law:** NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

**Use test-runner agent for:** Tests, pre-commit hooks, commits (keeps verbose output out of context)

</quick_reference>

<when_to_use>
**ALWAYS before:**
- Any success/completion claim
- Any expression of satisfaction
- Committing, PR creation, task completion
- Moving to next task
- ANY communication suggesting completion/correctness

**Red flags you need this:**
- Using "should", "probably", "seems to"
- Expressing satisfaction before verification ("Great!", "Perfect!")
- About to commit/push without verification
- Trusting agent success reports
- Relying on partial verification
</when_to_use>

<the_process>

## The Gate Function

Before claiming ANY status:

### 1. Identify
What command proves this claim?

### 2. Run
Execute the full command (fresh, complete).

**For tests/hooks/commits:** Use `hyperpowers:test-runner` agent
- Agent captures verbose output in its context
- Returns only summary + failures
- Prevents context pollution

**For other commands:** Run directly and capture output

### 3. Read
Full output, check exit code, count failures.

### 4. Verify
Does output confirm the claim?
- If NO: State actual status with evidence
- If YES: State claim with evidence

### 5. Only Then
Make the claim.

**Skip any step = lying, not verifying**

</the_process>

<examples>

<example>
<scenario>Developer claims tests pass without running them</scenario>

<code>
Developer modifies authentication logic.

Developer thinks: "This fix is straightforward, tests should pass now"

Developer writes: "Great! All tests passing. The bug is fixed."

[No test command run, no output shown]
</code>

<why_it_fails>
**No evidence:**
- "Should pass" ≠ evidence
- Confidence ≠ verification
- Might have broken other tests
- Might not have fixed the bug

**Why dangerous:**
- Broken code ships
- Trust broken with partner
- Wastes time on false completion
</why_it_fails>

<correction>
**Run verification:**

```bash
# Dispatch test-runner agent
"Run: cargo test"
```

**Agent returns:**
```
Summary: 33 passed, 1 failed
Failures:
- test_login_with_expired_token: assertion failed
  Expected: Err(AuthError::TokenExpired)
  Got: Ok(User { id: 123 })
```

**State actual status:**
```
Tests: 33 passed, 1 failed.
Failure: test_login_with_expired_token still fails.
The fix didn't handle expired tokens.
Investigating...
```

**Result:** Accurate status, bug caught before committing.
</correction>
</example>

<example>
<scenario>Developer closes bd task without verifying success criteria</scenario>

<code>
bd-3 success criteria:
- [ ] All functions fully implemented (no stubs, no TODOs)
- [ ] Tests written and passing: test_generate_token, test_verify_token
- [ ] Pre-commit hooks pass

Developer implements token functions.

Developer thinks: "I implemented everything, task complete"

Runs: bd close bd-3

[No verification commands run]
</code>

<why_it_fails>
**Success criteria not verified:**
- Might have TODO comments left
- Specific tests not run
- Pre-commit hooks not checked

**Result:** Incomplete task closed, issues discovered later.
</why_it_fails>

<correction>
**Verify each criterion:**

```bash
# 1. Check for TODOs
rg "TODO|FIXME|todo!|unimplemented!" src/
# Output: [nothing found] ✅

# 2. Run specific tests
cargo test test_generate_token test_verify_token
# Output: test result: ok. 2 passed ✅

# 3. Run pre-commit hooks (via test-runner agent)
"Run: cargo fmt --check && cargo clippy && cargo test"
```

**Agent returns:**
```
All checks passed:
- cargo fmt: no formatting issues
- cargo clippy: no warnings
- cargo test: 34/34 passed
```

**THEN close task:**
```bash
bd close bd-3
```

**Result:** Task actually complete, all criteria met with evidence.
</correction>
</example>

<example>
<scenario>Developer claims build works based on linter passing</scenario>

<code>
Developer fixes TypeScript errors flagged by linter.

Runs: npm run lint
Output: ✅ No issues found

Developer writes: "Build passes! Ready to commit."

[No build command run]
</code>

<why_it_fails>
**Linter ≠ compiler:**
- Linter checks style and simple errors
- Doesn't verify imports, types, compilation
- Build might still fail

**Why it happens:** Partial verification, extrapolation.
</why_it_fails>

<correction>
**Run actual build:**

```bash
npm run build
```

**Output:**
```
ERROR in src/auth/service.ts:23:15
TS2304: Cannot find name 'validateToken'
Did you mean 'verifyToken'?
```

**State actual status:**
```
Linter passes, but build fails.
Error: validateToken not defined (should be verifyToken).
Fixing...
```

**After fix, run build again:**
```bash
npm run build
# Output: ✓ built in 1.2s
```

**Now can claim:**
```
Build passes. [Evidence: npm run build exit 0]
Ready to commit.
```

**Result:** Actual build status verified, error caught.
</correction>
</example>

</examples>

<critical_rules>

## Rules That Have No Exceptions

1. **No claims without fresh verification** → Run command, see output, THEN claim
   - "Should work" = forbidden
   - "Looks correct" = forbidden
   - Previous run ≠ fresh verification

2. **Use test-runner agent for verbose commands** → Tests, hooks, commits
   - Prevents context pollution
   - Returns summary + failures only
   - Never run `git commit` or `cargo test` directly if output is verbose

3. **Verify ALL success criteria** → Not just "tests pass"
   - Read each criterion from bd task
   - Run verification for each
   - Check all pass before closing

4. **Evidence in every claim** → Show the output
   - Not: "Tests pass"
   - Yes: "Tests pass [Ran: cargo test, Output: 34/34 passed]"

## Common Excuses

All of these mean: Stop, run verification:
- "Should work now"
- "I'm confident this fixes it"
- "Just this once"
- "Linter passed" (when claiming build works)
- "Agent said success" (without independent verification)
- "I'm tired" (exhaustion ≠ excuse)
- "Partial check is enough"

## Pre-Commit Hook Assumption

**If your project uses pre-commit hooks enforcing tests:**
- All test failures are from your current changes
- Never check if errors were "pre-existing"
- Don't run `git checkout <sha> && pytest` to verify
- Pre-commit hooks guarantee previous commit passed
- Just fix the error directly

</critical_rules>

<verification_checklist>

Before claiming tests pass:
- [ ] Ran full test command (not partial)
- [ ] Saw output showing 0 failures
- [ ] Used test-runner agent if output verbose

Before claiming build succeeds:
- [ ] Ran build command (not just linter)
- [ ] Saw exit code 0
- [ ] Checked for compilation errors

Before closing bd task:
- [ ] Re-read success criteria from bd task
- [ ] Ran verification for each criterion
- [ ] Saw evidence all pass
- [ ] THEN closed task

Before closing bd epic:
- [ ] Ran `bd list --status open --parent bd-1`
- [ ] Saw 0 open tasks
- [ ] Ran `bd dep tree bd-1`
- [ ] Confirmed all tasks closed
- [ ] THEN closed epic

</verification_checklist>

<integration>

**This skill calls:**
- test-runner (for verbose verification commands)

**This skill is called by:**
- test-driven-development (verify tests pass/fail)
- executing-plans (verify task success criteria)
- refactoring-safely (verify tests still pass)
- ALL skills before completion claims

**Agents used:**
- hyperpowers:test-runner (run tests, hooks, commits without output pollution)

</integration>

<resources>

**When stuck:**
- Tempted to say "should work" → Run the verification
- Agent reports success → Check VCS diff, verify independently
- Partial verification → Run complete command
- Tired and want to finish → Run verification anyway, no exceptions

**Verification patterns:**
- Tests: Use test-runner agent, check 0 failures
- Build: Run build command, check exit 0
- bd task: Verify each success criterion
- bd epic: Check all tasks closed with bd list/dep tree

</resources>