gh-glittercowboy-taches-cc-…/skills/debug-like-expert/references/verification-patterns.md


<overview>
The most common debugging mistake: declaring victory too early. A fix isn't complete until it's verified. This document defines what "verified" means and provides systematic approaches to proving your fix works.
</overview>


<definition>
A fix is verified when:

1. **The original issue no longer occurs**
   - The exact reproduction steps now produce correct behavior
   - Not "it seems better" - it definitively works

2. **You understand why the fix works**
   - You can explain the mechanism
   - Not "I changed X and it worked" but "X was causing Y, and changing it prevents Y"

3. **Related functionality still works**
   - You haven't broken adjacent features
   - Regression testing passes

4. **The fix works across environments**
   - Not just on your machine
   - In production-like conditions

5. **The fix is stable**
   - Works consistently, not intermittently
   - Not just "worked once" but "works reliably"

**Anything less than this is not verified.**
</definition>

<examples>
❌ **Not verified**:
- "I ran it once and it didn't crash"
- "It seems to work now"
- "The error message is gone" (but is the behavior correct?)
- "Works on my machine"

✅ **Verified**:
- "I ran the original reproduction steps 20 times - zero failures"
- "The data now saves correctly and I can retrieve it"
- "All existing tests pass, plus I added a test for this scenario"
- "Verified in dev, staging, and production environments"
</examples>


<pattern name="reproduction_verification">
**The golden rule**: If you can't reproduce the bug, you can't verify it's fixed.

**Process**:

1. **Before fixing**: Document exact steps to reproduce
   ```markdown
   Reproduction steps:
   1. Login as admin user
   2. Navigate to /settings
   3. Click "Export Data" button
   4. Observe: Error "Cannot read property 'data' of undefined"
   ```

2. **After fixing**: Execute the same steps exactly
   ```markdown
   Verification:
   1. Login as admin user ✓
   2. Navigate to /settings ✓
   3. Click "Export Data" button ✓
   4. Observe: CSV downloads successfully ✓
   ```

3. **Test edge cases** related to the bug
   ```markdown
   Additional tests:
   - Export with empty data set ✓
   - Export with 1000+ records ✓
   - Export while another request is pending ✓
   ```

**If you can't reproduce the original bug**:
- You don't know if your fix worked
- Maybe it's still broken
- Maybe your "fix" did nothing
- Maybe you fixed a different bug

**Solution**: Revert your fix. If the bug comes back, you've verified your fix addressed it.
</pattern>


<pattern name="regression_testing">
**The problem**: You fix one thing, break another.

**Why it happens**:
- Your fix changed shared code
- Your fix had unintended side effects
- Your fix broke an assumption other code relied on

**Protection strategy**:

**1. Identify adjacent functionality**
- What else uses the code you changed?
- What features depend on this behavior?
- What workflows include this step?

**2. Test each adjacent area**
- Manually test the happy path
- Check error handling
- Verify data integrity

**3. Run existing tests**
- Unit tests for the module
- Integration tests for the feature
- End-to-end tests for the workflow

<example>
**Fix**: Changed how user sessions are stored (from memory to database)

**Adjacent functionality to verify**:
- Login still works ✓
- Logout still works ✓
- Session timeout still works ✓
- Concurrent logins are handled correctly ✓
- Session data persists across server restarts ✓ (new capability)
- Password reset flow still works ✓
- OAuth login still works ✓

If you only tested "login works", you missed 6 other things that could break.
</example>
</pattern>


<pattern name="test_first_debugging">
**Strategy**: Write a failing test that reproduces the bug, then fix until the test passes.

**Benefits**:
- Proves you can reproduce the bug
- Provides automatic verification
- Prevents regression in the future
- Forces you to understand the bug precisely

**Process**:

1. **Write a test that reproduces the bug**
   ```javascript
   test('should handle undefined user data gracefully', () => {
     const result = processUserData(undefined);
     expect(result).toBe(null); // Currently throws error
   });
   ```

2. **Verify the test fails** (confirms it reproduces the bug)
   ```
   ✗ should handle undefined user data gracefully
     TypeError: Cannot read property 'name' of undefined
   ```

3. **Fix the code**
   ```javascript
   function processUserData(user) {
     if (!user) return null; // Add defensive check
     return user.name;
   }
   ```

4. **Verify the test passes**
   ```
   ✓ should handle undefined user data gracefully
   ```

5. **Test is now regression protection**
   - If someone breaks this again, the test will catch it

**When to use**:
- Clear, reproducible bugs
- Code that has test infrastructure
- Bugs that could recur

**When not to use**:
- Exploratory debugging (you don't understand the bug yet)
- Infrastructure issues (can't easily test)
- One-off data issues
</pattern>


<pattern name="environment_verification">
**The trap**: "Works on my machine"

**Reality**: Production is different.

**Differences to consider**:

**Environment variables**:
- `NODE_ENV=development` vs `NODE_ENV=production`
- Different API keys
- Different database connections
- Different feature flags

**Dependencies**:
- Different package versions (if not locked)
- Different system libraries
- Different Node/Python/etc versions

**Data**:
- Volume (100 records locally, 1M in production)
- Quality (clean test data vs messy real data)
- Edge cases (nulls, special characters, extreme values)

**Network**:
- Latency (local: 5ms, production: 200ms)
- Reliability (local: perfect, production: occasional failures)
- Firewalls, proxies, load balancers

**Verification checklist**:
```markdown
- [ ] Works locally (dev environment)
- [ ] Works in Docker container (mimics production)
- [ ] Works in staging (production-like)
- [ ] Works in production (the real test)
```

<example>
**Bug**: Batch processing fails in production but works locally

**Investigation**:
- Local: 100 test records, completes in 2 seconds
- Production: 50,000 records, times out at 30 seconds

**The difference**: Volume. Local testing didn't catch it.

**Fix verification**:
- Test locally with 50,000 records
- Verify performance in staging
- Monitor first production run
- Confirm all environments work
</example>
</pattern>


<pattern name="stability_testing">
**The problem**: It worked once, but will it work reliably?

**Intermittent bugs are the worst**:
- Hard to reproduce
- Hard to verify fixes
- Easy to declare fixed when they're not

**Verification strategies**:

**1. Repeated execution**
```bash
for i in {1..100}; do
  npm test -- specific-test.js || echo "Failed on run $i"
done
```

If it fails even once, it's not fixed.

**2. Stress testing**
```javascript
// Run many instances in parallel
const promises = Array(50).fill().map(() =>
  processData(testInput)
);

const results = await Promise.all(promises);
// All results should be correct
```

**3. Soak testing**
- Run for extended period (hours, days)
- Monitor for memory leaks, performance degradation
- Ensure stability over time

**4. Timing variations**
```javascript
// For race conditions, add random delays
async function testWithRandomTiming() {
  await randomDelay(0, 100);
  triggerAction1();
  await randomDelay(0, 100);
  triggerAction2();
  await randomDelay(0, 100);
  verifyResult();
}

// Run this 1000 times
```

<example>
**Bug**: Race condition in file upload

**Weak verification**:
- Upload one file
- "It worked!"
- Ship it

**Strong verification**:
- Upload 100 files sequentially: all succeed ✓
- Upload 20 files in parallel: all succeed ✓
- Upload while navigating away: handles correctly ✓
- Upload, cancel, upload again: works ✓
- Run all tests 50 times: zero failures ✓

Now it's verified.
</example>
</pattern>


<checklist>
Copy this checklist when verifying a fix:

```markdown

### Original Issue
- [ ] Can reproduce the original bug before the fix
- [ ] Have documented exact reproduction steps

### Fix Validation
- [ ] Original reproduction steps now work correctly
- [ ] Can explain WHY the fix works
- [ ] Fix is minimal and targeted

### Regression Testing
- [ ] Adjacent feature 1: [name] works
- [ ] Adjacent feature 2: [name] works
- [ ] Adjacent feature 3: [name] works
- [ ] Existing tests pass
- [ ] Added test to prevent regression

### Environment Testing
- [ ] Works in development
- [ ] Works in staging/QA
- [ ] Works in production
- [ ] Tested with production-like data volume

### Stability Testing
- [ ] Tested multiple times (n=__): zero failures
- [ ] Tested edge cases: [list them]
- [ ] Tested under load/stress: stable

### Documentation
- [ ] Code comments explain the fix
- [ ] Commit message explains the root cause
- [ ] If needed, updated user-facing docs

### Sign-off
- [ ] I understand why this bug occurred
- [ ] I understand why this fix works
- [ ] I've verified it works in all relevant environments
- [ ] I've tested for regressions
- [ ] I'm confident this won't recur
```

**Do not merge/deploy until all checkboxes are checked.**
</checklist>


<distrust>
Your verification might be wrong if:

**1. You can't reproduce the original bug anymore**
- Maybe you forgot how
- Maybe the environment changed
- Maybe you're testing the wrong thing
- **Action**: Document reproduction steps FIRST, before fixing

**2. The fix is large or complex**
- Changed 10 files, modified 200 lines
- Too many moving parts
- **Action**: Simplify the fix, then verify each piece

**3. You're not sure why it works**
- "I changed X and the bug went away"
- But you can't explain the mechanism
- **Action**: Investigate until you understand, then verify

**4. It only works sometimes**
- "Usually works now"
- "Seems more stable"
- **Action**: Not verified. Find and fix the remaining issue

**5. You can't test in production-like conditions**
- Only tested locally
- Different data, different scale
- **Action**: Set up staging environment or use production data in dev

**Red flag phrases**:
- "It seems to work"
- "I think it's fixed"
- "Looks good to me"
- "Can't reproduce anymore" (but you never could reliably)

**Trust-building phrases**:
- "I've verified 50 times - zero failures"
- "All tests pass including new regression test"
- "Deployed to staging, tested for 3 days, no issues"
- "Root cause was X, fix addresses X directly, verified by Y"
</distrust>


<mindset>
**Assume your fix is wrong until proven otherwise.**

This isn't pessimism - it's professionalism.

**Questions to ask yourself**:
- "How could this fix fail?"
- "What haven't I tested?"
- "What am I assuming?"
- "Would this survive production?"

**The cost of insufficient verification**:
- Bug returns in production
- User frustration
- Lost trust
- Emergency debugging sessions
- Rollbacks

**The benefit of thorough verification**:
- Confidence in deployment
- Prevention of regressions
- Trust from team
- Learning from the investigation

**Verification is not optional. It's the most important part of debugging.**
</mindset>