The most common debugging mistake: declaring victory too early. A fix isn't complete until it's verified. This document defines what "verified" means and provides systematic approaches to proving your fix works. A fix is verified when: 1. **The original issue no longer occurs** - The exact reproduction steps now produce correct behavior - Not "it seems better" - it definitively works 2. **You understand why the fix works** - You can explain the mechanism - Not "I changed X and it worked" but "X was causing Y, and changing it prevents Y" 3. **Related functionality still works** - You haven't broken adjacent features - Regression testing passes 4. **The fix works across environments** - Not just on your machine - In production-like conditions 5. **The fix is stable** - Works consistently, not intermittently - Not just "worked once" but "works reliably" **Anything less than this is not verified.** ❌ **Not verified**: - "I ran it once and it didn't crash" - "It seems to work now" - "The error message is gone" (but is the behavior correct?) - "Works on my machine" ✅ **Verified**: - "I ran the original reproduction steps 20 times - zero failures" - "The data now saves correctly and I can retrieve it" - "All existing tests pass, plus I added a test for this scenario" - "Verified in dev, staging, and production environments" **The golden rule**: If you can't reproduce the bug, you can't verify it's fixed. **Process**: 1. **Before fixing**: Document exact steps to reproduce ```markdown Reproduction steps: 1. Login as admin user 2. Navigate to /settings 3. Click "Export Data" button 4. Observe: Error "Cannot read property 'data' of undefined" ``` 2. **After fixing**: Execute the same steps exactly ```markdown Verification: 1. Login as admin user ✓ 2. Navigate to /settings ✓ 3. Click "Export Data" button ✓ 4. Observe: CSV downloads successfully ✓ ``` 3. **Test edge cases** related to the bug ```markdown Additional tests: - Export with empty data set ✓ - Export with 1000+ records ✓ - Export while another request is pending ✓ ``` **If you can't reproduce the original bug**: - You don't know if your fix worked - Maybe it's still broken - Maybe your "fix" did nothing - Maybe you fixed a different bug **Solution**: Revert your fix. If the bug comes back, you've verified your fix addressed it. **The problem**: You fix one thing, break another. **Why it happens**: - Your fix changed shared code - Your fix had unintended side effects - Your fix broke an assumption other code relied on **Protection strategy**: **1. Identify adjacent functionality** - What else uses the code you changed? - What features depend on this behavior? - What workflows include this step? **2. Test each adjacent area** - Manually test the happy path - Check error handling - Verify data integrity **3. Run existing tests** - Unit tests for the module - Integration tests for the feature - End-to-end tests for the workflow **Fix**: Changed how user sessions are stored (from memory to database) **Adjacent functionality to verify**: - Login still works ✓ - Logout still works ✓ - Session timeout still works ✓ - Concurrent logins are handled correctly ✓ - Session data persists across server restarts ✓ (new capability) - Password reset flow still works ✓ - OAuth login still works ✓ If you only tested "login works", you missed 6 other things that could break. **Strategy**: Write a failing test that reproduces the bug, then fix until the test passes. **Benefits**: - Proves you can reproduce the bug - Provides automatic verification - Prevents regression in the future - Forces you to understand the bug precisely **Process**: 1. **Write a test that reproduces the bug** ```javascript test('should handle undefined user data gracefully', () => { const result = processUserData(undefined); expect(result).toBe(null); // Currently throws error }); ``` 2. **Verify the test fails** (confirms it reproduces the bug) ``` ✗ should handle undefined user data gracefully TypeError: Cannot read property 'name' of undefined ``` 3. **Fix the code** ```javascript function processUserData(user) { if (!user) return null; // Add defensive check return user.name; } ``` 4. **Verify the test passes** ``` ✓ should handle undefined user data gracefully ``` 5. **Test is now regression protection** - If someone breaks this again, the test will catch it **When to use**: - Clear, reproducible bugs - Code that has test infrastructure - Bugs that could recur **When not to use**: - Exploratory debugging (you don't understand the bug yet) - Infrastructure issues (can't easily test) - One-off data issues **The trap**: "Works on my machine" **Reality**: Production is different. **Differences to consider**: **Environment variables**: - `NODE_ENV=development` vs `NODE_ENV=production` - Different API keys - Different database connections - Different feature flags **Dependencies**: - Different package versions (if not locked) - Different system libraries - Different Node/Python/etc versions **Data**: - Volume (100 records locally, 1M in production) - Quality (clean test data vs messy real data) - Edge cases (nulls, special characters, extreme values) **Network**: - Latency (local: 5ms, production: 200ms) - Reliability (local: perfect, production: occasional failures) - Firewalls, proxies, load balancers **Verification checklist**: ```markdown - [ ] Works locally (dev environment) - [ ] Works in Docker container (mimics production) - [ ] Works in staging (production-like) - [ ] Works in production (the real test) ``` **Bug**: Batch processing fails in production but works locally **Investigation**: - Local: 100 test records, completes in 2 seconds - Production: 50,000 records, times out at 30 seconds **The difference**: Volume. Local testing didn't catch it. **Fix verification**: - Test locally with 50,000 records - Verify performance in staging - Monitor first production run - Confirm all environments work **The problem**: It worked once, but will it work reliably? **Intermittent bugs are the worst**: - Hard to reproduce - Hard to verify fixes - Easy to declare fixed when they're not **Verification strategies**: **1. Repeated execution** ```bash for i in {1..100}; do npm test -- specific-test.js || echo "Failed on run $i" done ``` If it fails even once, it's not fixed. **2. Stress testing** ```javascript // Run many instances in parallel const promises = Array(50).fill().map(() => processData(testInput) ); const results = await Promise.all(promises); // All results should be correct ``` **3. Soak testing** - Run for extended period (hours, days) - Monitor for memory leaks, performance degradation - Ensure stability over time **4. Timing variations** ```javascript // For race conditions, add random delays async function testWithRandomTiming() { await randomDelay(0, 100); triggerAction1(); await randomDelay(0, 100); triggerAction2(); await randomDelay(0, 100); verifyResult(); } // Run this 1000 times ``` **Bug**: Race condition in file upload **Weak verification**: - Upload one file - "It worked!" - Ship it **Strong verification**: - Upload 100 files sequentially: all succeed ✓ - Upload 20 files in parallel: all succeed ✓ - Upload while navigating away: handles correctly ✓ - Upload, cancel, upload again: works ✓ - Run all tests 50 times: zero failures ✓ Now it's verified. Copy this checklist when verifying a fix: ```markdown ### Original Issue - [ ] Can reproduce the original bug before the fix - [ ] Have documented exact reproduction steps ### Fix Validation - [ ] Original reproduction steps now work correctly - [ ] Can explain WHY the fix works - [ ] Fix is minimal and targeted ### Regression Testing - [ ] Adjacent feature 1: [name] works - [ ] Adjacent feature 2: [name] works - [ ] Adjacent feature 3: [name] works - [ ] Existing tests pass - [ ] Added test to prevent regression ### Environment Testing - [ ] Works in development - [ ] Works in staging/QA - [ ] Works in production - [ ] Tested with production-like data volume ### Stability Testing - [ ] Tested multiple times (n=__): zero failures - [ ] Tested edge cases: [list them] - [ ] Tested under load/stress: stable ### Documentation - [ ] Code comments explain the fix - [ ] Commit message explains the root cause - [ ] If needed, updated user-facing docs ### Sign-off - [ ] I understand why this bug occurred - [ ] I understand why this fix works - [ ] I've verified it works in all relevant environments - [ ] I've tested for regressions - [ ] I'm confident this won't recur ``` **Do not merge/deploy until all checkboxes are checked.** Your verification might be wrong if: **1. You can't reproduce the original bug anymore** - Maybe you forgot how - Maybe the environment changed - Maybe you're testing the wrong thing - **Action**: Document reproduction steps FIRST, before fixing **2. The fix is large or complex** - Changed 10 files, modified 200 lines - Too many moving parts - **Action**: Simplify the fix, then verify each piece **3. You're not sure why it works** - "I changed X and the bug went away" - But you can't explain the mechanism - **Action**: Investigate until you understand, then verify **4. It only works sometimes** - "Usually works now" - "Seems more stable" - **Action**: Not verified. Find and fix the remaining issue **5. You can't test in production-like conditions** - Only tested locally - Different data, different scale - **Action**: Set up staging environment or use production data in dev **Red flag phrases**: - "It seems to work" - "I think it's fixed" - "Looks good to me" - "Can't reproduce anymore" (but you never could reliably) **Trust-building phrases**: - "I've verified 50 times - zero failures" - "All tests pass including new regression test" - "Deployed to staging, tested for 3 days, no issues" - "Root cause was X, fix addresses X directly, verified by Y" **Assume your fix is wrong until proven otherwise.** This isn't pessimism - it's professionalism. **Questions to ask yourself**: - "How could this fix fail?" - "What haven't I tested?" - "What am I assuming?" - "Would this survive production?" **The cost of insufficient verification**: - Bug returns in production - User frustration - Lost trust - Emergency debugging sessions - Rollbacks **The benefit of thorough verification**: - Confidence in deployment - Prevention of regressions - Trust from team - Learning from the investigation **Verification is not optional. It's the most important part of debugging.**