Files
2025-11-29 18:28:37 +08:00

374 lines
12 KiB
Markdown

<overview>
Debugging is applied scientific method. You observe a phenomenon (the bug), form hypotheses about its cause, design experiments to test those hypotheses, and revise based on evidence. This isn't metaphorical - it's literal experimental science.
</overview>
<principle name="falsifiability">
A good hypothesis can be proven wrong. If you can't design an experiment that could disprove it, it's not a useful hypothesis.
**Bad hypotheses** (unfalsifiable):
- "Something is wrong with the state"
- "The timing is off"
- "There's a race condition somewhere"
- "The library is buggy"
**Good hypotheses** (falsifiable):
- "The user state is being reset because the component remounts when the route changes"
- "The API call completes after the component unmounts, causing the state update on unmounted component warning"
- "Two async operations are modifying the same array without locking, causing data loss"
- "The library's caching mechanism is returning stale data because our cache key doesn't include the timestamp"
**The difference**: Specificity. Good hypotheses make specific, testable claims.
</principle>
<how_to_form>
**Process for forming hypotheses**:
1. **Observe the behavior precisely**
- Not "it's broken"
- But "the counter shows 3 when clicking once, should show 1"
2. **Ask "What could cause this?"**
- List every possible cause you can think of
- Don't judge them yet, just brainstorm
3. **Make each hypothesis specific**
- Not "state is wrong"
- But "state is being updated twice because handleClick is called twice"
4. **Identify what evidence would support/refute each**
- If hypothesis X is true, I should see Y
- If hypothesis X is false, I should see Z
<example>
**Observation**: Button click sometimes saves data, sometimes doesn't.
**Vague hypothesis**: "The save isn't working reliably"
❌ Unfalsifiable, not specific
**Specific hypotheses**:
1. "The save API call is timing out when network is slow"
- Testable: Check network tab for timeout errors
- Falsifiable: If all requests complete successfully, this is wrong
2. "The save button is being double-clicked, and the second request overwrites with stale data"
- Testable: Add logging to count clicks
- Falsifiable: If only one click is registered, this is wrong
3. "The save is successful but the UI doesn't update because the response is being ignored"
- Testable: Check if API returns success
- Falsifiable: If UI updates on successful response, this is wrong
</example>
</how_to_form>
<experimental_design>
An experiment is a test that produces evidence supporting or refuting a hypothesis.
**Good experiments**:
- Test one hypothesis at a time
- Have clear success/failure criteria
- Produce unambiguous results
- Are repeatable
**Bad experiments**:
- Test multiple things at once
- Have unclear outcomes ("maybe it works better?")
- Rely on subjective judgment
- Can't be reproduced
<framework>
For each hypothesis, design an experiment:
**1. Prediction**: If hypothesis H is true, then I will observe X
**2. Test setup**: What do I need to do to test this?
**3. Measurement**: What exactly am I measuring?
**4. Success criteria**: What result confirms H? What result refutes H?
**5. Run the experiment**: Execute the test
**6. Observe the result**: Record what actually happened
**7. Conclude**: Does this support or refute H?
</framework>
<example>
**Hypothesis**: "The component is re-rendering excessively because the parent is passing a new object reference on every render"
**1. Prediction**: If true, the component will re-render even when the object's values haven't changed
**2. Test setup**:
- Add console.log in component body to count renders
- Add console.log in parent to track when object is created
- Add useEffect with the object as dependency to log when it changes
**3. Measurement**: Count of renders and object creations
**4. Success criteria**:
- Confirms H: Component re-renders match parent renders, object reference changes each time
- Refutes H: Component only re-renders when object values actually change
**5. Run**: Execute the code with logging
**6. Observe**:
```
[Parent] Created user object
[Child] Rendering (1)
[Parent] Created user object
[Child] Rendering (2)
[Parent] Created user object
[Child] Rendering (3)
```
**7. Conclude**: CONFIRMED. New object every parent render → child re-renders
</example>
</experimental_design>
<evidence_quality>
Not all evidence is equal. Learn to distinguish strong from weak evidence.
**Strong evidence**:
- Directly observable ("I can see in the logs that X happens")
- Repeatable ("This fails every time I do Y")
- Unambiguous ("The value is definitely null, not undefined")
- Independent ("This happens even in a fresh browser with no cache")
**Weak evidence**:
- Hearsay ("I think I saw this fail once")
- Non-repeatable ("It failed that one time but I can't reproduce it")
- Ambiguous ("Something seems off")
- Confounded ("It works after I restarted the server and cleared the cache and updated the package")
<examples>
**Strong**:
```javascript
console.log('User ID:', userId); // Output: User ID: undefined
console.log('Type:', typeof userId); // Output: Type: undefined
```
✅ Direct observation, unambiguous
**Weak**:
"I think the user ID might not be set correctly sometimes"
❌ Vague, not verified, uncertain
**Strong**:
```javascript
for (let i = 0; i < 100; i++) {
const result = processData(testData);
if (result !== expected) {
console.log('Failed on iteration', i);
}
}
// Output: Failed on iterations: 3, 7, 12, 23, 31...
```
✅ Repeatable, shows pattern
**Weak**:
"It usually works, but sometimes fails"
❌ Not quantified, no pattern identified
</examples>
</evidence_quality>
<decision_point>
Don't act too early (premature fix) or too late (analysis paralysis).
**Act when you can answer YES to all**:
1. **Do you understand the mechanism?**
- Not just "what fails" but "why it fails"
- Can you explain the chain of events that produces the bug?
2. **Can you reproduce it reliably?**
- Either always reproduces, or you understand the conditions that trigger it
- If you can't reproduce, you don't understand it yet
3. **Do you have evidence, not just theory?**
- You've observed the behavior directly
- You've logged the values, traced the execution
- You're not guessing
4. **Have you ruled out alternatives?**
- You've considered other hypotheses
- Evidence contradicts the alternatives
- This is the most likely cause, not just the first idea
**Don't act if**:
- "I think it might be X" - Too uncertain
- "This could be the issue" - Not confident enough
- "Let me try changing Y and see" - Random changes, not hypothesis-driven
- "I'll fix it and if it works, great" - Outcome-based, not understanding-based
<example>
**Too early** (don't act):
- Hypothesis: "Maybe the API is slow"
- Evidence: None, just a guess
- Action: Add caching
- Result: Bug persists, now you have caching to debug too
**Right time** (act):
- Hypothesis: "API response is missing the 'status' field when user is inactive, causing the app to crash"
- Evidence:
- Logged API response for active user: has 'status' field
- Logged API response for inactive user: missing 'status' field
- Logged app behavior: crashes on accessing undefined status
- Action: Add defensive check for missing status field
- Result: Bug fixed because you understood the cause
</example>
</decision_point>
<recovery>
You will be wrong sometimes. This is normal. The skill is recovering gracefully.
**When your hypothesis is disproven**:
1. **Acknowledge it explicitly**
- "This hypothesis was wrong because [evidence]"
- Don't gloss over it or rationalize
- Intellectual honesty with yourself
2. **Extract the learning**
- What did this experiment teach you?
- What did you rule out?
- What new information do you have?
3. **Revise your understanding**
- Update your mental model
- What does the evidence actually suggest?
4. **Form new hypotheses**
- Based on what you now know
- Avoid just moving to "second-guess" - use the evidence
5. **Don't get attached to hypotheses**
- You're not your ideas
- Being wrong quickly is better than being wrong slowly
<example>
**Initial hypothesis**: "The memory leak is caused by event listeners not being cleaned up"
**Experiment**: Check Chrome DevTools for listener counts
**Result**: Listener count stays stable, doesn't grow over time
**Recovery**:
1. ✅ "Event listeners are NOT the cause. The count doesn't increase."
2. ✅ "I've ruled out event listeners as the culprit"
3. ✅ "But the memory profile shows objects accumulating. What objects? Let me check the heap snapshot..."
4. ✅ "New hypothesis: Large arrays are being cached and never released. Let me test by checking the heap for array sizes..."
This is good debugging. Wrong hypothesis, quick recovery, better understanding.
</example>
</recovery>
<multiple_hypotheses>
Don't fall in love with your first hypothesis. Generate multiple alternatives.
**Strategy**: "Strong inference" - Design experiments that differentiate between competing hypotheses.
<example>
**Problem**: Form submission fails intermittently
**Competing hypotheses**:
1. Network timeout
2. Validation failure
3. Race condition with auto-save
4. Server-side rate limiting
**Design experiment that differentiates**:
Add logging at each stage:
```javascript
try {
console.log('[1] Starting validation');
const validation = await validate(formData);
console.log('[1] Validation passed:', validation);
console.log('[2] Starting submission');
const response = await api.submit(formData);
console.log('[2] Response received:', response.status);
console.log('[3] Updating UI');
updateUI(response);
console.log('[3] Complete');
} catch (error) {
console.log('[ERROR] Failed at stage:', error);
}
```
**Observe results**:
- Fails at [2] with timeout error → Hypothesis 1
- Fails at [1] with validation error → Hypothesis 2
- Succeeds but [3] has wrong data → Hypothesis 3
- Fails at [2] with 429 status → Hypothesis 4
**One experiment, differentiates between four hypotheses.**
</example>
</multiple_hypotheses>
<workflow>
```
1. Observe unexpected behavior
2. Form specific hypotheses (plural)
3. For each hypothesis: What would prove/disprove?
4. Design experiment to test
5. Run experiment
6. Observe results
7. Evaluate: Confirmed, refuted, or inconclusive?
8a. If CONFIRMED → Design fix based on understanding
8b. If REFUTED → Return to step 2 with new hypotheses
8c. If INCONCLUSIVE → Redesign experiment or gather more data
```
**Key insight**: This is a loop, not a line. You'll cycle through multiple times. That's expected.
</workflow>
<pitfalls>
**Pitfall: Testing multiple hypotheses at once**
- You change three things and it works
- Which one fixed it? You don't know
- Solution: Test one hypothesis at a time
**Pitfall: Confirmation bias in experiments**
- You only look for evidence that confirms your hypothesis
- You ignore evidence that contradicts it
- Solution: Actively seek disconfirming evidence
**Pitfall: Acting on weak evidence**
- "It seems like maybe this could be..."
- Solution: Wait for strong, unambiguous evidence
**Pitfall: Not documenting results**
- You forget what you tested
- You repeat the same experiments
- Solution: Write down each hypothesis and its result
**Pitfall: Giving up on the scientific method**
- Under pressure, you start making random changes
- "Let me just try this..."
- Solution: Double down on rigor when pressure increases
</pitfalls>
<excellence>
**Great debuggers**:
- Form multiple competing hypotheses
- Design clever experiments that differentiate between them
- Follow the evidence wherever it leads
- Revise their beliefs when proven wrong
- Act only when they have strong evidence
- Understand the mechanism, not just the symptom
This is the difference between guessing and debugging.
</excellence>