12 KiB
Bad hypotheses (unfalsifiable):
- "Something is wrong with the state"
- "The timing is off"
- "There's a race condition somewhere"
- "The library is buggy"
Good hypotheses (falsifiable):
- "The user state is being reset because the component remounts when the route changes"
- "The API call completes after the component unmounts, causing the state update on unmounted component warning"
- "Two async operations are modifying the same array without locking, causing data loss"
- "The library's caching mechanism is returning stale data because our cache key doesn't include the timestamp"
The difference: Specificity. Good hypotheses make specific, testable claims.
<how_to_form> Process for forming hypotheses:
-
Observe the behavior precisely
- Not "it's broken"
- But "the counter shows 3 when clicking once, should show 1"
-
Ask "What could cause this?"
- List every possible cause you can think of
- Don't judge them yet, just brainstorm
-
Make each hypothesis specific
- Not "state is wrong"
- But "state is being updated twice because handleClick is called twice"
-
Identify what evidence would support/refute each
- If hypothesis X is true, I should see Y
- If hypothesis X is false, I should see Z
Vague hypothesis: "The save isn't working reliably" ❌ Unfalsifiable, not specific
Specific hypotheses:
-
"The save API call is timing out when network is slow"
- Testable: Check network tab for timeout errors
- Falsifiable: If all requests complete successfully, this is wrong
-
"The save button is being double-clicked, and the second request overwrites with stale data"
- Testable: Add logging to count clicks
- Falsifiable: If only one click is registered, this is wrong
-
"The save is successful but the UI doesn't update because the response is being ignored"
- Testable: Check if API returns success
- Falsifiable: If UI updates on successful response, this is wrong </how_to_form>
<experimental_design> An experiment is a test that produces evidence supporting or refuting a hypothesis.
Good experiments:
- Test one hypothesis at a time
- Have clear success/failure criteria
- Produce unambiguous results
- Are repeatable
Bad experiments:
- Test multiple things at once
- Have unclear outcomes ("maybe it works better?")
- Rely on subjective judgment
- Can't be reproduced
1. Prediction: If hypothesis H is true, then I will observe X 2. Test setup: What do I need to do to test this? 3. Measurement: What exactly am I measuring? 4. Success criteria: What result confirms H? What result refutes H? 5. Run the experiment: Execute the test 6. Observe the result: Record what actually happened 7. Conclude: Does this support or refute H?
**Hypothesis**: "The component is re-rendering excessively because the parent is passing a new object reference on every render"1. Prediction: If true, the component will re-render even when the object's values haven't changed
2. Test setup:
- Add console.log in component body to count renders
- Add console.log in parent to track when object is created
- Add useEffect with the object as dependency to log when it changes
3. Measurement: Count of renders and object creations
4. Success criteria:
- Confirms H: Component re-renders match parent renders, object reference changes each time
- Refutes H: Component only re-renders when object values actually change
5. Run: Execute the code with logging
6. Observe:
[Parent] Created user object
[Child] Rendering (1)
[Parent] Created user object
[Child] Rendering (2)
[Parent] Created user object
[Child] Rendering (3)
7. Conclude: CONFIRMED. New object every parent render → child re-renders </experimental_design>
<evidence_quality> Not all evidence is equal. Learn to distinguish strong from weak evidence.
Strong evidence:
- Directly observable ("I can see in the logs that X happens")
- Repeatable ("This fails every time I do Y")
- Unambiguous ("The value is definitely null, not undefined")
- Independent ("This happens even in a fresh browser with no cache")
Weak evidence:
- Hearsay ("I think I saw this fail once")
- Non-repeatable ("It failed that one time but I can't reproduce it")
- Ambiguous ("Something seems off")
- Confounded ("It works after I restarted the server and cleared the cache and updated the package")
Weak: "I think the user ID might not be set correctly sometimes" ❌ Vague, not verified, uncertain
Strong:
for (let i = 0; i < 100; i++) {
const result = processData(testData);
if (result !== expected) {
console.log('Failed on iteration', i);
}
}
// Output: Failed on iterations: 3, 7, 12, 23, 31...
✅ Repeatable, shows pattern
Weak: "It usually works, but sometimes fails" ❌ Not quantified, no pattern identified </evidence_quality>
<decision_point> Don't act too early (premature fix) or too late (analysis paralysis).
Act when you can answer YES to all:
-
Do you understand the mechanism?
- Not just "what fails" but "why it fails"
- Can you explain the chain of events that produces the bug?
-
Can you reproduce it reliably?
- Either always reproduces, or you understand the conditions that trigger it
- If you can't reproduce, you don't understand it yet
-
Do you have evidence, not just theory?
- You've observed the behavior directly
- You've logged the values, traced the execution
- You're not guessing
-
Have you ruled out alternatives?
- You've considered other hypotheses
- Evidence contradicts the alternatives
- This is the most likely cause, not just the first idea
Don't act if:
- "I think it might be X" - Too uncertain
- "This could be the issue" - Not confident enough
- "Let me try changing Y and see" - Random changes, not hypothesis-driven
- "I'll fix it and if it works, great" - Outcome-based, not understanding-based
Right time (act):
- Hypothesis: "API response is missing the 'status' field when user is inactive, causing the app to crash"
- Evidence:
- Logged API response for active user: has 'status' field
- Logged API response for inactive user: missing 'status' field
- Logged app behavior: crashes on accessing undefined status
- Action: Add defensive check for missing status field
- Result: Bug fixed because you understood the cause </decision_point>
When your hypothesis is disproven:
-
Acknowledge it explicitly
- "This hypothesis was wrong because [evidence]"
- Don't gloss over it or rationalize
- Intellectual honesty with yourself
-
Extract the learning
- What did this experiment teach you?
- What did you rule out?
- What new information do you have?
-
Revise your understanding
- Update your mental model
- What does the evidence actually suggest?
-
Form new hypotheses
- Based on what you now know
- Avoid just moving to "second-guess" - use the evidence
-
Don't get attached to hypotheses
- You're not your ideas
- Being wrong quickly is better than being wrong slowly
Experiment: Check Chrome DevTools for listener counts Result: Listener count stays stable, doesn't grow over time
Recovery:
- ✅ "Event listeners are NOT the cause. The count doesn't increase."
- ✅ "I've ruled out event listeners as the culprit"
- ✅ "But the memory profile shows objects accumulating. What objects? Let me check the heap snapshot..."
- ✅ "New hypothesis: Large arrays are being cached and never released. Let me test by checking the heap for array sizes..."
This is good debugging. Wrong hypothesis, quick recovery, better understanding.
<multiple_hypotheses> Don't fall in love with your first hypothesis. Generate multiple alternatives.
Strategy: "Strong inference" - Design experiments that differentiate between competing hypotheses.
**Problem**: Form submission fails intermittentlyCompeting hypotheses:
- Network timeout
- Validation failure
- Race condition with auto-save
- Server-side rate limiting
Design experiment that differentiates:
Add logging at each stage:
try {
console.log('[1] Starting validation');
const validation = await validate(formData);
console.log('[1] Validation passed:', validation);
console.log('[2] Starting submission');
const response = await api.submit(formData);
console.log('[2] Response received:', response.status);
console.log('[3] Updating UI');
updateUI(response);
console.log('[3] Complete');
} catch (error) {
console.log('[ERROR] Failed at stage:', error);
}
Observe results:
- Fails at [2] with timeout error → Hypothesis 1
- Fails at [1] with validation error → Hypothesis 2
- Succeeds but [3] has wrong data → Hypothesis 3
- Fails at [2] with 429 status → Hypothesis 4
One experiment, differentiates between four hypotheses. </multiple_hypotheses>
``` 1. Observe unexpected behavior ↓ 2. Form specific hypotheses (plural) ↓ 3. For each hypothesis: What would prove/disprove? ↓ 4. Design experiment to test ↓ 5. Run experiment ↓ 6. Observe results ↓ 7. Evaluate: Confirmed, refuted, or inconclusive? ↓ 8a. If CONFIRMED → Design fix based on understanding 8b. If REFUTED → Return to step 2 with new hypotheses 8c. If INCONCLUSIVE → Redesign experiment or gather more data ```Key insight: This is a loop, not a line. You'll cycle through multiple times. That's expected.
Pitfall: Testing multiple hypotheses at once
- You change three things and it works
- Which one fixed it? You don't know
- Solution: Test one hypothesis at a time
Pitfall: Confirmation bias in experiments
- You only look for evidence that confirms your hypothesis
- You ignore evidence that contradicts it
- Solution: Actively seek disconfirming evidence
Pitfall: Acting on weak evidence
- "It seems like maybe this could be..."
- Solution: Wait for strong, unambiguous evidence
Pitfall: Not documenting results
- You forget what you tested
- You repeat the same experiments
- Solution: Write down each hypothesis and its result
Pitfall: Giving up on the scientific method
- Under pressure, you start making random changes
- "Let me just try this..."
- Solution: Double down on rigor when pressure increases
This is the difference between guessing and debugging.