15 KiB
Heuristics and Checklists Methodology
Advanced techniques for decision heuristics, checklist design, and cognitive bias mitigation.
Table of Contents
- When to Use Heuristics vs. Checklists
- Heuristics Research and Theory
- Checklist Design Principles
- Validating Heuristics and Checklists
- Refinement and Iteration
- Cognitive Biases and Mitigation
1. When to Use Heuristics vs. Checklists
Heuristics (Decision Shortcuts)
Use when:
- Time pressure (need to decide in <1 hour)
- Routine decisions (happens frequently, precedent exists)
- Good enough > perfect (satisficing appropriate)
- Environment stable (patterns repeat reliably)
- Cost of wrong decision low (<$10k impact)
Don't use when:
- Novel situations (no precedent, first time)
- High stakes (>$100k impact, irreversible)
- Adversarial environments (deception, misleading information)
- Multiple factors equally important (interactions matter)
- Legal/compliance requirements (need documentation)
Checklists (Procedural Memory Aids)
Use when:
- Complex procedures (>5 steps)
- Error-prone (history of mistakes)
- High consequences if step missed (safety, money, reputation)
- Infrequent procedures (easy to forget details)
- Multiple people involved (coordination needed)
Don't use when:
- Simple tasks (<3 steps)
- Fully automated (no human steps)
- Creative/exploratory work (checklist constrains unnecessarily)
- Expert with muscle memory (checklist adds overhead without value)
2. Heuristics Research and Theory
Fast and Frugal Heuristics (Gigerenzer)
Key insight: Simple heuristics can outperform complex models in uncertain environments with limited information.
Classic heuristics:
-
Recognition Heuristic: If you recognize one object but not the other, infer that the recognized object has higher value.
- Example: Which city is larger, Detroit or Tallahassee? (Recognize Detroit → Larger)
- Works when: Recognition correlates with criterion (r > 0.5)
- Fails when: Misleading advertising, niche quality
-
Take-the-Best: Rank cues by validity, use highest-validity cue that discriminates. Ignore others.
- Example: Hiring based on coding test score alone (if validity >70%)
- Works when: One cue dominates, environment stable
- Fails when: Multiple factors interact, no dominant cue
-
1/N Heuristic: Divide resources equally among N options.
- Example: Investment portfolio - equal weight across stocks
- Works when: No information on which option better, diversification reduces risk
- Fails when: Clear quality differences exist
Satisficing (Herbert Simon)
Concept: Search until option meets aspiration level (threshold), then stop. Don't optimize.
Formula:
Aspiration level = f(past outcomes, time pressure, search costs)
Adaptive satisficing: Lower threshold if no options meet it after K searches. Raise threshold if too many options qualify.
Example:
- Job search: "Salary ≥$120k, culture fit ≥7/10"
- After 20 applications, no offers → Lower to $110k
- After 5 offers all meeting bar → Raise to $130k
Ecological Rationality
Key insight: Heuristic's success depends on environment, not complexity.
Environment characteristics:
- Redundancy: Multiple cues correlated (take-the-best works)
- Predictability: Patterns repeat (recognition heuristic works)
- Volatility: Rapid change (simple heuristics adapt faster than complex models)
- Feedback speed: Fast feedback enables learning (trial-and-error refinement)
Mismatch example: Using recognition heuristic in adversarial environment (advertising creates false recognition) → Fails
3. Checklist Design Principles
Atul Gawande's Checklist Principles
Based on research in aviation, surgery, construction:
- Keep it short: 5-9 items max. Longer checklists get skipped.
- Focus on killer items: Steps that are often missed AND have serious consequences.
- Verb-first language: "Verify backups complete" not "Backups"
- Pause points: Define WHEN to use checklist (before start, after critical phase, before finish)
- Fit on one page: No scrolling or page-flipping
READ-DO vs. DO-CONFIRM
READ-DO (Challenge-Response):
- Read item aloud → Perform action → Confirm → Next item
- Use for: Novices, unfamiliar procedures, irreversible actions
- Example: Surgical safety checklist (WHO)
DO-CONFIRM:
- Perform entire procedure from memory → Then review checklist to confirm all done
- Use for: Experts, routine procedures, flow state important
- Example: Aviation pre-flight (experienced pilots)
Which to choose?:
- Expertise level: Novice → READ-DO, Expert → DO-CONFIRM
- Familiarity: First time → READ-DO, Routine → DO-CONFIRM
- Consequences: Irreversible (surgery) → READ-DO, Reversible → DO-CONFIRM
Forcing Functions and Fail-Safes
Forcing function: Design that prevents proceeding without completing step.
Examples:
- Car won't start unless seatbelt fastened
- Deployment script fails if tests not passing
- Door won't lock if key inside
vs. Checklist item: Checklist reminds, but can be ignored. Forcing function prevents.
When to use forcing function instead of checklist:
- Critical safety step (must be done)
- Automatable check (can be enforced by system)
- High compliance needed (>99%)
When checklist sufficient:
- Judgment required (can't automate)
- Multiple valid paths (flexibility needed)
- Compliance good with reminder (>90%)
4. Validating Heuristics and Checklists
Testing Heuristics
Retrospective validation: Test heuristic on historical cases.
Method:
- Collect past decisions (N ≥ 30 cases)
- Apply heuristic to each case (blind to actual outcome)
- Compare heuristic decision to actual outcome
- Calculate accuracy: % cases where heuristic would've chosen correctly
Target accuracy: ≥80% for good heuristic. <70% → Refine or abandon.
Example (Hiring heuristic):
- Heuristic: "Hire candidates from top 10 tech companies"
- Test on past 50 hires
- Outcome: 40/50 (80%) from top companies succeeded, 5/10 (50%) from others
- Conclusion: Heuristic valid (80% > 50% base rate)
A/B Testing Heuristics
Prospective validation: Run controlled experiment.
Method:
- Group A: Use heuristic
- Group B: Use existing method (or random)
- Compare outcomes (quality, speed, consistency)
Example:
- A: Customer routing by fast & frugal tree
- B: Customer routing by availability
- Metrics: Response time, resolution rate, customer satisfaction
- Result: A faster (20% ↓ response time), higher satisfaction (8.2 vs. 7.5) → Adopt heuristic
Checklist Validation
Error rate measurement:
Before checklist:
- Track error rate for procedure (e.g., deployments with failures)
- Baseline: X% error rate
After checklist:
- Introduce checklist
- Track error rate for same procedure
- New rate: Y% error rate
- Improvement: (X - Y) / X × 100%
Target: ≥50% error reduction. If <25%, checklist not effective.
Example (Surgical checklist):
- Before: 11% complication rate
- After: 7% complication rate
- Improvement: (11 - 7) / 11 = 36% reduction (good, continue using)
5. Refinement and Iteration
When Heuristics Fail
Diagnostic questions:
- Wrong cue: Are we using the best predictor? Try different criterion.
- Threshold too high/low: Should we raise/lower aspiration level?
- Environment changed: Did market shift, competition intensify, technology disrupt?
- Exceptions accumulating: Are special cases becoming the norm? Need more complex rule.
Refinement strategies:
- Add exception: "Use heuristic EXCEPT when [condition]"
- Adjust threshold: Satisficing level up/down based on outcomes
- Switch cue: Use different criterion if current one losing validity
- Add layer: Convert simple rule to fast & frugal tree (2-3 questions max)
When Checklists Fail
Diagnostic questions:
- Too long: Are people skipping because overwhelming? → Cut to killer items only.
- Wrong format: Are experts resisting READ-DO? → Switch to DO-CONFIRM.
- Missing critical step: Did error happen that checklist didn't catch? → Add item.
- False sense of security: Are people checking boxes without thinking? → Add verification.
Refinement strategies:
- Shorten: Remove non-critical items. Aim for 5-9 items.
- Reformat: Switch READ-DO ↔ DO-CONFIRM based on user feedback.
- Add forcing function: Critical items become automated checks (not manual).
- Add challenge-response: Two-person verification for high-stakes items.
6. Cognitive Biases and Mitigation
Availability Bias
Definition: Judge frequency/probability by ease of recall. Recent, vivid events seem more common.
How it misleads heuristics:
- Plane crash on news → Overestimate flight risk
- Recent fraud case → Overestimate fraud rate
- Salient failure → Avoid entire category
Mitigation:
- Use base rates (statistical frequencies) not anecdotes
- Ask: "What's the actual data?" not "What do I remember?"
- Track all cases, not just memorable ones
Example:
- Availability: "Customer from [Country X] didn't pay, avoid all [Country X] customers"
- Base rate check: "Only 2% of [Country X] customers defaulted vs. 1.5% overall" → Marginal difference, not categorical
Representativeness Bias
Definition: Judge probability by similarity to stereotype/prototype. "Looks like X, therefore is X."
How it misleads heuristics:
- "Looks like successful founder" (hoodie, Stanford, articulate) → Overestimate success
- "Looks like good engineer" (quiet, focused) → Miss great communicators
Mitigation:
- Use objective criteria (track record, test scores) not stereotypes
- Check base rate: How often does stereotype actually predict outcome?
- Blind evaluation: Remove identifying information
Example:
- Representativeness: "Candidate reminds me of [successful person], hire"
- Base rate: "Only 5% of hires succeed regardless of who they remind me of"
Anchoring Bias
Definition: Over-rely on first piece of information. Initial number shapes estimate.
How it misleads heuristics:
- First salary offer anchors negotiation
- Initial project estimate anchors timeline
- First price seen anchors value perception
Mitigation:
- Set your own anchor first (make first offer)
- Deliberately adjust away from anchor (mental correction)
- Use external reference (market data, not internal anchor)
Example:
- Anchoring: Candidate asks $150k, you offer $155k (anchored to their ask)
- Better: You offer $130k first (your anchor), negotiate from there
Confirmation Bias
Definition: Seek, interpret, recall information confirming existing belief. Ignore disconfirming evidence.
How it misleads heuristics:
- Heuristic works once → Notice only confirming cases
- Initial hypothesis → Search for supporting evidence only
Mitigation:
- Actively seek disconfirming evidence ("Why might this heuristic fail?")
- Track all cases (not just successes)
- Pre-commit to decision rule, then test objectively
Example:
- Confirmation: "Recognition heuristic worked on 3 cases!" (ignore 5 failures)
- Mitigation: Track all 50 cases → 25/50 = 50% accuracy (coin flip, abandon heuristic)
Sunk Cost Fallacy
Definition: Continue based on past investment, not future value. "Already spent X, can't stop now."
How it misleads heuristics:
- Heuristic worked in past → Keep using despite declining accuracy
- Spent time designing checklist → Force it to work despite low adoption
Mitigation:
- Evaluate based on future value only ("Will this heuristic work going forward?")
- Pre-commit to abandonment criteria ("If accuracy <70%, switch methods")
- Ignore past effort when deciding
Example:
- Sunk cost: "Spent 10 hours designing this heuristic, must use it"
- Rational: "Heuristic only 60% accurate, abandon and try different approach"
Advanced Topics
Swiss Cheese Model (Error Prevention)
James Reason's model: Multiple defensive layers, each with holes. Error occurs when holes align.
Layers:
- Organizational (policies, culture)
- Supervision (oversight, review)
- Preconditions (fatigue, time pressure)
- Actions (individual performance)
Checklist as defensive layer: Catches errors that slip through other layers.
Example (Deployment failure):
- Organizational: No deployment policy
- Supervision: No code review
- Preconditions: Friday night deployment (fatigue)
- Actions: Developer forgets migration
- Checklist: "☐ Database migration tested" catches error
Adaptive Heuristics
Concept: Heuristic parameters adjust based on outcomes.
Example (Satisficing with adaptive threshold):
- Start: Threshold = 80% of criteria
- After 10 searches, no options found → Lower to 70%
- After 5 options found → Raise to 85%
Implementation:
threshold = initial_threshold
search_count = 0
options_found = 0
while not decided:
search_count += 1
if option_meets_threshold:
options_found += 1
decide(option)
if search_count > K and options_found == 0:
threshold *= 0.9 # Lower threshold
if options_found > M:
threshold *= 1.1 # Raise threshold
Context-Dependent Heuristics
Concept: Different rules for different contexts. Meta-heuristic chooses which heuristic to use.
Example (Decision approach):
-
Check context:
- Is decision reversible? Yes → Use fast heuristic (satisficing)
- Is decision irreversible? No → Use slow analysis (full evaluation)
-
Choose heuristic based on stakes:
- <$1k: Recognition heuristic
- $1k-$10k: Satisficing
-
$10k: Full analysis
Implementation:
if stakes < 1000:
use recognition_heuristic()
elif stakes < 10000:
use satisficing(threshold=0.8)
else:
use full_analysis()
Key Takeaways
-
Heuristics work in stable environments: Recognition, take-the-best excel when patterns repeat. Fail in novel, adversarial contexts.
-
Satisficing beats optimization under uncertainty: "Good enough" faster and often as good as perfect when environment unpredictable.
-
Checklists catch 60-80% of errors: Proven in aviation, surgery, construction. Focus on killer items only (5-9 max).
-
READ-DO for novices, DO-CONFIRM for experts: Match format to user. Forcing experts into READ-DO creates resistance.
-
Test heuristics empirically: Don't assume rules work. Validate on ≥30 historical cases, target ≥80% accuracy.
-
Forcing functions > checklists for critical steps: If step must be done, automate enforcement rather than relying on manual check.
-
Update heuristics when environment changes: Rules optimized for past may fail when market shifts, tech disrupts, competition intensifies. Re-validate quarterly.