Files
2025-11-30 08:38:26 +08:00

15 KiB
Raw Permalink Blame History

Heuristics and Checklists Methodology

Advanced techniques for decision heuristics, checklist design, and cognitive bias mitigation.

Table of Contents

  1. When to Use Heuristics vs. Checklists
  2. Heuristics Research and Theory
  3. Checklist Design Principles
  4. Validating Heuristics and Checklists
  5. Refinement and Iteration
  6. Cognitive Biases and Mitigation

1. When to Use Heuristics vs. Checklists

Heuristics (Decision Shortcuts)

Use when:

  • Time pressure (need to decide in <1 hour)
  • Routine decisions (happens frequently, precedent exists)
  • Good enough > perfect (satisficing appropriate)
  • Environment stable (patterns repeat reliably)
  • Cost of wrong decision low (<$10k impact)

Don't use when:

  • Novel situations (no precedent, first time)
  • High stakes (>$100k impact, irreversible)
  • Adversarial environments (deception, misleading information)
  • Multiple factors equally important (interactions matter)
  • Legal/compliance requirements (need documentation)

Checklists (Procedural Memory Aids)

Use when:

  • Complex procedures (>5 steps)
  • Error-prone (history of mistakes)
  • High consequences if step missed (safety, money, reputation)
  • Infrequent procedures (easy to forget details)
  • Multiple people involved (coordination needed)

Don't use when:

  • Simple tasks (<3 steps)
  • Fully automated (no human steps)
  • Creative/exploratory work (checklist constrains unnecessarily)
  • Expert with muscle memory (checklist adds overhead without value)

2. Heuristics Research and Theory

Fast and Frugal Heuristics (Gigerenzer)

Key insight: Simple heuristics can outperform complex models in uncertain environments with limited information.

Classic heuristics:

  1. Recognition Heuristic: If you recognize one object but not the other, infer that the recognized object has higher value.

    • Example: Which city is larger, Detroit or Tallahassee? (Recognize Detroit → Larger)
    • Works when: Recognition correlates with criterion (r > 0.5)
    • Fails when: Misleading advertising, niche quality
  2. Take-the-Best: Rank cues by validity, use highest-validity cue that discriminates. Ignore others.

    • Example: Hiring based on coding test score alone (if validity >70%)
    • Works when: One cue dominates, environment stable
    • Fails when: Multiple factors interact, no dominant cue
  3. 1/N Heuristic: Divide resources equally among N options.

    • Example: Investment portfolio - equal weight across stocks
    • Works when: No information on which option better, diversification reduces risk
    • Fails when: Clear quality differences exist

Satisficing (Herbert Simon)

Concept: Search until option meets aspiration level (threshold), then stop. Don't optimize.

Formula:

Aspiration level = f(past outcomes, time pressure, search costs)

Adaptive satisficing: Lower threshold if no options meet it after K searches. Raise threshold if too many options qualify.

Example:

  • Job search: "Salary ≥$120k, culture fit ≥7/10"
  • After 20 applications, no offers → Lower to $110k
  • After 5 offers all meeting bar → Raise to $130k

Ecological Rationality

Key insight: Heuristic's success depends on environment, not complexity.

Environment characteristics:

  • Redundancy: Multiple cues correlated (take-the-best works)
  • Predictability: Patterns repeat (recognition heuristic works)
  • Volatility: Rapid change (simple heuristics adapt faster than complex models)
  • Feedback speed: Fast feedback enables learning (trial-and-error refinement)

Mismatch example: Using recognition heuristic in adversarial environment (advertising creates false recognition) → Fails


3. Checklist Design Principles

Atul Gawande's Checklist Principles

Based on research in aviation, surgery, construction:

  1. Keep it short: 5-9 items max. Longer checklists get skipped.
  2. Focus on killer items: Steps that are often missed AND have serious consequences.
  3. Verb-first language: "Verify backups complete" not "Backups"
  4. Pause points: Define WHEN to use checklist (before start, after critical phase, before finish)
  5. Fit on one page: No scrolling or page-flipping

READ-DO vs. DO-CONFIRM

READ-DO (Challenge-Response):

  • Read item aloud → Perform action → Confirm → Next item
  • Use for: Novices, unfamiliar procedures, irreversible actions
  • Example: Surgical safety checklist (WHO)

DO-CONFIRM:

  • Perform entire procedure from memory → Then review checklist to confirm all done
  • Use for: Experts, routine procedures, flow state important
  • Example: Aviation pre-flight (experienced pilots)

Which to choose?:

  • Expertise level: Novice → READ-DO, Expert → DO-CONFIRM
  • Familiarity: First time → READ-DO, Routine → DO-CONFIRM
  • Consequences: Irreversible (surgery) → READ-DO, Reversible → DO-CONFIRM

Forcing Functions and Fail-Safes

Forcing function: Design that prevents proceeding without completing step.

Examples:

  • Car won't start unless seatbelt fastened
  • Deployment script fails if tests not passing
  • Door won't lock if key inside

vs. Checklist item: Checklist reminds, but can be ignored. Forcing function prevents.

When to use forcing function instead of checklist:

  • Critical safety step (must be done)
  • Automatable check (can be enforced by system)
  • High compliance needed (>99%)

When checklist sufficient:

  • Judgment required (can't automate)
  • Multiple valid paths (flexibility needed)
  • Compliance good with reminder (>90%)

4. Validating Heuristics and Checklists

Testing Heuristics

Retrospective validation: Test heuristic on historical cases.

Method:

  1. Collect past decisions (N ≥ 30 cases)
  2. Apply heuristic to each case (blind to actual outcome)
  3. Compare heuristic decision to actual outcome
  4. Calculate accuracy: % cases where heuristic would've chosen correctly

Target accuracy: ≥80% for good heuristic. <70% → Refine or abandon.

Example (Hiring heuristic):

  • Heuristic: "Hire candidates from top 10 tech companies"
  • Test on past 50 hires
  • Outcome: 40/50 (80%) from top companies succeeded, 5/10 (50%) from others
  • Conclusion: Heuristic valid (80% > 50% base rate)

A/B Testing Heuristics

Prospective validation: Run controlled experiment.

Method:

  1. Group A: Use heuristic
  2. Group B: Use existing method (or random)
  3. Compare outcomes (quality, speed, consistency)

Example:

  • A: Customer routing by fast & frugal tree
  • B: Customer routing by availability
  • Metrics: Response time, resolution rate, customer satisfaction
  • Result: A faster (20% ↓ response time), higher satisfaction (8.2 vs. 7.5) → Adopt heuristic

Checklist Validation

Error rate measurement:

Before checklist:

  • Track error rate for procedure (e.g., deployments with failures)
  • Baseline: X% error rate

After checklist:

  • Introduce checklist
  • Track error rate for same procedure
  • New rate: Y% error rate
  • Improvement: (X - Y) / X × 100%

Target: ≥50% error reduction. If <25%, checklist not effective.

Example (Surgical checklist):

  • Before: 11% complication rate
  • After: 7% complication rate
  • Improvement: (11 - 7) / 11 = 36% reduction (good, continue using)

5. Refinement and Iteration

When Heuristics Fail

Diagnostic questions:

  1. Wrong cue: Are we using the best predictor? Try different criterion.
  2. Threshold too high/low: Should we raise/lower aspiration level?
  3. Environment changed: Did market shift, competition intensify, technology disrupt?
  4. Exceptions accumulating: Are special cases becoming the norm? Need more complex rule.

Refinement strategies:

  • Add exception: "Use heuristic EXCEPT when [condition]"
  • Adjust threshold: Satisficing level up/down based on outcomes
  • Switch cue: Use different criterion if current one losing validity
  • Add layer: Convert simple rule to fast & frugal tree (2-3 questions max)

When Checklists Fail

Diagnostic questions:

  1. Too long: Are people skipping because overwhelming? → Cut to killer items only.
  2. Wrong format: Are experts resisting READ-DO? → Switch to DO-CONFIRM.
  3. Missing critical step: Did error happen that checklist didn't catch? → Add item.
  4. False sense of security: Are people checking boxes without thinking? → Add verification.

Refinement strategies:

  • Shorten: Remove non-critical items. Aim for 5-9 items.
  • Reformat: Switch READ-DO ↔ DO-CONFIRM based on user feedback.
  • Add forcing function: Critical items become automated checks (not manual).
  • Add challenge-response: Two-person verification for high-stakes items.

6. Cognitive Biases and Mitigation

Availability Bias

Definition: Judge frequency/probability by ease of recall. Recent, vivid events seem more common.

How it misleads heuristics:

  • Plane crash on news → Overestimate flight risk
  • Recent fraud case → Overestimate fraud rate
  • Salient failure → Avoid entire category

Mitigation:

  • Use base rates (statistical frequencies) not anecdotes
  • Ask: "What's the actual data?" not "What do I remember?"
  • Track all cases, not just memorable ones

Example:

  • Availability: "Customer from [Country X] didn't pay, avoid all [Country X] customers"
  • Base rate check: "Only 2% of [Country X] customers defaulted vs. 1.5% overall" → Marginal difference, not categorical

Representativeness Bias

Definition: Judge probability by similarity to stereotype/prototype. "Looks like X, therefore is X."

How it misleads heuristics:

  • "Looks like successful founder" (hoodie, Stanford, articulate) → Overestimate success
  • "Looks like good engineer" (quiet, focused) → Miss great communicators

Mitigation:

  • Use objective criteria (track record, test scores) not stereotypes
  • Check base rate: How often does stereotype actually predict outcome?
  • Blind evaluation: Remove identifying information

Example:

  • Representativeness: "Candidate reminds me of [successful person], hire"
  • Base rate: "Only 5% of hires succeed regardless of who they remind me of"

Anchoring Bias

Definition: Over-rely on first piece of information. Initial number shapes estimate.

How it misleads heuristics:

  • First salary offer anchors negotiation
  • Initial project estimate anchors timeline
  • First price seen anchors value perception

Mitigation:

  • Set your own anchor first (make first offer)
  • Deliberately adjust away from anchor (mental correction)
  • Use external reference (market data, not internal anchor)

Example:

  • Anchoring: Candidate asks $150k, you offer $155k (anchored to their ask)
  • Better: You offer $130k first (your anchor), negotiate from there

Confirmation Bias

Definition: Seek, interpret, recall information confirming existing belief. Ignore disconfirming evidence.

How it misleads heuristics:

  • Heuristic works once → Notice only confirming cases
  • Initial hypothesis → Search for supporting evidence only

Mitigation:

  • Actively seek disconfirming evidence ("Why might this heuristic fail?")
  • Track all cases (not just successes)
  • Pre-commit to decision rule, then test objectively

Example:

  • Confirmation: "Recognition heuristic worked on 3 cases!" (ignore 5 failures)
  • Mitigation: Track all 50 cases → 25/50 = 50% accuracy (coin flip, abandon heuristic)

Sunk Cost Fallacy

Definition: Continue based on past investment, not future value. "Already spent X, can't stop now."

How it misleads heuristics:

  • Heuristic worked in past → Keep using despite declining accuracy
  • Spent time designing checklist → Force it to work despite low adoption

Mitigation:

  • Evaluate based on future value only ("Will this heuristic work going forward?")
  • Pre-commit to abandonment criteria ("If accuracy <70%, switch methods")
  • Ignore past effort when deciding

Example:

  • Sunk cost: "Spent 10 hours designing this heuristic, must use it"
  • Rational: "Heuristic only 60% accurate, abandon and try different approach"

Advanced Topics

Swiss Cheese Model (Error Prevention)

James Reason's model: Multiple defensive layers, each with holes. Error occurs when holes align.

Layers:

  1. Organizational (policies, culture)
  2. Supervision (oversight, review)
  3. Preconditions (fatigue, time pressure)
  4. Actions (individual performance)

Checklist as defensive layer: Catches errors that slip through other layers.

Example (Deployment failure):

  • Organizational: No deployment policy
  • Supervision: No code review
  • Preconditions: Friday night deployment (fatigue)
  • Actions: Developer forgets migration
  • Checklist: "☐ Database migration tested" catches error

Adaptive Heuristics

Concept: Heuristic parameters adjust based on outcomes.

Example (Satisficing with adaptive threshold):

  • Start: Threshold = 80% of criteria
  • After 10 searches, no options found → Lower to 70%
  • After 5 options found → Raise to 85%

Implementation:

threshold = initial_threshold
search_count = 0
options_found = 0

while not decided:
    search_count += 1
    if option_meets_threshold:
        options_found += 1
        decide(option)
    
    if search_count > K and options_found == 0:
        threshold *= 0.9  # Lower threshold
    if options_found > M:
        threshold *= 1.1  # Raise threshold

Context-Dependent Heuristics

Concept: Different rules for different contexts. Meta-heuristic chooses which heuristic to use.

Example (Decision approach):

  1. Check context:

    • Is decision reversible? Yes → Use fast heuristic (satisficing)
    • Is decision irreversible? No → Use slow analysis (full evaluation)
  2. Choose heuristic based on stakes:

    • <$1k: Recognition heuristic
    • $1k-$10k: Satisficing
    • $10k: Full analysis

Implementation:

if stakes < 1000:
    use recognition_heuristic()
elif stakes < 10000:
    use satisficing(threshold=0.8)
else:
    use full_analysis()

Key Takeaways

  1. Heuristics work in stable environments: Recognition, take-the-best excel when patterns repeat. Fail in novel, adversarial contexts.

  2. Satisficing beats optimization under uncertainty: "Good enough" faster and often as good as perfect when environment unpredictable.

  3. Checklists catch 60-80% of errors: Proven in aviation, surgery, construction. Focus on killer items only (5-9 max).

  4. READ-DO for novices, DO-CONFIRM for experts: Match format to user. Forcing experts into READ-DO creates resistance.

  5. Test heuristics empirically: Don't assume rules work. Validate on ≥30 historical cases, target ≥80% accuracy.

  6. Forcing functions > checklists for critical steps: If step must be done, automate enforcement rather than relying on manual check.

  7. Update heuristics when environment changes: Rules optimized for past may fail when market shifts, tech disrupts, competition intensifies. Re-validate quarterly.