# Heuristics and Checklists Methodology Advanced techniques for decision heuristics, checklist design, and cognitive bias mitigation. ## Table of Contents 1. [When to Use Heuristics vs. Checklists](#1-when-to-use-heuristics-vs-checklists) 2. [Heuristics Research and Theory](#2-heuristics-research-and-theory) 3. [Checklist Design Principles](#3-checklist-design-principles) 4. [Validating Heuristics and Checklists](#4-validating-heuristics-and-checklists) 5. [Refinement and Iteration](#5-refinement-and-iteration) 6. [Cognitive Biases and Mitigation](#6-cognitive-biases-and-mitigation) --- ## 1. When to Use Heuristics vs. Checklists ### Heuristics (Decision Shortcuts) **Use when**: - Time pressure (need to decide in <1 hour) - Routine decisions (happens frequently, precedent exists) - Good enough > perfect (satisficing appropriate) - Environment stable (patterns repeat reliably) - Cost of wrong decision low (<$10k impact) **Don't use when**: - Novel situations (no precedent, first time) - High stakes (>$100k impact, irreversible) - Adversarial environments (deception, misleading information) - Multiple factors equally important (interactions matter) - Legal/compliance requirements (need documentation) ### Checklists (Procedural Memory Aids) **Use when**: - Complex procedures (>5 steps) - Error-prone (history of mistakes) - High consequences if step missed (safety, money, reputation) - Infrequent procedures (easy to forget details) - Multiple people involved (coordination needed) **Don't use when**: - Simple tasks (<3 steps) - Fully automated (no human steps) - Creative/exploratory work (checklist constrains unnecessarily) - Expert with muscle memory (checklist adds overhead without value) --- ## 2. Heuristics Research and Theory ### Fast and Frugal Heuristics (Gigerenzer) **Key insight**: Simple heuristics can outperform complex models in uncertain environments with limited information. **Classic heuristics**: 1. **Recognition Heuristic**: If you recognize one object but not the other, infer that the recognized object has higher value. - **Example**: Which city is larger, Detroit or Tallahassee? (Recognize Detroit → Larger) - **Works when**: Recognition correlates with criterion (r > 0.5) - **Fails when**: Misleading advertising, niche quality 2. **Take-the-Best**: Rank cues by validity, use highest-validity cue that discriminates. Ignore others. - **Example**: Hiring based on coding test score alone (if validity >70%) - **Works when**: One cue dominates, environment stable - **Fails when**: Multiple factors interact, no dominant cue 3. **1/N Heuristic**: Divide resources equally among N options. - **Example**: Investment portfolio - equal weight across stocks - **Works when**: No information on which option better, diversification reduces risk - **Fails when**: Clear quality differences exist ### Satisficing (Herbert Simon) **Concept**: Search until option meets aspiration level (threshold), then stop. Don't optimize. **Formula**: ``` Aspiration level = f(past outcomes, time pressure, search costs) ``` **Adaptive satisficing**: Lower threshold if no options meet it after K searches. Raise threshold if too many options qualify. **Example**: - Job search: "Salary ≥$120k, culture fit ≥7/10" - After 20 applications, no offers → Lower to $110k - After 5 offers all meeting bar → Raise to $130k ### Ecological Rationality **Key insight**: Heuristic's success depends on environment, not complexity. **Environment characteristics**: - **Redundancy**: Multiple cues correlated (take-the-best works) - **Predictability**: Patterns repeat (recognition heuristic works) - **Volatility**: Rapid change (simple heuristics adapt faster than complex models) - **Feedback speed**: Fast feedback enables learning (trial-and-error refinement) **Mismatch example**: Using recognition heuristic in adversarial environment (advertising creates false recognition) → Fails --- ## 3. Checklist Design Principles ### Atul Gawande's Checklist Principles Based on research in aviation, surgery, construction: 1. **Keep it short**: 5-9 items max. Longer checklists get skipped. 2. **Focus on killer items**: Steps that are often missed AND have serious consequences. 3. **Verb-first language**: "Verify backups complete" not "Backups" 4. **Pause points**: Define WHEN to use checklist (before start, after critical phase, before finish) 5. **Fit on one page**: No scrolling or page-flipping ### READ-DO vs. DO-CONFIRM **READ-DO** (Challenge-Response): - Read item aloud → Perform action → Confirm → Next item - **Use for**: Novices, unfamiliar procedures, irreversible actions - **Example**: Surgical safety checklist (WHO) **DO-CONFIRM**: - Perform entire procedure from memory → Then review checklist to confirm all done - **Use for**: Experts, routine procedures, flow state important - **Example**: Aviation pre-flight (experienced pilots) **Which to choose?**: - Expertise level: Novice → READ-DO, Expert → DO-CONFIRM - Familiarity: First time → READ-DO, Routine → DO-CONFIRM - Consequences: Irreversible (surgery) → READ-DO, Reversible → DO-CONFIRM ### Forcing Functions and Fail-Safes **Forcing function**: Design that prevents proceeding without completing step. **Examples**: - Car won't start unless seatbelt fastened - Deployment script fails if tests not passing - Door won't lock if key inside **vs. Checklist item**: Checklist reminds, but can be ignored. Forcing function prevents. **When to use forcing function instead of checklist**: - Critical safety step (must be done) - Automatable check (can be enforced by system) - High compliance needed (>99%) **When checklist sufficient**: - Judgment required (can't automate) - Multiple valid paths (flexibility needed) - Compliance good with reminder (>90%) --- ## 4. Validating Heuristics and Checklists ### Testing Heuristics **Retrospective validation**: Test heuristic on historical cases. **Method**: 1. Collect past decisions (N ≥ 30 cases) 2. Apply heuristic to each case (blind to actual outcome) 3. Compare heuristic decision to actual outcome 4. Calculate accuracy: % cases where heuristic would've chosen correctly **Target accuracy**: ≥80% for good heuristic. <70% → Refine or abandon. **Example** (Hiring heuristic): - Heuristic: "Hire candidates from top 10 tech companies" - Test on past 50 hires - Outcome: 40/50 (80%) from top companies succeeded, 5/10 (50%) from others - **Conclusion**: Heuristic valid (80% > 50% base rate) ### A/B Testing Heuristics **Prospective validation**: Run controlled experiment. **Method**: 1. Group A: Use heuristic 2. Group B: Use existing method (or random) 3. Compare outcomes (quality, speed, consistency) **Example**: - A: Customer routing by fast & frugal tree - B: Customer routing by availability - **Metrics**: Response time, resolution rate, customer satisfaction - **Result**: A faster (20% ↓ response time), higher satisfaction (8.2 vs. 7.5) → Adopt heuristic ### Checklist Validation **Error rate measurement**: **Before checklist**: - Track error rate for procedure (e.g., deployments with failures) - Baseline: X% error rate **After checklist**: - Introduce checklist - Track error rate for same procedure - New rate: Y% error rate - **Improvement**: (X - Y) / X × 100% **Target**: ≥50% error reduction. If <25%, checklist not effective. **Example** (Surgical checklist): - Before: 11% complication rate - After: 7% complication rate - **Improvement**: (11 - 7) / 11 = 36% reduction (good, continue using) --- ## 5. Refinement and Iteration ### When Heuristics Fail **Diagnostic questions**: 1. **Wrong cue**: Are we using the best predictor? Try different criterion. 2. **Threshold too high/low**: Should we raise/lower aspiration level? 3. **Environment changed**: Did market shift, competition intensify, technology disrupt? 4. **Exceptions accumulating**: Are special cases becoming the norm? Need more complex rule. **Refinement strategies**: - **Add exception**: "Use heuristic EXCEPT when [condition]" - **Adjust threshold**: Satisficing level up/down based on outcomes - **Switch cue**: Use different criterion if current one losing validity - **Add layer**: Convert simple rule to fast & frugal tree (2-3 questions max) ### When Checklists Fail **Diagnostic questions**: 1. **Too long**: Are people skipping because overwhelming? → Cut to killer items only. 2. **Wrong format**: Are experts resisting READ-DO? → Switch to DO-CONFIRM. 3. **Missing critical step**: Did error happen that checklist didn't catch? → Add item. 4. **False sense of security**: Are people checking boxes without thinking? → Add verification. **Refinement strategies**: - **Shorten**: Remove non-critical items. Aim for 5-9 items. - **Reformat**: Switch READ-DO ↔ DO-CONFIRM based on user feedback. - **Add forcing function**: Critical items become automated checks (not manual). - **Add challenge-response**: Two-person verification for high-stakes items. --- ## 6. Cognitive Biases and Mitigation ### Availability Bias **Definition**: Judge frequency/probability by ease of recall. Recent, vivid events seem more common. **How it misleads heuristics**: - Plane crash on news → Overestimate flight risk - Recent fraud case → Overestimate fraud rate - Salient failure → Avoid entire category **Mitigation**: - Use base rates (statistical frequencies) not anecdotes - Ask: "What's the actual data?" not "What do I remember?" - Track all cases, not just memorable ones **Example**: - Availability: "Customer from [Country X] didn't pay, avoid all [Country X] customers" - Base rate check: "Only 2% of [Country X] customers defaulted vs. 1.5% overall" → Marginal difference, not categorical ### Representativeness Bias **Definition**: Judge probability by similarity to stereotype/prototype. "Looks like X, therefore is X." **How it misleads heuristics**: - "Looks like successful founder" (hoodie, Stanford, articulate) → Overestimate success - "Looks like good engineer" (quiet, focused) → Miss great communicators **Mitigation**: - Use objective criteria (track record, test scores) not stereotypes - Check base rate: How often does stereotype actually predict outcome? - Blind evaluation: Remove identifying information **Example**: - Representativeness: "Candidate reminds me of [successful person], hire" - Base rate: "Only 5% of hires succeed regardless of who they remind me of" ### Anchoring Bias **Definition**: Over-rely on first piece of information. Initial number shapes estimate. **How it misleads heuristics**: - First salary offer anchors negotiation - Initial project estimate anchors timeline - First price seen anchors value perception **Mitigation**: - Set your own anchor first (make first offer) - Deliberately adjust away from anchor (mental correction) - Use external reference (market data, not internal anchor) **Example**: - Anchoring: Candidate asks $150k, you offer $155k (anchored to their ask) - Better: You offer $130k first (your anchor), negotiate from there ### Confirmation Bias **Definition**: Seek, interpret, recall information confirming existing belief. Ignore disconfirming evidence. **How it misleads heuristics**: - Heuristic works once → Notice only confirming cases - Initial hypothesis → Search for supporting evidence only **Mitigation**: - Actively seek disconfirming evidence ("Why might this heuristic fail?") - Track all cases (not just successes) - Pre-commit to decision rule, then test objectively **Example**: - Confirmation: "Recognition heuristic worked on 3 cases!" (ignore 5 failures) - Mitigation: Track all 50 cases → 25/50 = 50% accuracy (coin flip, abandon heuristic) ### Sunk Cost Fallacy **Definition**: Continue based on past investment, not future value. "Already spent X, can't stop now." **How it misleads heuristics**: - Heuristic worked in past → Keep using despite declining accuracy - Spent time designing checklist → Force it to work despite low adoption **Mitigation**: - Evaluate based on future value only ("Will this heuristic work going forward?") - Pre-commit to abandonment criteria ("If accuracy <70%, switch methods") - Ignore past effort when deciding **Example**: - Sunk cost: "Spent 10 hours designing this heuristic, must use it" - Rational: "Heuristic only 60% accurate, abandon and try different approach" --- ## Advanced Topics ### Swiss Cheese Model (Error Prevention) **James Reason's model**: Multiple defensive layers, each with holes. Error occurs when holes align. **Layers**: 1. Organizational (policies, culture) 2. Supervision (oversight, review) 3. Preconditions (fatigue, time pressure) 4. Actions (individual performance) **Checklist as defensive layer**: Catches errors that slip through other layers. **Example** (Deployment failure): - Organizational: No deployment policy - Supervision: No code review - Preconditions: Friday night deployment (fatigue) - Actions: Developer forgets migration - **Checklist**: "☐ Database migration tested" catches error ### Adaptive Heuristics **Concept**: Heuristic parameters adjust based on outcomes. **Example** (Satisficing with adaptive threshold): - Start: Threshold = 80% of criteria - After 10 searches, no options found → Lower to 70% - After 5 options found → Raise to 85% **Implementation**: ``` threshold = initial_threshold search_count = 0 options_found = 0 while not decided: search_count += 1 if option_meets_threshold: options_found += 1 decide(option) if search_count > K and options_found == 0: threshold *= 0.9 # Lower threshold if options_found > M: threshold *= 1.1 # Raise threshold ``` ### Context-Dependent Heuristics **Concept**: Different rules for different contexts. Meta-heuristic chooses which heuristic to use. **Example** (Decision approach): 1. **Check context**: - Is decision reversible? Yes → Use fast heuristic (satisficing) - Is decision irreversible? No → Use slow analysis (full evaluation) 2. **Choose heuristic based on stakes**: - <$1k: Recognition heuristic - $1k-$10k: Satisficing - >$10k: Full analysis **Implementation**: ``` if stakes < 1000: use recognition_heuristic() elif stakes < 10000: use satisficing(threshold=0.8) else: use full_analysis() ``` ## Key Takeaways 1. **Heuristics work in stable environments**: Recognition, take-the-best excel when patterns repeat. Fail in novel, adversarial contexts. 2. **Satisficing beats optimization under uncertainty**: "Good enough" faster and often as good as perfect when environment unpredictable. 3. **Checklists catch 60-80% of errors**: Proven in aviation, surgery, construction. Focus on killer items only (5-9 max). 4. **READ-DO for novices, DO-CONFIRM for experts**: Match format to user. Forcing experts into READ-DO creates resistance. 5. **Test heuristics empirically**: Don't assume rules work. Validate on ≥30 historical cases, target ≥80% accuracy. 6. **Forcing functions > checklists for critical steps**: If step must be done, automate enforcement rather than relying on manual check. 7. **Update heuristics when environment changes**: Rules optimized for past may fail when market shifts, tech disrupts, competition intensifies. Re-validate quarterly.