Initial commit
This commit is contained in:
262
skills/heuristics-and-checklists/SKILL.md
Normal file
262
skills/heuristics-and-checklists/SKILL.md
Normal file
@@ -0,0 +1,262 @@
|
||||
---
|
||||
name: heuristics-and-checklists
|
||||
description: Use when making decisions under time pressure or uncertainty, preventing errors in complex procedures, designing decision rules or checklists, simplifying complex choices, or when user mentions heuristics, rules of thumb, mental models, checklists, error prevention, cognitive biases, satisficing, or needs practical decision shortcuts and systematic error reduction.
|
||||
---
|
||||
# Heuristics and Checklists
|
||||
|
||||
## Table of Contents
|
||||
- [Purpose](#purpose)
|
||||
- [When to Use](#when-to-use)
|
||||
- [What Is It?](#what-is-it)
|
||||
- [Workflow](#workflow)
|
||||
- [Common Patterns](#common-patterns)
|
||||
- [Guardrails](#guardrails)
|
||||
- [Quick Reference](#quick-reference)
|
||||
|
||||
## Purpose
|
||||
|
||||
Heuristics and Checklists provides practical frameworks for fast decision-making through mental shortcuts (heuristics) and systematic error prevention through structured procedures (checklists). This skill guides you through designing effective heuristics for routine decisions, creating checklists for complex procedures, and understanding when shortcuts work vs. when they lead to biases.
|
||||
|
||||
## When to Use
|
||||
|
||||
Use this skill when:
|
||||
|
||||
- **Time-constrained decisions**: Need to decide quickly without full analysis
|
||||
- **Routine choices**: Repetitive decisions where full analysis is overkill (satisficing)
|
||||
- **Error prevention**: Complex procedures where mistakes are costly (surgery, software deployment, flight operations)
|
||||
- **Checklist design**: Creating pre-flight, pre-launch, or pre-deployment checklists
|
||||
- **Cognitive load reduction**: Simplifying complex decisions into simple rules
|
||||
- **Bias mitigation**: Understanding when heuristics mislead (availability, anchoring, representativeness)
|
||||
- **Knowledge transfer**: Codifying expert intuition into transferable rules
|
||||
- **Quality assurance**: Ensuring critical steps aren't skipped
|
||||
- **Onboarding**: Teaching newcomers reliable decision patterns
|
||||
- **High-stakes procedures**: Surgery, aviation, nuclear operations, financial trading
|
||||
|
||||
Trigger phrases: "heuristics", "rules of thumb", "mental models", "checklists", "error prevention", "cognitive biases", "satisficing", "quick decision", "standard operating procedure"
|
||||
|
||||
## What Is It?
|
||||
|
||||
**Heuristics and Checklists** combines two complementary approaches for practical decision-making and error prevention:
|
||||
|
||||
**Core components**:
|
||||
- **Heuristics**: Mental shortcuts or rules of thumb that simplify decisions (e.g., "recognition heuristic": choose the option you recognize)
|
||||
- **Checklists**: Structured lists ensuring critical steps completed in order (aviation pre-flight, surgical safety checklist)
|
||||
- **Fast & Frugal Trees**: Simple decision trees with few branches, good enough decisions with minimal information
|
||||
- **Satisficing**: "Good enough" decisions (Simon) vs. exhaustive optimization
|
||||
- **Bias awareness**: Recognizing when heuristics fail (availability bias, anchoring, representativeness)
|
||||
- **Error prevention**: Swiss cheese model, forcing functions, fail-safes
|
||||
|
||||
**Quick example:**
|
||||
|
||||
**Scenario**: Startup CEO deciding whether to hire a candidate after interview.
|
||||
|
||||
**Without heuristics** (exhaustive analysis):
|
||||
- Compare to all other candidates (takes weeks)
|
||||
- 360-degree reference checks (10+ calls)
|
||||
- Skills assessment, culture fit survey, multiple rounds
|
||||
- Analysis paralysis, miss good candidates to faster competitors
|
||||
|
||||
**With heuristics** (fast & frugal):
|
||||
1. **Recognition heuristic**: Have they worked at company I respect? (Yes → +1)
|
||||
2. **Take-the-best**: What's their track record on most important skill? (Strong → +1)
|
||||
3. **Satisficing threshold**: If 2/2 positive, hire. Don't keep searching for "perfect" candidate.
|
||||
|
||||
**Outcome**: Hired strong candidate in 3 days instead of 3 weeks. Not perfect, but good enough and fast.
|
||||
|
||||
**Checklist example** (Software Deployment):
|
||||
```
|
||||
Pre-Deployment Checklist:
|
||||
☐ All tests passing (unit, integration, E2E)
|
||||
☐ Database migrations tested on staging
|
||||
☐ Rollback plan documented
|
||||
☐ Monitoring dashboards configured
|
||||
☐ On-call engineer identified
|
||||
☐ Stakeholders notified of deployment window
|
||||
☐ Feature flags configured for gradual rollout
|
||||
☐ Backups completed
|
||||
```
|
||||
|
||||
**Benefit**: Prevents missing critical steps. Reduces deployment failures by 60-80% (empirical data from aviation, surgery, software).
|
||||
|
||||
**Core benefits**:
|
||||
- **Speed**: Heuristics enable fast decisions under time pressure
|
||||
- **Cognitive efficiency**: Reduce mental load, free capacity for complex thinking
|
||||
- **Error reduction**: Checklists catch mistakes before they cause harm
|
||||
- **Consistency**: Standardized procedures reduce variance in outcomes
|
||||
- **Knowledge codification**: Capture expert intuition in transferable form
|
||||
|
||||
## Workflow
|
||||
|
||||
Copy this checklist and track your progress:
|
||||
|
||||
```
|
||||
Heuristics & Checklists Progress:
|
||||
- [ ] Step 1: Identify decision or procedure
|
||||
- [ ] Step 2: Choose approach (heuristic vs. checklist)
|
||||
- [ ] Step 3: Design heuristic or checklist
|
||||
- [ ] Step 4: Test and validate
|
||||
- [ ] Step 5: Apply and monitor
|
||||
- [ ] Step 6: Refine based on outcomes
|
||||
```
|
||||
|
||||
**Step 1: Identify decision or procedure**
|
||||
|
||||
What decision or procedure needs simplification? Is it repetitive? Time-sensitive? Error-prone? See [resources/template.md](resources/template.md#decision-procedure-identification-template).
|
||||
|
||||
**Step 2: Choose approach (heuristic vs. checklist)**
|
||||
|
||||
Heuristic for decisions (choose option). Checklist for procedures (sequence of steps). See [resources/methodology.md](resources/methodology.md#1-when-to-use-heuristics-vs-checklists).
|
||||
|
||||
**Step 3: Design heuristic or checklist**
|
||||
|
||||
Heuristic: Define simple rule (recognition, take-the-best, satisficing threshold). Checklist: List critical steps, add READ-DO or DO-CONFIRM format. See [resources/template.md](resources/template.md#heuristic-design-template) and [resources/template.md](resources/template.md#checklist-design-template).
|
||||
|
||||
**Step 4: Test and validate**
|
||||
|
||||
Pilot test with sample cases. Check: Does heuristic produce good enough decisions? Does checklist catch errors? See [resources/methodology.md](resources/methodology.md#4-validating-heuristics-and-checklists).
|
||||
|
||||
**Step 5: Apply and monitor**
|
||||
|
||||
Use in real scenarios. Track outcomes: decision quality, error rate, time saved. See [resources/template.md](resources/template.md#application-monitoring-template).
|
||||
|
||||
**Step 6: Refine based on outcomes**
|
||||
|
||||
Adjust rules based on data. If heuristic fails in specific contexts, add exception. If checklist too long, prioritize critical items. See [resources/methodology.md](resources/methodology.md#5-refinement-and-iteration).
|
||||
|
||||
Validate using [resources/evaluators/rubric_heuristics_and_checklists.json](resources/evaluators/rubric_heuristics_and_checklists.json). **Minimum standard**: Average score ≥ 3.5.
|
||||
|
||||
## Common Patterns
|
||||
|
||||
**Pattern 1: Recognition Heuristic**
|
||||
- **Rule**: Choose the option you recognize over the one you don't
|
||||
- **Best for**: Choosing between brands, cities, experts when quality correlates with fame
|
||||
- **Example**: "Which city is larger, Detroit or Milwaukee?" (Choose Detroit if only one recognized)
|
||||
- **When works**: Stable environments where recognition predicts quality
|
||||
- **When fails**: Advertising creates false recognition, niche quality unknown
|
||||
|
||||
**Pattern 2: Take-the-Best Heuristic**
|
||||
- **Rule**: Identify single most important criterion, choose based on that alone
|
||||
- **Best for**: Multi-attribute decisions with one dominant factor
|
||||
- **Example**: Hiring - "What's their track record on [critical skill]?" Ignore other factors.
|
||||
- **When works**: One factor predictive, others add little value
|
||||
- **When fails**: Multiple factors equally important, interactions matter
|
||||
|
||||
**Pattern 3: Satisficing (Good Enough Threshold)**
|
||||
- **Rule**: Set minimum acceptable criteria, choose first option that meets them
|
||||
- **Best for**: Routine decisions, time pressure, diminishing returns from analysis
|
||||
- **Example**: "Candidate meets 80% of requirements → hire, don't keep searching for 100%"
|
||||
- **When works**: Searching costs high, good enough > perfect delayed
|
||||
- **When fails**: Consequences of suboptimal choice severe
|
||||
|
||||
**Pattern 4: Aviation Checklist (DO-CONFIRM)**
|
||||
- **Format**: Perform actions from memory, then confirm each with checklist
|
||||
- **Best for**: Routine procedures with critical steps (pre-flight, pre-surgery, deployment)
|
||||
- **Example**: Pilot flies from memory, then reviews checklist to confirm all done
|
||||
- **When works**: Experts doing familiar procedures, flow state preferred
|
||||
- **When fails**: Novices, unfamiliar procedures (use READ-DO instead)
|
||||
|
||||
**Pattern 5: Surgical Checklist (READ-DO)**
|
||||
- **Format**: Read each step, then perform, one at a time
|
||||
- **Best for**: Unfamiliar procedures, novices, high-stakes irreversible actions
|
||||
- **Example**: Surgical team reads checklist aloud, confirms each step before proceeding
|
||||
- **When works**: Unfamiliar context, learning mode, consequences of error high
|
||||
- **When fails**: Expert routine tasks (feels tedious, adds overhead)
|
||||
|
||||
**Pattern 6: Fast & Frugal Decision Tree**
|
||||
- **Format**: Simple decision tree with 1-3 questions, binary choices at each node
|
||||
- **Best for**: Triage, classification, go/no-go decisions
|
||||
- **Example**: "Is customer enterprise? Yes → Assign senior rep. No → Is deal >$10k? Yes → Assign mid-level. No → Self-serve."
|
||||
- **When works**: Clear decision structure, limited information needed
|
||||
- **When fails**: Nuanced decisions, exceptions common
|
||||
|
||||
## Guardrails
|
||||
|
||||
**Critical requirements:**
|
||||
|
||||
1. **Know when heuristics work vs. fail**: Heuristics excel in stable, familiar environments with time pressure. They fail in novel, deceptive contexts (adversarial, misleading information). Don't use recognition heuristic when advertising creates false signals.
|
||||
|
||||
2. **Satisficing ≠ low standards**: "Good enough" threshold must be calibrated. Set based on cost of continued search vs. value of better option. Too low → poor decisions. Too high → analysis paralysis.
|
||||
|
||||
3. **Checklists for critical steps only**: Don't list every trivial action. Focus on steps that (1) are skipped often, (2) have serious consequences if missed, (3) not immediately obvious. Short checklists used > long checklists ignored.
|
||||
|
||||
4. **READ-DO for novices, DO-CONFIRM for experts**: Match format to user expertise. Forcing experts into READ-DO creates resistance and abandonment. Let experts flow, confirm after.
|
||||
|
||||
5. **Test heuristics empirically**: Don't assume rule works. Test on historical cases. Compare heuristic decisions to optimal decisions. If accuracy <80%, refine or abandon.
|
||||
|
||||
6. **Bias awareness is not bias elimination**: Knowing availability bias exists doesn't prevent it. Heuristics are unconscious. Need external checks (checklists, peer review, base rates) to counteract biases.
|
||||
|
||||
7. **Update heuristics when environment changes**: Rules optimized for past may fail in new context. Market shifts, technology changes, competitor strategies evolve. Re-validate quarterly.
|
||||
|
||||
8. **Forcing functions beat reminders**: "Don't forget X" fails. "Can't proceed until X done" works. Build constraints (e.g., deployment script requires all tests pass) rather than relying on memory.
|
||||
|
||||
**Common pitfalls:**
|
||||
|
||||
- ❌ **Heuristic as universal law**: "Always choose recognized brand" fails when dealing with deceptive advertising or niche quality.
|
||||
- ❌ **Checklist too long**: 30-item checklist gets skipped. Keep to 5-10 critical items max.
|
||||
- ❌ **Ignoring base rates**: "This customer seems like they'll buy" (representativeness heuristic) vs. "Only 2% of leads convert" (base rate). Use base rates to calibrate intuition.
|
||||
- ❌ **Anchoring on first option**: "First candidate seems good, let's hire" without considering alternatives. Set satisficing threshold, then evaluate multiple options.
|
||||
- ❌ **Checklist as blame shield**: "I followed checklist, not my fault" ignores responsibility to think. Checklists augment judgment, don't replace it.
|
||||
- ❌ **Not testing heuristics**: Assume rule works without validation. Test on past cases, measure accuracy.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
**Common heuristics:**
|
||||
|
||||
| Heuristic | Rule | Example | Best For |
|
||||
|-----------|------|---------|----------|
|
||||
| **Recognition** | Choose what you recognize | Detroit > Milwaukee (size) | Stable correlations between recognition and quality |
|
||||
| **Take-the-best** | Use single most important criterion | Hire based on track record alone | One dominant factor predicts outcome |
|
||||
| **Satisficing** | First option meeting threshold | Candidate meets 80% requirements → hire | Time pressure, search costs high |
|
||||
| **Availability** | Judge frequency by ease of recall | Plane crashes seem common (vivid) | Recent, vivid events (WARNING: bias) |
|
||||
| **Representativeness** | Judge by similarity to prototype | "Looks like successful startup founder" | Stereotypes exist (WARNING: bias) |
|
||||
| **Anchoring** | Adjust from initial value | First price shapes negotiation | Numerical estimates (WARNING: bias) |
|
||||
|
||||
**Checklist formats:**
|
||||
|
||||
| Format | When to Use | Process | Example |
|
||||
|--------|-------------|---------|---------|
|
||||
| **READ-DO** | Novices, unfamiliar, high-stakes | Read step → Do step → Repeat | Surgery (WHO checklist) |
|
||||
| **DO-CONFIRM** | Experts, routine, familiar | Do from memory → Confirm with checklist | Aviation pre-flight |
|
||||
| **Challenge-Response** | Two-person verification | One reads, other confirms | Nuclear launch procedures |
|
||||
|
||||
**Checklist design principles:**
|
||||
|
||||
1. **Keep it short**: 5-10 items max (critical steps only)
|
||||
2. **Use verb-first language**: "Verify backups complete" not "Backups"
|
||||
3. **One step per line**: Don't combine "Test and deploy"
|
||||
4. **Checkbox format**: ☐ Clear visual confirmation
|
||||
5. **Pause points**: Identify natural breaks (before start, after critical phase, before finish)
|
||||
6. **Killer items**: Mark items that block proceeding (e.g., ⚠ Tests must pass)
|
||||
|
||||
**When to use heuristics vs. checklists:**
|
||||
|
||||
| Decision Type | Use Heuristic | Use Checklist |
|
||||
|---------------|---------------|---------------|
|
||||
| **Choose between options** | ✓ Recognition, take-the-best, satisficing | ✗ Not applicable |
|
||||
| **Sequential procedure** | ✗ Not applicable | ✓ Pre-flight, deployment, surgery |
|
||||
| **Complex multi-step** | ✗ Too simplified | ✓ Ensures nothing skipped |
|
||||
| **Routine decision** | ✓ Fast rule (satisficing) | ✗ Overkill |
|
||||
| **Error-prone procedure** | ✗ Doesn't prevent errors | ✓ Catches mistakes |
|
||||
|
||||
**Cognitive biases (when heuristics fail):**
|
||||
|
||||
| Bias | Heuristic | Failure Mode | Mitigation |
|
||||
|------|-----------|--------------|------------|
|
||||
| **Availability** | Recent/vivid events judged as frequent | Overestimate plane crashes (vivid), underestimate heart disease | Use base rates, statistical data |
|
||||
| **Representativeness** | Judge by stereotype similarity | "Looks like successful founder" ignores base rate of success | Check against actual base rates |
|
||||
| **Anchoring** | First number shapes estimate | Initial salary offer anchors negotiation | Set own anchor first, adjust deliberately |
|
||||
| **Confirmation** | Seek supporting evidence | Only notice confirming data | Actively seek disconfirming evidence |
|
||||
| **Sunk cost** | Continue due to past investment | "Already spent $100k, can't stop now" | Evaluate based on future value only |
|
||||
|
||||
**Inputs required:**
|
||||
- **Decision/procedure**: What needs simplification or systematization?
|
||||
- **Historical data**: Past cases to test heuristic accuracy
|
||||
- **Critical steps**: Which steps, if skipped, cause failures?
|
||||
- **Error patterns**: Where do mistakes happen most often?
|
||||
- **Time constraints**: How quickly must decision be made?
|
||||
|
||||
**Outputs produced:**
|
||||
- `heuristic-rule.md`: Defined heuristic with conditions and exceptions
|
||||
- `checklist.md`: Structured checklist with critical steps
|
||||
- `validation-results.md`: Test results on historical cases
|
||||
- `refinement-log.md`: Iterations based on real-world performance
|
||||
@@ -0,0 +1,211 @@
|
||||
{
|
||||
"criteria": [
|
||||
{
|
||||
"name": "Heuristic Appropriateness",
|
||||
"description": "Heuristic type (recognition, take-the-best, satisficing, fast & frugal tree) matches decision context and environment stability.",
|
||||
"scale": {
|
||||
"1": "Wrong heuristic type chosen. Using recognition in adversarial environment or satisficing for novel high-stakes decisions.",
|
||||
"3": "Heuristic type reasonable but suboptimal. Some mismatch between heuristic and context.",
|
||||
"5": "Perfect match: Heuristic type suits decision frequency, time pressure, stakes, and environment stability. Ecological rationality demonstrated."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Checklist Focus (Killer Items Only)",
|
||||
"description": "Checklist contains only critical steps (often skipped AND serious consequences if missed). Not comprehensive list of all steps.",
|
||||
"scale": {
|
||||
"1": "Checklist too long (>15 items) or includes trivial steps. Comprehensive but unusable.",
|
||||
"3": "Some critical items present but includes non-critical steps. Length 10-15 items.",
|
||||
"5": "Focused on 5-9 killer items only. Each item meets criteria: often skipped, serious consequences, not obvious. Concise and actionable."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Format Alignment (READ-DO vs DO-CONFIRM)",
|
||||
"description": "Checklist format matches user expertise level. READ-DO for novices/unfamiliar, DO-CONFIRM for experts/routine.",
|
||||
"scale": {
|
||||
"1": "Format mismatch: READ-DO for experts (resistance), or DO-CONFIRM for novices (high error rate).",
|
||||
"3": "Format generally appropriate but could be optimized for specific user/context.",
|
||||
"5": "Perfect format match: READ-DO for novices/high-stakes/unfamiliar, DO-CONFIRM for experts/routine. Clear rationale provided."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Heuristic Validation",
|
||||
"description": "Heuristic tested on historical data (≥30 cases) or validated through A/B testing. Accuracy measured (target ≥80%).",
|
||||
"scale": {
|
||||
"1": "No validation. Heuristic assumed to work without testing. No accuracy measurement.",
|
||||
"3": "Informal validation on small sample (<30 cases). Some accuracy data but incomplete.",
|
||||
"5": "Rigorous validation: ≥30 historical cases tested, or A/B test run. Accuracy ≥80% demonstrated. Compared to baseline/alternative methods."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Checklist Error Reduction",
|
||||
"description": "Checklist effectiveness measured through before/after error rates. Target ≥50% error reduction demonstrated or projected.",
|
||||
"scale": {
|
||||
"1": "No measurement of checklist effectiveness. Error rates not tracked.",
|
||||
"3": "Some error tracking. Before/after comparison attempted but incomplete data.",
|
||||
"5": "Clear error rate measurement: before vs. after checklist. ≥50% reduction demonstrated or realistic projection based on similar cases."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Threshold Calibration (Satisficing)",
|
||||
"description": "For satisficing heuristics, threshold set based on search costs, time pressure, and past outcomes. Adaptive adjustment specified.",
|
||||
"scale": {
|
||||
"1": "Threshold arbitrary or missing. No rationale for 'good enough' level.",
|
||||
"3": "Threshold set but rationale weak. Some consideration of costs/benefits.",
|
||||
"5": "Well-calibrated threshold: based on search costs, time value, historical data. Adaptive rule specified (lower if no options, raise if too many)."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Bias Awareness and Mitigation",
|
||||
"description": "Recognizes when heuristics susceptible to biases (availability, representativeness, anchoring). Mitigation strategies included.",
|
||||
"scale": {
|
||||
"1": "No bias awareness. Heuristic presented as universally valid without limitations.",
|
||||
"3": "Some bias awareness mentioned but mitigation weak or missing.",
|
||||
"5": "Clear identification of bias risks (availability, representativeness, anchoring). Specific mitigations: use base rates, blind evaluation, external anchors."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Exception Handling",
|
||||
"description": "Heuristics include clear exceptions (when NOT to use). Checklists specify pause points and killer items (blocking conditions).",
|
||||
"scale": {
|
||||
"1": "No exceptions specified. Rule presented as universal. No blocking conditions in checklist.",
|
||||
"3": "Some exceptions mentioned but incomplete. Pause points present but killer items unclear.",
|
||||
"5": "Comprehensive exceptions: contexts where heuristic fails (novel, high-stakes, adversarial). Checklist has clear pause points and ⚠ killer items blocking proceed."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Cue Validity (Take-the-Best)",
|
||||
"description": "For take-the-best heuristics, cue validity documented (how often criterion predicts outcome, target >70%). Single most predictive cue identified.",
|
||||
"scale": {
|
||||
"1": "Cue chosen arbitrarily. No validity data. May not be most predictive criterion.",
|
||||
"3": "Cue seems reasonable but validity not measured. Assumption it's most predictive.",
|
||||
"5": "Cue validity rigorously measured (>70%). Compared to alternative cues. Confirmed as most predictive criterion through data."
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Iteration and Refinement Plan",
|
||||
"description": "Plan for monitoring heuristic/checklist performance and refining based on outcomes. Triggers for adjustment specified.",
|
||||
"scale": {
|
||||
"1": "No refinement plan. Set-it-and-forget-it approach. No monitoring of outcomes.",
|
||||
"3": "Informal plan to review periodically. No specific triggers or metrics.",
|
||||
"5": "Detailed refinement plan: metrics tracked (accuracy, error rate), review frequency (quarterly), triggers for adjustment (accuracy <80%, error rate >X%). Iterative improvement process."
|
||||
}
|
||||
}
|
||||
],
|
||||
"guidance_by_type": {
|
||||
"Hiring/Recruitment Decisions": {
|
||||
"target_score": 4.0,
|
||||
"key_criteria": ["Heuristic Appropriateness", "Threshold Calibration (Satisficing)", "Bias Awareness and Mitigation"],
|
||||
"common_pitfalls": ["Representativeness bias (looks like successful hire)", "Anchoring on first candidate", "No validation against historical hires"],
|
||||
"specific_guidance": "Use satisficing with clear thresholds (technical ≥75%, culture ≥7/10). Test on past hires (did threshold predict success?). Mitigate representativeness bias with blind evaluation and base rates."
|
||||
},
|
||||
"Operational Procedures (Deployment, Surgery, Aviation)": {
|
||||
"target_score": 4.5,
|
||||
"key_criteria": ["Checklist Focus (Killer Items Only)", "Format Alignment", "Checklist Error Reduction"],
|
||||
"common_pitfalls": ["Checklist too long (>15 items, gets skipped)", "Wrong format for users", "No error rate measurement"],
|
||||
"specific_guidance": "Focus on 5-9 killer items. Use READ-DO for unfamiliar/high-stakes, DO-CONFIRM for routine. Measure error rates before/after, target ≥50% reduction. Add forcing functions for critical steps."
|
||||
},
|
||||
"Customer Triage/Routing": {
|
||||
"target_score": 3.8,
|
||||
"key_criteria": ["Heuristic Appropriateness", "Exception Handling", "Iteration and Refinement Plan"],
|
||||
"common_pitfalls": ["Fast & frugal tree too complex (>3 levels)", "No exceptions for edge cases", "Not adapting as customer base evolves"],
|
||||
"specific_guidance": "Use fast & frugal tree (2-3 binary questions max). Define clear routing rules. Track misrouted cases, refine tree quarterly. Add exceptions for VIPs, escalations."
|
||||
},
|
||||
"Investment/Resource Allocation": {
|
||||
"target_score": 4.2,
|
||||
"key_criteria": ["Cue Validity (Take-the-Best)", "Heuristic Validation", "Bias Awareness and Mitigation"],
|
||||
"common_pitfalls": ["Availability bias (recent successes over-weighted)", "Confirmation bias (seek supporting evidence only)", "No backtest on historical cases"],
|
||||
"specific_guidance": "Use take-the-best with validated cue (founder track record, market size). Test on ≥30 past investments, accuracy ≥75%. Mitigate availability bias with base rates, sunk cost fallacy with future-value focus."
|
||||
},
|
||||
"Emergency/Time-Critical Decisions": {
|
||||
"target_score": 3.7,
|
||||
"key_criteria": ["Heuristic Appropriateness", "Exception Handling", "Format Alignment"],
|
||||
"common_pitfalls": ["Heuristic too slow (defeats purpose of quick decision)", "No exceptions for novel emergencies", "Checklist too long for urgent situations"],
|
||||
"specific_guidance": "Use recognition or satisficing for speed. Keep checklist to 3-5 critical items. Define clear exceptions ('if novel situation, escalate to expert'). Practice drills to build muscle memory."
|
||||
}
|
||||
},
|
||||
"guidance_by_complexity": {
|
||||
"Simple (Routine, Low Stakes)": {
|
||||
"target_score": 3.5,
|
||||
"focus_areas": ["Heuristic Appropriateness", "Checklist Focus", "Format Alignment"],
|
||||
"acceptable_shortcuts": ["Informal validation (spot checks)", "Shorter checklists (3-5 items)", "Basic exception list"],
|
||||
"specific_guidance": "Simple satisficing or recognition heuristic. Short checklist (3-5 items). DO-CONFIRM format for routine tasks. Informal validation acceptable if low stakes."
|
||||
},
|
||||
"Standard (Moderate Stakes, Some Complexity)": {
|
||||
"target_score": 4.0,
|
||||
"focus_areas": ["Heuristic Validation", "Checklist Error Reduction", "Bias Awareness and Mitigation"],
|
||||
"acceptable_shortcuts": ["Validation on smaller sample (20-30 cases)", "Informal error tracking"],
|
||||
"specific_guidance": "Validate heuristic on ≥20 cases, accuracy ≥75%. Track error rates before/after checklist. Identify 1-2 key biases and mitigations. Checklists 5-7 items."
|
||||
},
|
||||
"Complex (High Stakes, Multiple Factors)": {
|
||||
"target_score": 4.5,
|
||||
"focus_areas": ["All criteria", "Rigorous validation", "Continuous refinement"],
|
||||
"acceptable_shortcuts": ["None - comprehensive analysis required"],
|
||||
"specific_guidance": "Rigorous validation: ≥30 cases, A/B testing, accuracy ≥80%. Comprehensive bias mitigation. Error reduction ≥50% measured. Forcing functions for critical steps. Quarterly refinement. All exceptions documented."
|
||||
}
|
||||
},
|
||||
"common_failure_modes": [
|
||||
{
|
||||
"name": "Checklist Too Long",
|
||||
"symptom": "Checklist >15 items, includes trivial steps. Users skip or ignore.",
|
||||
"detection": "Low adoption rate (<50% of users complete). Items frequently unchecked.",
|
||||
"fix": "Ruthlessly cut to 5-9 killer items only. Ask: 'Is this often skipped? Serious consequences if missed? Not obvious?' If no to any, remove."
|
||||
},
|
||||
{
|
||||
"name": "Wrong Heuristic for Context",
|
||||
"symptom": "Using recognition heuristic in adversarial environment (advertising), or satisficing for novel high-stakes decision.",
|
||||
"detection": "Heuristic accuracy <60%, or major failures in novel contexts.",
|
||||
"fix": "Match heuristic to environment: stable → recognition/take-the-best, uncertain → satisficing, novel → full analysis. Add exceptions for edge cases."
|
||||
},
|
||||
{
|
||||
"name": "No Validation",
|
||||
"symptom": "Heuristic assumed to work without testing. No accuracy data. Checklist deployed without error rate measurement.",
|
||||
"detection": "Ask: 'How often does this rule work?' If no data, no validation.",
|
||||
"fix": "Test heuristic on ≥30 historical cases. Measure accuracy (target ≥80%). For checklists, track error rates before/after (target ≥50% reduction)."
|
||||
},
|
||||
{
|
||||
"name": "Ignoring Base Rates",
|
||||
"symptom": "Using representativeness or availability heuristics without checking actual frequencies. Recent vivid event over-weighted.",
|
||||
"detection": "Compare heuristic prediction to base rate. If large discrepancy, bias present.",
|
||||
"fix": "Always check base rates first. 'Customer from X looks risky' → Check: 'What % of X customers actually default?' Use data, not anecdotes."
|
||||
},
|
||||
{
|
||||
"name": "Format Mismatch (Expert vs. Novice)",
|
||||
"symptom": "Forcing experts into READ-DO creates resistance and abandonment. Novices with DO-CONFIRM make errors.",
|
||||
"detection": "User feedback: 'Too tedious' (experts) or 'Still making mistakes' (novices).",
|
||||
"fix": "Match format to expertise: Novices/unfamiliar → READ-DO, Experts/routine → DO-CONFIRM. Let experts flow, then confirm."
|
||||
},
|
||||
{
|
||||
"name": "Satisficing Threshold Uncalibrated",
|
||||
"symptom": "Threshold too high (analysis paralysis, no options qualify) or too low (poor decisions).",
|
||||
"detection": "Search budget exhausted with no options (too high), or many poor outcomes (too low).",
|
||||
"fix": "Calibrate based on search costs and past outcomes. Adaptive rule: lower after K searches if no options, raise if too many qualify."
|
||||
},
|
||||
{
|
||||
"name": "No Exceptions Specified",
|
||||
"symptom": "Heuristic presented as universal law. Applied blindly to novel or adversarial contexts where it fails.",
|
||||
"detection": "Major failures in contexts different from training data.",
|
||||
"fix": "Define clear exceptions: 'Use heuristic EXCEPT when [novel/high-stakes/adversarial/legally sensitive].' Document failure modes."
|
||||
},
|
||||
{
|
||||
"name": "Cue Not Most Predictive",
|
||||
"symptom": "Take-the-best uses convenient cue, not most valid cue. Accuracy suffers.",
|
||||
"detection": "Heuristic accuracy <75%. Other cues perform better in backtests.",
|
||||
"fix": "Test multiple cues, rank by validity (% of time cue predicts outcome correctly). Use highest-validity cue, ignore others."
|
||||
},
|
||||
{
|
||||
"name": "Checklist as Blame Shield",
|
||||
"symptom": "'I followed checklist, not my fault.' Boxes checked without thinking. False sense of security.",
|
||||
"detection": "Errors still occur despite checklist completion. Users mechanically checking boxes.",
|
||||
"fix": "Emphasize checklist augments judgment, doesn't replace it. Add forcing functions for critical items (can't proceed unless done). Challenge-response for high-stakes."
|
||||
},
|
||||
{
|
||||
"name": "No Refinement Plan",
|
||||
"symptom": "Set-it-and-forget-it. Heuristic/checklist not updated as environment changes. Accuracy degrades over time.",
|
||||
"detection": "Heuristic accuracy declining. Checklist error rate creeping back up.",
|
||||
"fix": "Quarterly review: Re-validate heuristic on recent cases. Track error rates. Adjust thresholds, add exceptions, update checklist items based on data."
|
||||
}
|
||||
],
|
||||
"minimum_standard": 3.5,
|
||||
"target_score": 4.0,
|
||||
"excellence_threshold": 4.5
|
||||
}
|
||||
422
skills/heuristics-and-checklists/resources/methodology.md
Normal file
422
skills/heuristics-and-checklists/resources/methodology.md
Normal file
@@ -0,0 +1,422 @@
|
||||
# Heuristics and Checklists Methodology
|
||||
|
||||
Advanced techniques for decision heuristics, checklist design, and cognitive bias mitigation.
|
||||
|
||||
## Table of Contents
|
||||
1. [When to Use Heuristics vs. Checklists](#1-when-to-use-heuristics-vs-checklists)
|
||||
2. [Heuristics Research and Theory](#2-heuristics-research-and-theory)
|
||||
3. [Checklist Design Principles](#3-checklist-design-principles)
|
||||
4. [Validating Heuristics and Checklists](#4-validating-heuristics-and-checklists)
|
||||
5. [Refinement and Iteration](#5-refinement-and-iteration)
|
||||
6. [Cognitive Biases and Mitigation](#6-cognitive-biases-and-mitigation)
|
||||
|
||||
---
|
||||
|
||||
## 1. When to Use Heuristics vs. Checklists
|
||||
|
||||
### Heuristics (Decision Shortcuts)
|
||||
|
||||
**Use when**:
|
||||
- Time pressure (need to decide in <1 hour)
|
||||
- Routine decisions (happens frequently, precedent exists)
|
||||
- Good enough > perfect (satisficing appropriate)
|
||||
- Environment stable (patterns repeat reliably)
|
||||
- Cost of wrong decision low (<$10k impact)
|
||||
|
||||
**Don't use when**:
|
||||
- Novel situations (no precedent, first time)
|
||||
- High stakes (>$100k impact, irreversible)
|
||||
- Adversarial environments (deception, misleading information)
|
||||
- Multiple factors equally important (interactions matter)
|
||||
- Legal/compliance requirements (need documentation)
|
||||
|
||||
### Checklists (Procedural Memory Aids)
|
||||
|
||||
**Use when**:
|
||||
- Complex procedures (>5 steps)
|
||||
- Error-prone (history of mistakes)
|
||||
- High consequences if step missed (safety, money, reputation)
|
||||
- Infrequent procedures (easy to forget details)
|
||||
- Multiple people involved (coordination needed)
|
||||
|
||||
**Don't use when**:
|
||||
- Simple tasks (<3 steps)
|
||||
- Fully automated (no human steps)
|
||||
- Creative/exploratory work (checklist constrains unnecessarily)
|
||||
- Expert with muscle memory (checklist adds overhead without value)
|
||||
|
||||
---
|
||||
|
||||
## 2. Heuristics Research and Theory
|
||||
|
||||
### Fast and Frugal Heuristics (Gigerenzer)
|
||||
|
||||
**Key insight**: Simple heuristics can outperform complex models in uncertain environments with limited information.
|
||||
|
||||
**Classic heuristics**:
|
||||
|
||||
1. **Recognition Heuristic**: If you recognize one object but not the other, infer that the recognized object has higher value.
|
||||
- **Example**: Which city is larger, Detroit or Tallahassee? (Recognize Detroit → Larger)
|
||||
- **Works when**: Recognition correlates with criterion (r > 0.5)
|
||||
- **Fails when**: Misleading advertising, niche quality
|
||||
|
||||
2. **Take-the-Best**: Rank cues by validity, use highest-validity cue that discriminates. Ignore others.
|
||||
- **Example**: Hiring based on coding test score alone (if validity >70%)
|
||||
- **Works when**: One cue dominates, environment stable
|
||||
- **Fails when**: Multiple factors interact, no dominant cue
|
||||
|
||||
3. **1/N Heuristic**: Divide resources equally among N options.
|
||||
- **Example**: Investment portfolio - equal weight across stocks
|
||||
- **Works when**: No information on which option better, diversification reduces risk
|
||||
- **Fails when**: Clear quality differences exist
|
||||
|
||||
### Satisficing (Herbert Simon)
|
||||
|
||||
**Concept**: Search until option meets aspiration level (threshold), then stop. Don't optimize.
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
Aspiration level = f(past outcomes, time pressure, search costs)
|
||||
```
|
||||
|
||||
**Adaptive satisficing**: Lower threshold if no options meet it after K searches. Raise threshold if too many options qualify.
|
||||
|
||||
**Example**:
|
||||
- Job search: "Salary ≥$120k, culture fit ≥7/10"
|
||||
- After 20 applications, no offers → Lower to $110k
|
||||
- After 5 offers all meeting bar → Raise to $130k
|
||||
|
||||
### Ecological Rationality
|
||||
|
||||
**Key insight**: Heuristic's success depends on environment, not complexity.
|
||||
|
||||
**Environment characteristics**:
|
||||
- **Redundancy**: Multiple cues correlated (take-the-best works)
|
||||
- **Predictability**: Patterns repeat (recognition heuristic works)
|
||||
- **Volatility**: Rapid change (simple heuristics adapt faster than complex models)
|
||||
- **Feedback speed**: Fast feedback enables learning (trial-and-error refinement)
|
||||
|
||||
**Mismatch example**: Using recognition heuristic in adversarial environment (advertising creates false recognition) → Fails
|
||||
|
||||
---
|
||||
|
||||
## 3. Checklist Design Principles
|
||||
|
||||
### Atul Gawande's Checklist Principles
|
||||
|
||||
Based on research in aviation, surgery, construction:
|
||||
|
||||
1. **Keep it short**: 5-9 items max. Longer checklists get skipped.
|
||||
2. **Focus on killer items**: Steps that are often missed AND have serious consequences.
|
||||
3. **Verb-first language**: "Verify backups complete" not "Backups"
|
||||
4. **Pause points**: Define WHEN to use checklist (before start, after critical phase, before finish)
|
||||
5. **Fit on one page**: No scrolling or page-flipping
|
||||
|
||||
### READ-DO vs. DO-CONFIRM
|
||||
|
||||
**READ-DO** (Challenge-Response):
|
||||
- Read item aloud → Perform action → Confirm → Next item
|
||||
- **Use for**: Novices, unfamiliar procedures, irreversible actions
|
||||
- **Example**: Surgical safety checklist (WHO)
|
||||
|
||||
**DO-CONFIRM**:
|
||||
- Perform entire procedure from memory → Then review checklist to confirm all done
|
||||
- **Use for**: Experts, routine procedures, flow state important
|
||||
- **Example**: Aviation pre-flight (experienced pilots)
|
||||
|
||||
**Which to choose?**:
|
||||
- Expertise level: Novice → READ-DO, Expert → DO-CONFIRM
|
||||
- Familiarity: First time → READ-DO, Routine → DO-CONFIRM
|
||||
- Consequences: Irreversible (surgery) → READ-DO, Reversible → DO-CONFIRM
|
||||
|
||||
### Forcing Functions and Fail-Safes
|
||||
|
||||
**Forcing function**: Design that prevents proceeding without completing step.
|
||||
|
||||
**Examples**:
|
||||
- Car won't start unless seatbelt fastened
|
||||
- Deployment script fails if tests not passing
|
||||
- Door won't lock if key inside
|
||||
|
||||
**vs. Checklist item**: Checklist reminds, but can be ignored. Forcing function prevents.
|
||||
|
||||
**When to use forcing function instead of checklist**:
|
||||
- Critical safety step (must be done)
|
||||
- Automatable check (can be enforced by system)
|
||||
- High compliance needed (>99%)
|
||||
|
||||
**When checklist sufficient**:
|
||||
- Judgment required (can't automate)
|
||||
- Multiple valid paths (flexibility needed)
|
||||
- Compliance good with reminder (>90%)
|
||||
|
||||
---
|
||||
|
||||
## 4. Validating Heuristics and Checklists
|
||||
|
||||
### Testing Heuristics
|
||||
|
||||
**Retrospective validation**: Test heuristic on historical cases.
|
||||
|
||||
**Method**:
|
||||
1. Collect past decisions (N ≥ 30 cases)
|
||||
2. Apply heuristic to each case (blind to actual outcome)
|
||||
3. Compare heuristic decision to actual outcome
|
||||
4. Calculate accuracy: % cases where heuristic would've chosen correctly
|
||||
|
||||
**Target accuracy**: ≥80% for good heuristic. <70% → Refine or abandon.
|
||||
|
||||
**Example** (Hiring heuristic):
|
||||
- Heuristic: "Hire candidates from top 10 tech companies"
|
||||
- Test on past 50 hires
|
||||
- Outcome: 40/50 (80%) from top companies succeeded, 5/10 (50%) from others
|
||||
- **Conclusion**: Heuristic valid (80% > 50% base rate)
|
||||
|
||||
### A/B Testing Heuristics
|
||||
|
||||
**Prospective validation**: Run controlled experiment.
|
||||
|
||||
**Method**:
|
||||
1. Group A: Use heuristic
|
||||
2. Group B: Use existing method (or random)
|
||||
3. Compare outcomes (quality, speed, consistency)
|
||||
|
||||
**Example**:
|
||||
- A: Customer routing by fast & frugal tree
|
||||
- B: Customer routing by availability
|
||||
- **Metrics**: Response time, resolution rate, customer satisfaction
|
||||
- **Result**: A faster (20% ↓ response time), higher satisfaction (8.2 vs. 7.5) → Adopt heuristic
|
||||
|
||||
### Checklist Validation
|
||||
|
||||
**Error rate measurement**:
|
||||
|
||||
**Before checklist**:
|
||||
- Track error rate for procedure (e.g., deployments with failures)
|
||||
- Baseline: X% error rate
|
||||
|
||||
**After checklist**:
|
||||
- Introduce checklist
|
||||
- Track error rate for same procedure
|
||||
- New rate: Y% error rate
|
||||
- **Improvement**: (X - Y) / X × 100%
|
||||
|
||||
**Target**: ≥50% error reduction. If <25%, checklist not effective.
|
||||
|
||||
**Example** (Surgical checklist):
|
||||
- Before: 11% complication rate
|
||||
- After: 7% complication rate
|
||||
- **Improvement**: (11 - 7) / 11 = 36% reduction (good, continue using)
|
||||
|
||||
---
|
||||
|
||||
## 5. Refinement and Iteration
|
||||
|
||||
### When Heuristics Fail
|
||||
|
||||
**Diagnostic questions**:
|
||||
1. **Wrong cue**: Are we using the best predictor? Try different criterion.
|
||||
2. **Threshold too high/low**: Should we raise/lower aspiration level?
|
||||
3. **Environment changed**: Did market shift, competition intensify, technology disrupt?
|
||||
4. **Exceptions accumulating**: Are special cases becoming the norm? Need more complex rule.
|
||||
|
||||
**Refinement strategies**:
|
||||
- **Add exception**: "Use heuristic EXCEPT when [condition]"
|
||||
- **Adjust threshold**: Satisficing level up/down based on outcomes
|
||||
- **Switch cue**: Use different criterion if current one losing validity
|
||||
- **Add layer**: Convert simple rule to fast & frugal tree (2-3 questions max)
|
||||
|
||||
### When Checklists Fail
|
||||
|
||||
**Diagnostic questions**:
|
||||
1. **Too long**: Are people skipping because overwhelming? → Cut to killer items only.
|
||||
2. **Wrong format**: Are experts resisting READ-DO? → Switch to DO-CONFIRM.
|
||||
3. **Missing critical step**: Did error happen that checklist didn't catch? → Add item.
|
||||
4. **False sense of security**: Are people checking boxes without thinking? → Add verification.
|
||||
|
||||
**Refinement strategies**:
|
||||
- **Shorten**: Remove non-critical items. Aim for 5-9 items.
|
||||
- **Reformat**: Switch READ-DO ↔ DO-CONFIRM based on user feedback.
|
||||
- **Add forcing function**: Critical items become automated checks (not manual).
|
||||
- **Add challenge-response**: Two-person verification for high-stakes items.
|
||||
|
||||
---
|
||||
|
||||
## 6. Cognitive Biases and Mitigation
|
||||
|
||||
### Availability Bias
|
||||
|
||||
**Definition**: Judge frequency/probability by ease of recall. Recent, vivid events seem more common.
|
||||
|
||||
**How it misleads heuristics**:
|
||||
- Plane crash on news → Overestimate flight risk
|
||||
- Recent fraud case → Overestimate fraud rate
|
||||
- Salient failure → Avoid entire category
|
||||
|
||||
**Mitigation**:
|
||||
- Use base rates (statistical frequencies) not anecdotes
|
||||
- Ask: "What's the actual data?" not "What do I remember?"
|
||||
- Track all cases, not just memorable ones
|
||||
|
||||
**Example**:
|
||||
- Availability: "Customer from [Country X] didn't pay, avoid all [Country X] customers"
|
||||
- Base rate check: "Only 2% of [Country X] customers defaulted vs. 1.5% overall" → Marginal difference, not categorical
|
||||
|
||||
### Representativeness Bias
|
||||
|
||||
**Definition**: Judge probability by similarity to stereotype/prototype. "Looks like X, therefore is X."
|
||||
|
||||
**How it misleads heuristics**:
|
||||
- "Looks like successful founder" (hoodie, Stanford, articulate) → Overestimate success
|
||||
- "Looks like good engineer" (quiet, focused) → Miss great communicators
|
||||
|
||||
**Mitigation**:
|
||||
- Use objective criteria (track record, test scores) not stereotypes
|
||||
- Check base rate: How often does stereotype actually predict outcome?
|
||||
- Blind evaluation: Remove identifying information
|
||||
|
||||
**Example**:
|
||||
- Representativeness: "Candidate reminds me of [successful person], hire"
|
||||
- Base rate: "Only 5% of hires succeed regardless of who they remind me of"
|
||||
|
||||
### Anchoring Bias
|
||||
|
||||
**Definition**: Over-rely on first piece of information. Initial number shapes estimate.
|
||||
|
||||
**How it misleads heuristics**:
|
||||
- First salary offer anchors negotiation
|
||||
- Initial project estimate anchors timeline
|
||||
- First price seen anchors value perception
|
||||
|
||||
**Mitigation**:
|
||||
- Set your own anchor first (make first offer)
|
||||
- Deliberately adjust away from anchor (mental correction)
|
||||
- Use external reference (market data, not internal anchor)
|
||||
|
||||
**Example**:
|
||||
- Anchoring: Candidate asks $150k, you offer $155k (anchored to their ask)
|
||||
- Better: You offer $130k first (your anchor), negotiate from there
|
||||
|
||||
### Confirmation Bias
|
||||
|
||||
**Definition**: Seek, interpret, recall information confirming existing belief. Ignore disconfirming evidence.
|
||||
|
||||
**How it misleads heuristics**:
|
||||
- Heuristic works once → Notice only confirming cases
|
||||
- Initial hypothesis → Search for supporting evidence only
|
||||
|
||||
**Mitigation**:
|
||||
- Actively seek disconfirming evidence ("Why might this heuristic fail?")
|
||||
- Track all cases (not just successes)
|
||||
- Pre-commit to decision rule, then test objectively
|
||||
|
||||
**Example**:
|
||||
- Confirmation: "Recognition heuristic worked on 3 cases!" (ignore 5 failures)
|
||||
- Mitigation: Track all 50 cases → 25/50 = 50% accuracy (coin flip, abandon heuristic)
|
||||
|
||||
### Sunk Cost Fallacy
|
||||
|
||||
**Definition**: Continue based on past investment, not future value. "Already spent X, can't stop now."
|
||||
|
||||
**How it misleads heuristics**:
|
||||
- Heuristic worked in past → Keep using despite declining accuracy
|
||||
- Spent time designing checklist → Force it to work despite low adoption
|
||||
|
||||
**Mitigation**:
|
||||
- Evaluate based on future value only ("Will this heuristic work going forward?")
|
||||
- Pre-commit to abandonment criteria ("If accuracy <70%, switch methods")
|
||||
- Ignore past effort when deciding
|
||||
|
||||
**Example**:
|
||||
- Sunk cost: "Spent 10 hours designing this heuristic, must use it"
|
||||
- Rational: "Heuristic only 60% accurate, abandon and try different approach"
|
||||
|
||||
---
|
||||
|
||||
## Advanced Topics
|
||||
|
||||
### Swiss Cheese Model (Error Prevention)
|
||||
|
||||
**James Reason's model**: Multiple defensive layers, each with holes. Error occurs when holes align.
|
||||
|
||||
**Layers**:
|
||||
1. Organizational (policies, culture)
|
||||
2. Supervision (oversight, review)
|
||||
3. Preconditions (fatigue, time pressure)
|
||||
4. Actions (individual performance)
|
||||
|
||||
**Checklist as defensive layer**: Catches errors that slip through other layers.
|
||||
|
||||
**Example** (Deployment failure):
|
||||
- Organizational: No deployment policy
|
||||
- Supervision: No code review
|
||||
- Preconditions: Friday night deployment (fatigue)
|
||||
- Actions: Developer forgets migration
|
||||
- **Checklist**: "☐ Database migration tested" catches error
|
||||
|
||||
### Adaptive Heuristics
|
||||
|
||||
**Concept**: Heuristic parameters adjust based on outcomes.
|
||||
|
||||
**Example** (Satisficing with adaptive threshold):
|
||||
- Start: Threshold = 80% of criteria
|
||||
- After 10 searches, no options found → Lower to 70%
|
||||
- After 5 options found → Raise to 85%
|
||||
|
||||
**Implementation**:
|
||||
```
|
||||
threshold = initial_threshold
|
||||
search_count = 0
|
||||
options_found = 0
|
||||
|
||||
while not decided:
|
||||
search_count += 1
|
||||
if option_meets_threshold:
|
||||
options_found += 1
|
||||
decide(option)
|
||||
|
||||
if search_count > K and options_found == 0:
|
||||
threshold *= 0.9 # Lower threshold
|
||||
if options_found > M:
|
||||
threshold *= 1.1 # Raise threshold
|
||||
```
|
||||
|
||||
### Context-Dependent Heuristics
|
||||
|
||||
**Concept**: Different rules for different contexts. Meta-heuristic chooses which heuristic to use.
|
||||
|
||||
**Example** (Decision approach):
|
||||
1. **Check context**:
|
||||
- Is decision reversible? Yes → Use fast heuristic (satisficing)
|
||||
- Is decision irreversible? No → Use slow analysis (full evaluation)
|
||||
|
||||
2. **Choose heuristic based on stakes**:
|
||||
- <$1k: Recognition heuristic
|
||||
- $1k-$10k: Satisficing
|
||||
- >$10k: Full analysis
|
||||
|
||||
**Implementation**:
|
||||
```
|
||||
if stakes < 1000:
|
||||
use recognition_heuristic()
|
||||
elif stakes < 10000:
|
||||
use satisficing(threshold=0.8)
|
||||
else:
|
||||
use full_analysis()
|
||||
```
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
1. **Heuristics work in stable environments**: Recognition, take-the-best excel when patterns repeat. Fail in novel, adversarial contexts.
|
||||
|
||||
2. **Satisficing beats optimization under uncertainty**: "Good enough" faster and often as good as perfect when environment unpredictable.
|
||||
|
||||
3. **Checklists catch 60-80% of errors**: Proven in aviation, surgery, construction. Focus on killer items only (5-9 max).
|
||||
|
||||
4. **READ-DO for novices, DO-CONFIRM for experts**: Match format to user. Forcing experts into READ-DO creates resistance.
|
||||
|
||||
5. **Test heuristics empirically**: Don't assume rules work. Validate on ≥30 historical cases, target ≥80% accuracy.
|
||||
|
||||
6. **Forcing functions > checklists for critical steps**: If step must be done, automate enforcement rather than relying on manual check.
|
||||
|
||||
7. **Update heuristics when environment changes**: Rules optimized for past may fail when market shifts, tech disrupts, competition intensifies. Re-validate quarterly.
|
||||
371
skills/heuristics-and-checklists/resources/template.md
Normal file
371
skills/heuristics-and-checklists/resources/template.md
Normal file
@@ -0,0 +1,371 @@
|
||||
# Heuristics and Checklists Templates
|
||||
|
||||
Quick-start templates for designing decision heuristics and error-prevention checklists.
|
||||
|
||||
## Decision/Procedure Identification Template
|
||||
|
||||
**What needs simplification?**
|
||||
|
||||
**Type**: [Decision / Procedure]
|
||||
|
||||
**Frequency**: [How often does this occur? Daily / Weekly / Monthly]
|
||||
|
||||
**Current approach**: [How is this currently done?]
|
||||
|
||||
**Problems with current approach**:
|
||||
- [ ] Too slow (takes [X] hours/days)
|
||||
- [ ] Inconsistent outcomes (variance in quality)
|
||||
- [ ] Errors frequent ([X]% error rate)
|
||||
- [ ] Cognitive overload (too many factors to consider)
|
||||
- [ ] Analysis paralysis (can't decide, keep searching)
|
||||
- [ ] Knowledge not transferable (depends on expert intuition)
|
||||
|
||||
**Goal**: [What do you want to achieve? Faster decisions? Fewer errors? Consistent quality?]
|
||||
|
||||
---
|
||||
|
||||
## Heuristic Design Template
|
||||
|
||||
### Step 1: Define the Decision
|
||||
|
||||
**Decision question**: [What am I choosing?]
|
||||
- Example: "Which job candidate to hire?"
|
||||
- Example: "Which customer segment to prioritize?"
|
||||
|
||||
**Options**: [What are the alternatives?]
|
||||
1. [Option A]
|
||||
2. [Option B]
|
||||
3. [Option C]
|
||||
...
|
||||
|
||||
### Step 2: Choose Heuristic Type
|
||||
|
||||
**Type selected**: [Recognition / Take-the-Best / Satisficing / Fast & Frugal Tree]
|
||||
|
||||
#### Option A: Recognition Heuristic
|
||||
|
||||
**Rule**: If you recognize one option but not the other, choose the recognized one.
|
||||
|
||||
**Applies when**:
|
||||
- Recognition correlates with quality (brands, cities, experts)
|
||||
- Environment stable (not deceptive advertising)
|
||||
|
||||
**Example**: "Choose candidate from company I recognize (Google, Amazon) over unknown startup"
|
||||
|
||||
#### Option B: Take-the-Best Heuristic
|
||||
|
||||
**Most important criterion**: [What single factor best predicts outcome?]
|
||||
|
||||
**Rule**: Choose the option that scores best on this criterion alone. Ignore other factors.
|
||||
|
||||
**Example**: "For hiring engineers, use 'coding test score' only. Ignore school, years experience, personality."
|
||||
|
||||
**Cue validity**: [How often does this criterion predict success? Aim for >70%]
|
||||
|
||||
#### Option C: Satisficing
|
||||
|
||||
**Minimum acceptable threshold**: [What criteria must be met?]
|
||||
|
||||
| Criterion | Minimum | Notes |
|
||||
|-----------|---------|-------|
|
||||
| [Criterion 1] | [Threshold] | [e.g., "Coding score ≥80%"] |
|
||||
| [Criterion 2] | [Threshold] | [e.g., "Experience ≥2 years"] |
|
||||
| [Criterion 3] | [Threshold] | [e.g., "Culture fit ≥7/10"] |
|
||||
|
||||
**Rule**: Choose first option that meets ALL thresholds. Stop searching.
|
||||
|
||||
**Search budget**: [How many options to evaluate before lowering threshold? e.g., "Evaluate 3 candidates, if none meet threshold, adjust."]
|
||||
|
||||
#### Option D: Fast & Frugal Tree
|
||||
|
||||
**Decision tree**:
|
||||
|
||||
```
|
||||
Question 1: [Binary question, e.g., "Is deal >$10k?"]
|
||||
├─ Yes → [Action A or Question 2]
|
||||
└─ No → [Action B]
|
||||
|
||||
Question 2: [Next binary question]
|
||||
├─ Yes → [Action C]
|
||||
└─ No → [Action D]
|
||||
```
|
||||
|
||||
**Example** (Customer routing):
|
||||
```
|
||||
Is customer enterprise (>1000 employees)?
|
||||
├─ Yes → Assign senior rep
|
||||
└─ No → Is deal value >$10k?
|
||||
├─ Yes → Assign mid-level rep
|
||||
└─ No → Self-serve flow
|
||||
```
|
||||
|
||||
### Step 3: Define Exceptions
|
||||
|
||||
**When this heuristic should NOT be used**:
|
||||
- [Condition 1: e.g., "Novel situation with no precedent"]
|
||||
- [Condition 2: e.g., "Stakes >$100k (requires full analysis)"]
|
||||
- [Condition 3: e.g., "Adversarial environment (deception likely)"]
|
||||
|
||||
---
|
||||
|
||||
## Checklist Design Template
|
||||
|
||||
### Step 1: Identify Critical Steps
|
||||
|
||||
**Procedure**: [What process needs a checklist?]
|
||||
|
||||
**Brainstorm all steps** (comprehensive list first):
|
||||
1. [Step 1]
|
||||
2. [Step 2]
|
||||
3. [Step 3]
|
||||
...
|
||||
|
||||
**Filter to critical steps** (keep only steps that meet ≥1 criterion):
|
||||
- [ ] Often skipped (easy to forget)
|
||||
- [ ] Serious consequences if missed (failures, errors, safety risks)
|
||||
- [ ] Not immediately obvious (requires deliberate check)
|
||||
|
||||
**Critical steps** (5-10 items max):
|
||||
1. [Critical step 1]
|
||||
2. [Critical step 2]
|
||||
3. [Critical step 3]
|
||||
4. [Critical step 4]
|
||||
5. [Critical step 5]
|
||||
|
||||
### Step 2: Choose Checklist Format
|
||||
|
||||
**Format**: [READ-DO / DO-CONFIRM / Challenge-Response]
|
||||
|
||||
**Guidance**:
|
||||
- **READ-DO**: Novices, unfamiliar procedures, irreversible actions
|
||||
- **DO-CONFIRM**: Experts, routine procedures, familiar tasks
|
||||
- **Challenge-Response**: Two-person verification for high-stakes
|
||||
|
||||
### Step 3: Write Checklist
|
||||
|
||||
**Checklist title**: [e.g., "Software Deployment Pre-Flight Checklist"]
|
||||
|
||||
**Pause points**: [When to use this checklist?]
|
||||
- Before: [e.g., "Before initiating deployment"]
|
||||
- During: [e.g., "After migration, before going live"]
|
||||
- After: [e.g., "Post-deployment verification"]
|
||||
|
||||
**Format template** (READ-DO):
|
||||
|
||||
```
|
||||
[CHECKLIST TITLE]
|
||||
|
||||
☐ [Verb-first action 1]
|
||||
└─ [Specific detail or criteria if needed]
|
||||
|
||||
☐ [Verb-first action 2]
|
||||
|
||||
⚠ [KILLER ITEM - must pass to proceed]
|
||||
|
||||
☐ [Verb-first action 3]
|
||||
|
||||
☐ [Verb-first action 4]
|
||||
|
||||
☐ [Verb-first action 5]
|
||||
|
||||
All items checked → Proceed with [next phase]
|
||||
```
|
||||
|
||||
**Format template** (DO-CONFIRM):
|
||||
|
||||
```
|
||||
[CHECKLIST TITLE]
|
||||
|
||||
Perform procedure from memory, then confirm each item:
|
||||
|
||||
☐ [Action 1 completed?]
|
||||
☐ [Action 2 completed?]
|
||||
☐ [Action 3 completed?]
|
||||
☐ [Action 4 completed?]
|
||||
☐ [Action 5 completed?]
|
||||
|
||||
All confirmed → Procedure complete
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example Heuristics
|
||||
|
||||
### Example 1: Hiring Decision (Satisficing)
|
||||
|
||||
**Decision**: Choose job candidate from pool
|
||||
|
||||
**Satisficing threshold**:
|
||||
| Criterion | Minimum |
|
||||
|-----------|---------|
|
||||
| Technical skills (coding test) | ≥75% |
|
||||
| Communication (interview rating) | ≥7/10 |
|
||||
| Culture fit (team feedback) | ≥7/10 |
|
||||
| Reference check | Positive |
|
||||
|
||||
**Rule**: Evaluate candidates in order received. First candidate meeting ALL thresholds → Hire immediately. Don't keep searching for "perfect" candidate.
|
||||
|
||||
**Search budget**: If first 5 candidates don't meet threshold, lower bar to 70% technical (but keep other criteria).
|
||||
|
||||
### Example 2: Investment Decision (Take-the-Best)
|
||||
|
||||
**Decision**: Which startup to invest in?
|
||||
|
||||
**Most predictive criterion**: Founder track record (prior exits or significant roles)
|
||||
|
||||
**Rule**: Rank startups by founder quality alone. Invest in top 2. Ignore market size, product, traction (assume cue validity of founder >70%).
|
||||
|
||||
**Validation**: Test on past investments. If founder track record predicts success <70% of time, switch to different criterion.
|
||||
|
||||
### Example 3: Customer Triage (Fast & Frugal Tree)
|
||||
|
||||
**Decision**: How to route incoming customer inquiry?
|
||||
|
||||
**Tree**:
|
||||
```
|
||||
Is customer enterprise (>1000 employees)?
|
||||
├─ Yes → Assign to senior account exec (immediate response)
|
||||
└─ No → Is issue billing/payment?
|
||||
├─ Yes → Assign to finance team
|
||||
└─ No → Is trial user (not paid)?
|
||||
├─ Yes → Self-serve knowledge base
|
||||
└─ No → Assign to support queue (24h SLA)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example Checklists
|
||||
|
||||
### Example 1: Software Deployment (READ-DO)
|
||||
|
||||
```
|
||||
Software Deployment Pre-Flight Checklist
|
||||
|
||||
☐ All tests passing
|
||||
└─ Unit, integration, E2E tests green
|
||||
|
||||
⚠ Database migrations tested on staging
|
||||
└─ KILLER ITEM - Deployment blocked if migrations fail
|
||||
|
||||
☐ Rollback plan documented
|
||||
└─ Link to runbook: [URL]
|
||||
|
||||
☐ Monitoring dashboards configured
|
||||
└─ Alerts set for error rate, latency, traffic
|
||||
|
||||
☐ On-call engineer identified and notified
|
||||
└─ Name: [___], Phone: [___]
|
||||
|
||||
☐ Stakeholders notified of deployment window
|
||||
└─ Email sent with timing and impact
|
||||
|
||||
☐ Feature flags configured
|
||||
└─ Gradual rollout enabled (10% → 50% → 100%)
|
||||
|
||||
☐ Backups completed
|
||||
└─ Database backup timestamp: [___]
|
||||
|
||||
All items checked → Proceed with deployment
|
||||
```
|
||||
|
||||
### Example 2: Aviation Pre-Flight (DO-CONFIRM)
|
||||
|
||||
```
|
||||
Pre-Flight Checklist (DO-CONFIRM)
|
||||
|
||||
Perform checks, then confirm:
|
||||
|
||||
☐ Fuel quantity verified (visual + gauge)
|
||||
☐ Flaps freedom of movement checked
|
||||
☐ Landing gear visual inspection complete
|
||||
☐ Oil level within limits
|
||||
☐ Control surfaces free and correct
|
||||
☐ Instruments verified (altimeter, compass, airspeed)
|
||||
☐ Seatbelts and harness secured
|
||||
☐ Flight plan filed
|
||||
|
||||
All confirmed → Cleared for takeoff
|
||||
```
|
||||
|
||||
### Example 3: Surgical Safety (WHO Checklist - Challenge/Response)
|
||||
|
||||
```
|
||||
WHO Surgical Safety Checklist - Time Out (Before Incision)
|
||||
|
||||
[Entire team pauses, nurse reads aloud, surgeon confirms each]
|
||||
|
||||
☐ Confirm patient identity
|
||||
Response: "Name: [___], DOB: [___]"
|
||||
|
||||
☐ Confirm surgical site and procedure
|
||||
Response: "Site marked, procedure: [___]"
|
||||
|
||||
☐ Anticipated critical events reviewed
|
||||
Response: "Surgeon: [Key steps], Anesthesia: [Concerns], Nursing: [Equipment ready]"
|
||||
|
||||
☐ Antibiotic prophylaxis given within 60 min
|
||||
Response: "Administered at [time]"
|
||||
|
||||
☐ Essential imaging displayed
|
||||
Response: "[X-ray/MRI] displayed and reviewed"
|
||||
|
||||
All confirmed → Surgeon may proceed with incision
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Application & Monitoring Template
|
||||
|
||||
### Applying Heuristic
|
||||
|
||||
**Heuristic being applied**: [Name of rule]
|
||||
|
||||
**Case log**:
|
||||
|
||||
| Date | Decision | Heuristic Used | Outcome | Notes |
|
||||
|------|----------|----------------|---------|-------|
|
||||
| [Date] | [What decided] | [Which rule] | [Good / Bad] | [What happened] |
|
||||
| [Date] | [What decided] | [Which rule] | [Good / Bad] | [What happened] |
|
||||
|
||||
**Success rate**: [X good outcomes / Y total decisions = Z%]
|
||||
|
||||
**Goal**: ≥80% good outcomes. If <80%, refine heuristic.
|
||||
|
||||
### Applying Checklist
|
||||
|
||||
**Checklist being applied**: [Name of checklist]
|
||||
|
||||
**Usage log**:
|
||||
|
||||
| Date | Procedure | Items Caught | Errors Prevented | Time Added |
|
||||
|------|-----------|--------------|------------------|------------|
|
||||
| [Date] | [What done] | [# items flagged] | [What would've failed] | [+X min] |
|
||||
| [Date] | [What done] | [# items flagged] | [What would've failed] | [+X min] |
|
||||
|
||||
**Error rate**:
|
||||
- Before checklist: [X% failure rate]
|
||||
- After checklist: [Y% failure rate]
|
||||
- **Improvement**: [(X-Y)/X × 100]% reduction
|
||||
|
||||
---
|
||||
|
||||
## Refinement Log Template
|
||||
|
||||
**Iteration**: [#1, #2, #3...]
|
||||
|
||||
**Date**: [When refined]
|
||||
|
||||
**Problem identified**: [What went wrong? When did heuristic/checklist fail?]
|
||||
|
||||
**Root cause**: [Why did it fail?]
|
||||
- [ ] Heuristic: Wrong criterion chosen, threshold too high/low, environment changed
|
||||
- [ ] Checklist: Missing critical step, too long (skipped), wrong format for user
|
||||
|
||||
**Refinement made**:
|
||||
- **Before**: [Old rule/checklist]
|
||||
- **After**: [New rule/checklist]
|
||||
- **Rationale**: [Why this change should help]
|
||||
|
||||
**Test plan**: [How to validate refinement? Test on X cases, monitor for Y weeks]
|
||||
|
||||
**Outcome**: [Did refinement improve performance? Yes/No, by how much?]
|
||||
Reference in New Issue
Block a user