Files
gh-k-dense-ai-claude-scient…/skills/hypothesis-generation/references/hypothesis_quality_criteria.md
2025-11-30 08:30:10 +08:00

197 lines
7.6 KiB
Markdown

# Hypothesis Quality Criteria
## Framework for Evaluating Scientific Hypotheses
Use these criteria to assess the quality and rigor of generated hypotheses. A robust hypothesis should score well across multiple dimensions.
## Core Criteria
### 1. Testability
**Definition:** The hypothesis can be empirically tested through observation or experimentation.
**Evaluation questions:**
- Can specific experiments or observations test this hypothesis?
- Are the predicted outcomes measurable?
- Can the hypothesis be tested with current or near-future methods?
- Are there multiple independent ways to test it?
**Strong testability examples:**
- "Increased expression of protein X will reduce cell proliferation rate by >30%"
- "Patients receiving treatment Y will show 50% reduction in symptom Z within 4 weeks"
**Weak testability examples:**
- "This process is influenced by complex interactions" (vague, no specific prediction)
- "The mechanism involves quantum effects" (if no method to test quantum effects exists)
### 2. Falsifiability
**Definition:** Clear conditions or observations would disprove the hypothesis (Popperian criterion).
**Evaluation questions:**
- What specific observations would prove this hypothesis wrong?
- Are the falsifying conditions realistic to observe?
- Is the hypothesis stated clearly enough to be disproven?
- Can null results meaningfully falsify the hypothesis?
**Strong falsifiability examples:**
- "If we knock out gene X, phenotype Y will disappear" (can be falsified if phenotype persists)
- "Drug A will outperform placebo in 80% of patients" (clear falsification threshold)
**Weak falsifiability examples:**
- "Multiple factors contribute to the outcome" (too vague to falsify)
- "The effect may vary depending on context" (built-in escape clauses)
### 3. Parsimony (Occam's Razor)
**Definition:** Among competing hypotheses with equal explanatory power, prefer the simpler explanation.
**Evaluation questions:**
- Does the hypothesis invoke the minimum number of entities/mechanisms needed?
- Are all proposed elements necessary to explain the phenomenon?
- Could a simpler mechanism account for the observations?
- Does it avoid unnecessary assumptions?
**Parsimony considerations:**
- Simple ≠ simplistic; complexity is justified when evidence demands it
- Established mechanisms are "simpler" than novel, unproven ones
- Direct mechanisms are simpler than elaborate multi-step pathways
- One well-supported mechanism beats multiple speculative ones
### 4. Explanatory Power
**Definition:** The hypothesis accounts for a substantial portion of the observed phenomenon.
**Evaluation questions:**
- How much of the observed data does this hypothesis explain?
- Does it account for both typical and atypical observations?
- Can it explain related phenomena beyond the immediate observation?
- Does it resolve apparent contradictions in existing data?
**Strong explanatory power indicators:**
- Explains multiple independent observations
- Accounts for quantitative relationships, not just qualitative patterns
- Resolves previously puzzling findings
- Makes sense of seemingly contradictory results
**Limited explanatory power indicators:**
- Only explains part of the phenomenon
- Requires additional hypotheses for complete explanation
- Leaves major observations unexplained
### 5. Scope
**Definition:** The range of phenomena and contexts the hypothesis can address.
**Evaluation questions:**
- Does it apply only to the specific case or to broader situations?
- Can it generalize across conditions, species, or systems?
- Does it connect to larger theoretical frameworks?
- What are its boundaries and limitations?
**Broader scope (generally preferable):**
- Applies across multiple experimental conditions
- Generalizes to related systems or species
- Connects phenomenon to established principles
**Narrower scope (acceptable if explicitly defined):**
- Limited to specific conditions or contexts
- Requires different mechanisms in different settings
- Context-dependent with clear boundaries
### 6. Consistency with Established Knowledge
**Definition:** Alignment with well-supported theories, principles, and empirical findings.
**Evaluation questions:**
- Is it consistent with established physical, chemical, or biological principles?
- Does it align with or reasonably extend current theories?
- If contradicting established knowledge, is there strong justification?
- Does it require violating well-supported laws or findings?
**Levels of consistency:**
- **Fully consistent:** Applies established mechanisms in new context
- **Mostly consistent:** Extends current understanding in plausible ways
- **Partially inconsistent:** Contradicts some findings but has explanatory value
- **Highly inconsistent:** Requires rejecting well-established principles (requires exceptional evidence)
### 7. Novelty and Insight
**Definition:** The hypothesis offers new understanding beyond merely restating known facts.
**Evaluation questions:**
- Does it provide new mechanistic insight?
- Does it challenge assumptions or conventional wisdom?
- Does it suggest unexpected connections or relationships?
- Does it open new research directions?
**Novel contributions:**
- Proposes previously unconsidered mechanisms
- Reframes the problem in a productive way
- Connects disparate observations
- Suggests non-obvious testable predictions
**Note:** Novelty alone doesn't make a hypothesis valuable; it must also be testable, parsimonious, and explanatory.
## Comparative Evaluation
When evaluating multiple competing hypotheses:
### Trade-offs and Balancing
Hypotheses often involve trade-offs:
- More parsimonious but less explanatory power
- Broader scope but less testable with current methods
- Novel insights but less consistent with current knowledge
**Evaluation approach:**
- No hypothesis needs to be perfect on all dimensions
- Identify each hypothesis's strengths and weaknesses
- Consider which criteria are most important for the specific phenomenon
- Note which hypotheses are most immediately testable
- Identify which would be most informative if supported
### Distinguishability
**Key question:** Can experiments distinguish between competing hypotheses?
- Identify predictions that differ between hypotheses
- Prioritize hypotheses that make distinct predictions
- Note which experiments would most efficiently narrow the field
- Consider whether hypotheses could all be partially correct
## Common Pitfalls
### Untestable Hypotheses
- Too vague to generate specific predictions
- Invoke unobservable or unmeasurable entities
- Require technology that doesn't exist
### Unfalsifiable Hypotheses
- Built-in escape clauses ("may or may not occur")
- Post-hoc explanations that fit any outcome
- No specification of what would disprove them
### Overly Complex Hypotheses
- Invoke multiple unproven mechanisms
- Add unnecessary steps or entities
- Complexity not justified by explanatory gains
### Just-So Stories
- Plausible narratives without testable predictions
- Explain observations but don't predict new ones
- Impossible to distinguish from alternative stories
## Practical Application
When generating hypotheses:
1. **Draft initial hypotheses** focusing on mechanistic explanations
2. **Apply quality criteria** to identify weaknesses
3. **Refine hypotheses** to improve testability and clarity
4. **Develop specific predictions** to enhance testability and falsifiability
5. **Compare systematically** across all criteria
6. **Prioritize for testing** based on distinguishability and feasibility
Remember: The goal is not a perfect hypothesis, but a set of testable, falsifiable, informative hypotheses that advance understanding of the phenomenon.