gh-k-dense-ai-claude-scient…/skills/hypothesis-generation/references/hypothesis_quality_criteria.md

# Hypothesis Quality Criteria

## Framework for Evaluating Scientific Hypotheses

Use these criteria to assess the quality and rigor of generated hypotheses. A robust hypothesis should score well across multiple dimensions.

## Core Criteria

### 1. Testability

**Definition:** The hypothesis can be empirically tested through observation or experimentation.

**Evaluation questions:**
- Can specific experiments or observations test this hypothesis?
- Are the predicted outcomes measurable?
- Can the hypothesis be tested with current or near-future methods?
- Are there multiple independent ways to test it?

**Strong testability examples:**
- "Increased expression of protein X will reduce cell proliferation rate by >30%"
- "Patients receiving treatment Y will show 50% reduction in symptom Z within 4 weeks"

**Weak testability examples:**
- "This process is influenced by complex interactions" (vague, no specific prediction)
- "The mechanism involves quantum effects" (if no method to test quantum effects exists)

### 2. Falsifiability

**Definition:** Clear conditions or observations would disprove the hypothesis (Popperian criterion).

**Evaluation questions:**
- What specific observations would prove this hypothesis wrong?
- Are the falsifying conditions realistic to observe?
- Is the hypothesis stated clearly enough to be disproven?
- Can null results meaningfully falsify the hypothesis?

**Strong falsifiability examples:**
- "If we knock out gene X, phenotype Y will disappear" (can be falsified if phenotype persists)
- "Drug A will outperform placebo in 80% of patients" (clear falsification threshold)

**Weak falsifiability examples:**
- "Multiple factors contribute to the outcome" (too vague to falsify)
- "The effect may vary depending on context" (built-in escape clauses)

### 3. Parsimony (Occam's Razor)

**Definition:** Among competing hypotheses with equal explanatory power, prefer the simpler explanation.

**Evaluation questions:**
- Does the hypothesis invoke the minimum number of entities/mechanisms needed?
- Are all proposed elements necessary to explain the phenomenon?
- Could a simpler mechanism account for the observations?
- Does it avoid unnecessary assumptions?

**Parsimony considerations:**
- Simple ≠ simplistic; complexity is justified when evidence demands it
- Established mechanisms are "simpler" than novel, unproven ones
- Direct mechanisms are simpler than elaborate multi-step pathways
- One well-supported mechanism beats multiple speculative ones

### 4. Explanatory Power

**Definition:** The hypothesis accounts for a substantial portion of the observed phenomenon.

**Evaluation questions:**
- How much of the observed data does this hypothesis explain?
- Does it account for both typical and atypical observations?
- Can it explain related phenomena beyond the immediate observation?
- Does it resolve apparent contradictions in existing data?

**Strong explanatory power indicators:**
- Explains multiple independent observations
- Accounts for quantitative relationships, not just qualitative patterns
- Resolves previously puzzling findings
- Makes sense of seemingly contradictory results

**Limited explanatory power indicators:**
- Only explains part of the phenomenon
- Requires additional hypotheses for complete explanation
- Leaves major observations unexplained

### 5. Scope

**Definition:** The range of phenomena and contexts the hypothesis can address.

**Evaluation questions:**
- Does it apply only to the specific case or to broader situations?
- Can it generalize across conditions, species, or systems?
- Does it connect to larger theoretical frameworks?
- What are its boundaries and limitations?

**Broader scope (generally preferable):**
- Applies across multiple experimental conditions
- Generalizes to related systems or species
- Connects phenomenon to established principles

**Narrower scope (acceptable if explicitly defined):**
- Limited to specific conditions or contexts
- Requires different mechanisms in different settings
- Context-dependent with clear boundaries

### 6. Consistency with Established Knowledge

**Definition:** Alignment with well-supported theories, principles, and empirical findings.

**Evaluation questions:**
- Is it consistent with established physical, chemical, or biological principles?
- Does it align with or reasonably extend current theories?
- If contradicting established knowledge, is there strong justification?
- Does it require violating well-supported laws or findings?

**Levels of consistency:**
- **Fully consistent:** Applies established mechanisms in new context
- **Mostly consistent:** Extends current understanding in plausible ways
- **Partially inconsistent:** Contradicts some findings but has explanatory value
- **Highly inconsistent:** Requires rejecting well-established principles (requires exceptional evidence)

### 7. Novelty and Insight

**Definition:** The hypothesis offers new understanding beyond merely restating known facts.

**Evaluation questions:**
- Does it provide new mechanistic insight?
- Does it challenge assumptions or conventional wisdom?
- Does it suggest unexpected connections or relationships?
- Does it open new research directions?

**Novel contributions:**
- Proposes previously unconsidered mechanisms
- Reframes the problem in a productive way
- Connects disparate observations
- Suggests non-obvious testable predictions

**Note:** Novelty alone doesn't make a hypothesis valuable; it must also be testable, parsimonious, and explanatory.

## Comparative Evaluation

When evaluating multiple competing hypotheses:

### Trade-offs and Balancing

Hypotheses often involve trade-offs:
- More parsimonious but less explanatory power
- Broader scope but less testable with current methods
- Novel insights but less consistent with current knowledge

**Evaluation approach:**
- No hypothesis needs to be perfect on all dimensions
- Identify each hypothesis's strengths and weaknesses
- Consider which criteria are most important for the specific phenomenon
- Note which hypotheses are most immediately testable
- Identify which would be most informative if supported

### Distinguishability

**Key question:** Can experiments distinguish between competing hypotheses?

- Identify predictions that differ between hypotheses
- Prioritize hypotheses that make distinct predictions
- Note which experiments would most efficiently narrow the field
- Consider whether hypotheses could all be partially correct

## Common Pitfalls

### Untestable Hypotheses
- Too vague to generate specific predictions
- Invoke unobservable or unmeasurable entities
- Require technology that doesn't exist

### Unfalsifiable Hypotheses
- Built-in escape clauses ("may or may not occur")
- Post-hoc explanations that fit any outcome
- No specification of what would disprove them

### Overly Complex Hypotheses
- Invoke multiple unproven mechanisms
- Add unnecessary steps or entities
- Complexity not justified by explanatory gains

### Just-So Stories
- Plausible narratives without testable predictions
- Explain observations but don't predict new ones
- Impossible to distinguish from alternative stories

## Practical Application

When generating hypotheses:

1. **Draft initial hypotheses** focusing on mechanistic explanations
2. **Apply quality criteria** to identify weaknesses
3. **Refine hypotheses** to improve testability and clarity
4. **Develop specific predictions** to enhance testability and falsifiability
5. **Compare systematically** across all criteria
6. **Prioritize for testing** based on distinguishability and feasibility

Remember: The goal is not a perfect hypothesis, but a set of testable, falsifiable, informative hypotheses that advance understanding of the phenomenon.