Files
gh-k-dense-ai-claude-scient…/skills/scientific-critical-thinking/references/experimental_design.md
2025-11-30 08:30:10 +08:00

497 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Experimental Design Checklist
## Research Question Formulation
### Is the Question Well-Formed?
- [ ] **Specific:** Clearly defined variables and relationships
- [ ] **Answerable:** Can be addressed with available methods
- [ ] **Relevant:** Addresses a gap in knowledge or practical need
- [ ] **Feasible:** Resources, time, and ethical considerations allow it
- [ ] **Falsifiable:** Can be proven wrong if incorrect
### Have You Reviewed the Literature?
- [ ] Identified what's already known
- [ ] Found gaps or contradictions to address
- [ ] Learned from methodological successes and failures
- [ ] Identified appropriate outcome measures
- [ ] Determined typical effect sizes in the field
## Hypothesis Development
### Is Your Hypothesis Testable?
- [ ] Makes specific, quantifiable predictions
- [ ] Variables are operationally defined
- [ ] Specifies direction/nature of expected relationships
- [ ] Can be falsified by potential observations
### Types of Hypotheses
- [ ] **Null hypothesis (H₀):** No effect/relationship exists
- [ ] **Alternative hypothesis (H₁):** Effect/relationship exists
- [ ] **Directional vs. non-directional:** One-tailed vs. two-tailed tests
## Study Design Selection
### What Type of Study is Appropriate?
**Experimental (Intervention) Studies:**
- [ ] **Randomized Controlled Trial (RCT):** Gold standard for causation
- [ ] **Quasi-experimental:** Non-random assignment but manipulation
- [ ] **Within-subjects:** Same participants in all conditions
- [ ] **Between-subjects:** Different participants per condition
- [ ] **Factorial:** Multiple independent variables
- [ ] **Crossover:** Participants receive multiple interventions sequentially
**Observational Studies:**
- [ ] **Cohort:** Follow groups over time
- [ ] **Case-control:** Compare those with/without outcome
- [ ] **Cross-sectional:** Snapshot at one time point
- [ ] **Ecological:** Population-level data
**Consider:**
- [ ] Can you randomly assign participants?
- [ ] Can you manipulate the independent variable?
- [ ] Is the outcome rare (favor case-control) or common?
- [ ] Do you need to establish temporal sequence?
- [ ] What's feasible given ethical, practical constraints?
## Variables
### Independent Variables (Manipulated/Predictor)
- [ ] Clearly defined and operationalized
- [ ] Appropriate levels/categories chosen
- [ ] Manipulation is sufficient to test hypothesis
- [ ] Manipulation check planned (if applicable)
### Dependent Variables (Outcome/Response)
- [ ] Directly measures the construct of interest
- [ ] Validated and reliable measurement
- [ ] Sensitive enough to detect expected effects
- [ ] Appropriate for statistical analysis planned
- [ ] Primary outcome clearly designated
### Control Variables
- [ ] **Confounding variables identified:**
- Variables that affect both IV and DV
- Alternative explanations for findings
- [ ] **Strategy for control:**
- Randomization
- Matching
- Stratification
- Statistical adjustment
- Restriction (inclusion/exclusion criteria)
- Blinding
### Extraneous Variables
- [ ] Potential sources of noise identified
- [ ] Standardized procedures to minimize
- [ ] Environmental factors controlled
- [ ] Time of day, setting, equipment standardized
## Sampling
### Population Definition
- [ ] **Target population:** Who you want to generalize to
- [ ] **Accessible population:** Who you can actually sample from
- [ ] **Sample:** Who actually participates
- [ ] Difference between these documented
### Sampling Method
- [ ] **Probability sampling (preferred for generalizability):**
- Simple random sampling
- Stratified sampling
- Cluster sampling
- Systematic sampling
- [ ] **Non-probability sampling (common but limits generalizability):**
- Convenience sampling
- Purposive sampling
- Snowball sampling
- Quota sampling
### Sample Size
- [ ] **A priori power analysis conducted**
- Expected effect size (from literature or pilot)
- Desired power (typically .80 or .90)
- Significance level (typically .05)
- Statistical test to be used
- [ ] Accounts for expected attrition/dropout
- [ ] Sufficient for planned subgroup analyses
- [ ] Practical constraints acknowledged
### Inclusion/Exclusion Criteria
- [ ] Clearly defined and justified
- [ ] Not overly restrictive (limits generalizability)
- [ ] Based on theoretical or practical considerations
- [ ] Ethical considerations addressed
- [ ] Documented and applied consistently
## Blinding and Randomization
### Randomization
- [ ] **What is randomized:**
- Participant assignment to conditions
- Order of conditions (within-subjects)
- Stimuli/items presented
- [ ] **Method of randomization:**
- Computer-generated random numbers
- Random number tables
- Coin flips (for very small studies)
- [ ] **Allocation concealment:**
- Sequence generated before recruitment
- Allocation hidden until after enrollment
- Sequentially numbered, sealed envelopes (if needed)
- [ ] **Stratified randomization:**
- Balance important variables across groups
- Block randomization to ensure equal group sizes
- [ ] **Check randomization:**
- Compare groups at baseline
- Report any significant differences
### Blinding
- [ ] **Single-blind:** Participants don't know group assignment
- [ ] **Double-blind:** Participants and researchers don't know
- [ ] **Triple-blind:** Participants, researchers, and data analysts don't know
- [ ] **Blinding feasibility:**
- Is true blinding possible?
- Placebo/sham controls needed?
- Identical appearance of interventions?
- [ ] **Blinding check:**
- Assess whether blinding maintained
- Ask participants/researchers to guess assignments
## Control Groups and Conditions
### What Type of Control?
- [ ] **No treatment control:** Natural course of condition
- [ ] **Placebo control:** Inert treatment for comparison
- [ ] **Active control:** Standard treatment comparison
- [ ] **Wait-list control:** Delayed treatment
- [ ] **Attention control:** Matches contact time without active ingredient
### Multiple Conditions
- [ ] Factorial designs for multiple factors
- [ ] Dose-response relationship assessment
- [ ] Mechanism testing with component analyses
## Procedures
### Protocol Development
- [ ] **Detailed, written protocol:**
- Step-by-step procedures
- Scripts for standardized instructions
- Decision rules for handling issues
- Data collection forms
- [ ] Pilot tested before main study
- [ ] Staff trained to criterion
- [ ] Compliance monitoring planned
### Standardization
- [ ] Same instructions for all participants
- [ ] Same equipment and materials
- [ ] Same environment/setting when possible
- [ ] Same assessment timing
- [ ] Deviations from protocol documented
### Data Collection
- [ ] **When collected:**
- Baseline measurements
- Post-intervention
- Follow-up timepoints
- [ ] **Who collects:**
- Trained researchers
- Blinded when possible
- Inter-rater reliability established
- [ ] **How collected:**
- Valid, reliable instruments
- Standardized administration
- Multiple methods if possible (triangulation)
## Measurement
### Validity
- [ ] **Face validity:** Appears to measure construct
- [ ] **Content validity:** Covers all aspects of construct
- [ ] **Criterion validity:** Correlates with gold standard
- Concurrent validity
- Predictive validity
- [ ] **Construct validity:** Measures theoretical construct
- Convergent validity (correlates with related measures)
- Discriminant validity (doesn't correlate with unrelated measures)
### Reliability
- [ ] **Test-retest:** Consistent over time
- [ ] **Internal consistency:** Items measure same construct (Cronbach's α)
- [ ] **Inter-rater reliability:** Agreement between raters (Cohen's κ, ICC)
- [ ] **Parallel forms:** Alternative versions consistent
### Measurement Considerations
- [ ] Objective measures preferred when possible
- [ ] Validated instruments used when available
- [ ] Multiple measures of key constructs
- [ ] Sensitivity to change considered
- [ ] Floor/ceiling effects avoided
- [ ] Response formats appropriate
- [ ] Recall periods appropriate
- [ ] Cultural appropriateness considered
## Bias Minimization
### Selection Bias
- [ ] Random sampling when possible
- [ ] Clearly defined eligibility criteria
- [ ] Document who declines and why
- [ ] Minimize self-selection
### Performance Bias
- [ ] Standardized protocols
- [ ] Blinding of providers
- [ ] Monitor protocol adherence
- [ ] Document deviations
### Detection Bias
- [ ] Blinding of outcome assessors
- [ ] Objective measures when possible
- [ ] Standardized assessment procedures
- [ ] Multiple raters with reliability checks
### Attrition Bias
- [ ] Strategies to minimize dropout
- [ ] Track reasons for dropout
- [ ] Compare dropouts to completers
- [ ] Intention-to-treat analysis planned
### Reporting Bias
- [ ] Preregister study and analysis plan
- [ ] Designate primary vs. secondary outcomes
- [ ] Commit to reporting all outcomes
- [ ] Distinguish planned from exploratory analyses
## Data Management
### Data Collection
- [ ] Data collection forms designed and tested
- [ ] REDCap, Qualtrics, or similar platforms
- [ ] Range checks and validation rules
- [ ] Regular backups
- [ ] Secure storage (HIPAA/GDPR compliant if needed)
### Data Quality
- [ ] Real-time data validation
- [ ] Regular quality checks
- [ ] Missing data patterns monitored
- [ ] Outliers identified and investigated
- [ ] Protocol deviations documented
### Data Security
- [ ] De-identification procedures
- [ ] Access controls
- [ ] Audit trails
- [ ] Compliance with regulations (IRB, HIPAA, GDPR)
## Statistical Analysis Planning
### Analysis Plan (Prespecify Before Data Collection)
- [ ] **Primary analysis:**
- Statistical test(s) specified
- Hypothesis clearly stated
- Significance level set (usually α = .05)
- One-tailed or two-tailed
- [ ] **Secondary analyses:**
- Clearly designated as secondary
- Exploratory analyses labeled as such
- [ ] **Multiple comparisons:**
- Adjustment method specified (if needed)
- Primary outcome protects from inflation
### Assumptions
- [ ] Assumptions of statistical tests identified
- [ ] Plan to check assumptions
- [ ] Backup non-parametric alternatives
- [ ] Transformation options considered
### Missing Data
- [ ] Anticipated amount of missingness
- [ ] Missing data mechanism (MCAR, MAR, MNAR)
- [ ] Handling strategy:
- Complete case analysis
- Multiple imputation
- Maximum likelihood
- [ ] Sensitivity analyses planned
### Effect Sizes
- [ ] Appropriate effect size measures identified
- [ ] Will be reported alongside p-values
- [ ] Confidence intervals planned
### Statistical Software
- [ ] Software selected (R, SPSS, Stata, Python, etc.)
- [ ] Version documented
- [ ] Analysis scripts prepared in advance
- [ ] Will be made available (Open Science)
## Ethical Considerations
### Ethical Approval
- [ ] IRB/Ethics committee approval obtained
- [ ] Study registered (ClinicalTrials.gov, etc.) if applicable
- [ ] Protocol follows Declaration of Helsinki or equivalent
### Informed Consent
- [ ] Voluntary participation
- [ ] Comprehensible explanation
- [ ] Risks and benefits disclosed
- [ ] Right to withdraw without penalty
- [ ] Privacy protections explained
- [ ] Compensation disclosed
### Risk-Benefit Analysis
- [ ] Potential benefits outweigh risks
- [ ] Risks minimized
- [ ] Vulnerable populations protected
- [ ] Data safety monitoring (if high risk)
### Confidentiality
- [ ] Data de-identified
- [ ] Secure storage
- [ ] Limited access
- [ ] Reporting doesn't allow re-identification
## Validity Threats
### Internal Validity (Causation)
- [ ] **History:** External events between measurements
- [ ] **Maturation:** Changes in participants over time
- [ ] **Testing:** Effects of repeated measurement
- [ ] **Instrumentation:** Changes in measurement over time
- [ ] **Regression to mean:** Extreme scores becoming less extreme
- [ ] **Selection:** Groups differ at baseline
- [ ] **Attrition:** Differential dropout
- [ ] **Diffusion:** Control group receives treatment elements
### External Validity (Generalizability)
- [ ] Sample representative of population
- [ ] Setting realistic/natural
- [ ] Treatment typical of real-world implementation
- [ ] Outcome measures ecologically valid
- [ ] Time frame appropriate
### Construct Validity (Measurement)
- [ ] Measures actually tap intended constructs
- [ ] Operations match theoretical definitions
- [ ] No confounding of constructs
- [ ] Adequate coverage of construct
### Statistical Conclusion Validity
- [ ] Adequate statistical power
- [ ] Assumptions met
- [ ] Appropriate tests used
- [ ] Alpha level appropriate
- [ ] Multiple comparisons addressed
## Reporting and Transparency
### Preregistration
- [ ] Study preregistered (OSF, ClinicalTrials.gov, AsPredicted)
- [ ] Hypotheses stated a priori
- [ ] Analysis plan documented
- [ ] Distinguishes confirmatory from exploratory
### Reporting Guidelines
- [ ] **RCTs:** CONSORT checklist
- [ ] **Observational studies:** STROBE checklist
- [ ] **Systematic reviews:** PRISMA checklist
- [ ] **Diagnostic studies:** STARD checklist
- [ ] **Qualitative research:** COREQ checklist
- [ ] **Case reports:** CARE guidelines
### Transparency
- [ ] All measures reported
- [ ] All manipulations disclosed
- [ ] Sample size determination explained
- [ ] Exclusion criteria and numbers reported
- [ ] Attrition documented
- [ ] Deviations from protocol noted
- [ ] Conflicts of interest disclosed
### Open Science
- [ ] Data sharing planned (when ethical)
- [ ] Analysis code shared
- [ ] Materials available
- [ ] Preprint posted
- [ ] Open access publication when possible
## Post-Study Considerations
### Data Analysis
- [ ] Follow preregistered plan
- [ ] Clearly label deviations and exploratory analyses
- [ ] Check assumptions
- [ ] Report all outcomes
- [ ] Report effect sizes and CIs, not just p-values
### Interpretation
- [ ] Conclusions supported by data
- [ ] Limitations acknowledged
- [ ] Alternative explanations considered
- [ ] Generalizability discussed
- [ ] Clinical/practical significance addressed
### Dissemination
- [ ] Publish regardless of results (reduce publication bias)
- [ ] Present at conferences
- [ ] Share findings with participants (when appropriate)
- [ ] Communicate to relevant stakeholders
- [ ] Plain language summaries
### Next Steps
- [ ] Replication needed?
- [ ] Follow-up studies identified
- [ ] Mechanism studies planned
- [ ] Clinical applications considered
## Common Pitfalls to Avoid
- [ ] No power analysis → underpowered study
- [ ] Hypothesis formed after seeing data (HARKing)
- [ ] No blinding when feasible → bias
- [ ] P-hacking (data fishing, optional stopping)
- [ ] Multiple testing without correction → false positives
- [ ] Inadequate control group
- [ ] Confounding not addressed
- [ ] Instruments not validated
- [ ] High attrition not addressed
- [ ] Cherry-picking results to report
- [ ] Causal language from correlational data
- [ ] Ignoring assumptions of statistical tests
- [ ] Not preregistering changes literature bias
- [ ] Conflicts of interest not disclosed
## Final Checklist Before Starting
- [ ] Research question is clear and important
- [ ] Hypothesis is testable and specific
- [ ] Study design is appropriate
- [ ] Sample size is adequate (power analysis)
- [ ] Measures are valid and reliable
- [ ] Confounds are controlled
- [ ] Randomization and blinding implemented
- [ ] Data collection is standardized
- [ ] Analysis plan is prespecified
- [ ] Ethical approval obtained
- [ ] Study is preregistered
- [ ] Resources are sufficient
- [ ] Team is trained
- [ ] Protocol is documented
- [ ] Backup plans exist for problems
## Remember
**Good experimental design is about:**
- Asking clear questions
- Minimizing bias
- Maximizing validity
- Appropriate inference
- Transparency
- Reproducibility
**The best time to think about these issues is before collecting data, not after.**