Files
gh-k-dense-ai-claude-scient…/skills/scientific-critical-thinking/references/experimental_design.md
2025-11-30 08:30:14 +08:00

16 KiB
Raw Blame History

Experimental Design Checklist

Research Question Formulation

Is the Question Well-Formed?

  • Specific: Clearly defined variables and relationships
  • Answerable: Can be addressed with available methods
  • Relevant: Addresses a gap in knowledge or practical need
  • Feasible: Resources, time, and ethical considerations allow it
  • Falsifiable: Can be proven wrong if incorrect

Have You Reviewed the Literature?

  • Identified what's already known
  • Found gaps or contradictions to address
  • Learned from methodological successes and failures
  • Identified appropriate outcome measures
  • Determined typical effect sizes in the field

Hypothesis Development

Is Your Hypothesis Testable?

  • Makes specific, quantifiable predictions
  • Variables are operationally defined
  • Specifies direction/nature of expected relationships
  • Can be falsified by potential observations

Types of Hypotheses

  • Null hypothesis (H₀): No effect/relationship exists
  • Alternative hypothesis (H₁): Effect/relationship exists
  • Directional vs. non-directional: One-tailed vs. two-tailed tests

Study Design Selection

What Type of Study is Appropriate?

Experimental (Intervention) Studies:

  • Randomized Controlled Trial (RCT): Gold standard for causation
  • Quasi-experimental: Non-random assignment but manipulation
  • Within-subjects: Same participants in all conditions
  • Between-subjects: Different participants per condition
  • Factorial: Multiple independent variables
  • Crossover: Participants receive multiple interventions sequentially

Observational Studies:

  • Cohort: Follow groups over time
  • Case-control: Compare those with/without outcome
  • Cross-sectional: Snapshot at one time point
  • Ecological: Population-level data

Consider:

  • Can you randomly assign participants?
  • Can you manipulate the independent variable?
  • Is the outcome rare (favor case-control) or common?
  • Do you need to establish temporal sequence?
  • What's feasible given ethical, practical constraints?

Variables

Independent Variables (Manipulated/Predictor)

  • Clearly defined and operationalized
  • Appropriate levels/categories chosen
  • Manipulation is sufficient to test hypothesis
  • Manipulation check planned (if applicable)

Dependent Variables (Outcome/Response)

  • Directly measures the construct of interest
  • Validated and reliable measurement
  • Sensitive enough to detect expected effects
  • Appropriate for statistical analysis planned
  • Primary outcome clearly designated

Control Variables

  • Confounding variables identified:
    • Variables that affect both IV and DV
    • Alternative explanations for findings
  • Strategy for control:
    • Randomization
    • Matching
    • Stratification
    • Statistical adjustment
    • Restriction (inclusion/exclusion criteria)
    • Blinding

Extraneous Variables

  • Potential sources of noise identified
  • Standardized procedures to minimize
  • Environmental factors controlled
  • Time of day, setting, equipment standardized

Sampling

Population Definition

  • Target population: Who you want to generalize to
  • Accessible population: Who you can actually sample from
  • Sample: Who actually participates
  • Difference between these documented

Sampling Method

  • Probability sampling (preferred for generalizability):
    • Simple random sampling
    • Stratified sampling
    • Cluster sampling
    • Systematic sampling
  • Non-probability sampling (common but limits generalizability):
    • Convenience sampling
    • Purposive sampling
    • Snowball sampling
    • Quota sampling

Sample Size

  • A priori power analysis conducted
    • Expected effect size (from literature or pilot)
    • Desired power (typically .80 or .90)
    • Significance level (typically .05)
    • Statistical test to be used
  • Accounts for expected attrition/dropout
  • Sufficient for planned subgroup analyses
  • Practical constraints acknowledged

Inclusion/Exclusion Criteria

  • Clearly defined and justified
  • Not overly restrictive (limits generalizability)
  • Based on theoretical or practical considerations
  • Ethical considerations addressed
  • Documented and applied consistently

Blinding and Randomization

Randomization

  • What is randomized:
    • Participant assignment to conditions
    • Order of conditions (within-subjects)
    • Stimuli/items presented
  • Method of randomization:
    • Computer-generated random numbers
    • Random number tables
    • Coin flips (for very small studies)
  • Allocation concealment:
    • Sequence generated before recruitment
    • Allocation hidden until after enrollment
    • Sequentially numbered, sealed envelopes (if needed)
  • Stratified randomization:
    • Balance important variables across groups
    • Block randomization to ensure equal group sizes
  • Check randomization:
    • Compare groups at baseline
    • Report any significant differences

Blinding

  • Single-blind: Participants don't know group assignment
  • Double-blind: Participants and researchers don't know
  • Triple-blind: Participants, researchers, and data analysts don't know
  • Blinding feasibility:
    • Is true blinding possible?
    • Placebo/sham controls needed?
    • Identical appearance of interventions?
  • Blinding check:
    • Assess whether blinding maintained
    • Ask participants/researchers to guess assignments

Control Groups and Conditions

What Type of Control?

  • No treatment control: Natural course of condition
  • Placebo control: Inert treatment for comparison
  • Active control: Standard treatment comparison
  • Wait-list control: Delayed treatment
  • Attention control: Matches contact time without active ingredient

Multiple Conditions

  • Factorial designs for multiple factors
  • Dose-response relationship assessment
  • Mechanism testing with component analyses

Procedures

Protocol Development

  • Detailed, written protocol:
    • Step-by-step procedures
    • Scripts for standardized instructions
    • Decision rules for handling issues
    • Data collection forms
  • Pilot tested before main study
  • Staff trained to criterion
  • Compliance monitoring planned

Standardization

  • Same instructions for all participants
  • Same equipment and materials
  • Same environment/setting when possible
  • Same assessment timing
  • Deviations from protocol documented

Data Collection

  • When collected:
    • Baseline measurements
    • Post-intervention
    • Follow-up timepoints
  • Who collects:
    • Trained researchers
    • Blinded when possible
    • Inter-rater reliability established
  • How collected:
    • Valid, reliable instruments
    • Standardized administration
    • Multiple methods if possible (triangulation)

Measurement

Validity

  • Face validity: Appears to measure construct
  • Content validity: Covers all aspects of construct
  • Criterion validity: Correlates with gold standard
    • Concurrent validity
    • Predictive validity
  • Construct validity: Measures theoretical construct
    • Convergent validity (correlates with related measures)
    • Discriminant validity (doesn't correlate with unrelated measures)

Reliability

  • Test-retest: Consistent over time
  • Internal consistency: Items measure same construct (Cronbach's α)
  • Inter-rater reliability: Agreement between raters (Cohen's κ, ICC)
  • Parallel forms: Alternative versions consistent

Measurement Considerations

  • Objective measures preferred when possible
  • Validated instruments used when available
  • Multiple measures of key constructs
  • Sensitivity to change considered
  • Floor/ceiling effects avoided
  • Response formats appropriate
  • Recall periods appropriate
  • Cultural appropriateness considered

Bias Minimization

Selection Bias

  • Random sampling when possible
  • Clearly defined eligibility criteria
  • Document who declines and why
  • Minimize self-selection

Performance Bias

  • Standardized protocols
  • Blinding of providers
  • Monitor protocol adherence
  • Document deviations

Detection Bias

  • Blinding of outcome assessors
  • Objective measures when possible
  • Standardized assessment procedures
  • Multiple raters with reliability checks

Attrition Bias

  • Strategies to minimize dropout
  • Track reasons for dropout
  • Compare dropouts to completers
  • Intention-to-treat analysis planned

Reporting Bias

  • Preregister study and analysis plan
  • Designate primary vs. secondary outcomes
  • Commit to reporting all outcomes
  • Distinguish planned from exploratory analyses

Data Management

Data Collection

  • Data collection forms designed and tested
  • REDCap, Qualtrics, or similar platforms
  • Range checks and validation rules
  • Regular backups
  • Secure storage (HIPAA/GDPR compliant if needed)

Data Quality

  • Real-time data validation
  • Regular quality checks
  • Missing data patterns monitored
  • Outliers identified and investigated
  • Protocol deviations documented

Data Security

  • De-identification procedures
  • Access controls
  • Audit trails
  • Compliance with regulations (IRB, HIPAA, GDPR)

Statistical Analysis Planning

Analysis Plan (Prespecify Before Data Collection)

  • Primary analysis:
    • Statistical test(s) specified
    • Hypothesis clearly stated
    • Significance level set (usually α = .05)
    • One-tailed or two-tailed
  • Secondary analyses:
    • Clearly designated as secondary
    • Exploratory analyses labeled as such
  • Multiple comparisons:
    • Adjustment method specified (if needed)
    • Primary outcome protects from inflation

Assumptions

  • Assumptions of statistical tests identified
  • Plan to check assumptions
  • Backup non-parametric alternatives
  • Transformation options considered

Missing Data

  • Anticipated amount of missingness
  • Missing data mechanism (MCAR, MAR, MNAR)
  • Handling strategy:
    • Complete case analysis
    • Multiple imputation
    • Maximum likelihood
  • Sensitivity analyses planned

Effect Sizes

  • Appropriate effect size measures identified
  • Will be reported alongside p-values
  • Confidence intervals planned

Statistical Software

  • Software selected (R, SPSS, Stata, Python, etc.)
  • Version documented
  • Analysis scripts prepared in advance
  • Will be made available (Open Science)

Ethical Considerations

Ethical Approval

  • IRB/Ethics committee approval obtained
  • Study registered (ClinicalTrials.gov, etc.) if applicable
  • Protocol follows Declaration of Helsinki or equivalent
  • Voluntary participation
  • Comprehensible explanation
  • Risks and benefits disclosed
  • Right to withdraw without penalty
  • Privacy protections explained
  • Compensation disclosed

Risk-Benefit Analysis

  • Potential benefits outweigh risks
  • Risks minimized
  • Vulnerable populations protected
  • Data safety monitoring (if high risk)

Confidentiality

  • Data de-identified
  • Secure storage
  • Limited access
  • Reporting doesn't allow re-identification

Validity Threats

Internal Validity (Causation)

  • History: External events between measurements
  • Maturation: Changes in participants over time
  • Testing: Effects of repeated measurement
  • Instrumentation: Changes in measurement over time
  • Regression to mean: Extreme scores becoming less extreme
  • Selection: Groups differ at baseline
  • Attrition: Differential dropout
  • Diffusion: Control group receives treatment elements

External Validity (Generalizability)

  • Sample representative of population
  • Setting realistic/natural
  • Treatment typical of real-world implementation
  • Outcome measures ecologically valid
  • Time frame appropriate

Construct Validity (Measurement)

  • Measures actually tap intended constructs
  • Operations match theoretical definitions
  • No confounding of constructs
  • Adequate coverage of construct

Statistical Conclusion Validity

  • Adequate statistical power
  • Assumptions met
  • Appropriate tests used
  • Alpha level appropriate
  • Multiple comparisons addressed

Reporting and Transparency

Preregistration

  • Study preregistered (OSF, ClinicalTrials.gov, AsPredicted)
  • Hypotheses stated a priori
  • Analysis plan documented
  • Distinguishes confirmatory from exploratory

Reporting Guidelines

  • RCTs: CONSORT checklist
  • Observational studies: STROBE checklist
  • Systematic reviews: PRISMA checklist
  • Diagnostic studies: STARD checklist
  • Qualitative research: COREQ checklist
  • Case reports: CARE guidelines

Transparency

  • All measures reported
  • All manipulations disclosed
  • Sample size determination explained
  • Exclusion criteria and numbers reported
  • Attrition documented
  • Deviations from protocol noted
  • Conflicts of interest disclosed

Open Science

  • Data sharing planned (when ethical)
  • Analysis code shared
  • Materials available
  • Preprint posted
  • Open access publication when possible

Post-Study Considerations

Data Analysis

  • Follow preregistered plan
  • Clearly label deviations and exploratory analyses
  • Check assumptions
  • Report all outcomes
  • Report effect sizes and CIs, not just p-values

Interpretation

  • Conclusions supported by data
  • Limitations acknowledged
  • Alternative explanations considered
  • Generalizability discussed
  • Clinical/practical significance addressed

Dissemination

  • Publish regardless of results (reduce publication bias)
  • Present at conferences
  • Share findings with participants (when appropriate)
  • Communicate to relevant stakeholders
  • Plain language summaries

Next Steps

  • Replication needed?
  • Follow-up studies identified
  • Mechanism studies planned
  • Clinical applications considered

Common Pitfalls to Avoid

  • No power analysis → underpowered study
  • Hypothesis formed after seeing data (HARKing)
  • No blinding when feasible → bias
  • P-hacking (data fishing, optional stopping)
  • Multiple testing without correction → false positives
  • Inadequate control group
  • Confounding not addressed
  • Instruments not validated
  • High attrition not addressed
  • Cherry-picking results to report
  • Causal language from correlational data
  • Ignoring assumptions of statistical tests
  • Not preregistering changes literature bias
  • Conflicts of interest not disclosed

Final Checklist Before Starting

  • Research question is clear and important
  • Hypothesis is testable and specific
  • Study design is appropriate
  • Sample size is adequate (power analysis)
  • Measures are valid and reliable
  • Confounds are controlled
  • Randomization and blinding implemented
  • Data collection is standardized
  • Analysis plan is prespecified
  • Ethical approval obtained
  • Study is preregistered
  • Resources are sufficient
  • Team is trained
  • Protocol is documented
  • Backup plans exist for problems

Remember

Good experimental design is about:

  • Asking clear questions
  • Minimizing bias
  • Maximizing validity
  • Appropriate inference
  • Transparency
  • Reproducibility

The best time to think about these issues is before collecting data, not after.