16 KiB
16 KiB
Experimental Design Checklist
Research Question Formulation
Is the Question Well-Formed?
- Specific: Clearly defined variables and relationships
- Answerable: Can be addressed with available methods
- Relevant: Addresses a gap in knowledge or practical need
- Feasible: Resources, time, and ethical considerations allow it
- Falsifiable: Can be proven wrong if incorrect
Have You Reviewed the Literature?
- Identified what's already known
- Found gaps or contradictions to address
- Learned from methodological successes and failures
- Identified appropriate outcome measures
- Determined typical effect sizes in the field
Hypothesis Development
Is Your Hypothesis Testable?
- Makes specific, quantifiable predictions
- Variables are operationally defined
- Specifies direction/nature of expected relationships
- Can be falsified by potential observations
Types of Hypotheses
- Null hypothesis (H₀): No effect/relationship exists
- Alternative hypothesis (H₁): Effect/relationship exists
- Directional vs. non-directional: One-tailed vs. two-tailed tests
Study Design Selection
What Type of Study is Appropriate?
Experimental (Intervention) Studies:
- Randomized Controlled Trial (RCT): Gold standard for causation
- Quasi-experimental: Non-random assignment but manipulation
- Within-subjects: Same participants in all conditions
- Between-subjects: Different participants per condition
- Factorial: Multiple independent variables
- Crossover: Participants receive multiple interventions sequentially
Observational Studies:
- Cohort: Follow groups over time
- Case-control: Compare those with/without outcome
- Cross-sectional: Snapshot at one time point
- Ecological: Population-level data
Consider:
- Can you randomly assign participants?
- Can you manipulate the independent variable?
- Is the outcome rare (favor case-control) or common?
- Do you need to establish temporal sequence?
- What's feasible given ethical, practical constraints?
Variables
Independent Variables (Manipulated/Predictor)
- Clearly defined and operationalized
- Appropriate levels/categories chosen
- Manipulation is sufficient to test hypothesis
- Manipulation check planned (if applicable)
Dependent Variables (Outcome/Response)
- Directly measures the construct of interest
- Validated and reliable measurement
- Sensitive enough to detect expected effects
- Appropriate for statistical analysis planned
- Primary outcome clearly designated
Control Variables
- Confounding variables identified:
- Variables that affect both IV and DV
- Alternative explanations for findings
- Strategy for control:
- Randomization
- Matching
- Stratification
- Statistical adjustment
- Restriction (inclusion/exclusion criteria)
- Blinding
Extraneous Variables
- Potential sources of noise identified
- Standardized procedures to minimize
- Environmental factors controlled
- Time of day, setting, equipment standardized
Sampling
Population Definition
- Target population: Who you want to generalize to
- Accessible population: Who you can actually sample from
- Sample: Who actually participates
- Difference between these documented
Sampling Method
- Probability sampling (preferred for generalizability):
- Simple random sampling
- Stratified sampling
- Cluster sampling
- Systematic sampling
- Non-probability sampling (common but limits generalizability):
- Convenience sampling
- Purposive sampling
- Snowball sampling
- Quota sampling
Sample Size
- A priori power analysis conducted
- Expected effect size (from literature or pilot)
- Desired power (typically .80 or .90)
- Significance level (typically .05)
- Statistical test to be used
- Accounts for expected attrition/dropout
- Sufficient for planned subgroup analyses
- Practical constraints acknowledged
Inclusion/Exclusion Criteria
- Clearly defined and justified
- Not overly restrictive (limits generalizability)
- Based on theoretical or practical considerations
- Ethical considerations addressed
- Documented and applied consistently
Blinding and Randomization
Randomization
- What is randomized:
- Participant assignment to conditions
- Order of conditions (within-subjects)
- Stimuli/items presented
- Method of randomization:
- Computer-generated random numbers
- Random number tables
- Coin flips (for very small studies)
- Allocation concealment:
- Sequence generated before recruitment
- Allocation hidden until after enrollment
- Sequentially numbered, sealed envelopes (if needed)
- Stratified randomization:
- Balance important variables across groups
- Block randomization to ensure equal group sizes
- Check randomization:
- Compare groups at baseline
- Report any significant differences
Blinding
- Single-blind: Participants don't know group assignment
- Double-blind: Participants and researchers don't know
- Triple-blind: Participants, researchers, and data analysts don't know
- Blinding feasibility:
- Is true blinding possible?
- Placebo/sham controls needed?
- Identical appearance of interventions?
- Blinding check:
- Assess whether blinding maintained
- Ask participants/researchers to guess assignments
Control Groups and Conditions
What Type of Control?
- No treatment control: Natural course of condition
- Placebo control: Inert treatment for comparison
- Active control: Standard treatment comparison
- Wait-list control: Delayed treatment
- Attention control: Matches contact time without active ingredient
Multiple Conditions
- Factorial designs for multiple factors
- Dose-response relationship assessment
- Mechanism testing with component analyses
Procedures
Protocol Development
- Detailed, written protocol:
- Step-by-step procedures
- Scripts for standardized instructions
- Decision rules for handling issues
- Data collection forms
- Pilot tested before main study
- Staff trained to criterion
- Compliance monitoring planned
Standardization
- Same instructions for all participants
- Same equipment and materials
- Same environment/setting when possible
- Same assessment timing
- Deviations from protocol documented
Data Collection
- When collected:
- Baseline measurements
- Post-intervention
- Follow-up timepoints
- Who collects:
- Trained researchers
- Blinded when possible
- Inter-rater reliability established
- How collected:
- Valid, reliable instruments
- Standardized administration
- Multiple methods if possible (triangulation)
Measurement
Validity
- Face validity: Appears to measure construct
- Content validity: Covers all aspects of construct
- Criterion validity: Correlates with gold standard
- Concurrent validity
- Predictive validity
- Construct validity: Measures theoretical construct
- Convergent validity (correlates with related measures)
- Discriminant validity (doesn't correlate with unrelated measures)
Reliability
- Test-retest: Consistent over time
- Internal consistency: Items measure same construct (Cronbach's α)
- Inter-rater reliability: Agreement between raters (Cohen's κ, ICC)
- Parallel forms: Alternative versions consistent
Measurement Considerations
- Objective measures preferred when possible
- Validated instruments used when available
- Multiple measures of key constructs
- Sensitivity to change considered
- Floor/ceiling effects avoided
- Response formats appropriate
- Recall periods appropriate
- Cultural appropriateness considered
Bias Minimization
Selection Bias
- Random sampling when possible
- Clearly defined eligibility criteria
- Document who declines and why
- Minimize self-selection
Performance Bias
- Standardized protocols
- Blinding of providers
- Monitor protocol adherence
- Document deviations
Detection Bias
- Blinding of outcome assessors
- Objective measures when possible
- Standardized assessment procedures
- Multiple raters with reliability checks
Attrition Bias
- Strategies to minimize dropout
- Track reasons for dropout
- Compare dropouts to completers
- Intention-to-treat analysis planned
Reporting Bias
- Preregister study and analysis plan
- Designate primary vs. secondary outcomes
- Commit to reporting all outcomes
- Distinguish planned from exploratory analyses
Data Management
Data Collection
- Data collection forms designed and tested
- REDCap, Qualtrics, or similar platforms
- Range checks and validation rules
- Regular backups
- Secure storage (HIPAA/GDPR compliant if needed)
Data Quality
- Real-time data validation
- Regular quality checks
- Missing data patterns monitored
- Outliers identified and investigated
- Protocol deviations documented
Data Security
- De-identification procedures
- Access controls
- Audit trails
- Compliance with regulations (IRB, HIPAA, GDPR)
Statistical Analysis Planning
Analysis Plan (Prespecify Before Data Collection)
- Primary analysis:
- Statistical test(s) specified
- Hypothesis clearly stated
- Significance level set (usually α = .05)
- One-tailed or two-tailed
- Secondary analyses:
- Clearly designated as secondary
- Exploratory analyses labeled as such
- Multiple comparisons:
- Adjustment method specified (if needed)
- Primary outcome protects from inflation
Assumptions
- Assumptions of statistical tests identified
- Plan to check assumptions
- Backup non-parametric alternatives
- Transformation options considered
Missing Data
- Anticipated amount of missingness
- Missing data mechanism (MCAR, MAR, MNAR)
- Handling strategy:
- Complete case analysis
- Multiple imputation
- Maximum likelihood
- Sensitivity analyses planned
Effect Sizes
- Appropriate effect size measures identified
- Will be reported alongside p-values
- Confidence intervals planned
Statistical Software
- Software selected (R, SPSS, Stata, Python, etc.)
- Version documented
- Analysis scripts prepared in advance
- Will be made available (Open Science)
Ethical Considerations
Ethical Approval
- IRB/Ethics committee approval obtained
- Study registered (ClinicalTrials.gov, etc.) if applicable
- Protocol follows Declaration of Helsinki or equivalent
Informed Consent
- Voluntary participation
- Comprehensible explanation
- Risks and benefits disclosed
- Right to withdraw without penalty
- Privacy protections explained
- Compensation disclosed
Risk-Benefit Analysis
- Potential benefits outweigh risks
- Risks minimized
- Vulnerable populations protected
- Data safety monitoring (if high risk)
Confidentiality
- Data de-identified
- Secure storage
- Limited access
- Reporting doesn't allow re-identification
Validity Threats
Internal Validity (Causation)
- History: External events between measurements
- Maturation: Changes in participants over time
- Testing: Effects of repeated measurement
- Instrumentation: Changes in measurement over time
- Regression to mean: Extreme scores becoming less extreme
- Selection: Groups differ at baseline
- Attrition: Differential dropout
- Diffusion: Control group receives treatment elements
External Validity (Generalizability)
- Sample representative of population
- Setting realistic/natural
- Treatment typical of real-world implementation
- Outcome measures ecologically valid
- Time frame appropriate
Construct Validity (Measurement)
- Measures actually tap intended constructs
- Operations match theoretical definitions
- No confounding of constructs
- Adequate coverage of construct
Statistical Conclusion Validity
- Adequate statistical power
- Assumptions met
- Appropriate tests used
- Alpha level appropriate
- Multiple comparisons addressed
Reporting and Transparency
Preregistration
- Study preregistered (OSF, ClinicalTrials.gov, AsPredicted)
- Hypotheses stated a priori
- Analysis plan documented
- Distinguishes confirmatory from exploratory
Reporting Guidelines
- RCTs: CONSORT checklist
- Observational studies: STROBE checklist
- Systematic reviews: PRISMA checklist
- Diagnostic studies: STARD checklist
- Qualitative research: COREQ checklist
- Case reports: CARE guidelines
Transparency
- All measures reported
- All manipulations disclosed
- Sample size determination explained
- Exclusion criteria and numbers reported
- Attrition documented
- Deviations from protocol noted
- Conflicts of interest disclosed
Open Science
- Data sharing planned (when ethical)
- Analysis code shared
- Materials available
- Preprint posted
- Open access publication when possible
Post-Study Considerations
Data Analysis
- Follow preregistered plan
- Clearly label deviations and exploratory analyses
- Check assumptions
- Report all outcomes
- Report effect sizes and CIs, not just p-values
Interpretation
- Conclusions supported by data
- Limitations acknowledged
- Alternative explanations considered
- Generalizability discussed
- Clinical/practical significance addressed
Dissemination
- Publish regardless of results (reduce publication bias)
- Present at conferences
- Share findings with participants (when appropriate)
- Communicate to relevant stakeholders
- Plain language summaries
Next Steps
- Replication needed?
- Follow-up studies identified
- Mechanism studies planned
- Clinical applications considered
Common Pitfalls to Avoid
- No power analysis → underpowered study
- Hypothesis formed after seeing data (HARKing)
- No blinding when feasible → bias
- P-hacking (data fishing, optional stopping)
- Multiple testing without correction → false positives
- Inadequate control group
- Confounding not addressed
- Instruments not validated
- High attrition not addressed
- Cherry-picking results to report
- Causal language from correlational data
- Ignoring assumptions of statistical tests
- Not preregistering changes literature bias
- Conflicts of interest not disclosed
Final Checklist Before Starting
- Research question is clear and important
- Hypothesis is testable and specific
- Study design is appropriate
- Sample size is adequate (power analysis)
- Measures are valid and reliable
- Confounds are controlled
- Randomization and blinding implemented
- Data collection is standardized
- Analysis plan is prespecified
- Ethical approval obtained
- Study is preregistered
- Resources are sufficient
- Team is trained
- Protocol is documented
- Backup plans exist for problems
Remember
Good experimental design is about:
- Asking clear questions
- Minimizing bias
- Maximizing validity
- Appropriate inference
- Transparency
- Reproducibility
The best time to think about these issues is before collecting data, not after.