# Experimental Design Checklist ## Research Question Formulation ### Is the Question Well-Formed? - [ ] **Specific:** Clearly defined variables and relationships - [ ] **Answerable:** Can be addressed with available methods - [ ] **Relevant:** Addresses a gap in knowledge or practical need - [ ] **Feasible:** Resources, time, and ethical considerations allow it - [ ] **Falsifiable:** Can be proven wrong if incorrect ### Have You Reviewed the Literature? - [ ] Identified what's already known - [ ] Found gaps or contradictions to address - [ ] Learned from methodological successes and failures - [ ] Identified appropriate outcome measures - [ ] Determined typical effect sizes in the field ## Hypothesis Development ### Is Your Hypothesis Testable? - [ ] Makes specific, quantifiable predictions - [ ] Variables are operationally defined - [ ] Specifies direction/nature of expected relationships - [ ] Can be falsified by potential observations ### Types of Hypotheses - [ ] **Null hypothesis (H₀):** No effect/relationship exists - [ ] **Alternative hypothesis (H₁):** Effect/relationship exists - [ ] **Directional vs. non-directional:** One-tailed vs. two-tailed tests ## Study Design Selection ### What Type of Study is Appropriate? **Experimental (Intervention) Studies:** - [ ] **Randomized Controlled Trial (RCT):** Gold standard for causation - [ ] **Quasi-experimental:** Non-random assignment but manipulation - [ ] **Within-subjects:** Same participants in all conditions - [ ] **Between-subjects:** Different participants per condition - [ ] **Factorial:** Multiple independent variables - [ ] **Crossover:** Participants receive multiple interventions sequentially **Observational Studies:** - [ ] **Cohort:** Follow groups over time - [ ] **Case-control:** Compare those with/without outcome - [ ] **Cross-sectional:** Snapshot at one time point - [ ] **Ecological:** Population-level data **Consider:** - [ ] Can you randomly assign participants? - [ ] Can you manipulate the independent variable? - [ ] Is the outcome rare (favor case-control) or common? - [ ] Do you need to establish temporal sequence? - [ ] What's feasible given ethical, practical constraints? ## Variables ### Independent Variables (Manipulated/Predictor) - [ ] Clearly defined and operationalized - [ ] Appropriate levels/categories chosen - [ ] Manipulation is sufficient to test hypothesis - [ ] Manipulation check planned (if applicable) ### Dependent Variables (Outcome/Response) - [ ] Directly measures the construct of interest - [ ] Validated and reliable measurement - [ ] Sensitive enough to detect expected effects - [ ] Appropriate for statistical analysis planned - [ ] Primary outcome clearly designated ### Control Variables - [ ] **Confounding variables identified:** - Variables that affect both IV and DV - Alternative explanations for findings - [ ] **Strategy for control:** - Randomization - Matching - Stratification - Statistical adjustment - Restriction (inclusion/exclusion criteria) - Blinding ### Extraneous Variables - [ ] Potential sources of noise identified - [ ] Standardized procedures to minimize - [ ] Environmental factors controlled - [ ] Time of day, setting, equipment standardized ## Sampling ### Population Definition - [ ] **Target population:** Who you want to generalize to - [ ] **Accessible population:** Who you can actually sample from - [ ] **Sample:** Who actually participates - [ ] Difference between these documented ### Sampling Method - [ ] **Probability sampling (preferred for generalizability):** - Simple random sampling - Stratified sampling - Cluster sampling - Systematic sampling - [ ] **Non-probability sampling (common but limits generalizability):** - Convenience sampling - Purposive sampling - Snowball sampling - Quota sampling ### Sample Size - [ ] **A priori power analysis conducted** - Expected effect size (from literature or pilot) - Desired power (typically .80 or .90) - Significance level (typically .05) - Statistical test to be used - [ ] Accounts for expected attrition/dropout - [ ] Sufficient for planned subgroup analyses - [ ] Practical constraints acknowledged ### Inclusion/Exclusion Criteria - [ ] Clearly defined and justified - [ ] Not overly restrictive (limits generalizability) - [ ] Based on theoretical or practical considerations - [ ] Ethical considerations addressed - [ ] Documented and applied consistently ## Blinding and Randomization ### Randomization - [ ] **What is randomized:** - Participant assignment to conditions - Order of conditions (within-subjects) - Stimuli/items presented - [ ] **Method of randomization:** - Computer-generated random numbers - Random number tables - Coin flips (for very small studies) - [ ] **Allocation concealment:** - Sequence generated before recruitment - Allocation hidden until after enrollment - Sequentially numbered, sealed envelopes (if needed) - [ ] **Stratified randomization:** - Balance important variables across groups - Block randomization to ensure equal group sizes - [ ] **Check randomization:** - Compare groups at baseline - Report any significant differences ### Blinding - [ ] **Single-blind:** Participants don't know group assignment - [ ] **Double-blind:** Participants and researchers don't know - [ ] **Triple-blind:** Participants, researchers, and data analysts don't know - [ ] **Blinding feasibility:** - Is true blinding possible? - Placebo/sham controls needed? - Identical appearance of interventions? - [ ] **Blinding check:** - Assess whether blinding maintained - Ask participants/researchers to guess assignments ## Control Groups and Conditions ### What Type of Control? - [ ] **No treatment control:** Natural course of condition - [ ] **Placebo control:** Inert treatment for comparison - [ ] **Active control:** Standard treatment comparison - [ ] **Wait-list control:** Delayed treatment - [ ] **Attention control:** Matches contact time without active ingredient ### Multiple Conditions - [ ] Factorial designs for multiple factors - [ ] Dose-response relationship assessment - [ ] Mechanism testing with component analyses ## Procedures ### Protocol Development - [ ] **Detailed, written protocol:** - Step-by-step procedures - Scripts for standardized instructions - Decision rules for handling issues - Data collection forms - [ ] Pilot tested before main study - [ ] Staff trained to criterion - [ ] Compliance monitoring planned ### Standardization - [ ] Same instructions for all participants - [ ] Same equipment and materials - [ ] Same environment/setting when possible - [ ] Same assessment timing - [ ] Deviations from protocol documented ### Data Collection - [ ] **When collected:** - Baseline measurements - Post-intervention - Follow-up timepoints - [ ] **Who collects:** - Trained researchers - Blinded when possible - Inter-rater reliability established - [ ] **How collected:** - Valid, reliable instruments - Standardized administration - Multiple methods if possible (triangulation) ## Measurement ### Validity - [ ] **Face validity:** Appears to measure construct - [ ] **Content validity:** Covers all aspects of construct - [ ] **Criterion validity:** Correlates with gold standard - Concurrent validity - Predictive validity - [ ] **Construct validity:** Measures theoretical construct - Convergent validity (correlates with related measures) - Discriminant validity (doesn't correlate with unrelated measures) ### Reliability - [ ] **Test-retest:** Consistent over time - [ ] **Internal consistency:** Items measure same construct (Cronbach's α) - [ ] **Inter-rater reliability:** Agreement between raters (Cohen's κ, ICC) - [ ] **Parallel forms:** Alternative versions consistent ### Measurement Considerations - [ ] Objective measures preferred when possible - [ ] Validated instruments used when available - [ ] Multiple measures of key constructs - [ ] Sensitivity to change considered - [ ] Floor/ceiling effects avoided - [ ] Response formats appropriate - [ ] Recall periods appropriate - [ ] Cultural appropriateness considered ## Bias Minimization ### Selection Bias - [ ] Random sampling when possible - [ ] Clearly defined eligibility criteria - [ ] Document who declines and why - [ ] Minimize self-selection ### Performance Bias - [ ] Standardized protocols - [ ] Blinding of providers - [ ] Monitor protocol adherence - [ ] Document deviations ### Detection Bias - [ ] Blinding of outcome assessors - [ ] Objective measures when possible - [ ] Standardized assessment procedures - [ ] Multiple raters with reliability checks ### Attrition Bias - [ ] Strategies to minimize dropout - [ ] Track reasons for dropout - [ ] Compare dropouts to completers - [ ] Intention-to-treat analysis planned ### Reporting Bias - [ ] Preregister study and analysis plan - [ ] Designate primary vs. secondary outcomes - [ ] Commit to reporting all outcomes - [ ] Distinguish planned from exploratory analyses ## Data Management ### Data Collection - [ ] Data collection forms designed and tested - [ ] REDCap, Qualtrics, or similar platforms - [ ] Range checks and validation rules - [ ] Regular backups - [ ] Secure storage (HIPAA/GDPR compliant if needed) ### Data Quality - [ ] Real-time data validation - [ ] Regular quality checks - [ ] Missing data patterns monitored - [ ] Outliers identified and investigated - [ ] Protocol deviations documented ### Data Security - [ ] De-identification procedures - [ ] Access controls - [ ] Audit trails - [ ] Compliance with regulations (IRB, HIPAA, GDPR) ## Statistical Analysis Planning ### Analysis Plan (Prespecify Before Data Collection) - [ ] **Primary analysis:** - Statistical test(s) specified - Hypothesis clearly stated - Significance level set (usually α = .05) - One-tailed or two-tailed - [ ] **Secondary analyses:** - Clearly designated as secondary - Exploratory analyses labeled as such - [ ] **Multiple comparisons:** - Adjustment method specified (if needed) - Primary outcome protects from inflation ### Assumptions - [ ] Assumptions of statistical tests identified - [ ] Plan to check assumptions - [ ] Backup non-parametric alternatives - [ ] Transformation options considered ### Missing Data - [ ] Anticipated amount of missingness - [ ] Missing data mechanism (MCAR, MAR, MNAR) - [ ] Handling strategy: - Complete case analysis - Multiple imputation - Maximum likelihood - [ ] Sensitivity analyses planned ### Effect Sizes - [ ] Appropriate effect size measures identified - [ ] Will be reported alongside p-values - [ ] Confidence intervals planned ### Statistical Software - [ ] Software selected (R, SPSS, Stata, Python, etc.) - [ ] Version documented - [ ] Analysis scripts prepared in advance - [ ] Will be made available (Open Science) ## Ethical Considerations ### Ethical Approval - [ ] IRB/Ethics committee approval obtained - [ ] Study registered (ClinicalTrials.gov, etc.) if applicable - [ ] Protocol follows Declaration of Helsinki or equivalent ### Informed Consent - [ ] Voluntary participation - [ ] Comprehensible explanation - [ ] Risks and benefits disclosed - [ ] Right to withdraw without penalty - [ ] Privacy protections explained - [ ] Compensation disclosed ### Risk-Benefit Analysis - [ ] Potential benefits outweigh risks - [ ] Risks minimized - [ ] Vulnerable populations protected - [ ] Data safety monitoring (if high risk) ### Confidentiality - [ ] Data de-identified - [ ] Secure storage - [ ] Limited access - [ ] Reporting doesn't allow re-identification ## Validity Threats ### Internal Validity (Causation) - [ ] **History:** External events between measurements - [ ] **Maturation:** Changes in participants over time - [ ] **Testing:** Effects of repeated measurement - [ ] **Instrumentation:** Changes in measurement over time - [ ] **Regression to mean:** Extreme scores becoming less extreme - [ ] **Selection:** Groups differ at baseline - [ ] **Attrition:** Differential dropout - [ ] **Diffusion:** Control group receives treatment elements ### External Validity (Generalizability) - [ ] Sample representative of population - [ ] Setting realistic/natural - [ ] Treatment typical of real-world implementation - [ ] Outcome measures ecologically valid - [ ] Time frame appropriate ### Construct Validity (Measurement) - [ ] Measures actually tap intended constructs - [ ] Operations match theoretical definitions - [ ] No confounding of constructs - [ ] Adequate coverage of construct ### Statistical Conclusion Validity - [ ] Adequate statistical power - [ ] Assumptions met - [ ] Appropriate tests used - [ ] Alpha level appropriate - [ ] Multiple comparisons addressed ## Reporting and Transparency ### Preregistration - [ ] Study preregistered (OSF, ClinicalTrials.gov, AsPredicted) - [ ] Hypotheses stated a priori - [ ] Analysis plan documented - [ ] Distinguishes confirmatory from exploratory ### Reporting Guidelines - [ ] **RCTs:** CONSORT checklist - [ ] **Observational studies:** STROBE checklist - [ ] **Systematic reviews:** PRISMA checklist - [ ] **Diagnostic studies:** STARD checklist - [ ] **Qualitative research:** COREQ checklist - [ ] **Case reports:** CARE guidelines ### Transparency - [ ] All measures reported - [ ] All manipulations disclosed - [ ] Sample size determination explained - [ ] Exclusion criteria and numbers reported - [ ] Attrition documented - [ ] Deviations from protocol noted - [ ] Conflicts of interest disclosed ### Open Science - [ ] Data sharing planned (when ethical) - [ ] Analysis code shared - [ ] Materials available - [ ] Preprint posted - [ ] Open access publication when possible ## Post-Study Considerations ### Data Analysis - [ ] Follow preregistered plan - [ ] Clearly label deviations and exploratory analyses - [ ] Check assumptions - [ ] Report all outcomes - [ ] Report effect sizes and CIs, not just p-values ### Interpretation - [ ] Conclusions supported by data - [ ] Limitations acknowledged - [ ] Alternative explanations considered - [ ] Generalizability discussed - [ ] Clinical/practical significance addressed ### Dissemination - [ ] Publish regardless of results (reduce publication bias) - [ ] Present at conferences - [ ] Share findings with participants (when appropriate) - [ ] Communicate to relevant stakeholders - [ ] Plain language summaries ### Next Steps - [ ] Replication needed? - [ ] Follow-up studies identified - [ ] Mechanism studies planned - [ ] Clinical applications considered ## Common Pitfalls to Avoid - [ ] No power analysis → underpowered study - [ ] Hypothesis formed after seeing data (HARKing) - [ ] No blinding when feasible → bias - [ ] P-hacking (data fishing, optional stopping) - [ ] Multiple testing without correction → false positives - [ ] Inadequate control group - [ ] Confounding not addressed - [ ] Instruments not validated - [ ] High attrition not addressed - [ ] Cherry-picking results to report - [ ] Causal language from correlational data - [ ] Ignoring assumptions of statistical tests - [ ] Not preregistering changes literature bias - [ ] Conflicts of interest not disclosed ## Final Checklist Before Starting - [ ] Research question is clear and important - [ ] Hypothesis is testable and specific - [ ] Study design is appropriate - [ ] Sample size is adequate (power analysis) - [ ] Measures are valid and reliable - [ ] Confounds are controlled - [ ] Randomization and blinding implemented - [ ] Data collection is standardized - [ ] Analysis plan is prespecified - [ ] Ethical approval obtained - [ ] Study is preregistered - [ ] Resources are sufficient - [ ] Team is trained - [ ] Protocol is documented - [ ] Backup plans exist for problems ## Remember **Good experimental design is about:** - Asking clear questions - Minimizing bias - Maximizing validity - Appropriate inference - Transparency - Reproducibility **The best time to think about these issues is before collecting data, not after.**