# Statistical Test Selection Guide This guide provides a decision tree for selecting appropriate statistical tests based on research questions, data types, and study designs. ## Decision Tree for Test Selection ### 1. Comparing Groups #### Two Independent Groups - **Continuous outcome, normally distributed**: Independent samples t-test - **Continuous outcome, non-normal**: Mann-Whitney U test (Wilcoxon rank-sum test) - **Binary outcome**: Chi-square test or Fisher's exact test (if expected counts < 5) - **Ordinal outcome**: Mann-Whitney U test #### Two Paired/Dependent Groups - **Continuous outcome, normally distributed**: Paired t-test - **Continuous outcome, non-normal**: Wilcoxon signed-rank test - **Binary outcome**: McNemar's test - **Ordinal outcome**: Wilcoxon signed-rank test #### Three or More Independent Groups - **Continuous outcome, normally distributed, equal variances**: One-way ANOVA - **Continuous outcome, normally distributed, unequal variances**: Welch's ANOVA - **Continuous outcome, non-normal**: Kruskal-Wallis H test - **Binary/categorical outcome**: Chi-square test - **Ordinal outcome**: Kruskal-Wallis H test #### Three or More Paired/Dependent Groups - **Continuous outcome, normally distributed**: Repeated measures ANOVA - **Continuous outcome, non-normal**: Friedman test - **Binary outcome**: Cochran's Q test #### Multiple Factors (Factorial Designs) - **Continuous outcome**: Two-way ANOVA (or higher-way ANOVA) - **With covariates**: ANCOVA - **Mixed within and between factors**: Mixed ANOVA ### 2. Relationships Between Variables #### Two Continuous Variables - **Linear relationship, bivariate normal**: Pearson correlation - **Monotonic relationship or non-normal**: Spearman rank correlation - **Rank-based data**: Spearman or Kendall's tau #### One Continuous Outcome, One or More Predictors - **Single continuous predictor**: Simple linear regression - **Multiple continuous/categorical predictors**: Multiple linear regression - **Categorical predictors**: ANOVA/ANCOVA framework - **Non-linear relationships**: Polynomial regression or generalized additive models (GAM) #### Binary Outcome - **Single predictor**: Logistic regression - **Multiple predictors**: Multiple logistic regression - **Rare events**: Exact logistic regression or Firth's method #### Count Outcome - **Poisson-distributed**: Poisson regression - **Overdispersed counts**: Negative binomial regression - **Zero-inflated**: Zero-inflated Poisson/negative binomial #### Time-to-Event Outcome - **Comparing survival curves**: Log-rank test - **Modeling with covariates**: Cox proportional hazards regression - **Parametric survival models**: Weibull, exponential, log-normal ### 3. Agreement and Reliability #### Inter-Rater Reliability - **Categorical ratings, 2 raters**: Cohen's kappa - **Categorical ratings, >2 raters**: Fleiss' kappa or Krippendorff's alpha - **Continuous ratings**: Intraclass correlation coefficient (ICC) #### Test-Retest Reliability - **Continuous measurements**: ICC or Pearson correlation - **Internal consistency**: Cronbach's alpha #### Agreement Between Methods - **Continuous measurements**: Bland-Altman analysis - **Categorical classifications**: Cohen's kappa ### 4. Categorical Data Analysis #### Contingency Tables - **2x2 table**: Chi-square test or Fisher's exact test - **Larger than 2x2**: Chi-square test - **Ordered categories**: Cochran-Armitage trend test - **Paired categories**: McNemar's test (2x2) or McNemar-Bowker test (larger) ### 5. Bayesian Alternatives Any of the above tests can be performed using Bayesian methods: - **Group comparisons**: Bayesian t-test, Bayesian ANOVA - **Correlations**: Bayesian correlation - **Regression**: Bayesian linear/logistic regression **Advantages of Bayesian approaches:** - Provides probability of hypotheses given data - Naturally incorporates prior information - Provides credible intervals instead of confidence intervals - No p-value interpretation issues ## Key Considerations ### Sample Size - Small samples (n < 30): Consider non-parametric tests or exact methods - Very large samples: Even small effects may be statistically significant; focus on effect sizes ### Multiple Comparisons - When conducting multiple tests, adjust for multiple comparisons using: - Bonferroni correction (conservative) - Holm-Bonferroni (less conservative) - False Discovery Rate (FDR) control (Benjamini-Hochberg) - Tukey HSD for post-hoc ANOVA comparisons ### Missing Data - Complete case analysis (listwise deletion) - Multiple imputation - Maximum likelihood methods - Ensure missing data mechanism is understood (MCAR, MAR, MNAR) ### Effect Sizes - Always report effect sizes alongside p-values - See `effect_sizes_and_power.md` for guidance ### Study Design Considerations - Randomized controlled trials: Standard parametric/non-parametric tests - Observational studies: Consider confounding and use regression/matching - Clustered/nested data: Use mixed-effects models or GEE - Time series: Use time series methods (ARIMA, etc.)