Files
2025-11-30 08:30:10 +08:00

5.0 KiB

Statistical Test Selection Guide

This guide provides a decision tree for selecting appropriate statistical tests based on research questions, data types, and study designs.

Decision Tree for Test Selection

1. Comparing Groups

Two Independent Groups

  • Continuous outcome, normally distributed: Independent samples t-test
  • Continuous outcome, non-normal: Mann-Whitney U test (Wilcoxon rank-sum test)
  • Binary outcome: Chi-square test or Fisher's exact test (if expected counts < 5)
  • Ordinal outcome: Mann-Whitney U test

Two Paired/Dependent Groups

  • Continuous outcome, normally distributed: Paired t-test
  • Continuous outcome, non-normal: Wilcoxon signed-rank test
  • Binary outcome: McNemar's test
  • Ordinal outcome: Wilcoxon signed-rank test

Three or More Independent Groups

  • Continuous outcome, normally distributed, equal variances: One-way ANOVA
  • Continuous outcome, normally distributed, unequal variances: Welch's ANOVA
  • Continuous outcome, non-normal: Kruskal-Wallis H test
  • Binary/categorical outcome: Chi-square test
  • Ordinal outcome: Kruskal-Wallis H test

Three or More Paired/Dependent Groups

  • Continuous outcome, normally distributed: Repeated measures ANOVA
  • Continuous outcome, non-normal: Friedman test
  • Binary outcome: Cochran's Q test

Multiple Factors (Factorial Designs)

  • Continuous outcome: Two-way ANOVA (or higher-way ANOVA)
  • With covariates: ANCOVA
  • Mixed within and between factors: Mixed ANOVA

2. Relationships Between Variables

Two Continuous Variables

  • Linear relationship, bivariate normal: Pearson correlation
  • Monotonic relationship or non-normal: Spearman rank correlation
  • Rank-based data: Spearman or Kendall's tau

One Continuous Outcome, One or More Predictors

  • Single continuous predictor: Simple linear regression
  • Multiple continuous/categorical predictors: Multiple linear regression
  • Categorical predictors: ANOVA/ANCOVA framework
  • Non-linear relationships: Polynomial regression or generalized additive models (GAM)

Binary Outcome

  • Single predictor: Logistic regression
  • Multiple predictors: Multiple logistic regression
  • Rare events: Exact logistic regression or Firth's method

Count Outcome

  • Poisson-distributed: Poisson regression
  • Overdispersed counts: Negative binomial regression
  • Zero-inflated: Zero-inflated Poisson/negative binomial

Time-to-Event Outcome

  • Comparing survival curves: Log-rank test
  • Modeling with covariates: Cox proportional hazards regression
  • Parametric survival models: Weibull, exponential, log-normal

3. Agreement and Reliability

Inter-Rater Reliability

  • Categorical ratings, 2 raters: Cohen's kappa
  • Categorical ratings, >2 raters: Fleiss' kappa or Krippendorff's alpha
  • Continuous ratings: Intraclass correlation coefficient (ICC)

Test-Retest Reliability

  • Continuous measurements: ICC or Pearson correlation
  • Internal consistency: Cronbach's alpha

Agreement Between Methods

  • Continuous measurements: Bland-Altman analysis
  • Categorical classifications: Cohen's kappa

4. Categorical Data Analysis

Contingency Tables

  • 2x2 table: Chi-square test or Fisher's exact test
  • Larger than 2x2: Chi-square test
  • Ordered categories: Cochran-Armitage trend test
  • Paired categories: McNemar's test (2x2) or McNemar-Bowker test (larger)

5. Bayesian Alternatives

Any of the above tests can be performed using Bayesian methods:

  • Group comparisons: Bayesian t-test, Bayesian ANOVA
  • Correlations: Bayesian correlation
  • Regression: Bayesian linear/logistic regression

Advantages of Bayesian approaches:

  • Provides probability of hypotheses given data
  • Naturally incorporates prior information
  • Provides credible intervals instead of confidence intervals
  • No p-value interpretation issues

Key Considerations

Sample Size

  • Small samples (n < 30): Consider non-parametric tests or exact methods
  • Very large samples: Even small effects may be statistically significant; focus on effect sizes

Multiple Comparisons

  • When conducting multiple tests, adjust for multiple comparisons using:
    • Bonferroni correction (conservative)
    • Holm-Bonferroni (less conservative)
    • False Discovery Rate (FDR) control (Benjamini-Hochberg)
    • Tukey HSD for post-hoc ANOVA comparisons

Missing Data

  • Complete case analysis (listwise deletion)
  • Multiple imputation
  • Maximum likelihood methods
  • Ensure missing data mechanism is understood (MCAR, MAR, MNAR)

Effect Sizes

  • Always report effect sizes alongside p-values
  • See effect_sizes_and_power.md for guidance

Study Design Considerations

  • Randomized controlled trials: Standard parametric/non-parametric tests
  • Observational studies: Consider confounding and use regression/matching
  • Clustered/nested data: Use mixed-effects models or GEE
  • Time series: Use time series methods (ARIMA, etc.)