Initial commit
This commit is contained in:
661
skills/statistical-analysis/references/bayesian_statistics.md
Normal file
661
skills/statistical-analysis/references/bayesian_statistics.md
Normal file
@@ -0,0 +1,661 @@
|
||||
# Bayesian Statistical Analysis
|
||||
|
||||
This document provides guidance on conducting and interpreting Bayesian statistical analyses, which offer an alternative framework to frequentist (classical) statistics.
|
||||
|
||||
## Bayesian vs. Frequentist Philosophy
|
||||
|
||||
### Fundamental Differences
|
||||
|
||||
| Aspect | Frequentist | Bayesian |
|
||||
|--------|-------------|----------|
|
||||
| **Probability interpretation** | Long-run frequency of events | Degree of belief/uncertainty |
|
||||
| **Parameters** | Fixed but unknown | Random variables with distributions |
|
||||
| **Inference** | Based on sampling distributions | Based on posterior distributions |
|
||||
| **Primary output** | p-values, confidence intervals | Posterior probabilities, credible intervals |
|
||||
| **Prior information** | Not formally incorporated | Explicitly incorporated via priors |
|
||||
| **Hypothesis testing** | Reject/fail to reject null | Probability of hypotheses given data |
|
||||
| **Sample size** | Often requires minimum | Can work with any sample size |
|
||||
| **Interpretation** | Indirect (probability of data given H₀) | Direct (probability of hypothesis given data) |
|
||||
|
||||
### Key Question Difference
|
||||
|
||||
**Frequentist**: "If the null hypothesis is true, what is the probability of observing data this extreme or more extreme?"
|
||||
|
||||
**Bayesian**: "Given the observed data, what is the probability that the hypothesis is true?"
|
||||
|
||||
The Bayesian question is more intuitive and directly addresses what researchers want to know.
|
||||
|
||||
---
|
||||
|
||||
## Bayes' Theorem
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
P(θ|D) = P(D|θ) × P(θ) / P(D)
|
||||
```
|
||||
|
||||
**In words**:
|
||||
```
|
||||
Posterior = Likelihood × Prior / Evidence
|
||||
```
|
||||
|
||||
Where:
|
||||
- **θ (theta)**: Parameter of interest (e.g., mean difference, correlation)
|
||||
- **D**: Observed data
|
||||
- **P(θ|D)**: Posterior distribution (belief about θ after seeing data)
|
||||
- **P(D|θ)**: Likelihood (probability of data given θ)
|
||||
- **P(θ)**: Prior distribution (belief about θ before seeing data)
|
||||
- **P(D)**: Marginal likelihood/evidence (normalizing constant)
|
||||
|
||||
---
|
||||
|
||||
## Prior Distributions
|
||||
|
||||
### Types of Priors
|
||||
|
||||
#### 1. Informative Priors
|
||||
|
||||
**When to use**: When you have substantial prior knowledge from:
|
||||
- Previous studies
|
||||
- Expert knowledge
|
||||
- Theory
|
||||
- Pilot data
|
||||
|
||||
**Example**: Meta-analysis shows effect size d ≈ 0.40, SD = 0.15
|
||||
- Prior: Normal(0.40, 0.15)
|
||||
|
||||
**Advantages**:
|
||||
- Incorporates existing knowledge
|
||||
- More efficient (smaller samples needed)
|
||||
- Can stabilize estimates with small data
|
||||
|
||||
**Disadvantages**:
|
||||
- Subjective (but subjectivity can be strength)
|
||||
- Must be justified and transparent
|
||||
- May be controversial if strong prior conflicts with data
|
||||
|
||||
---
|
||||
|
||||
#### 2. Weakly Informative Priors
|
||||
|
||||
**When to use**: Default choice for most applications
|
||||
|
||||
**Characteristics**:
|
||||
- Regularizes estimates (prevents extreme values)
|
||||
- Has minimal influence on posterior with moderate data
|
||||
- Prevents computational issues
|
||||
|
||||
**Example priors**:
|
||||
- Effect size: Normal(0, 1) or Cauchy(0, 0.707)
|
||||
- Variance: Half-Cauchy(0, 1)
|
||||
- Correlation: Uniform(-1, 1) or Beta(2, 2)
|
||||
|
||||
**Advantages**:
|
||||
- Balances objectivity and regularization
|
||||
- Computationally stable
|
||||
- Broadly acceptable
|
||||
|
||||
---
|
||||
|
||||
#### 3. Non-Informative (Flat/Uniform) Priors
|
||||
|
||||
**When to use**: When attempting to be "objective"
|
||||
|
||||
**Example**: Uniform(-∞, ∞) for any value
|
||||
|
||||
**⚠️ Caution**:
|
||||
- Can lead to improper posteriors
|
||||
- May produce non-sensible results
|
||||
- Not truly "non-informative" (still makes assumptions)
|
||||
- Often not recommended in modern Bayesian practice
|
||||
|
||||
**Better alternative**: Use weakly informative priors
|
||||
|
||||
---
|
||||
|
||||
### Prior Sensitivity Analysis
|
||||
|
||||
**Always conduct**: Test how results change with different priors
|
||||
|
||||
**Process**:
|
||||
1. Fit model with default/planned prior
|
||||
2. Fit model with more diffuse prior
|
||||
3. Fit model with more concentrated prior
|
||||
4. Compare posterior distributions
|
||||
|
||||
**Reporting**:
|
||||
- If results are similar: Evidence is robust
|
||||
- If results differ substantially: Data are not strong enough to overwhelm prior
|
||||
|
||||
**Python example**:
|
||||
```python
|
||||
import pymc as pm
|
||||
|
||||
# Model with different priors
|
||||
priors = [
|
||||
('weakly_informative', pm.Normal.dist(0, 1)),
|
||||
('diffuse', pm.Normal.dist(0, 10)),
|
||||
('informative', pm.Normal.dist(0.5, 0.3))
|
||||
]
|
||||
|
||||
results = {}
|
||||
for name, prior in priors:
|
||||
with pm.Model():
|
||||
effect = pm.Normal('effect', mu=prior.mu, sigma=prior.sigma)
|
||||
# ... rest of model
|
||||
trace = pm.sample()
|
||||
results[name] = trace
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bayesian Hypothesis Testing
|
||||
|
||||
### Bayes Factor (BF)
|
||||
|
||||
**What it is**: Ratio of evidence for two competing hypotheses
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
BF₁₀ = P(D|H₁) / P(D|H₀)
|
||||
```
|
||||
|
||||
**Interpretation**:
|
||||
|
||||
| BF₁₀ | Evidence |
|
||||
|------|----------|
|
||||
| >100 | Decisive for H₁ |
|
||||
| 30-100 | Very strong for H₁ |
|
||||
| 10-30 | Strong for H₁ |
|
||||
| 3-10 | Moderate for H₁ |
|
||||
| 1-3 | Anecdotal for H₁ |
|
||||
| 1 | No evidence |
|
||||
| 1/3-1 | Anecdotal for H₀ |
|
||||
| 1/10-1/3 | Moderate for H₀ |
|
||||
| 1/30-1/10 | Strong for H₀ |
|
||||
| 1/100-1/30 | Very strong for H₀ |
|
||||
| <1/100 | Decisive for H₀ |
|
||||
|
||||
**Advantages over p-values**:
|
||||
1. Can provide evidence for null hypothesis
|
||||
2. Not dependent on sampling intentions (no "peeking" problem)
|
||||
3. Directly quantifies evidence
|
||||
4. Can be updated with more data
|
||||
|
||||
**Python calculation**:
|
||||
```python
|
||||
import pingouin as pg
|
||||
|
||||
# Note: Limited BF support in Python
|
||||
# Better options: R packages (BayesFactor), JASP software
|
||||
|
||||
# Approximate BF from t-statistic
|
||||
# Using Jeffreys-Zellner-Siow prior
|
||||
from scipy import stats
|
||||
|
||||
def bf_from_t(t, n1, n2, r_scale=0.707):
|
||||
"""
|
||||
Approximate Bayes Factor from t-statistic
|
||||
r_scale: Cauchy prior scale (default 0.707 for medium effect)
|
||||
"""
|
||||
# This is simplified; use dedicated packages for accurate calculation
|
||||
df = n1 + n2 - 2
|
||||
# Implementation requires numerical integration
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Region of Practical Equivalence (ROPE)
|
||||
|
||||
**Purpose**: Define range of negligible effect sizes
|
||||
|
||||
**Process**:
|
||||
1. Define ROPE (e.g., d ∈ [-0.1, 0.1] for negligible effects)
|
||||
2. Calculate % of posterior inside ROPE
|
||||
3. Make decision:
|
||||
- >95% in ROPE: Accept practical equivalence
|
||||
- >95% outside ROPE: Reject equivalence
|
||||
- Otherwise: Inconclusive
|
||||
|
||||
**Advantage**: Directly tests for practical significance
|
||||
|
||||
**Python example**:
|
||||
```python
|
||||
# Define ROPE
|
||||
rope_lower, rope_upper = -0.1, 0.1
|
||||
|
||||
# Calculate % of posterior in ROPE
|
||||
in_rope = np.mean((posterior_samples > rope_lower) &
|
||||
(posterior_samples < rope_upper))
|
||||
|
||||
print(f"{in_rope*100:.1f}% of posterior in ROPE")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bayesian Estimation
|
||||
|
||||
### Credible Intervals
|
||||
|
||||
**What it is**: Interval containing parameter with X% probability
|
||||
|
||||
**95% Credible Interval interpretation**:
|
||||
> "There is a 95% probability that the true parameter lies in this interval."
|
||||
|
||||
**This is what people THINK confidence intervals mean** (but don't in frequentist framework)
|
||||
|
||||
**Types**:
|
||||
|
||||
#### Equal-Tailed Interval (ETI)
|
||||
- 2.5th to 97.5th percentile
|
||||
- Simple to calculate
|
||||
- May not include mode for skewed distributions
|
||||
|
||||
#### Highest Density Interval (HDI)
|
||||
- Narrowest interval containing 95% of distribution
|
||||
- Always includes mode
|
||||
- Better for skewed distributions
|
||||
|
||||
**Python calculation**:
|
||||
```python
|
||||
import arviz as az
|
||||
|
||||
# Equal-tailed interval
|
||||
eti = np.percentile(posterior_samples, [2.5, 97.5])
|
||||
|
||||
# HDI
|
||||
hdi = az.hdi(posterior_samples, hdi_prob=0.95)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Posterior Distributions
|
||||
|
||||
**Interpreting posterior distributions**:
|
||||
|
||||
1. **Central tendency**:
|
||||
- Mean: Average posterior value
|
||||
- Median: 50th percentile
|
||||
- Mode: Most probable value (MAP - Maximum A Posteriori)
|
||||
|
||||
2. **Uncertainty**:
|
||||
- SD: Spread of posterior
|
||||
- Credible intervals: Quantify uncertainty
|
||||
|
||||
3. **Shape**:
|
||||
- Symmetric: Similar to normal
|
||||
- Skewed: Asymmetric uncertainty
|
||||
- Multimodal: Multiple plausible values
|
||||
|
||||
**Visualization**:
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import arviz as az
|
||||
|
||||
# Posterior plot with HDI
|
||||
az.plot_posterior(trace, hdi_prob=0.95)
|
||||
|
||||
# Trace plot (check convergence)
|
||||
az.plot_trace(trace)
|
||||
|
||||
# Forest plot (multiple parameters)
|
||||
az.plot_forest(trace)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Bayesian Analyses
|
||||
|
||||
### Bayesian T-Test
|
||||
|
||||
**Purpose**: Compare two groups (Bayesian alternative to t-test)
|
||||
|
||||
**Outputs**:
|
||||
1. Posterior distribution of mean difference
|
||||
2. 95% credible interval
|
||||
3. Bayes Factor (BF₁₀)
|
||||
4. Probability of directional hypothesis (e.g., P(μ₁ > μ₂))
|
||||
|
||||
**Python implementation**:
|
||||
```python
|
||||
import pymc as pm
|
||||
import arviz as az
|
||||
|
||||
# Bayesian independent samples t-test
|
||||
with pm.Model() as model:
|
||||
# Priors for group means
|
||||
mu1 = pm.Normal('mu1', mu=0, sigma=10)
|
||||
mu2 = pm.Normal('mu2', mu=0, sigma=10)
|
||||
|
||||
# Prior for pooled standard deviation
|
||||
sigma = pm.HalfNormal('sigma', sigma=10)
|
||||
|
||||
# Likelihood
|
||||
y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group1)
|
||||
y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group2)
|
||||
|
||||
# Derived quantity: mean difference
|
||||
diff = pm.Deterministic('diff', mu1 - mu2)
|
||||
|
||||
# Sample posterior
|
||||
trace = pm.sample(2000, tune=1000, return_inferencedata=True)
|
||||
|
||||
# Analyze results
|
||||
print(az.summary(trace, var_names=['mu1', 'mu2', 'diff']))
|
||||
|
||||
# Probability that group1 > group2
|
||||
prob_greater = np.mean(trace.posterior['diff'].values > 0)
|
||||
print(f"P(μ₁ > μ₂) = {prob_greater:.3f}")
|
||||
|
||||
# Plot posterior
|
||||
az.plot_posterior(trace, var_names=['diff'], ref_val=0)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Bayesian ANOVA
|
||||
|
||||
**Purpose**: Compare three or more groups
|
||||
|
||||
**Model**:
|
||||
```python
|
||||
import pymc as pm
|
||||
|
||||
with pm.Model() as anova_model:
|
||||
# Hyperpriors
|
||||
mu_global = pm.Normal('mu_global', mu=0, sigma=10)
|
||||
sigma_between = pm.HalfNormal('sigma_between', sigma=5)
|
||||
sigma_within = pm.HalfNormal('sigma_within', sigma=5)
|
||||
|
||||
# Group means (hierarchical)
|
||||
group_means = pm.Normal('group_means',
|
||||
mu=mu_global,
|
||||
sigma=sigma_between,
|
||||
shape=n_groups)
|
||||
|
||||
# Likelihood
|
||||
y = pm.Normal('y',
|
||||
mu=group_means[group_idx],
|
||||
sigma=sigma_within,
|
||||
observed=data)
|
||||
|
||||
trace = pm.sample(2000, tune=1000, return_inferencedata=True)
|
||||
|
||||
# Posterior contrasts
|
||||
contrast_1_2 = trace.posterior['group_means'][:,:,0] - trace.posterior['group_means'][:,:,1]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Bayesian Correlation
|
||||
|
||||
**Purpose**: Estimate correlation between two variables
|
||||
|
||||
**Advantage**: Provides distribution of correlation values
|
||||
|
||||
**Python implementation**:
|
||||
```python
|
||||
import pymc as pm
|
||||
|
||||
with pm.Model() as corr_model:
|
||||
# Prior on correlation
|
||||
rho = pm.Uniform('rho', lower=-1, upper=1)
|
||||
|
||||
# Convert to covariance matrix
|
||||
cov_matrix = pm.math.stack([[1, rho],
|
||||
[rho, 1]])
|
||||
|
||||
# Likelihood (bivariate normal)
|
||||
obs = pm.MvNormal('obs',
|
||||
mu=[0, 0],
|
||||
cov=cov_matrix,
|
||||
observed=np.column_stack([x, y]))
|
||||
|
||||
trace = pm.sample(2000, tune=1000, return_inferencedata=True)
|
||||
|
||||
# Summarize correlation
|
||||
print(az.summary(trace, var_names=['rho']))
|
||||
|
||||
# Probability that correlation is positive
|
||||
prob_positive = np.mean(trace.posterior['rho'].values > 0)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Bayesian Linear Regression
|
||||
|
||||
**Purpose**: Model relationship between predictors and outcome
|
||||
|
||||
**Advantages**:
|
||||
- Uncertainty in all parameters
|
||||
- Natural regularization (via priors)
|
||||
- Can incorporate prior knowledge
|
||||
- Credible intervals for predictions
|
||||
|
||||
**Python implementation**:
|
||||
```python
|
||||
import pymc as pm
|
||||
|
||||
with pm.Model() as regression_model:
|
||||
# Priors for coefficients
|
||||
alpha = pm.Normal('alpha', mu=0, sigma=10) # Intercept
|
||||
beta = pm.Normal('beta', mu=0, sigma=10, shape=n_predictors)
|
||||
sigma = pm.HalfNormal('sigma', sigma=10)
|
||||
|
||||
# Expected value
|
||||
mu = alpha + pm.math.dot(X, beta)
|
||||
|
||||
# Likelihood
|
||||
y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y)
|
||||
|
||||
trace = pm.sample(2000, tune=1000, return_inferencedata=True)
|
||||
|
||||
# Posterior predictive checks
|
||||
with regression_model:
|
||||
ppc = pm.sample_posterior_predictive(trace)
|
||||
|
||||
az.plot_ppc(ppc)
|
||||
|
||||
# Predictions with uncertainty
|
||||
with regression_model:
|
||||
pm.set_data({'X': X_new})
|
||||
posterior_pred = pm.sample_posterior_predictive(trace)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Hierarchical (Multilevel) Models
|
||||
|
||||
**When to use**:
|
||||
- Nested/clustered data (students within schools)
|
||||
- Repeated measures
|
||||
- Meta-analysis
|
||||
- Varying effects across groups
|
||||
|
||||
**Key concept**: Partial pooling
|
||||
- Complete pooling: Ignore groups (biased)
|
||||
- No pooling: Analyze groups separately (high variance)
|
||||
- Partial pooling: Borrow strength across groups (Bayesian)
|
||||
|
||||
**Example: Varying intercepts**:
|
||||
```python
|
||||
with pm.Model() as hierarchical_model:
|
||||
# Hyperpriors
|
||||
mu_global = pm.Normal('mu_global', mu=0, sigma=10)
|
||||
sigma_between = pm.HalfNormal('sigma_between', sigma=5)
|
||||
sigma_within = pm.HalfNormal('sigma_within', sigma=5)
|
||||
|
||||
# Group-level intercepts
|
||||
alpha = pm.Normal('alpha',
|
||||
mu=mu_global,
|
||||
sigma=sigma_between,
|
||||
shape=n_groups)
|
||||
|
||||
# Likelihood
|
||||
y_obs = pm.Normal('y_obs',
|
||||
mu=alpha[group_idx],
|
||||
sigma=sigma_within,
|
||||
observed=y)
|
||||
|
||||
trace = pm.sample()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Comparison
|
||||
|
||||
### Methods
|
||||
|
||||
#### 1. Bayes Factor
|
||||
- Directly compares model evidence
|
||||
- Sensitive to prior specification
|
||||
- Can be computationally intensive
|
||||
|
||||
#### 2. Information Criteria
|
||||
|
||||
**WAIC (Widely Applicable Information Criterion)**:
|
||||
- Bayesian analog of AIC
|
||||
- Lower is better
|
||||
- Accounts for effective number of parameters
|
||||
|
||||
**LOO (Leave-One-Out Cross-Validation)**:
|
||||
- Estimates out-of-sample prediction error
|
||||
- Lower is better
|
||||
- More robust than WAIC
|
||||
|
||||
**Python calculation**:
|
||||
```python
|
||||
import arviz as az
|
||||
|
||||
# Calculate WAIC and LOO
|
||||
waic = az.waic(trace)
|
||||
loo = az.loo(trace)
|
||||
|
||||
print(f"WAIC: {waic.elpd_waic:.2f}")
|
||||
print(f"LOO: {loo.elpd_loo:.2f}")
|
||||
|
||||
# Compare multiple models
|
||||
comparison = az.compare({
|
||||
'model1': trace1,
|
||||
'model2': trace2,
|
||||
'model3': trace3
|
||||
})
|
||||
print(comparison)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checking Bayesian Models
|
||||
|
||||
### 1. Convergence Diagnostics
|
||||
|
||||
**R-hat (Gelman-Rubin statistic)**:
|
||||
- Compares within-chain and between-chain variance
|
||||
- Values close to 1.0 indicate convergence
|
||||
- R-hat < 1.01: Good
|
||||
- R-hat > 1.05: Poor convergence
|
||||
|
||||
**Effective Sample Size (ESS)**:
|
||||
- Number of independent samples
|
||||
- Higher is better
|
||||
- ESS > 400 per chain recommended
|
||||
|
||||
**Trace plots**:
|
||||
- Should look like "fuzzy caterpillar"
|
||||
- No trends, no stuck chains
|
||||
|
||||
**Python checking**:
|
||||
```python
|
||||
# Automatic summary with diagnostics
|
||||
print(az.summary(trace, var_names=['parameter']))
|
||||
|
||||
# Visual diagnostics
|
||||
az.plot_trace(trace)
|
||||
az.plot_rank(trace) # Rank plots
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Posterior Predictive Checks
|
||||
|
||||
**Purpose**: Does model generate data similar to observed data?
|
||||
|
||||
**Process**:
|
||||
1. Generate predictions from posterior
|
||||
2. Compare to actual data
|
||||
3. Look for systematic discrepancies
|
||||
|
||||
**Python implementation**:
|
||||
```python
|
||||
with model:
|
||||
ppc = pm.sample_posterior_predictive(trace)
|
||||
|
||||
# Visual check
|
||||
az.plot_ppc(ppc, num_pp_samples=100)
|
||||
|
||||
# Quantitative checks
|
||||
obs_mean = np.mean(observed_data)
|
||||
pred_means = [np.mean(sample) for sample in ppc.posterior_predictive['y_obs']]
|
||||
p_value = np.mean(pred_means >= obs_mean) # Bayesian p-value
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Reporting Bayesian Results
|
||||
|
||||
### Example T-Test Report
|
||||
|
||||
> "A Bayesian independent samples t-test was conducted to compare groups A and B. Weakly informative priors were used: Normal(0, 1) for the mean difference and Half-Cauchy(0, 1) for the pooled standard deviation. The posterior distribution of the mean difference had a mean of 5.2 (95% CI [2.3, 8.1]), indicating that Group A scored higher than Group B. The Bayes Factor BF₁₀ = 23.5 provided strong evidence for a difference between groups, and there was a 99.7% probability that Group A's mean exceeded Group B's mean."
|
||||
|
||||
### Example Regression Report
|
||||
|
||||
> "A Bayesian linear regression was fitted with weakly informative priors (Normal(0, 10) for coefficients, Half-Cauchy(0, 5) for residual SD). The model explained substantial variance (R² = 0.47, 95% CI [0.38, 0.55]). Study hours (β = 0.52, 95% CI [0.38, 0.66]) and prior GPA (β = 0.31, 95% CI [0.17, 0.45]) were credible predictors (95% CIs excluded zero). Posterior predictive checks showed good model fit. Convergence diagnostics were satisfactory (all R-hat < 1.01, ESS > 1000)."
|
||||
|
||||
---
|
||||
|
||||
## Advantages and Limitations
|
||||
|
||||
### Advantages
|
||||
|
||||
1. **Intuitive interpretation**: Direct probability statements about parameters
|
||||
2. **Incorporates prior knowledge**: Uses all available information
|
||||
3. **Flexible**: Handles complex models easily
|
||||
4. **No p-hacking**: Can look at data as it arrives
|
||||
5. **Quantifies uncertainty**: Full posterior distribution
|
||||
6. **Small samples**: Works with any sample size
|
||||
|
||||
### Limitations
|
||||
|
||||
1. **Computational**: Requires MCMC sampling (can be slow)
|
||||
2. **Prior specification**: Requires thought and justification
|
||||
3. **Complexity**: Steeper learning curve
|
||||
4. **Software**: Fewer tools than frequentist methods
|
||||
5. **Communication**: May need to educate reviewers/readers
|
||||
|
||||
---
|
||||
|
||||
## Key Python Packages
|
||||
|
||||
- **PyMC**: Full Bayesian modeling framework
|
||||
- **ArviZ**: Visualization and diagnostics
|
||||
- **Bambi**: High-level interface for regression models
|
||||
- **PyStan**: Python interface to Stan
|
||||
- **TensorFlow Probability**: Bayesian inference with TensorFlow
|
||||
|
||||
---
|
||||
|
||||
## When to Use Bayesian Methods
|
||||
|
||||
**Use Bayesian when**:
|
||||
- You have prior information to incorporate
|
||||
- You want direct probability statements
|
||||
- Sample size is small
|
||||
- Model is complex (hierarchical, missing data, etc.)
|
||||
- You want to update analysis as data arrives
|
||||
|
||||
**Frequentist may be sufficient when**:
|
||||
- Standard analysis with large sample
|
||||
- No prior information
|
||||
- Computational resources limited
|
||||
- Reviewers unfamiliar with Bayesian methods
|
||||
Reference in New Issue
Block a user