# Code Data Analysis Scaffolds Template

## Workflow

Copy this checklist and track your progress:

```
Code Data Analysis Scaffolds Progress:
- [ ] Step 1: Clarify task and objectives
- [ ] Step 2: Choose appropriate scaffold type
- [ ] Step 3: Generate scaffold structure
- [ ] Step 4: Validate scaffold completeness
- [ ] Step 5: Deliver scaffold and guide execution
```

**Step 1: Clarify task** - Ask context questions to understand task type, constraints, expected outcomes. See [Context Questions](#context-questions).

**Step 2: Choose scaffold** - Select TDD, EDA, Statistical Analysis, or Validation based on task. See [Scaffold Selection Guide](#scaffold-selection-guide).

**Step 3: Generate structure** - Use appropriate scaffold template. See [TDD Scaffold](#tdd-scaffold), [EDA Scaffold](#eda-scaffold), [Statistical Analysis Scaffold](#statistical-analysis-scaffold), or [Validation Scaffold](#validation-scaffold).

**Step 4: Validate completeness** - Check scaffold covers requirements, includes validation steps, makes assumptions explicit. See [Quality Checklist](#quality-checklist).

**Step 5: Deliver and guide** - Present scaffold, highlight next steps, surface any gaps discovered. Execute if user wants help.

## Context Questions

**For all tasks:**
- What are you trying to accomplish? (Specific outcome expected)
- What's the context? (Dataset characteristics, codebase state, existing work)
- Any constraints? (Time, tools, data limitations, performance requirements)
- What does success look like? (Acceptance criteria, quality bar)

**For TDD tasks:**
- What functionality needs tests? (Feature, bug fix, refactor)
- Existing test coverage? (None, partial, comprehensive)
- Test framework preference? (pytest, jest, junit, etc.)
- Integration vs unit tests? (Scope of testing)

**For EDA tasks:**
- What's the dataset? (Size, format, source)
- What questions are you trying to answer? (Exploratory vs. hypothesis-driven)
- Existing knowledge about data? (Schema, distributions, known issues)
- End goal? (Feature engineering, quality assessment, insights)

**For Statistical/Modeling tasks:**
- What's the research question? (Descriptive, predictive, causal)
- Available data? (Sample size, variables, treatment/control)
- Causal or predictive goal? (Understanding why vs. forecasting what)
- Significance level / acceptable error rate?

## Scaffold Selection Guide

| User Says | Task Type | Scaffold to Use |
|-----------|-----------|-----------------|
| "Write tests for..." | TDD | [TDD Scaffold](#tdd-scaffold) |
| "Explore this dataset..." | EDA | [EDA Scaffold](#eda-scaffold) |
| "Analyze the effect of..." / "Does X cause Y?" | Causal Inference | See methodology.md |
| "Predict..." / "Classify..." / "Forecast..." | Predictive Modeling | See methodology.md |
| "Design an A/B test..." / "Compare groups..." | Statistical Analysis | [Statistical Analysis Scaffold](#statistical-analysis-scaffold) |
| "Validate..." / "Check quality..." | Validation | [Validation Scaffold](#validation-scaffold) |

## TDD Scaffold

Use when writing new code, refactoring, or fixing bugs. **Write tests FIRST, then implement.**

### Quick Template

```python
# Test file: test_[module].py
import pytest
from [module] import [function_to_test]

# 1. HAPPY PATH TESTS (expected usage)
def test_[function]_with_valid_input():
    """Test normal, expected behavior"""
    result = [function](valid_input)
    assert result == expected_output
    assert result.property == expected_value

# 2. EDGE CASE TESTS (boundary conditions)
def test_[function]_with_empty_input():
    """Test with empty/minimal input"""
    result = [function]([])
    assert result == expected_for_empty

def test_[function]_with_maximum_input():
    """Test with large/maximum input"""
    result = [function](large_input)
    assert result is not None

# 3. ERROR CONDITION TESTS (invalid input, expected failures)
def test_[function]_with_invalid_input():
    """Test proper error handling"""
    with pytest.raises(ValueError):
        [function](invalid_input)

def test_[function]_with_none_input():
    """Test None handling"""
    with pytest.raises(TypeError):
        [function](None)

# 4. STATE TESTS (if function modifies state)
def test_[function]_modifies_state_correctly():
    """Test side effects are correct"""
    obj = Object()
    obj.[function](param)
    assert obj.state == expected_state

# 5. INTEGRATION TESTS (if interacting with external systems)
@pytest.fixture
def mock_external_service():
    """Mock external dependencies"""
    return Mock(spec=ExternalService)

def test_[function]_with_external_service(mock_external_service):
    """Test integration points"""
    result = [function](mock_external_service)
    mock_external_service.method.assert_called_once()
    assert result == expected_from_integration
```

### Test Data Setup

```python
# conftest.py or test fixtures
@pytest.fixture
def sample_data():
    """Reusable test data"""
    return {
        "valid": [...],
        "edge_case": [...],
        "invalid": [...]
    }

@pytest.fixture(scope="session")
def database_session():
    """Database for integration tests"""
    db = create_test_db()
    yield db
    db.cleanup()
```

### TDD Cycle

1. **Red**: Write failing test (defines what success looks like)
2. **Green**: Write minimal code to make test pass
3. **Refactor**: Improve code while keeping tests green
4. **Repeat**: Next test case

## EDA Scaffold

Use when exploring new dataset. Follow systematic plan to understand data quality and patterns.

### Quick Template

```python
# 1. DATA OVERVIEW
# Load and inspect
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_[format]('data.csv')

# Basic info
print(f"Shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")
print(df.dtypes)
print(df.head())
print(df.info())
print(df.describe())

# 2. DATA QUALITY CHECKS
# Missing values
missing = df.isnull().sum()
missing_pct = (missing / len(df)) * 100
print(missing_pct[missing_pct > 0])

# Duplicates
print(f"Duplicates: {df.duplicated().sum()}")

# Data types consistency
print("Check: Are numeric columns actually numeric?")
print("Check: Are dates parsed correctly?")
print("Check: Are categorical variables encoded properly?")

# 3. UNIVARIATE ANALYSIS
# Numeric: mean, median, std, range, distribution plots, outliers (IQR method)
for col in df.select_dtypes(include=[np.number]).columns:
    print(f"{col}: mean={df[col].mean():.2f}, median={df[col].median():.2f}, std={df[col].std():.2f}")
    df[col].hist(bins=50); plt.title(f'{col} Distribution'); plt.show()
    Q1, Q3 = df[col].quantile([0.25, 0.75])
    outliers = ((df[col] < (Q1 - 1.5*(Q3-Q1))) | (df[col] > (Q3 + 1.5*(Q3-Q1)))).sum()
    print(f"  Outliers: {outliers} ({outliers/len(df)*100:.1f}%)")

# Categorical: value counts, unique values, bar plots
for col in df.select_dtypes(include=['object', 'category']).columns:
    print(f"{col}: {df[col].nunique()} unique, most common={df[col].mode()[0]}")
    df[col].value_counts().head(10).plot(kind='bar'); plt.show()

# 4. BIVARIATE ANALYSIS
# Correlation heatmap, pairplots, categorical vs numeric boxplots
sns.heatmap(df.select_dtypes(include=[np.number]).corr(), annot=True, cmap='coolwarm')
sns.pairplot(df[['var1', 'var2', 'var3', 'target']], hue='target'); plt.show()
# For each categorical-numeric pair, create boxplots to see distributions

# 5. INSIGHTS & NEXT STEPS
print("\n=== KEY FINDINGS ===")
print("1. Data quality: [summary]")
print("2. Distributions: [any skewness, outliers]")
print("3. Correlations: [strong relationships found]")
print("4. Missing patterns: [systematic missingness?]")
print("\n=== RECOMMENDED ACTIONS ===")
print("1. Handle missing data: [imputation strategy]")
print("2. Address outliers: [cap, remove, transform]")
print("3. Feature engineering: [ideas based on EDA]")
print("4. Data transformations: [log, standardize, encode]")
```

### EDA Checklist

- [ ] Load data and check shape/dtypes
- [ ] Assess missing values (how much, which variables, patterns?)
- [ ] Check for duplicates
- [ ] Validate data types (numeric, categorical, dates)
- [ ] Univariate analysis (distributions, outliers, summary stats)
- [ ] Bivariate analysis (correlations, relationships with target)
- [ ] Identify data quality issues
- [ ] Document insights and recommended next steps

## Statistical Analysis Scaffold

Use for hypothesis testing, A/B tests, comparing groups.

### Quick Template

```python
# STATISTICAL ANALYSIS SCAFFOLD

# 1. DEFINE RESEARCH QUESTION
question = "Does treatment X improve outcome Y?"

# 2. STATE HYPOTHESES
H0 = "Treatment X has no effect on outcome Y (null hypothesis)"
H1 = "Treatment X improves outcome Y (alternative hypothesis)"

# 3. SET SIGNIFICANCE LEVEL
alpha = 0.05  # 5% significance level (Type I error rate)
power = 0.80  # 80% power (1 - Type II error rate)

# 4. CHECK ASSUMPTIONS (t-test: independence, normality, equal variance)
from scipy import stats
_, p_norm = stats.shapiro(treatment_group)  # Normality test
_, p_var = stats.levene(treatment_group, control_group)  # Equal variance test
print(f"Normality: p={p_norm:.3f} {'✓' if p_norm > 0.05 else '✗ use non-parametric'}")
print(f"Equal variance: p={p_var:.3f} {'✓' if p_var > 0.05 else '✗ use Welch t-test'}")

# 5. PERFORM STATISTICAL TEST
# Choose appropriate test based on data type and assumptions

# For continuous outcome, 2 groups:
statistic, p_value = stats.ttest_ind(treatment_group, control_group)
print(f"t-statistic: {statistic:.3f}, p-value: {p_value:.4f}")

# For categorical outcome, 2 groups:
from scipy.stats import chi2_contingency
contingency_table = pd.crosstab(df['group'], df['outcome'])
chi2, p_value, dof, expected = chi2_contingency(contingency_table)
print(f"Chi-square: {chi2:.3f}, p-value: {p_value:.4f}")

# 6. INTERPRET RESULTS & EFFECT SIZE
if p_value < alpha:
    cohen_d = (treatment_group.mean() - control_group.mean()) / pooled_std
    effect = "Small" if abs(cohen_d) < 0.2 else "Medium" if abs(cohen_d) < 0.5 else "Large"
    print(f"REJECT H0 (p={p_value:.4f}). Effect size (Cohen's d)={cohen_d:.3f} ({effect})")
else:
    print(f"FAIL TO REJECT H0 (p={p_value:.4f}). Insufficient evidence for effect.")

# 7. CONFIDENCE INTERVAL & SENSITIVITY
ci_95 = stats.t.interval(0.95, len(treatment_group)-1, loc=treatment_group.mean(), scale=stats.sem(treatment_group))
print(f"95% CI: [{ci_95[0]:.2f}, {ci_95[1]:.2f}]")
print("Sensitivity: Check without outliers, with non-parametric test, with confounders")
```

### Statistical Test Selection

| Data Type | # Groups | Test |
|-----------|----------|------|
| Continuous | 2 | t-test (or Welch's if unequal variance) |
| Continuous | 3+ | ANOVA (or Kruskal-Wallis if non-normal) |
| Categorical | 2 | Chi-square or Fisher's exact |
| Ordinal | 2 | Mann-Whitney U |
| Paired/Repeated | 2 | Paired t-test or Wilcoxon signed-rank |

## Validation Scaffold

Use for validating data quality, code quality, or model quality before shipping.

### Data Validation Template

```python
# DATA VALIDATION CHECKLIST

# 1. SCHEMA VALIDATION
expected_columns = ['id', 'timestamp', 'value', 'category']
assert set(df.columns) == set(expected_columns), "Column mismatch"

expected_dtypes = {'id': 'int64', 'timestamp': 'datetime64', 'value': 'float64', 'category': 'object'}
for col, dtype in expected_dtypes.items():
    assert df[col].dtype == dtype, f"{col} type mismatch: expected {dtype}, got {df[col].dtype}"

# 2. RANGE VALIDATION
assert df['value'].min() >= 0, "Negative values found (should be >= 0)"
assert df['value'].max() <= 100, "Values exceed maximum (should be <= 100)"

# 3. UNIQUENESS VALIDATION
assert df['id'].is_unique, "Duplicate IDs found"

# 4. COMPLETENESS VALIDATION
required_fields = ['id', 'value']
for field in required_fields:
    missing_pct = df[field].isnull().mean() * 100
    assert missing_pct == 0, f"{field} has {missing_pct:.1f}% missing (required field)"

# 5. CONSISTENCY VALIDATION
assert (df['start_date'] <= df['end_date']).all(), "start_date after end_date found"

# 6. REFERENTIAL INTEGRITY
valid_categories = ['A', 'B', 'C']
assert df['category'].isin(valid_categories).all(), "Invalid categories found"

print("✓ All data validations passed")
```

### Code Validation Checklist

- [ ] **Unit tests**: All functions have tests covering happy path, edge cases, errors
- [ ] **Integration tests**: APIs, database interactions tested end-to-end
- [ ] **Test coverage**: ≥80% coverage for critical paths
- [ ] **Error handling**: All exceptions caught and handled gracefully
- [ ] **Input validation**: All user inputs validated before processing
- [ ] **Logging**: Key operations logged for debugging
- [ ] **Documentation**: Functions have docstrings, README updated
- [ ] **Performance**: No obvious performance bottlenecks (profiled if needed)
- [ ] **Security**: No hardcoded secrets, SQL injection protected, XSS prevented

### Model Validation Checklist

- [ ] **Train/val/test split**: Data split before any preprocessing (no data leakage)
- [ ] **Baseline model**: Simple baseline implemented for comparison
- [ ] **Cross-validation**: k-fold CV performed (k≥5)
- [ ] **Metrics**: Appropriate metrics chosen (accuracy, precision/recall, AUC, RMSE, etc.)
- [ ] **Overfitting check**: Training vs validation performance compared
- [ ] **Error analysis**: Failure modes analyzed, edge cases identified
- [ ] **Fairness**: Model checked for bias across sensitive groups
- [ ] **Interpretability**: Feature importance or SHAP values computed
- [ ] **Robustness**: Model tested with perturbed inputs
- [ ] **Monitoring**: Drift detection and performance tracking in place

## Quality Checklist

Before delivering, verify:

**Scaffold Structure:**
- [ ] Clear step-by-step process defined
- [ ] Each step has concrete actions (not vague advice)
- [ ] Validation checkpoints included
- [ ] Expected outputs specified

**Completeness:**
- [ ] Covers all requirements from user's task
- [ ] Includes example code/pseudocode where helpful
- [ ] Anticipates edge cases and error conditions
- [ ] Provides decision guidance (when to use which approach)

**Clarity:**
- [ ] Assumptions stated explicitly
- [ ] Technical terms defined or illustrated
- [ ] Success criteria clear
- [ ] Next steps obvious

**Actionability:**
- [ ] User can execute scaffold without further guidance
- [ ] Code snippets are runnable (or nearly runnable)
- [ ] Gaps surfaced early (missing data, unclear requirements)
- [ ] Includes validation/quality checks

**Rubric Score:**
- [ ] Self-assessed with rubric ≥ 3.5 average