Initial commit

2025-11-30 08:47:54 +08:00
commit bd35f442d8
35 changed files with 12544 additions and 0 deletions
--- a/skills/policyengine-analysis-skill/SKILL.md
+++ b/skills/policyengine-analysis-skill/SKILL.md
@@ -0,0 +1,569 @@
+---
+name: policyengine-analysis
+description: Common analysis patterns for PolicyEngine research repositories (CRFB, newsletters, dashboards, impact studies)
+---
+
+# PolicyEngine Analysis
+
+Patterns for creating policy impact analyses, dashboards, and research using PolicyEngine.
+
+## For Users 👥
+
+### What are Analysis Repositories?
+
+Analysis repositories produce the research you see on PolicyEngine:
+
+**Blog posts:**
+- "How Montana's tax cuts affect poverty"
+- "Harris EITC proposal costs and impacts"
+- "UK Budget 2024 analysis"
+
+**Dashboards:**
+- State tax comparisons
+- Policy proposal scorecards
+- Interactive calculators (GiveCalc, SALT calculator)
+
+**Research reports:**
+- Distributional analyses for organizations
+- Policy briefs for legislators
+- Impact assessments
+
+### How Analysis Works
+
+1. **Define policy reform** using PolicyEngine parameters
+2. **Create household examples** showing specific impacts
+3. **Run population simulations** for aggregate effects
+4. **Calculate distributional impacts** (who wins, who loses)
+5. **Create visualizations** (charts, tables)
+6. **Write report** following policyengine-writing-skill style
+7. **Publish** to blog or share with stakeholders
+
+### Reading PolicyEngine Analysis
+
+**Key sections in typical analysis:**
+
+**The proposal:**
+- What policy changes
+- Specific parameter values
+
+**Household impacts:**
+- 3-5 example households
+- Dollar amounts for each
+- Charts showing impact across income range
+
+**Statewide/national impacts:**
+- Total cost or revenue
+- Winners and losers by income decile
+- Poverty and inequality effects
+
+**See policyengine-writing-skill for writing conventions.**
+
+## For Analysts 📊
+
+### When to Use This Skill
+
+- Creating policy impact analyses
+- Building interactive dashboards with Streamlit/Plotly
+- Writing analysis notebooks
+- Calculating distributional impacts
+- Comparing policy proposals
+- Creating visualizations for research
+- Publishing policy research
+
+### Example Analysis Repositories
+
+- `crfb-tob-impacts` - Policy impact analyses
+- `newsletters` - Data-driven newsletters
+- `2024-election-dashboard` - Policy comparison dashboards
+- `marginal-child` - Specialized policy analyses
+- `givecalc` - Charitable giving calculator
+
+## Repository Structure
+
+Standard analysis repository structure:
+
+```
+analysis-repo/
+├── analysis.ipynb           # Main Jupyter notebook
+├── app.py                   # Streamlit app (if applicable)
+├── requirements.txt         # Python dependencies
+├── README.md               # Documentation
+├── data/                   # Data files (if needed)
+├── outputs/                # Generated charts, tables
+└── .streamlit/             # Streamlit config
+    └── config.toml
+```
+
+## Common Analysis Patterns
+
+### Pattern 1: Impact Analysis Across Income Distribution
+
+```python
+import pandas as pd
+import numpy as np
+from policyengine_us import Simulation
+
+# Define reform
+reform = {
+    "gov.irs.credits.ctc.amount.base_amount": {
+        "2024-01-01.2100-12-31": 5000
+    }
+}
+
+# Analyze across income distribution
+incomes = np.linspace(0, 200000, 101)
+results = []
+
+for income in incomes:
+    # Baseline
+    situation = create_situation(income=income)
+    sim_baseline = Simulation(situation=situation)
+    tax_baseline = sim_baseline.calculate("income_tax", 2024)[0]
+
+    # Reform
+    sim_reform = Simulation(situation=situation, reform=reform)
+    tax_reform = sim_reform.calculate("income_tax", 2024)[0]
+
+    results.append({
+        "income": income,
+        "tax_baseline": tax_baseline,
+        "tax_reform": tax_reform,
+        "tax_change": tax_reform - tax_baseline
+    })
+
+df = pd.DataFrame(results)
+```
+
+### Pattern 2: Household-Level Case Studies
+
+```python
+# Define representative households
+households = {
+    "Single, No Children": {
+        "income": 40000,
+        "num_children": 0,
+        "married": False
+    },
+    "Single Parent, 2 Children": {
+        "income": 50000,
+        "num_children": 2,
+        "married": False
+    },
+    "Married, 2 Children": {
+        "income": 100000,
+        "num_children": 2,
+        "married": True
+    }
+}
+
+# Calculate impacts for each
+case_studies = {}
+for name, params in households.items():
+    situation = create_family(**params)
+
+    sim_baseline = Simulation(situation=situation)
+    sim_reform = Simulation(situation=situation, reform=reform)
+
+    case_studies[name] = {
+        "baseline_tax": sim_baseline.calculate("income_tax", 2024)[0],
+        "reform_tax": sim_reform.calculate("income_tax", 2024)[0],
+        "ctc_baseline": sim_baseline.calculate("ctc", 2024)[0],
+        "ctc_reform": sim_reform.calculate("ctc", 2024)[0]
+    }
+
+case_df = pd.DataFrame(case_studies).T
+```
+
+### Pattern 3: State-by-State Comparison
+
+```python
+states = ["CA", "NY", "TX", "FL", "PA", "OH", "IL", "MI"]
+
+state_results = []
+for state in states:
+    situation = create_situation(income=75000, state=state)
+
+    sim_baseline = Simulation(situation=situation)
+    sim_reform = Simulation(situation=situation, reform=reform)
+
+    state_results.append({
+        "state": state,
+        "baseline_net_income": sim_baseline.calculate("household_net_income", 2024)[0],
+        "reform_net_income": sim_reform.calculate("household_net_income", 2024)[0],
+        "change": (sim_reform.calculate("household_net_income", 2024)[0] -
+                  sim_baseline.calculate("household_net_income", 2024)[0])
+    })
+
+state_df = pd.DataFrame(state_results)
+```
+
+### Pattern 4: Marginal Analysis (Winners/Losers)
+
+```python
+import plotly.graph_objects as go
+
+# Calculate across income range
+situation_with_axes = {
+    # ... setup ...
+    "axes": [[{
+        "name": "employment_income",
+        "count": 1001,
+        "min": 0,
+        "max": 200000,
+        "period": 2024
+    }]]
+}
+
+sim_baseline = Simulation(situation=situation_with_axes)
+sim_reform = Simulation(situation=situation_with_axes, reform=reform)
+
+incomes = sim_baseline.calculate("employment_income", 2024)
+baseline_net = sim_baseline.calculate("household_net_income", 2024)
+reform_net = sim_reform.calculate("household_net_income", 2024)
+
+gains = reform_net - baseline_net
+
+# Identify winners and losers
+winners = gains > 0
+losers = gains < 0
+neutral = gains == 0
+
+print(f"Winners: {winners.sum() / len(gains) * 100:.1f}%")
+print(f"Losers: {losers.sum() / len(gains) * 100:.1f}%")
+print(f"Neutral: {neutral.sum() / len(gains) * 100:.1f}%")
+```
+
+## Visualization Patterns
+
+### Standard Plotly Configuration
+
+```python
+import plotly.graph_objects as go
+
+# PolicyEngine brand colors
+TEAL = "#39C6C0"
+BLUE = "#2C6496"
+DARK_GRAY = "#616161"
+
+def create_pe_layout(title, xaxis_title, yaxis_title):
+    """Create standard PolicyEngine chart layout."""
+    return go.Layout(
+        title=title,
+        xaxis_title=xaxis_title,
+        yaxis_title=yaxis_title,
+        font=dict(family="Roboto Serif", size=14),
+        plot_bgcolor="white",
+        hovermode="x unified",
+        xaxis=dict(
+            showgrid=True,
+            gridcolor="lightgray",
+            zeroline=True
+        ),
+        yaxis=dict(
+            showgrid=True,
+            gridcolor="lightgray",
+            zeroline=True
+        )
+    )
+
+# Use in charts
+fig = go.Figure(layout=create_pe_layout(
+    "Tax Impact by Income",
+    "Income",
+    "Tax Change"
+))
+fig.add_trace(go.Scatter(x=incomes, y=tax_change, line=dict(color=TEAL)))
+```
+
+### Common Chart Types
+
+**1. Line Chart (Impact by Income)**
+```python
+fig = go.Figure()
+fig.add_trace(go.Scatter(
+    x=df.income,
+    y=df.tax_change,
+    mode='lines',
+    name='Tax Change',
+    line=dict(color=TEAL, width=3)
+))
+fig.update_layout(
+    title="Tax Impact by Income Level",
+    xaxis_title="Income",
+    yaxis_title="Tax Change ($)",
+    xaxis_tickformat="$,.0f",
+    yaxis_tickformat="$,.0f"
+)
+```
+
+**2. Bar Chart (State Comparison)**
+```python
+fig = go.Figure()
+fig.add_trace(go.Bar(
+    x=state_df.state,
+    y=state_df.change,
+    marker_color=TEAL
+))
+fig.update_layout(
+    title="Net Income Change by State",
+    xaxis_title="State",
+    yaxis_title="Change ($)",
+    yaxis_tickformat="$,.0f"
+)
+```
+
+**3. Waterfall Chart (Budget Impact)**
+```python
+fig = go.Figure(go.Waterfall(
+    x=["Baseline", "Tax Credit", "Phase-out", "Reform"],
+    y=[baseline_revenue, credit_cost, phaseout_revenue, 0],
+    measure=["absolute", "relative", "relative", "total"],
+    connector={"line": {"color": "gray"}}
+))
+```
+
+## Streamlit Dashboard Patterns
+
+### Basic Streamlit Setup
+
+```python
+import streamlit as st
+from policyengine_us import Simulation
+
+st.set_page_config(page_title="Policy Analysis", layout="wide")
+
+st.title("Policy Impact Calculator")
+
+# User inputs
+col1, col2, col3 = st.columns(3)
+with col1:
+    income = st.number_input("Income", value=60000, step=5000)
+with col2:
+    state = st.selectbox("State", ["CA", "NY", "TX", "FL"])
+with col3:
+    num_children = st.number_input("Children", value=0, min_value=0, max_value=10)
+
+# Calculate
+if st.button("Calculate"):
+    situation = create_family(
+        parent_income=income,
+        num_children=num_children,
+        state=state
+    )
+
+    sim_baseline = Simulation(situation=situation)
+    sim_reform = Simulation(situation=situation, reform=reform)
+
+    # Display results
+    col1, col2, col3 = st.columns(3)
+    with col1:
+        st.metric(
+            "Baseline Tax",
+            f"${sim_baseline.calculate('income_tax', 2024)[0]:,.0f}"
+        )
+    with col2:
+        st.metric(
+            "Reform Tax",
+            f"${sim_reform.calculate('income_tax', 2024)[0]:,.0f}"
+        )
+    with col3:
+        change = (sim_reform.calculate('income_tax', 2024)[0] -
+                 sim_baseline.calculate('income_tax', 2024)[0])
+        st.metric("Change", f"${change:,.0f}", delta=f"${-change:,.0f}")
+```
+
+### Interactive Chart with Streamlit
+
+```python
+# Create chart based on user inputs
+incomes = np.linspace(0, income_max, 1001)
+results = []
+
+for income in incomes:
+    situation = create_situation(income=income, state=selected_state)
+    sim = Simulation(situation=situation, reform=reform)
+    results.append(sim.calculate("household_net_income", 2024)[0])
+
+fig = go.Figure()
+fig.add_trace(go.Scatter(x=incomes, y=results, line=dict(color=TEAL)))
+st.plotly_chart(fig, use_container_width=True)
+```
+
+## Jupyter Notebook Best Practices
+
+### Notebook Structure
+
+```python
+# Cell 1: Title and Description
+"""
+# Policy Analysis: [Policy Name]
+
+**Date:** [Date]
+**Author:** [Your Name]
+
+## Summary
+Brief description of the analysis and key findings.
+"""
+
+# Cell 2: Imports
+import pandas as pd
+import numpy as np
+import plotly.graph_objects as go
+from policyengine_us import Simulation
+
+# Cell 3: Configuration
+YEAR = 2024
+STATES = ["CA", "NY", "TX", "FL"]
+
+# Cell 4+: Analysis sections with markdown headers
+```
+
+### Export Results
+
+```python
+# Save DataFrame
+df.to_csv("outputs/impact_analysis.csv", index=False)
+
+# Save Plotly chart
+fig.write_html("outputs/chart.html")
+fig.write_image("outputs/chart.png", width=1200, height=600)
+
+# Save summary statistics
+summary = {
+    "total_winners": winners.sum(),
+    "total_losers": losers.sum(),
+    "avg_gain": gains[winners].mean(),
+    "avg_loss": gains[losers].mean()
+}
+pd.DataFrame([summary]).to_csv("outputs/summary.csv", index=False)
+```
+
+## Repository-Specific Examples
+
+This skill includes example templates in the `examples/` directory:
+
+- `impact_analysis_template.ipynb` - Standard impact analysis
+- `dashboard_template.py` - Streamlit dashboard
+- `state_comparison.py` - State-by-state analysis
+- `case_studies.py` - Household case studies
+- `reform_definitions.py` - Common reform patterns
+
+## Common Pitfalls
+
+### Pitfall 1: Not Using Consistent Year
+**Problem:** Mixing 2024 and 2025 calculations
+
+**Solution:** Define year constant at top:
+```python
+CURRENT_YEAR = 2024
+# Use everywhere
+simulation.calculate("income_tax", CURRENT_YEAR)
+```
+
+### Pitfall 2: Inefficient Simulations
+**Problem:** Creating new simulation for each income level
+
+**Solution:** Use axes for efficiency:
+```python
+# SLOW
+for income in incomes:
+    situation = create_situation(income=income)
+    sim = Simulation(situation=situation)
+    results.append(sim.calculate("income_tax", 2024)[0])
+
+# FAST
+situation_with_axes = create_situation_with_axes(incomes)
+sim = Simulation(situation=situation_with_axes)
+results = sim.calculate("income_tax", 2024)  # Array of all results
+```
+
+### Pitfall 3: Forgetting to Compare Baseline and Reform
+**Problem:** Only showing reform results
+
+**Solution:** Always show both:
+```python
+results = {
+    "baseline": sim_baseline.calculate("income_tax", 2024),
+    "reform": sim_reform.calculate("income_tax", 2024),
+    "change": reform - baseline
+}
+```
+
+## PolicyEngine API Usage
+
+For larger-scale analyses, use the PolicyEngine API:
+
+```python
+import requests
+
+def calculate_via_api(situation, reform=None):
+    """Calculate using PolicyEngine API."""
+    url = "https://api.policyengine.org/us/calculate"
+
+    payload = {
+        "household": situation,
+        "policy_id": reform_id if reform else baseline_policy_id
+    }
+
+    response = requests.post(url, json=payload)
+    return response.json()
+```
+
+## Testing Analysis Code
+
+```python
+import pytest
+
+def test_reform_increases_ctc():
+    """Test that reform increases CTC as expected."""
+    situation = create_family(income=50000, num_children=2)
+
+    sim_baseline = Simulation(situation=situation)
+    sim_reform = Simulation(situation=situation, reform=reform)
+
+    ctc_baseline = sim_baseline.calculate("ctc", 2024)[0]
+    ctc_reform = sim_reform.calculate("ctc", 2024)[0]
+
+    assert ctc_reform > ctc_baseline, "Reform should increase CTC"
+    assert ctc_reform == 5000 * 2, "CTC should be $5000 per child"
+```
+
+## Documentation Standards
+
+### README Template
+
+```markdown
+# [Analysis Name]
+
+## Overview
+Brief description of the analysis.
+
+## Key Findings
+- Finding 1
+- Finding 2
+- Finding 3
+
+## Methodology
+Explanation of approach and data sources.
+
+## How to Run
+
+\```bash
+pip install -r requirements.txt
+python app.py  # or jupyter notebook analysis.ipynb
+\```
+
+## Outputs
+- `outputs/chart1.png` - Description
+- `outputs/results.csv` - Description
+
+## Contact
+PolicyEngine Team - hello@policyengine.org
+```
+
+## Additional Resources
+
+- **PolicyEngine API Docs:** https://policyengine.org/us/api
+- **Analysis Examples:** https://github.com/PolicyEngine/analysis-notebooks
+- **Streamlit Docs:** https://docs.streamlit.io
+- **Plotly Docs:** https://plotly.com/python/