Initial commit
This commit is contained in:
569
skills/policyengine-analysis-skill/SKILL.md
Normal file
569
skills/policyengine-analysis-skill/SKILL.md
Normal file
@@ -0,0 +1,569 @@
|
||||
---
|
||||
name: policyengine-analysis
|
||||
description: Common analysis patterns for PolicyEngine research repositories (CRFB, newsletters, dashboards, impact studies)
|
||||
---
|
||||
|
||||
# PolicyEngine Analysis
|
||||
|
||||
Patterns for creating policy impact analyses, dashboards, and research using PolicyEngine.
|
||||
|
||||
## For Users 👥
|
||||
|
||||
### What are Analysis Repositories?
|
||||
|
||||
Analysis repositories produce the research you see on PolicyEngine:
|
||||
|
||||
**Blog posts:**
|
||||
- "How Montana's tax cuts affect poverty"
|
||||
- "Harris EITC proposal costs and impacts"
|
||||
- "UK Budget 2024 analysis"
|
||||
|
||||
**Dashboards:**
|
||||
- State tax comparisons
|
||||
- Policy proposal scorecards
|
||||
- Interactive calculators (GiveCalc, SALT calculator)
|
||||
|
||||
**Research reports:**
|
||||
- Distributional analyses for organizations
|
||||
- Policy briefs for legislators
|
||||
- Impact assessments
|
||||
|
||||
### How Analysis Works
|
||||
|
||||
1. **Define policy reform** using PolicyEngine parameters
|
||||
2. **Create household examples** showing specific impacts
|
||||
3. **Run population simulations** for aggregate effects
|
||||
4. **Calculate distributional impacts** (who wins, who loses)
|
||||
5. **Create visualizations** (charts, tables)
|
||||
6. **Write report** following policyengine-writing-skill style
|
||||
7. **Publish** to blog or share with stakeholders
|
||||
|
||||
### Reading PolicyEngine Analysis
|
||||
|
||||
**Key sections in typical analysis:**
|
||||
|
||||
**The proposal:**
|
||||
- What policy changes
|
||||
- Specific parameter values
|
||||
|
||||
**Household impacts:**
|
||||
- 3-5 example households
|
||||
- Dollar amounts for each
|
||||
- Charts showing impact across income range
|
||||
|
||||
**Statewide/national impacts:**
|
||||
- Total cost or revenue
|
||||
- Winners and losers by income decile
|
||||
- Poverty and inequality effects
|
||||
|
||||
**See policyengine-writing-skill for writing conventions.**
|
||||
|
||||
## For Analysts 📊
|
||||
|
||||
### When to Use This Skill
|
||||
|
||||
- Creating policy impact analyses
|
||||
- Building interactive dashboards with Streamlit/Plotly
|
||||
- Writing analysis notebooks
|
||||
- Calculating distributional impacts
|
||||
- Comparing policy proposals
|
||||
- Creating visualizations for research
|
||||
- Publishing policy research
|
||||
|
||||
### Example Analysis Repositories
|
||||
|
||||
- `crfb-tob-impacts` - Policy impact analyses
|
||||
- `newsletters` - Data-driven newsletters
|
||||
- `2024-election-dashboard` - Policy comparison dashboards
|
||||
- `marginal-child` - Specialized policy analyses
|
||||
- `givecalc` - Charitable giving calculator
|
||||
|
||||
## Repository Structure
|
||||
|
||||
Standard analysis repository structure:
|
||||
|
||||
```
|
||||
analysis-repo/
|
||||
├── analysis.ipynb # Main Jupyter notebook
|
||||
├── app.py # Streamlit app (if applicable)
|
||||
├── requirements.txt # Python dependencies
|
||||
├── README.md # Documentation
|
||||
├── data/ # Data files (if needed)
|
||||
├── outputs/ # Generated charts, tables
|
||||
└── .streamlit/ # Streamlit config
|
||||
└── config.toml
|
||||
```
|
||||
|
||||
## Common Analysis Patterns
|
||||
|
||||
### Pattern 1: Impact Analysis Across Income Distribution
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
import numpy as np
|
||||
from policyengine_us import Simulation
|
||||
|
||||
# Define reform
|
||||
reform = {
|
||||
"gov.irs.credits.ctc.amount.base_amount": {
|
||||
"2024-01-01.2100-12-31": 5000
|
||||
}
|
||||
}
|
||||
|
||||
# Analyze across income distribution
|
||||
incomes = np.linspace(0, 200000, 101)
|
||||
results = []
|
||||
|
||||
for income in incomes:
|
||||
# Baseline
|
||||
situation = create_situation(income=income)
|
||||
sim_baseline = Simulation(situation=situation)
|
||||
tax_baseline = sim_baseline.calculate("income_tax", 2024)[0]
|
||||
|
||||
# Reform
|
||||
sim_reform = Simulation(situation=situation, reform=reform)
|
||||
tax_reform = sim_reform.calculate("income_tax", 2024)[0]
|
||||
|
||||
results.append({
|
||||
"income": income,
|
||||
"tax_baseline": tax_baseline,
|
||||
"tax_reform": tax_reform,
|
||||
"tax_change": tax_reform - tax_baseline
|
||||
})
|
||||
|
||||
df = pd.DataFrame(results)
|
||||
```
|
||||
|
||||
### Pattern 2: Household-Level Case Studies
|
||||
|
||||
```python
|
||||
# Define representative households
|
||||
households = {
|
||||
"Single, No Children": {
|
||||
"income": 40000,
|
||||
"num_children": 0,
|
||||
"married": False
|
||||
},
|
||||
"Single Parent, 2 Children": {
|
||||
"income": 50000,
|
||||
"num_children": 2,
|
||||
"married": False
|
||||
},
|
||||
"Married, 2 Children": {
|
||||
"income": 100000,
|
||||
"num_children": 2,
|
||||
"married": True
|
||||
}
|
||||
}
|
||||
|
||||
# Calculate impacts for each
|
||||
case_studies = {}
|
||||
for name, params in households.items():
|
||||
situation = create_family(**params)
|
||||
|
||||
sim_baseline = Simulation(situation=situation)
|
||||
sim_reform = Simulation(situation=situation, reform=reform)
|
||||
|
||||
case_studies[name] = {
|
||||
"baseline_tax": sim_baseline.calculate("income_tax", 2024)[0],
|
||||
"reform_tax": sim_reform.calculate("income_tax", 2024)[0],
|
||||
"ctc_baseline": sim_baseline.calculate("ctc", 2024)[0],
|
||||
"ctc_reform": sim_reform.calculate("ctc", 2024)[0]
|
||||
}
|
||||
|
||||
case_df = pd.DataFrame(case_studies).T
|
||||
```
|
||||
|
||||
### Pattern 3: State-by-State Comparison
|
||||
|
||||
```python
|
||||
states = ["CA", "NY", "TX", "FL", "PA", "OH", "IL", "MI"]
|
||||
|
||||
state_results = []
|
||||
for state in states:
|
||||
situation = create_situation(income=75000, state=state)
|
||||
|
||||
sim_baseline = Simulation(situation=situation)
|
||||
sim_reform = Simulation(situation=situation, reform=reform)
|
||||
|
||||
state_results.append({
|
||||
"state": state,
|
||||
"baseline_net_income": sim_baseline.calculate("household_net_income", 2024)[0],
|
||||
"reform_net_income": sim_reform.calculate("household_net_income", 2024)[0],
|
||||
"change": (sim_reform.calculate("household_net_income", 2024)[0] -
|
||||
sim_baseline.calculate("household_net_income", 2024)[0])
|
||||
})
|
||||
|
||||
state_df = pd.DataFrame(state_results)
|
||||
```
|
||||
|
||||
### Pattern 4: Marginal Analysis (Winners/Losers)
|
||||
|
||||
```python
|
||||
import plotly.graph_objects as go
|
||||
|
||||
# Calculate across income range
|
||||
situation_with_axes = {
|
||||
# ... setup ...
|
||||
"axes": [[{
|
||||
"name": "employment_income",
|
||||
"count": 1001,
|
||||
"min": 0,
|
||||
"max": 200000,
|
||||
"period": 2024
|
||||
}]]
|
||||
}
|
||||
|
||||
sim_baseline = Simulation(situation=situation_with_axes)
|
||||
sim_reform = Simulation(situation=situation_with_axes, reform=reform)
|
||||
|
||||
incomes = sim_baseline.calculate("employment_income", 2024)
|
||||
baseline_net = sim_baseline.calculate("household_net_income", 2024)
|
||||
reform_net = sim_reform.calculate("household_net_income", 2024)
|
||||
|
||||
gains = reform_net - baseline_net
|
||||
|
||||
# Identify winners and losers
|
||||
winners = gains > 0
|
||||
losers = gains < 0
|
||||
neutral = gains == 0
|
||||
|
||||
print(f"Winners: {winners.sum() / len(gains) * 100:.1f}%")
|
||||
print(f"Losers: {losers.sum() / len(gains) * 100:.1f}%")
|
||||
print(f"Neutral: {neutral.sum() / len(gains) * 100:.1f}%")
|
||||
```
|
||||
|
||||
## Visualization Patterns
|
||||
|
||||
### Standard Plotly Configuration
|
||||
|
||||
```python
|
||||
import plotly.graph_objects as go
|
||||
|
||||
# PolicyEngine brand colors
|
||||
TEAL = "#39C6C0"
|
||||
BLUE = "#2C6496"
|
||||
DARK_GRAY = "#616161"
|
||||
|
||||
def create_pe_layout(title, xaxis_title, yaxis_title):
|
||||
"""Create standard PolicyEngine chart layout."""
|
||||
return go.Layout(
|
||||
title=title,
|
||||
xaxis_title=xaxis_title,
|
||||
yaxis_title=yaxis_title,
|
||||
font=dict(family="Roboto Serif", size=14),
|
||||
plot_bgcolor="white",
|
||||
hovermode="x unified",
|
||||
xaxis=dict(
|
||||
showgrid=True,
|
||||
gridcolor="lightgray",
|
||||
zeroline=True
|
||||
),
|
||||
yaxis=dict(
|
||||
showgrid=True,
|
||||
gridcolor="lightgray",
|
||||
zeroline=True
|
||||
)
|
||||
)
|
||||
|
||||
# Use in charts
|
||||
fig = go.Figure(layout=create_pe_layout(
|
||||
"Tax Impact by Income",
|
||||
"Income",
|
||||
"Tax Change"
|
||||
))
|
||||
fig.add_trace(go.Scatter(x=incomes, y=tax_change, line=dict(color=TEAL)))
|
||||
```
|
||||
|
||||
### Common Chart Types
|
||||
|
||||
**1. Line Chart (Impact by Income)**
|
||||
```python
|
||||
fig = go.Figure()
|
||||
fig.add_trace(go.Scatter(
|
||||
x=df.income,
|
||||
y=df.tax_change,
|
||||
mode='lines',
|
||||
name='Tax Change',
|
||||
line=dict(color=TEAL, width=3)
|
||||
))
|
||||
fig.update_layout(
|
||||
title="Tax Impact by Income Level",
|
||||
xaxis_title="Income",
|
||||
yaxis_title="Tax Change ($)",
|
||||
xaxis_tickformat="$,.0f",
|
||||
yaxis_tickformat="$,.0f"
|
||||
)
|
||||
```
|
||||
|
||||
**2. Bar Chart (State Comparison)**
|
||||
```python
|
||||
fig = go.Figure()
|
||||
fig.add_trace(go.Bar(
|
||||
x=state_df.state,
|
||||
y=state_df.change,
|
||||
marker_color=TEAL
|
||||
))
|
||||
fig.update_layout(
|
||||
title="Net Income Change by State",
|
||||
xaxis_title="State",
|
||||
yaxis_title="Change ($)",
|
||||
yaxis_tickformat="$,.0f"
|
||||
)
|
||||
```
|
||||
|
||||
**3. Waterfall Chart (Budget Impact)**
|
||||
```python
|
||||
fig = go.Figure(go.Waterfall(
|
||||
x=["Baseline", "Tax Credit", "Phase-out", "Reform"],
|
||||
y=[baseline_revenue, credit_cost, phaseout_revenue, 0],
|
||||
measure=["absolute", "relative", "relative", "total"],
|
||||
connector={"line": {"color": "gray"}}
|
||||
))
|
||||
```
|
||||
|
||||
## Streamlit Dashboard Patterns
|
||||
|
||||
### Basic Streamlit Setup
|
||||
|
||||
```python
|
||||
import streamlit as st
|
||||
from policyengine_us import Simulation
|
||||
|
||||
st.set_page_config(page_title="Policy Analysis", layout="wide")
|
||||
|
||||
st.title("Policy Impact Calculator")
|
||||
|
||||
# User inputs
|
||||
col1, col2, col3 = st.columns(3)
|
||||
with col1:
|
||||
income = st.number_input("Income", value=60000, step=5000)
|
||||
with col2:
|
||||
state = st.selectbox("State", ["CA", "NY", "TX", "FL"])
|
||||
with col3:
|
||||
num_children = st.number_input("Children", value=0, min_value=0, max_value=10)
|
||||
|
||||
# Calculate
|
||||
if st.button("Calculate"):
|
||||
situation = create_family(
|
||||
parent_income=income,
|
||||
num_children=num_children,
|
||||
state=state
|
||||
)
|
||||
|
||||
sim_baseline = Simulation(situation=situation)
|
||||
sim_reform = Simulation(situation=situation, reform=reform)
|
||||
|
||||
# Display results
|
||||
col1, col2, col3 = st.columns(3)
|
||||
with col1:
|
||||
st.metric(
|
||||
"Baseline Tax",
|
||||
f"${sim_baseline.calculate('income_tax', 2024)[0]:,.0f}"
|
||||
)
|
||||
with col2:
|
||||
st.metric(
|
||||
"Reform Tax",
|
||||
f"${sim_reform.calculate('income_tax', 2024)[0]:,.0f}"
|
||||
)
|
||||
with col3:
|
||||
change = (sim_reform.calculate('income_tax', 2024)[0] -
|
||||
sim_baseline.calculate('income_tax', 2024)[0])
|
||||
st.metric("Change", f"${change:,.0f}", delta=f"${-change:,.0f}")
|
||||
```
|
||||
|
||||
### Interactive Chart with Streamlit
|
||||
|
||||
```python
|
||||
# Create chart based on user inputs
|
||||
incomes = np.linspace(0, income_max, 1001)
|
||||
results = []
|
||||
|
||||
for income in incomes:
|
||||
situation = create_situation(income=income, state=selected_state)
|
||||
sim = Simulation(situation=situation, reform=reform)
|
||||
results.append(sim.calculate("household_net_income", 2024)[0])
|
||||
|
||||
fig = go.Figure()
|
||||
fig.add_trace(go.Scatter(x=incomes, y=results, line=dict(color=TEAL)))
|
||||
st.plotly_chart(fig, use_container_width=True)
|
||||
```
|
||||
|
||||
## Jupyter Notebook Best Practices
|
||||
|
||||
### Notebook Structure
|
||||
|
||||
```python
|
||||
# Cell 1: Title and Description
|
||||
"""
|
||||
# Policy Analysis: [Policy Name]
|
||||
|
||||
**Date:** [Date]
|
||||
**Author:** [Your Name]
|
||||
|
||||
## Summary
|
||||
Brief description of the analysis and key findings.
|
||||
"""
|
||||
|
||||
# Cell 2: Imports
|
||||
import pandas as pd
|
||||
import numpy as np
|
||||
import plotly.graph_objects as go
|
||||
from policyengine_us import Simulation
|
||||
|
||||
# Cell 3: Configuration
|
||||
YEAR = 2024
|
||||
STATES = ["CA", "NY", "TX", "FL"]
|
||||
|
||||
# Cell 4+: Analysis sections with markdown headers
|
||||
```
|
||||
|
||||
### Export Results
|
||||
|
||||
```python
|
||||
# Save DataFrame
|
||||
df.to_csv("outputs/impact_analysis.csv", index=False)
|
||||
|
||||
# Save Plotly chart
|
||||
fig.write_html("outputs/chart.html")
|
||||
fig.write_image("outputs/chart.png", width=1200, height=600)
|
||||
|
||||
# Save summary statistics
|
||||
summary = {
|
||||
"total_winners": winners.sum(),
|
||||
"total_losers": losers.sum(),
|
||||
"avg_gain": gains[winners].mean(),
|
||||
"avg_loss": gains[losers].mean()
|
||||
}
|
||||
pd.DataFrame([summary]).to_csv("outputs/summary.csv", index=False)
|
||||
```
|
||||
|
||||
## Repository-Specific Examples
|
||||
|
||||
This skill includes example templates in the `examples/` directory:
|
||||
|
||||
- `impact_analysis_template.ipynb` - Standard impact analysis
|
||||
- `dashboard_template.py` - Streamlit dashboard
|
||||
- `state_comparison.py` - State-by-state analysis
|
||||
- `case_studies.py` - Household case studies
|
||||
- `reform_definitions.py` - Common reform patterns
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### Pitfall 1: Not Using Consistent Year
|
||||
**Problem:** Mixing 2024 and 2025 calculations
|
||||
|
||||
**Solution:** Define year constant at top:
|
||||
```python
|
||||
CURRENT_YEAR = 2024
|
||||
# Use everywhere
|
||||
simulation.calculate("income_tax", CURRENT_YEAR)
|
||||
```
|
||||
|
||||
### Pitfall 2: Inefficient Simulations
|
||||
**Problem:** Creating new simulation for each income level
|
||||
|
||||
**Solution:** Use axes for efficiency:
|
||||
```python
|
||||
# SLOW
|
||||
for income in incomes:
|
||||
situation = create_situation(income=income)
|
||||
sim = Simulation(situation=situation)
|
||||
results.append(sim.calculate("income_tax", 2024)[0])
|
||||
|
||||
# FAST
|
||||
situation_with_axes = create_situation_with_axes(incomes)
|
||||
sim = Simulation(situation=situation_with_axes)
|
||||
results = sim.calculate("income_tax", 2024) # Array of all results
|
||||
```
|
||||
|
||||
### Pitfall 3: Forgetting to Compare Baseline and Reform
|
||||
**Problem:** Only showing reform results
|
||||
|
||||
**Solution:** Always show both:
|
||||
```python
|
||||
results = {
|
||||
"baseline": sim_baseline.calculate("income_tax", 2024),
|
||||
"reform": sim_reform.calculate("income_tax", 2024),
|
||||
"change": reform - baseline
|
||||
}
|
||||
```
|
||||
|
||||
## PolicyEngine API Usage
|
||||
|
||||
For larger-scale analyses, use the PolicyEngine API:
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
def calculate_via_api(situation, reform=None):
|
||||
"""Calculate using PolicyEngine API."""
|
||||
url = "https://api.policyengine.org/us/calculate"
|
||||
|
||||
payload = {
|
||||
"household": situation,
|
||||
"policy_id": reform_id if reform else baseline_policy_id
|
||||
}
|
||||
|
||||
response = requests.post(url, json=payload)
|
||||
return response.json()
|
||||
```
|
||||
|
||||
## Testing Analysis Code
|
||||
|
||||
```python
|
||||
import pytest
|
||||
|
||||
def test_reform_increases_ctc():
|
||||
"""Test that reform increases CTC as expected."""
|
||||
situation = create_family(income=50000, num_children=2)
|
||||
|
||||
sim_baseline = Simulation(situation=situation)
|
||||
sim_reform = Simulation(situation=situation, reform=reform)
|
||||
|
||||
ctc_baseline = sim_baseline.calculate("ctc", 2024)[0]
|
||||
ctc_reform = sim_reform.calculate("ctc", 2024)[0]
|
||||
|
||||
assert ctc_reform > ctc_baseline, "Reform should increase CTC"
|
||||
assert ctc_reform == 5000 * 2, "CTC should be $5000 per child"
|
||||
```
|
||||
|
||||
## Documentation Standards
|
||||
|
||||
### README Template
|
||||
|
||||
```markdown
|
||||
# [Analysis Name]
|
||||
|
||||
## Overview
|
||||
Brief description of the analysis.
|
||||
|
||||
## Key Findings
|
||||
- Finding 1
|
||||
- Finding 2
|
||||
- Finding 3
|
||||
|
||||
## Methodology
|
||||
Explanation of approach and data sources.
|
||||
|
||||
## How to Run
|
||||
|
||||
\```bash
|
||||
pip install -r requirements.txt
|
||||
python app.py # or jupyter notebook analysis.ipynb
|
||||
\```
|
||||
|
||||
## Outputs
|
||||
- `outputs/chart1.png` - Description
|
||||
- `outputs/results.csv` - Description
|
||||
|
||||
## Contact
|
||||
PolicyEngine Team - hello@policyengine.org
|
||||
```
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- **PolicyEngine API Docs:** https://policyengine.org/us/api
|
||||
- **Analysis Examples:** https://github.com/PolicyEngine/analysis-notebooks
|
||||
- **Streamlit Docs:** https://docs.streamlit.io
|
||||
- **Plotly Docs:** https://plotly.com/python/
|
||||
Reference in New Issue
Block a user